Re: [R] reading data problem
Hi Jan; Thanks so much. It is much appreciated. The problem has been solved. Regards, Greg On Mon, Sep 24, 2018 at 3:05 PM Jan T Kim wrote: > hmm... I don't see the quote="" paraneter in your read.csv call > > > Best regards, Jan > -- > Sent from my mobile. Apologies for typos and terseness > > On Mon, Sep 24, 2018, 20:40 greg holly wrote: > >> Hi Jan; >> >> Thanks so much for this. Yes, I did. Her is my code to read >> data: a<-read.csv("for_R_graphs.csv", header=T, sep=",") >> >> On Mon, Sep 24, 2018 at 2:07 PM Jan T Kim via R-help < >> r-help@r-project.org> wrote: >> >>> Yet one more: have you tried adding quote="" to your read.table >>> parameters? Quote characters have a 50% chance of being balanced, >>> and they can encompass multiple lines... >>> >>> On Mon, Sep 24, 2018 at 11:40:47AM -0700, Bert Gunter wrote: >>> > One more question: >>> > >>> > 5. Have you tried shutting down, restarting R, and rereading? >>> > >>> > -- Bert >>> > >>> > On Mon, Sep 24, 2018 at 11:36 AM Bert Gunter >>> wrote: >>> > >>> > > *Perhaps* useful questions (perhaps *not*, though): >>> > > >>> > > 1. What is your OS? What is your R version? >>> > > 2. How do you know that your data has 151 rows? >>> > > 3. Are there stray characters -- perhaps a stray eof -- in your >>> data? Have >>> > > you checked around row 96 to see what's there? >>> > > 4. Are the data you did get in R what you expect? >>> > > >>> > > -- Bert >>> > > >>> > > Bert Gunter >>> > > >>> > > "The trouble with having an open mind is that people keep coming >>> along and >>> > > sticking things into it." >>> > > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) >>> > > >>> > > >>> > > On Mon, Sep 24, 2018 at 11:27 AM greg holly >>> wrote: >>> > > >>> > >> Hi Dear all; >>> > >> >>> > >> I have a dataset with 151*291 dimension. After making data read >>> into R I >>> > >> am >>> > >> getting a data with 96*291 dimension. Even though I have no error >>> message >>> > >> from R I could not understand the reason why I cannot get data >>> correctly? >>> > >> >>> > >> Here are my codes to make read the data >>> > >> a<-read.table("for_R_graphs.csv", header=T, sep=",") >>> > >> a<-read.table("for_R_graphs.txt", header=T, sep="\t") >>> > >> >>> > >> Regards, >>> > >> >>> > >> Greg >>> > >> >>> > >> [[alternative HTML version deleted]] >>> > >> >>> > >> __ >>> > >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >>> > >> https://stat.ethz.ch/mailman/listinfo/r-help >>> > >> PLEASE do read the posting guide >>> > >> http://www.R-project.org/posting-guide.html >>> > >> and provide commented, minimal, self-contained, reproducible code. >>> > >> >>> > > >>> > >>> > [[alternative HTML version deleted]] >>> > >>> > __ >>> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >>> > https://stat.ethz.ch/mailman/listinfo/r-help >>> > PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> > and provide commented, minimal, self-contained, reproducible code. >>> >>> __ >>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >> [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] reading data problem
hmm... I don't see the quote="" paraneter in your read.csv call Best regards, Jan -- Sent from my mobile. Apologies for typos and terseness On Mon, Sep 24, 2018, 20:40 greg holly wrote: > Hi Jan; > > Thanks so much for this. Yes, I did. Her is my code to read > data: a<-read.csv("for_R_graphs.csv", header=T, sep=",") > > On Mon, Sep 24, 2018 at 2:07 PM Jan T Kim via R-help > wrote: > >> Yet one more: have you tried adding quote="" to your read.table >> parameters? Quote characters have a 50% chance of being balanced, >> and they can encompass multiple lines... >> >> On Mon, Sep 24, 2018 at 11:40:47AM -0700, Bert Gunter wrote: >> > One more question: >> > >> > 5. Have you tried shutting down, restarting R, and rereading? >> > >> > -- Bert >> > >> > On Mon, Sep 24, 2018 at 11:36 AM Bert Gunter >> wrote: >> > >> > > *Perhaps* useful questions (perhaps *not*, though): >> > > >> > > 1. What is your OS? What is your R version? >> > > 2. How do you know that your data has 151 rows? >> > > 3. Are there stray characters -- perhaps a stray eof -- in your data? >> Have >> > > you checked around row 96 to see what's there? >> > > 4. Are the data you did get in R what you expect? >> > > >> > > -- Bert >> > > >> > > Bert Gunter >> > > >> > > "The trouble with having an open mind is that people keep coming >> along and >> > > sticking things into it." >> > > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) >> > > >> > > >> > > On Mon, Sep 24, 2018 at 11:27 AM greg holly >> wrote: >> > > >> > >> Hi Dear all; >> > >> >> > >> I have a dataset with 151*291 dimension. After making data read into >> R I >> > >> am >> > >> getting a data with 96*291 dimension. Even though I have no error >> message >> > >> from R I could not understand the reason why I cannot get data >> correctly? >> > >> >> > >> Here are my codes to make read the data >> > >> a<-read.table("for_R_graphs.csv", header=T, sep=",") >> > >> a<-read.table("for_R_graphs.txt", header=T, sep="\t") >> > >> >> > >> Regards, >> > >> >> > >> Greg >> > >> >> > >> [[alternative HTML version deleted]] >> > >> >> > >> __ >> > >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> > >> https://stat.ethz.ch/mailman/listinfo/r-help >> > >> PLEASE do read the posting guide >> > >> http://www.R-project.org/posting-guide.html >> > >> and provide commented, minimal, self-contained, reproducible code. >> > >> >> > > >> > >> > [[alternative HTML version deleted]] >> > >> > __ >> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> > https://stat.ethz.ch/mailman/listinfo/r-help >> > PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> > and provide commented, minimal, self-contained, reproducible code. >> >> __ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] reading data problem
Hi Jan; Thanks so much for this. Yes, I did. Her is my code to read data: a<-read.csv("for_R_graphs.csv", header=T, sep=",") On Mon, Sep 24, 2018 at 2:07 PM Jan T Kim via R-help wrote: > Yet one more: have you tried adding quote="" to your read.table > parameters? Quote characters have a 50% chance of being balanced, > and they can encompass multiple lines... > > On Mon, Sep 24, 2018 at 11:40:47AM -0700, Bert Gunter wrote: > > One more question: > > > > 5. Have you tried shutting down, restarting R, and rereading? > > > > -- Bert > > > > On Mon, Sep 24, 2018 at 11:36 AM Bert Gunter > wrote: > > > > > *Perhaps* useful questions (perhaps *not*, though): > > > > > > 1. What is your OS? What is your R version? > > > 2. How do you know that your data has 151 rows? > > > 3. Are there stray characters -- perhaps a stray eof -- in your data? > Have > > > you checked around row 96 to see what's there? > > > 4. Are the data you did get in R what you expect? > > > > > > -- Bert > > > > > > Bert Gunter > > > > > > "The trouble with having an open mind is that people keep coming along > and > > > sticking things into it." > > > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > > > > > > > > > On Mon, Sep 24, 2018 at 11:27 AM greg holly > wrote: > > > > > >> Hi Dear all; > > >> > > >> I have a dataset with 151*291 dimension. After making data read into > R I > > >> am > > >> getting a data with 96*291 dimension. Even though I have no error > message > > >> from R I could not understand the reason why I cannot get data > correctly? > > >> > > >> Here are my codes to make read the data > > >> a<-read.table("for_R_graphs.csv", header=T, sep=",") > > >> a<-read.table("for_R_graphs.txt", header=T, sep="\t") > > >> > > >> Regards, > > >> > > >> Greg > > >> > > >> [[alternative HTML version deleted]] > > >> > > >> __ > > >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > > >> https://stat.ethz.ch/mailman/listinfo/r-help > > >> PLEASE do read the posting guide > > >> http://www.R-project.org/posting-guide.html > > >> and provide commented, minimal, self-contained, reproducible code. > > >> > > > > > > > [[alternative HTML version deleted]] > > > > __ > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] reading data problem
Hi Bert; Thanks for writing. Here are my answers to your questions: Regards, Greg 1. What is your OS? What is your R version? *The version is 3.5.0* 2. How do you know that your data has 151 rows? *Because I looked in excel also I work on the same data in SAS* 3. Are there stray characters -- perhaps a stray eof -- in your data? Have you checked around row 96 to see what's there? *I don't think so if I have stray characters* 4. Are the data you did get in R what you expect? * I will run for some graphics* 5. Have you tried shutting down, restarting R, and rereading? *Yes and again I had the same problem* On Mon, Sep 24, 2018 at 1:36 PM Bert Gunter wrote: > *Perhaps* useful questions (perhaps *not*, though): > > 1. What is your OS? What is your R version? > 2. How do you know that your data has 151 rows? > 3. Are there stray characters -- perhaps a stray eof -- in your data? Have > you checked around row 96 to see what's there? > 4. Are the data you did get in R what you expect? > > -- Bert > > Bert Gunter > > "The trouble with having an open mind is that people keep coming along and > sticking things into it." > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > > > On Mon, Sep 24, 2018 at 11:27 AM greg holly wrote: > >> Hi Dear all; >> >> I have a dataset with 151*291 dimension. After making data read into R I >> am >> getting a data with 96*291 dimension. Even though I have no error message >> from R I could not understand the reason why I cannot get data correctly? >> >> Here are my codes to make read the data >> a<-read.table("for_R_graphs.csv", header=T, sep=",") >> a<-read.table("for_R_graphs.txt", header=T, sep="\t") >> >> Regards, >> >> Greg >> >> [[alternative HTML version deleted]] >> >> __ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] reading data problem
Yet one more: have you tried adding quote="" to your read.table parameters? Quote characters have a 50% chance of being balanced, and they can encompass multiple lines... On Mon, Sep 24, 2018 at 11:40:47AM -0700, Bert Gunter wrote: > One more question: > > 5. Have you tried shutting down, restarting R, and rereading? > > -- Bert > > On Mon, Sep 24, 2018 at 11:36 AM Bert Gunter wrote: > > > *Perhaps* useful questions (perhaps *not*, though): > > > > 1. What is your OS? What is your R version? > > 2. How do you know that your data has 151 rows? > > 3. Are there stray characters -- perhaps a stray eof -- in your data? Have > > you checked around row 96 to see what's there? > > 4. Are the data you did get in R what you expect? > > > > -- Bert > > > > Bert Gunter > > > > "The trouble with having an open mind is that people keep coming along and > > sticking things into it." > > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > > > > > > On Mon, Sep 24, 2018 at 11:27 AM greg holly wrote: > > > >> Hi Dear all; > >> > >> I have a dataset with 151*291 dimension. After making data read into R I > >> am > >> getting a data with 96*291 dimension. Even though I have no error message > >> from R I could not understand the reason why I cannot get data correctly? > >> > >> Here are my codes to make read the data > >> a<-read.table("for_R_graphs.csv", header=T, sep=",") > >> a<-read.table("for_R_graphs.txt", header=T, sep="\t") > >> > >> Regards, > >> > >> Greg > >> > >> [[alternative HTML version deleted]] > >> > >> __ > >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > >> https://stat.ethz.ch/mailman/listinfo/r-help > >> PLEASE do read the posting guide > >> http://www.R-project.org/posting-guide.html > >> and provide commented, minimal, self-contained, reproducible code. > >> > > > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] reading data problem
One more question: 5. Have you tried shutting down, restarting R, and rereading? -- Bert On Mon, Sep 24, 2018 at 11:36 AM Bert Gunter wrote: > *Perhaps* useful questions (perhaps *not*, though): > > 1. What is your OS? What is your R version? > 2. How do you know that your data has 151 rows? > 3. Are there stray characters -- perhaps a stray eof -- in your data? Have > you checked around row 96 to see what's there? > 4. Are the data you did get in R what you expect? > > -- Bert > > Bert Gunter > > "The trouble with having an open mind is that people keep coming along and > sticking things into it." > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > > > On Mon, Sep 24, 2018 at 11:27 AM greg holly wrote: > >> Hi Dear all; >> >> I have a dataset with 151*291 dimension. After making data read into R I >> am >> getting a data with 96*291 dimension. Even though I have no error message >> from R I could not understand the reason why I cannot get data correctly? >> >> Here are my codes to make read the data >> a<-read.table("for_R_graphs.csv", header=T, sep=",") >> a<-read.table("for_R_graphs.txt", header=T, sep="\t") >> >> Regards, >> >> Greg >> >> [[alternative HTML version deleted]] >> >> __ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] reading data problem
*Perhaps* useful questions (perhaps *not*, though): 1. What is your OS? What is your R version? 2. How do you know that your data has 151 rows? 3. Are there stray characters -- perhaps a stray eof -- in your data? Have you checked around row 96 to see what's there? 4. Are the data you did get in R what you expect? -- Bert Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Mon, Sep 24, 2018 at 11:27 AM greg holly wrote: > Hi Dear all; > > I have a dataset with 151*291 dimension. After making data read into R I am > getting a data with 96*291 dimension. Even though I have no error message > from R I could not understand the reason why I cannot get data correctly? > > Here are my codes to make read the data > a<-read.table("for_R_graphs.csv", header=T, sep=",") > a<-read.table("for_R_graphs.txt", header=T, sep="\t") > > Regards, > > Greg > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] reading data
Hi Jim, With a little dig on my side , I have found the issue as to why the script is skipping that file. The file is "ISO-8859 text, with CRLF line terminators" The file should be ASCII and I changed using dos2unix and CRLF line terminators is eliminated but still I am not reading it. How can I read those files with "ISO-8859 text"? On Tue, Jun 13, 2017 at 7:20 PM, jim holtmanwrote: > You need to provide reproducible data. What does the file contain? Why are > you using 'sep=' when reading fixed format. You might be able to attach the > '.txt' to your email to help with the problem. Also you did not state what > the differences that you are seeing. So help us out here. > > > Jim Holtman > Data Munger Guru > > What is the problem that you are trying to solve? > Tell me what you want to do, not how you want to do it. > > On Tue, Jun 13, 2017 at 5:09 PM, Ashta wrote: >> >> Hi all, >> >> I am using R to extract data on a regular basis. >> However, sometimes using the same script and the same data I am >> getting different observation. >> The library I am using and how I am reading it is as follows. >> >> library(stringr) >> namelist <- file("Adress1.txt",encoding="ISO-8859-1") >> Name <- read.fwf(namelist, >> colClasses="character", skip=2,sep="\t",fill=T, >> width =c(2,8,1,1,1,1,1,1,9,5)+1,col.names=ccol) >> >> Can some one suggest me how track the issue? >> Is it the library issue or Java issue? >> May I read as free format instead of fixed format? >> >> Thank you in advance >> >> __ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] reading data
You need to provide reproducible data. What does the file contain? Why are you using 'sep=' when reading fixed format. You might be able to attach the '.txt' to your email to help with the problem. Also you did not state what the differences that you are seeing. So help us out here. Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. On Tue, Jun 13, 2017 at 5:09 PM, Ashtawrote: > Hi all, > > I am using R to extract data on a regular basis. > However, sometimes using the same script and the same data I am > getting different observation. > The library I am using and how I am reading it is as follows. > > library(stringr) > namelist <- file("Adress1.txt",encoding="ISO-8859-1") > Name <- read.fwf(namelist, > colClasses="character", skip=2,sep="\t",fill=T, > width =c(2,8,1,1,1,1,1,1,9,5)+1,col.names=ccol) > > Can some one suggest me how track the issue? > Is it the library issue or Java issue? > May I read as free format instead of fixed format? > > Thank you in advance > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/ > posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] reading data
Try asking on R-sig-geo mailing list Also, state what package(s) you are using, and include what you have already tried. -Don -- Don MacQueen Lawrence Livermore National Laboratory 7000 East Ave., L-627 Livermore, CA 94550 925-423-1062 On 1/19/17, 10:53 AM, "R-help on behalf of lily li"wrote: Hi R users, I'm trying to open netcdf files in R. Each nc file has daily climate measurements for a whole year, covering the whole US. How to limit the file to a specific rectangle? Thanks. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] reading data into nested frames
Hi Ed, I'm not sure I understand, but can't you rwad the files one by one and create one data.frane using rbind? Is easy to put do in a loop too. Best wishes, Ulrik On Thu, 2 Jun 2016, 20:23 Ed Siefker,wrote: > I have many data files named like this: > > E11.5-021415-dko-1-1-masked-bottom-area.tsv > E11.5-021415-dko-1-1-masked-top-area.tsv > E11.5-021415-dko-1-2-masked-bottom-area.tsv > E11.5-021415-dko-1-2-masked-top-area.tsv > E11.5-021415-dko-1-3-masked-bottom-area.tsv > E11.5-021415-dko-1-3-masked-top-area.tsv > > age-date-genotype-num-slicenum-filler-position-data > > An individual sample is an age-date-geno-num, each sample has two > parts, and is composed of around 10 slices. Each row of the tsv is an > area which will be summed for the total area. > > What I want is a dataframe, with a row for each sample and a column > for bottom and top. Under bottom and top, I want each element to be a > dataframe with a row for each slice and a column for the area. > > So I can lapply over this list of files, use strsplit to pull out the > slice num and put the area into the correct row of a dataframe easily > enough. But I have a line for every datapoint, not sample, and there > would be a dataframe for each area. > > How can I merge all the data for the slices into one data frame? Does > this make sense? > Thanks > -Ed > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading data with two rows of variable names using read.zoo
Thanks, Dan. Your codes work fine. But I have tens of countries UK, JP, BR, US..., each of which has ten columns a1, a2, ..., a10 of data. So a little more automation is needed. I have been trying to make a list of each country's data and use sapply thing to get UK JP 2009 Q2 65 2009 Q3 75 2009 Q4 87 2010 Q1 67 2010 Q2 63 But for me, it was not easy as it looks... Thank you in advance! -- View this message in context: http://r.789695.n4.nabble.com/Reading-data-with-two-rows-of-variable-names-using-read-zoo-tp4710496p4710510.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading data with two rows of variable names using read.zoo
Not a guru, but this isn't that hard. The following works with your sample data. It shouldn't be too difficult to modify for your full file. library(zoo) df - read.table('path_to_your_data', sep=';', skip=2, as.is=TRUE) str(df) substr(df$V1,5,5) - '-' df$V1 - as.yearqtr(substr(df$V1,1,6)) df$A - rowSums(df[,c(2,4)]) df$B - rowSums(df[,c(3,5)]) want - as.zoo(df[,-c(2:5)]) want Hope this is helpful, Dan Daniel Nordlund, PhD Research and Data Analysis Division Services Enterprise Support Administration Washington State Department of Social and Health Services -Original Message- From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of ??? Sent: Tuesday, July 28, 2015 9:42 AM To: r-help@r-project.org Subject: [R] Reading data with two rows of variable names using read.zoo Dear R gurus. I have a data file which has two rows of variable names. And the time index has a little unusual format. I have no idea how to handle two names and awkward indexing for the quarters. Lines - Index; UK; UK; JP; JP Index; a1; a2; a1; a2 2009 2/4;2;4;3;2 2009 3/4;5;2;1;4 2009 4/4;7;1;1;6 2010 1/4;3;3;5;2 2010 2/4;5;1;2;1 (a snippet from a big data containing a1, a2, ..., a10 of many countries) I want to sum a1 and a2 for UK, JP and obtain a zoo object like this: AB 2009 Q2 65 2009 Q3 75 2009 Q4 87 2010 Q1 67 2010 Q2 63 This looks quite challenging. Thanks for your time. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] reading data using XTS package
On Tue, Nov 18, 2014 at 9:42 PM, Upananda Pani upananda.p...@gmail.com wrote: Dear All, I want to read the my time series data using XTS package and then to calculate return using PeformanceAnalytics Package but i am getting the following error. Please help me to solve the problem. The error follows: # Required Libraries library(xts) library(PerformanceAnalytics) #Reading Data x-read.csv('crude.csv') y-xts(x[,1:2],as.numeric(x[,2:2]),order.by =as.Date(x[,1],format='%d-%b-%y')) close - y$close #Calculating Return rspot = Return.calculate(close, method = c (discrete)) Error in `/.default`(pr, lag(pr)) : non-numeric argument to binary operator I am not getting where i am committing the mistake. I'm not sure what you were trying to do, because: 1) You didn't provide a reproducible example (I don't have 'crude.csv'), and 2) it doesn't make sense to use unnamed arguments for the first two arguments, and then use a named argument to re-specify the second argument (order.by). You probably wanted something like: close - xts(x[,2], as.Date(x[,1], format='%d-%b-%y')) With sincere regards, Upananda -- You may delay, but time will not. Research Scholar alternative mail id: up...@iitkgp.ac.in Department of HSS, IIT KGP KGP __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Ulrich | about.me/joshuaulrich FOSS Trading | www.fosstrading.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] reading data from a web
You did not read the data with the commands you provided since c1 is not defined so read.fwf() fails immediately. Here is a solution that works for the link you provided, but would need to be modified for months that do not have 30 days: lnk - http://www.data.jma.go.jp/gmd/env/data/radiation/data/geppo/201004/DR201004_sap.txt; raw - readLines(lnk) # Read the file as text lines raw - raw[19:48] # Pull out the data raw - substr(raw, 16, nchar(raw)) # Strip the leading blanks raw - gsub( +, ,, raw)# Replace two or more blanks with a comma raw - gsub(\\.\\.\\., NA, raw) # Replace ... with NA Solar - read.csv(text=raw, header=FALSE, colClasses=c(character, + rep(numeric, 25))) str(Solar) 'data.frame': 30 obs. of 26 variables: $ V1 : chr 4 1 4 2 4 3 4 4 ... $ V2 : num NA NA NA NA NA NA NA NA NA NA ... $ V3 : num NA NA NA NA NA NA NA NA NA NA ... $ V4 : num NA NA NA NA NA NA NA NA NA NA ... $ V5 : num NA NA NA NA NA NA NA NA NA NA ... $ V6 : num NA NA NA NA NA NA NA NA NA NA ... $ V7 : num 0 0 0 2 0 8 0 75 2 0 ... $ V8 : num 0 0 17 133 0 27 36 218 1 1 ... $ V9 : num 0 98 29 205 0 23 4 280 1 0 ... $ V10: num 2 190 62 100 0 9 0 310 7 12 ... $ V11: num 0 237 49 227 86 9 0 321 0 0 ... $ V12: num 0 303 21 151 177 13 1 304 52 0 ... $ V13: num 0 286 72 199 131 8 2 320 33 6 ... $ V14: num 0 318 203 284 30 1 102 285 9 130 ... $ V15: num 0 314 241 282 10 0 43 286 93 107 ... $ V16: num 1 270 171 256 6 1 0 272 181 27 ... $ V17: num 3 190 100 214 34 0 11 255 177 0 ... $ V18: num 0 89 69 129 24 0 8 205 138 0 ... $ V19: num 0 7 2 27 2 0 0 80 30 0 ... $ V20: num 0 0 0 0 0 0 0 0 0 0 ... $ V21: num NA NA NA NA NA NA NA NA NA NA ... $ V22: num NA NA NA NA NA NA NA NA NA NA ... $ V23: num NA NA NA NA NA NA NA NA NA NA ... $ V24: num NA NA NA NA NA NA NA NA NA NA ... $ V25: num NA NA NA NA NA NA NA NA NA NA ... $ V26: num 6 2302 1036 2209 500 ... - David L Carlson Department of Anthropology Texas AM University College Station, TX 77840-4352 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Alemu Tadesse Sent: Wednesday, October 29, 2014 2:21 PM To: r-help@r-project.org Subject: [R] reading data from a web Dear All, I have data of the format shown in the link http://www.data.jma.go.jp/gmd/env/data/radiation/data/geppo/201004/DR201004_sap.txt that I need to read. I have downloaded all the data from the link and I have it on my computer. I used the following script (got it from web) and was able to read the data. But, it is not in the format that I wanted it to be. I want it a data frame and clean numbers. asNumeric - function(x) as.numeric(as.character(x)) factorsNumeric - function(data) modifyList(data, lapply(data[, sapply(data, is.logical)],asNumeric)) data=read.fwf(filename, widths=c(c1),skip=18, header=FALSE) data$V2-as.numeric(gsub( ,, as.character(data$V2) , fixed=TRUE)) f - factorsNumeric(data) Any help is appreciated. Best, Alemu [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] reading data saved with writeBin() into anything other than R
For me that other software would probably be Octave. I'm interested if anyone here has read in these files using Octave, or a C program or anything else. I typed 'octave read binary file' into google.com and the first hit was the Octave help file for its fread function. In C fread is also a good way to go (C and Octave have different argument lists for their fread functions.) In the Linux shell you can use the od command. % R --quiet con - gzcon(file(/tmp/file.gz, wb)) # your gzcon(/tmp/file.gz, wb) resulted in an error message writeBin(c(121:130,129:121), con, size=2) close(con) q(no) % zcat /tmp/file.gz | od --format d2 000121122123124125126127128 020129130129128127126125124 040123122121 046 Bill Dunlap TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Mike Miller Sent: Monday, April 21, 2014 6:00 PM To: R-Help List Subject: [R] reading data saved with writeBin() into anything other than R After saving a file like so... con - gzcon(file.gz, wb)) writeBin(vector, con, size=2) close(con) I can read it back into R like so... con - gzcon(file.gz, rb)) vector - readBin(con, integer(), 4800, size=2, signed=FALSE) close(con) ...and I'm wondering what other programs might be able to read in these data. It seems to be very straightforward: When I store 5436 integers for each of 7694 subjects, at two bytes per integer that ought to be 5436*7696*2 = 83670912 bytes, and it is exactly that: $ zcat file.gz | wc -c 83670912 So if I just convert every pair of bytes to an integer, I guess that will do it. I stored them this way because it was compact, but I guess this system also can work well when other software needs to read the data. For me that other software would probably be Octave. I'm interested if anyone here has read in these files using Octave, or a C program or anything else. If I don't get a good answer here, I'll try the Octave list, and I'll send my best answers here. The rest of this is some related info for readers of this list. You don't need to read below to answer my question above. Thanks. In case anyone is interested, I did some comparisons of loading speed and file size for a number of ways of storing my data. These data all consist of positive numbers between 0 and 2, with three digits to the right of the decimal, so I can save them as floating point double-precision, or multiply by 1000 and store them as integers. The test here as for a matrix of 5000 x 7845 = 39,225,000 values. These are the file sizes: 202.1 MB tab-delimited text file, original, uncompressed 29.9 MB tab-delimited text file, original, gzip compressed 187.7 MB tab-delimited text file, integers, uncompressed 24.6 MB tab-delimited text file, integers, gzip compressed 38.9 MB R save() original numeric values (doubles) 27.0 MB R save() integers 19.7 MB R writeBin() 16-bit integer gzipped So, for file size (important in my case), the gzipped writeBin() method storing 16-bit integers was the winner. Impressively, storing the data that way and dividing by 1000 on the fly to return the original numbers was faster than reading an Rdata file of the matrix: The integer text file: system.time( D - matrix( scan( file = D/D000, what=integer(0) ), ncol=7845, byrow=TRUE ) ) Read 39225000 items user system elapsed 10.626 0.344 10.971 The R save() original numeric values (doubles): system.time( load(D000_test.Rdata) ) user system elapsed 5.579 0.119 5.698 The R save() integers: system.time( load(D000_test.Rdata) ) user system elapsed 4.863 0.050 4.913 The writeBin() 16-bit integer gzipped file: con - gzcon(file(D000_test.gz, rb)) system.time( D - matrix( readBin( con, integer(), 7845*5000, size=2, signed=FALSE ), ncol=7845, byrow=TRUE ) ) user system elapsed 3.769 0.138 3.906 close(con) The writeBin() 16-bit integer gzipped file, converted to numeric by dividing by 1000 on the fly: system.time( D - matrix( readBin( con, integer(), 7845*5000, size=2, signed=FALSE ), ncol=7845, byrow=TRUE )/1000 ) user system elapsed 4.159 0.237 4.397 close(con) Best, Mike -- Michael B. Miller, Ph.D. Minnesota Center for Twin and Family Research Department of Psychology University of Minnesota http://scholar.google.com/citations?user=EV_phq4J __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __
Re: [R] reading data saved with writeBin() into anything other than R
On Tue, 22 Apr 2014, William Dunlap wrote: For me that other software would probably be Octave. I'm interested if anyone here has read in these files using Octave, or a C program or anything else. I typed 'octave read binary file' into google.com and the first hit was the Octave help file for its fread function. In C fread is also a good way to go (C and Octave have different argument lists for their fread functions.) In the Linux shell you can use the od command. Thanks! My mistake was that I was searching using R and writebin in my search string which limited my results too severely. I actually figured that out before your message came in and felt a little embarrassed, and that has only gotten worse. But you did give me something cool that I didn't know: % R --quiet con - gzcon(file(/tmp/file.gz, wb)) # your gzcon(/tmp/file.gz, wb) resulted in an error message writeBin(c(121:130,129:121), con, size=2) close(con) q(no) % zcat /tmp/file.gz | od --format d2 000121122123124125126127128 020129130129128127126125124 040123122121 046 That's really neat. With my data, I can do this to return the original matrix: zcat file.gz | od -vtd2 -w15392 -An matrix.txt It is quite fast, too: $ time -p zcat D1.gz | od -vtd2 -w15392 -An /dev/null real 6.08 user 6.86 sys 0.08 If I had realized how little my writeBin() output files had to do with R, I probably wouldn't have posted here, but I'm glad I did. FYI -- I was able to use fread() in Octave on the uncompressed version of the file, but it isn't handling the zipped version as expected. That's an Octave problem, so I'll deal with them on that one. I might not have zlib compiled in, or maybe they still have a bug in that function. Thanks! Mike __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading data from Census API into R
I got it: library(rjson) library(plyr) test-fromJSON(file=url(http://api.census.gov/data/2010/sf1?key=mykeyget=P0030001,NAMEfor=county:*in=state:48;)) test2-ldply(test)[-1,] names(test2)-ldply(test)[1,] head(test2) P0030001 NAME state county 258458 Anderson County48001 314786 Andrews County48003 486771 Angelina County48005 523158 Aransas County48007 6 9054Archer County48009 7 1901 Armstrong County48011 - Corey Sparks, PhD Assistant Professor Department of Demography University of Texas at San Antonio 501 West César E. Chávez Blvd Monterey Building 2.270C San Antonio, TX 78207 210-458-3166 corey.sparks 'at' utsa.edu coreysparks.weebly.com -- View this message in context: http://r.789695.n4.nabble.com/Reading-data-from-Census-API-into-R-tp4684877p4684881.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading data from Excel file in r
Take a look at the XLConnect package. I use it for all the reading/writing for Excel files. Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. On Mon, Nov 4, 2013 at 8:47 AM, Baro babak...@gmail.com wrote: Hi experts, I want to read data from an excel data like this: for the fifth column, from first row until 140 but only 1,3,5,7,.139 (only 70 values), How can I do it in R? thanks [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading data from Excel file in r
You can use the XLConnect package to read in a range of rows and columns, then define a function to subset the odd rows. For example, library(XLConnect) wb - loadWorkbook(C:/temp/MyData.xls) dat - readWorksheet(wb, sheet=getSheets(wb)[1], startRow=1, endRow=139, startCol=5, endCol=5) dat - readWorksheet(wb, sheet=getSheets(wb)[1], startRow=1, endRow=79, startCol=5, endCol=5) odds - function(x) x[seq(1, length(x), 2)] odds(unlist(dat)) Jean On Mon, Nov 4, 2013 at 7:47 AM, Baro babak...@gmail.com wrote: Hi experts, I want to read data from an excel data like this: for the fifth column, from first row until 140 but only 1,3,5,7,.139 (only 70 values), How can I do it in R? thanks [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading data from Excel file in r
thanks alot, but now I have another problem: my Excel file is very big and I get this error, which says: Error: OutOfMemoryError (Java): Java heap space Is there any way to read each value one by one and save them in an array? On Mon, Nov 4, 2013 at 6:13 AM, Adams, Jean jvad...@usgs.gov wrote: You can use the XLConnect package to read in a range of rows and columns, then define a function to subset the odd rows. For example, library(XLConnect) wb - loadWorkbook(C:/temp/MyData.xls) dat - readWorksheet(wb, sheet=getSheets(wb)[1], startRow=1, endRow=139, startCol=5, endCol=5) dat - readWorksheet(wb, sheet=getSheets(wb)[1], startRow=1, endRow=79, startCol=5, endCol=5) odds - function(x) x[seq(1, length(x), 2)] odds(unlist(dat)) Jean On Mon, Nov 4, 2013 at 7:47 AM, Baro babak...@gmail.com wrote: Hi experts, I want to read data from an excel data like this: for the fifth column, from first row until 140 but only 1,3,5,7,.139 (only 70 values), How can I do it in R? thanks [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading data from Excel file in r
Perhaps the discussion at this link will help ... (see especially the second answer). http://stackoverflow.com/questions/7963393/out-of-memory-error-java-when-using-r-and-xlconnect-package Jean On Mon, Nov 4, 2013 at 8:26 AM, Baro babak...@gmail.com wrote: thanks alot, but now I have another problem: my Excel file is very big and I get this error, which says: Error: OutOfMemoryError (Java): Java heap space Is there any way to read each value one by one and save them in an array? On Mon, Nov 4, 2013 at 6:13 AM, Adams, Jean jvad...@usgs.gov wrote: You can use the XLConnect package to read in a range of rows and columns, then define a function to subset the odd rows. For example, library(XLConnect) wb - loadWorkbook(C:/temp/MyData.xls) dat - readWorksheet(wb, sheet=getSheets(wb)[1], startRow=1, endRow=139, startCol=5, endCol=5) dat - readWorksheet(wb, sheet=getSheets(wb)[1], startRow=1, endRow=79, startCol=5, endCol=5) odds - function(x) x[seq(1, length(x), 2)] odds(unlist(dat)) Jean On Mon, Nov 4, 2013 at 7:47 AM, Baro babak...@gmail.com wrote: Hi experts, I want to read data from an excel data like this: for the fifth column, from first row until 140 but only 1,3,5,7,.139 (only 70 values), How can I do it in R? thanks [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading data from Excel file in r
thanks, I changed my code, but still have the same problem :/ On Mon, Nov 4, 2013 at 6:49 AM, Adams, Jean jvad...@usgs.gov wrote: Perhaps the discussion at this link will help ... (see especially the second answer). http://stackoverflow.com/questions/7963393/out-of-memory-error-java-when-using-r-and-xlconnect-package Jean On Mon, Nov 4, 2013 at 8:26 AM, Baro babak...@gmail.com wrote: thanks alot, but now I have another problem: my Excel file is very big and I get this error, which says: Error: OutOfMemoryError (Java): Java heap space Is there any way to read each value one by one and save them in an array? On Mon, Nov 4, 2013 at 6:13 AM, Adams, Jean jvad...@usgs.gov wrote: You can use the XLConnect package to read in a range of rows and columns, then define a function to subset the odd rows. For example, library(XLConnect) wb - loadWorkbook(C:/temp/MyData.xls) dat - readWorksheet(wb, sheet=getSheets(wb)[1], startRow=1, endRow=139, startCol=5, endCol=5) dat - readWorksheet(wb, sheet=getSheets(wb)[1], startRow=1, endRow=79, startCol=5, endCol=5) odds - function(x) x[seq(1, length(x), 2)] odds(unlist(dat)) Jean On Mon, Nov 4, 2013 at 7:47 AM, Baro babak...@gmail.com wrote: Hi experts, I want to read data from an excel data like this: for the fifth column, from first row until 140 but only 1,3,5,7,.139 (only 70 values), How can I do it in R? thanks [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading data from Excel file in r
Is this an .xlsx file format?  If so convert to .xls and try again.  .xlsx is compressed and takes a lot more resources in XLConnect Sent from my Verizon Wireless 4G LTE Smartphone Original message From: Baro babak...@gmail.com Date: 11/04/2013 09:26 (GMT-05:00) To: Adams, Jean jvad...@usgs.gov Cc: R help r-help@r-project.org Subject: Re: [R] Reading data from Excel file in r thanks alot, but now I have another problem: my Excel file is very big and I get this error, which says: Error: OutOfMemoryError (Java): Java heap space Is there any way to read each value one by one and save them in an array? On Mon, Nov 4, 2013 at 6:13 AM, Adams, Jean jvad...@usgs.gov wrote: You can use the XLConnect package to read in a range of rows and columns, then define a function to subset the odd rows. For example, library(XLConnect) wb - loadWorkbook(C:/temp/MyData.xls) dat - readWorksheet(wb, sheet=getSheets(wb)[1], startRow=1, endRow=139, startCol=5, endCol=5) dat - readWorksheet(wb, sheet=getSheets(wb)[1], startRow=1, endRow=79, startCol=5, endCol=5) odds - function(x) x[seq(1, length(x), 2)] odds(unlist(dat)) Jean On Mon, Nov 4, 2013 at 7:47 AM, Baro babak...@gmail.com wrote: Hi experts, I want to read data from an excel data like this:  for the fifth column, from first row until 140 but only 1,3,5,7,.139 (only 70 values), How can I do it in R? thanks         [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading data from Excel file in r
thanks Jim, I have tried it but still the same error :/ On Mon, Nov 4, 2013 at 7:34 AM, Jim Holtman jholt...@gmail.com wrote: Is this an .xlsx file format? If so convert to .xls and try again. .xlsx is compressed and takes a lot more resources in XLConnect Sent from my Verizon Wireless 4G LTE Smartphone Original message From: Baro babak...@gmail.com Date: 11/04/2013 09:26 (GMT-05:00) To: Adams, Jean jvad...@usgs.gov Cc: R help r-help@r-project.org Subject: Re: [R] Reading data from Excel file in r thanks alot, but now I have another problem: my Excel file is very big and I get this error, which says: Error: OutOfMemoryError (Java): Java heap space Is there any way to read each value one by one and save them in an array? On Mon, Nov 4, 2013 at 6:13 AM, Adams, Jean jvad...@usgs.gov wrote: You can use the XLConnect package to read in a range of rows and columns, then define a function to subset the odd rows. For example, library(XLConnect) wb - loadWorkbook(C:/temp/MyData.xls) dat - readWorksheet(wb, sheet=getSheets(wb)[1], startRow=1, endRow=139, startCol=5, endCol=5) dat - readWorksheet(wb, sheet=getSheets(wb)[1], startRow=1, endRow=79, startCol=5, endCol=5) odds - function(x) x[seq(1, length(x), 2)] odds(unlist(dat)) Jean On Mon, Nov 4, 2013 at 7:47 AM, Baro babak...@gmail.com wrote: Hi experts, I want to read data from an excel data like this: for the fifth column, from first row until 140 but only 1,3,5,7,.139 (only 70 values), How can I do it in R? thanks [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading data from a text file conditionally skipping lines
Hi, It would be better to give an example. If your dataset is like the one attached: con-file(Trial1.txt) Lines1- readLines(con) close(con) #If the data you wanted to extract is numeric and the header and footer are characters, dat1-read.table(text=Lines1[-grep([A-Za-z],Lines1)],sep=\t,header=FALSE) dat1 # V1 V2 V3 V4 V5 #1 38 43 39 44 45 #2 39 44 36 49 46 #3 42 45 47 49 37 #4 34 43 39 45 45 #5 38 42 39 44 47 #6 43 44 46 42 37 #7 32 49 38 42 45 #8 34 45 35 49 46 #9 44 45 46 49 37 #10 34 43 39 48 49 #11 38 42 39 47 47 #12 43 44 46 42 37 #13 37 43 39 44 45 #14 39 42 36 49 46 #15 42 45 47 49 37 #or You mentioned that the data is repeated every so many lines. Here also, there is repeating pattern. head(Lines1,10) #[1] Lorem ipsum dolor sit amet, consectetuer adipiscing elit, sed diam nonummy nibh euismod tincidunt ut laoreet dolore magna aliquam erat #volutpat. #[2] Ut wisi enim ad minim veniam, quis nostrud exerci tation ullamcorper suscipit #lobortis # [3] 38\t43\t39\t44\t45 #[4] 39\t44\t36\t49\t46 #[5] 42\t45\t47\t49\t37 #[6] Duis autem vel eum iriure dolor in hendrerit in vulputate velit esse molestie #consequat. #[7] Vel illum dolore eu feugiat nulla facilisis at vero eros et accumsan et iusto odio dignissim qui blandit praesent luptatum zzril delenit augue duis #dolore te feugait nulla facilisi. #[8] 34\t43\t39\t45\t45 #[9] 38\t42\t39\t44\t47 #[10] 43\t44\t46\t42\t37 dat2-read.table(text=Lines1[rep(rep(c(FALSE,TRUE),times=c(2,3)),5)],sep=\t,header=FALSE) identical(dat1,dat2) #[1] TRUE A.K. I have a text file that is nicely formatted (tab separated). However, it has some header and footer information after every so many lines. I do not want to read this information in my dataframe. What is the best way to read this data into R. Thanks for all the help! Lorem ipsum dolor sit amet, consectetuer adipiscing elit, sed diam nonummy nibh euismod tincidunt ut laoreet dolore magna aliquam erat volutpat. Ut wisi enim ad minim veniam, quis nostrud exerci tation ullamcorper suscipit lobortis 38 43 39 44 45 39 44 36 49 46 42 45 47 49 37 Duis autem vel eum iriure dolor in hendrerit in vulputate velit esse molestie consequat. Vel illum dolore eu feugiat nulla facilisis at vero eros et accumsan et iusto odio dignissim qui blandit praesent luptatum zzril delenit augue duis dolore te feugait nulla facilisi. 34 43 39 45 45 38 42 39 44 47 43 44 46 42 37 Lorem ipsum dolor sit amet, consectetuer adipiscing elit, sed diam nonummy nibh euismod tincidunt ut laoreet dolore magna aliquam erat volutpat. Ut wisi enim ad minim veniam, quis nostrud exerci tation ullamcorper suscipit lobortis 32 49 38 42 45 34 45 35 49 46 44 45 46 49 37 Duis autem vel eum iriure dolor in hendrerit in vulputate velit esse molestie consequat. Vel illum dolore eu feugiat nulla facilisis at vero eros et accumsan et iusto odio dignissim qui blandit praesent luptatum zzril delenit augue duis dolore te feugait nulla facilisi. 34 43 39 48 49 38 42 39 47 47 43 44 46 42 37 Lorem ipsum dolor sit amet, consectetuer adipiscing elit, sed diam nonummy nibh euismod tincidunt ut laoreet dolore magna aliquam erat volutpat. Ut wisi enim ad minim veniam, quis nostrud exerci tation ullamcorper suscipit lobortis 37 43 39 44 45 39 42 36 49 46 42 45 47 49 37 Duis autem vel eum iriure dolor in hendrerit in vulputate velit esse molestie consequat. Vel illum dolore eu feugiat nulla facilisis at vero eros et accumsan et iusto odio dignissim qui blandit praesent luptatum zzril delenit augue duis dolore te feugait nulla facilisi.
Re: [R] Reading Data
Hi, I tried to read your data from the image: OPENCUT- read.table(OpenCut.dat,header=TRUE,sep=\t) OPENCUT FC LC SR DM 1 400030.34 1323.5 0 400 2 12680.13 2.5 0 180 3 472272.75 2004.7 3 300 4 332978.03 1301.3 106 180 5 98654.20 295.0 0 180 6 68142.05 259.9 69 125 7 178433.11 425.0 49 180 8 96765.83 635.5 12 180 9 204808.90 640.4 0 400 10 151760.20 357.0 0 180 11 91330.42 173.6 6 180 12 93154.33 197.5 16 125 13 121030.15 203.0 30 125 14 60132.75 160.0 26 125 15 32233.78 69.0 8 90 16 5.13 137.0 0 125 17 160791.82 1335.0 0 180 18 62531.76 80.5 21 180 str(OPENCUT) #'data.frame': 18 obs. of 4 variables: # $ FC: num 400030 12680 472273 332978 98654 ... # $ LC: num 1323.5 2.5 2004.7 1301.3 295 ... # $ SR: int 0 0 3 106 0 69 49 12 0 0 ... # $ DM: int 400 180 300 180 180 125 180 180 400 180 ... You didn't mention whether you attach(OPENCUT) or not. If you didn't attach the data: OPENCUT$FC # [1] 400030.34 12680.13 472272.75 332978.03 98654.20 68142.05 178433.11 #[8] 96765.83 204808.90 151760.20 91330.42 93154.33 121030.15 60132.75 #[15] 32233.78 5.13 160791.82 62531.76 OPENCUT$LC # [1] 1323.5 2.5 2004.7 1301.3 295.0 259.9 425.0 635.5 640.4 357.0 #[11] 173.6 197.5 203.0 160.0 69.0 137.0 1335.0 80.5 OPENCUT$SR # [1] 0 0 3 106 0 69 49 12 0 0 6 16 30 26 8 0 0 21 OPENCUT$DM # [1] 400 180 300 180 180 125 180 180 400 180 180 125 125 125 90 125 180 180 OPENCUT[,1] # [1] 400030.34 12680.13 472272.75 332978.03 98654.20 68142.05 178433.11 #[8] 96765.83 204808.90 151760.20 91330.42 93154.33 121030.15 60132.75 #[15] 32233.78 5.13 160791.82 62531.76 attach(OPENCUT) #this I won't recommend FC # [1] 400030.34 12680.13 472272.75 332978.03 98654.20 68142.05 178433.11 #[8] 96765.83 204808.90 151760.20 91330.42 93154.33 121030.15 60132.75 #[15] 32233.78 5.13 160791.82 62531.76 LC # [1] 1323.5 2.5 2004.7 1301.3 295.0 259.9 425.0 635.5 640.4 357.0 #[11] 173.6 197.5 203.0 160.0 69.0 137.0 1335.0 80.5 SR # [1] 0 0 3 106 0 69 49 12 0 0 6 16 30 26 8 0 0 21 DM # [1] 400 180 300 180 180 125 180 180 400 180 180 125 125 125 90 125 180 180 Not sure how you got that errors. It would be better if you dput(OPENCUT). A.K. Hi, I know there are several similar post, but I really have troubles reading a simple .dat file and I don't understand why. In fact, I am being able to import the data but only the first and last column are recognized as a proper vector set. Could you please tell me why I am not being able to read SR and LC columns??? Please see the screenshot attached: As you can see, columns FC and DM are well read by R however the other two are not. I've also tried to import it as a csv sheet but the result was the same. Any help would be appreciated! Many thanks, Stoyan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] reading data
Hi, Try this: files-paste(MSMS_,23,PepInfo.txt,sep=) read.data-function(x) {names(x)-gsub(^(.*)\\/.*,\\1,x); lapply(x,function(y) read.table(y,header=TRUE,sep = \t,stringsAsFactors=FALSE,fill=TRUE))} lista-do.call(c,lapply(list.files(recursive=T)[grep(files,list.files(recursive=T))],read.data)) names(lista)-paste(group_,gsub(\\d+,,names(lista)),sep=) res2-split(lista,names(lista)) res3- lapply(res2,function(x) {names(x)-paste(gsub(.*_,,names(x)),1:length(x),sep=);x}) #Freq whole data res4-lapply(seq_along(res3),function(i) do.call(rbind,lapply(res3[[i]],function(x) as.data.frame(table(factor(x$z,levels=1:3)) names(res4)- names(res2) library(reshape2) freq.i1-do.call(rbind,lapply(res4,function(x) dcast(melt(data.frame(id=gsub(\\..*,,row.names(x)),x),id.var=c(id,Var1)),id~Var1,value.var=value))) freq.i1 # id 1 2 3 #group_a a1 1 12 6 #group_c.1 c1 0 10 3 #group_c.2 c2 0 12 3 #group_c.3 c3 0 13 4 #group_t.1 t1 0 10 4 #group_t.2 t2 1 12 6 freq.rel.i1- as.matrix(freq.i1[,-1]/rowSums(freq.i1[,-1]) ) freq.rel.i1 # 1 2 3 #group_a 0.05263158 0.6315789 0.3157895 #group_c.1 0. 0.7692308 0.2307692 #group_c.2 0. 0.800 0.200 #group_c.3 0. 0.7647059 0.2352941 #group_t.1 0. 0.7142857 0.2857143 #group_t.2 0.05263158 0.6315789 0.3157895 #Freq with FDR 0.01 res5-lapply(seq_along(res3),function(i) do.call(rbind,lapply(res3[[i]],function(x) as.data.frame(table(factor(x$z[x[[FDR]]0.01],levels=1:3)) names(res5)- names(res2) freq.f1- do.call(rbind,lapply(res5,function(x) dcast(melt(data.frame(id=gsub(\\..*,,row.names(x)),x),id.var=c(id,Var1)),id~Var1,value.var=value))) freq.f1 # id 1 2 3 #group_a a1 1 10 5 #group_c.1 c1 0 7 2 #group_c.2 c2 0 8 2 #group_c.3 c3 0 6 4 #group_t.1 t1 0 7 4 #group_t.2 t2 1 10 5 freq.rel.f1- as.matrix(freq.f1[,-1]/rowSums(freq.f1[,-1])) colour-sample(rainbow(nrow(freq.rel.i1))) par(mfrow=c(1,2)) barplot(freq.rel.i1,beside=T,main=(Sample),xlab=Charge,ylab=Relative Frequencies,col=colour,legend.text = rownames(freq.rel.i1)) barplot(freq.rel.f1,beside=T,main=(Sample with FDR0.01),xlab=Charge,ylab=Relative Frequencies,col=colour,legend.text = rownames(freq.rel.f1)) #change the legend position Also, didn't check the rest of the code from chisquare test. A.K. From: Vera Costa veracosta...@gmail.com To: arun smartpink...@yahoo.com Sent: Tuesday, February 19, 2013 4:19 PM Subject: Re: reading data Here is the code and some outputs. z.plot - function(directory,number) { #reading data setwd(directory) direct-dir(directory,pattern = paste(MSMS_,number,PepInfo.txt,sep=), full.names = FALSE, recursive = TRUE) directT - direct[grepl(^t, direct)] directC - direct[grepl(^c, direct)] lista-lapply(direct, function(x) read.table(x,header=TRUE, sep = \t)) listaC-lapply(directC, function(x) read.table(x,header=TRUE, sep = \t)) listaT-lapply(directT, function(x) read.table(x,header=TRUE, sep = \t)) #count different z values cab - vector() for (i in 1:length(lista)) { dc-lista[[i]][ifelse(lista[[i]]$FDR0.01, TRUE, FALSE),] dc-table(dc$z) cab - c(cab, names(dc)) } #Relative freqs to construct the graph cab - unique(cab) print(cab) ###[1] 2 3 1 d - matrix(ncol=length(cab)) dci- d[-1,] dcf - d[-1,] dti - d[-1,] dtf - d[-1,] for (i in 1:length(listaC)) { #Relative freq of all data dcc-listaC[[i]] dcc-table(factor(dcc$z, levels=cab)) dci- rbind(dci, dcc) rownames(dci)-rownames(1:(nrow(dci)), do.NULL = FALSE, prefix = c) #Relative freq of data with FDR0.01 dcc1-listaC[[i]][ifelse(listaC[[i]]$FDR0.01, TRUE, FALSE),] dcc1-table(factor(dcc1$z, levels=cab)) dcf- rbind(dcf,dcc1) rownames(dcf)-rownames(1:(nrow(dcf)), do.NULL = FALSE, prefix = c) } for (i in 1:length(listaT)) { #Relative freq of all data dct-listaT[[i]] dct-table(factor(dct$z, levels=cab)) dti- rbind(dti, dct) rownames(dti)-rownames(1:(nrow(dti)), do.NULL = FALSE, prefix = t) #Relative freq of data with FDR0.01 dct1-listaT[[i]][ifelse(listaT[[i]]$FDR0.01, TRUE, FALSE),] dct1-table(factor(dct1$z, levels=cab)) dtf- rbind(dtf,dct1) rownames(dtf)-rownames(1:(nrow(dtf)), do.NULL = FALSE, prefix = t) } freq.i-rbind(dci,dti) freq.f-rbind(dcf,dtf) freq.rel.i-freq.i/apply(freq.i,1,sum) freq.rel.f-freq.f/apply(freq.f,1,sum) print(freq.i) ## 2 3 1 #c1 10 3 0 #c2 12 3 0 #c3 13 4 0 #t1 10 4 0 #t2 12 6 1 print(freq.f) ### 2 3 1 #c1 7 2 0 #c2 8 2 0 #c3 6 4 0 #t1 7 4 0 #t2 10 5 1 print(freq.rel.i) ### 2 3 1 #c1 0.7692308 0.2307692 0. #c2 0.800 0.200 0. #c3 0.7647059 0.2352941 0. #t1 0.7142857 0.2857143 0. #t2 0.6315789 0.3157895 0.05263158 print(freq.rel.f) ### 2 3 1 #c1 0.778 0.222 0. #c2 0.800 0.200 0. #c3 0.600
Re: [R] reading data
Hi Vera, Not sure I understand your question. Your statement In my lista I can´t merge rows to have the group, because the idea is for each file count frequencies of mm, when b0.01. after that I want a graph like the graph in attach. files-paste(MSMS_,23,PepInfo.txt,sep=) read.data-function(x) {names(x)-gsub(^(.*)\\/.*,\\1,x); lapply(x,function(y) read.table(y,header=TRUE,sep = \t,stringsAsFactors=FALSE,fill=TRUE))} lista-do.call(c,lapply(list.files(recursive=T)[grep(files,list.files(recursive=T))],read.data)) names(lista)-paste(group_,gsub(\\d+,,names(lista)),sep=) res2-split(lista,names(lista)) res3- lapply(res2,function(x) {names(x)-paste(gsub(.*_,,names(x)),1:length(x),sep=);x}) res4-lapply(seq_along(res3),function(i) do.call(rbind,lapply(res3[[i]], function(x) x[x[[b]]0.01,]))) names(res4)- names(res2) res4 lapply(res4,function(x) table(x$mm)) #$group_a #2 3 #9 3 #$group_b #2 3 #6 2 #$group_c #2 3 #3 1 If you want the separate counts per a1,a2,a3 within the group: res4-lapply(seq_along(res3),function(i) do.call(rbind,lapply(res3[[i]], function(x) table(x$mm[x[[b]]0.01] names(res4)- names(res2) res4 #$group_a # 2 3 #a1 3 1 #a2 3 1 #a3 3 1 #$group_b # 2 3 #b1 3 1 #b2 3 1 #$group_c # 2 3 #c1 3 1 I haven't gone through the rest of the codes as I was not sure about what you want. A.K. From: Vera Costa veracosta...@gmail.com To: arun smartpink...@yahoo.com Sent: Monday, February 18, 2013 10:27 AM Subject: Re: reading data Hi!!! I'm coming to ask a new question. I want a function to do my statistics. I start with you had send me: z.plot - function(directory,number) { setwd(directory) indx-gsub([./],,list.dirs()) indx1- indx[indx!=] print(indx1) files-paste(MSMS_,number,PepInfo.txt,sep=) read.data-function(x) {names(x)-gsub(^(.*)\\/.*,\\1,x); lapply(x,function(y) read.table(y,header=TRUE,sep = \t,stringsAsFactors=FALSE,fill=TRUE))} lista-do.call(c,lapply(list.files(recursive=T)[grep(files,list.files(recursive=T))],read.data)) print(lista) #names(lista)-paste(group_,gsub(\\d+,,names(lista)),sep=) ve = TRUE) } z.plot(C:/Users/Vera Costa/Desktop/dados.lixo,23) In my lista I can´t merge rows to have the group, because the idea is for each file count frequencies of mm, when b0.01. after that I want a graph like the graph in attach. When I had 2 groups and knew the name of the groups, I did the code (but Know I have more groups and, maybe, I don´t know the name of the groups): z.plot - function(directory,number) { #reading data setwd(directory) direct-dir(directory,pattern = paste(MSMS_,number,PepInfo.txt,sep=), full.names = FALSE, recursive = TRUE) directT - direct[grepl(^t, direct)] directC - direct[grepl(^c, direct)] lista-lapply(direct, function(x) read.table(x,header=TRUE, sep = \t)) listaC-lapply(directC, function(x) read.table(x,header=TRUE, sep = \t)) listaT-lapply(directT, function(x) read.table(x,header=TRUE, sep = \t)) #count different z values cab - vector() for (i in 1:length(lista)) { dc-lista[[i]][ifelse(lista[[i]]$FDR0.01, TRUE, FALSE),] dc-table(dc$z) cab - c(cab, names(dc)) } #Relative freqs to construct the graph cab - unique(cab) d - matrix(ncol=length(cab)) dci- d[-1,] dcf - d[-1,] dti - d[-1,] dtf - d[-1,] for (i in 1:length(listaC)) { #Relative freq of all data dcc-listaC[[i]] dcc-table(factor(dcc$z, levels=cab)) dci- rbind(dci, dcc) rownames(dci)-rownames(1:(nrow(dci)), do.NULL = FALSE, prefix = c) #Relative freq of data with FDR0.01 dcc1-listaC[[i]][ifelse(listaC[[i]]$FDR0.01, TRUE, FALSE),] dcc1-table(factor(dcc1$z, levels=cab)) dcf- rbind(dcf,dcc1) rownames(dcf)-rownames(1:(nrow(dcf)), do.NULL = FALSE, prefix = c) } for (i in 1:length(listaT)) { #Relative freq of all data dct-listaT[[i]] dct-table(factor(dct$z, levels=cab)) dti- rbind(dti, dct) rownames(dti)-rownames(1:(nrow(dti)), do.NULL = FALSE, prefix = t) #Relative freq of data with FDR0.01 dct1-listaT[[i]][ifelse(listaT[[i]]$FDR0.01, TRUE, FALSE),] dct1-table(factor(dct1$z, levels=cab)) dtf- rbind(dtf,dct1) rownames(dtf)-rownames(1:(nrow(dtf)), do.NULL = FALSE, prefix = t) } freq.i-rbind(dci,dti) freq.f-rbind(dcf,dtf) freq.rel.i-freq.i/apply(freq.i,1,sum) freq.rel.f-freq.f/apply(freq.f,1,sum) #Graph plot colour-sample(rainbow(nrow(freq.rel.i))) par(mfrow=c(1,2)) barplot(freq.rel.i,beside=T,main=(Sample),xlab=Charge,ylab=Relative Frequencies,col=colour,legend.text = rownames(freq.rel.i)) barplot(freq.rel.f,beside=T,main=(Sample with FDR0.01),xlab=Charge,ylab=Relative Frequencies,col=colour,legend.text = rownames(freq.rel.f)) #average of the group (except c1t1) freqs-rbind(dcf[-1,], dtf[-1,]) average-apply(freqs,2,mean) #chisquare test function chisq.test-function(x,y){ somax-sum(x) somay-sum(y) nj.-x+y nj-sum(nj.) ejx-(nj./nj)*somax ejy-(nj./nj)*somay ETx-((x-ejx)^2)/ejx
Re: [R] reading data
Hi, I am not able to open your graph. I am using linux. Also, the codes in the function are not reproducible directT - direct[grepl(^t, direct)] directC - direct[grepl(^c, direct)] It takes double the time to know what is going on. dir() #[1] a1 a2 a3 b1 b2 c1 direct- list.files(recursive=TRUE)[grepl(^a|^b,dir())] direct #[1] MSMS_23PepInfo.txt MSMS_23PepInfo.txt MSMS_23PepInfo.txt #[4] MSMS_23PepInfo.txt MSMS_23PepInfo.txt directA- list.files(recursive=TRUE)[grepl(^a,dir())] directB- list.files(recursive=TRUE)[grepl(^b,dir())] lista- lapply(direct,function(x) read.table(x,header=TRUE,stringsAsFactors=FALSE,sep=\t,fill=TRUE)) listaA-lapply(directA, function(x) read.table(x,header=TRUE, sep = \t,fill=TRUE)) listaB-lapply(directB, function(x) read.table(x,header=TRUE, sep = \t,fill=TRUE)) #here I am changing the names listaT, z, etc.. count different mm values cab - vector() for (i in 1:length(lista)) { dc-lista[[i]][ifelse(lista[[i]]$b0.01, TRUE, FALSE),] dc-table(dc$mm) cab - c(cab, names(dc)) } #Relative freqs to construct the graph cab - unique(cab) d - matrix(ncol=length(cab)) dci- d[-1,] dcf - d[-1,] dti - d[-1,] dtf - d[-1,] for (i in 1:length(listaA)) { #Relative freq of all data dcc-listaA[[i]] dcc-table(factor(dcc$mm, levels=cab)) dci- rbind(dci, dcc) rownames(dci)-rownames(1:(nrow(dci)), do.NULL = FALSE, prefix = a) #Relative freq of data with FDR0.01 dcc1-listaA[[i]][ifelse(listaA[[i]]$FDR0.01, TRUE, FALSE),] dcc1-table(factor(dcc1$mm, levels=cab)) dcf- rbind(dcf,dcc1) rownames(dcf)-rownames(1:(nrow(dcf)), do.NULL = FALSE, prefix = a) } for (i in 1:length(listaB)) { #Relative freq of all data dct-listaB[[i]] dct-table(factor(dct$mm, levels=cab)) dti- rbind(dti, dct) rownames(dti)-rownames(1:(nrow(dti)), do.NULL = FALSE, prefix = b) #Relative freq of data with FDR0.01 dct1-listaB[[i]][ifelse(listaB[[i]]$FDR0.01, TRUE, FALSE),] dct1-table(factor(dct1$mm, levels=cab)) dtf- rbind(dtf,dct1) rownames(dtf)-rownames(1:(nrow(dtf)), do.NULL = FALSE, prefix = b) } freq.i-rbind(dci,dti) freq.f-rbind(dcf,dtf) freq.rel.i-freq.i/apply(freq.i,1,sum) freq.rel.f-freq.f/apply(freq.f,1,sum) freq.i # 2 3 #a1 4 1 #a2 4 1 #a3 4 1 #b1 4 1 #b2 4 1 #b3 4 1 #b4 4 1 #result from my code. files-paste(MSMS_,23,PepInfo.txt,sep=) read.data-function(x) {names(x)-gsub(^(.*)\\/.*,\\1,x); lapply(x,function(y) read.table(y,header=TRUE,sep = \t,stringsAsFactors=FALSE,fill=TRUE))} lista-do.call(c,lapply(list.files(recursive=T)[grep(files,list.files(recursive=T))],read.data)) names(lista)-paste(group_,gsub(\\d+,,names(lista)),sep=) res2-split(lista,names(lista)) res3- lapply(res2,function(x) {names(x)-paste(gsub(.*_,,names(x)),1:length(x),sep=);x}) res4-lapply(seq_along(res3),function(i) do.call(rbind,lapply(res3[[i]], function(x) table(x$mm[x[[b]]0.01] names(res4)- names(res2) res4 $group_a # 2 3 #a1 3 1 #a2 3 1 #a3 3 1 #$group_b # 2 3 #b1 3 1 #b2 3 1 #$group_c # 2 3 #c1 3 1 There is a difference in output from freq.i and res4. There were only two files under 'group_b`. So, check your codes. A.K. From: Vera Costa veracosta...@gmail.com To: arun smartpink...@yahoo.com Sent: Monday, February 18, 2013 10:27 AM Subject: Re: reading data Hi!!! I'm coming to ask a new question. I want a function to do my statistics. I start with you had send me: z.plot - function(directory,number) { setwd(directory) indx-gsub([./],,list.dirs()) indx1- indx[indx!=] print(indx1) files-paste(MSMS_,number,PepInfo.txt,sep=) read.data-function(x) {names(x)-gsub(^(.*)\\/.*,\\1,x); lapply(x,function(y) read.table(y,header=TRUE,sep = \t,stringsAsFactors=FALSE,fill=TRUE))} lista-do.call(c,lapply(list.files(recursive=T)[grep(files,list.files(recursive=T))],read.data)) print(lista) #names(lista)-paste(group_,gsub(\\d+,,names(lista)),sep=) ve = TRUE) } z.plot(C:/Users/Vera Costa/Desktop/dados.lixo,23) In my lista I can´t merge rows to have the group, because the idea is for each file count frequencies of mm, when b0.01. after that I want a graph like the graph in attach. When I had 2 groups and knew the name of the groups, I did the code (but Know I have more groups and, maybe, I don´t know the name of the groups): z.plot - function(directory,number) { #reading data setwd(directory) direct-dir(directory,pattern = paste(MSMS_,number,PepInfo.txt,sep=), full.names = FALSE, recursive = TRUE) directT - direct[grepl(^t, direct)] directC - direct[grepl(^c, direct)] lista-lapply(direct, function(x) read.table(x,header=TRUE, sep = \t)) listaC-lapply(directC, function(x) read.table(x,header=TRUE, sep = \t)) listaT-lapply(directT, function(x) read.table(x,header=TRUE, sep = \t)) #count different z values cab - vector() for (i in 1:length(lista)) {
Re: [R] reading data
HI Vera, No problem. I am cc:ing to r-help. A.K. From: Vera Costa veracosta...@gmail.com To: arun smartpink...@yahoo.com Sent: Sunday, February 17, 2013 5:44 AM Subject: Re: reading data Hi. Thank you. It works now:-) And yes, I use windows. Thank you very much. No dia 17 de Fev de 2013 00:44, arun smartpink...@yahoo.com escreveu: Hi Vera, Have you tried the suggestion? Are you using Windows? Thanks, Arun From: Vera Costa veracosta...@gmail.com To: arun smartpink...@yahoo.com Sent: Saturday, February 16, 2013 7:10 PM Subject: Re: reading data Thank you. In mine, I have an error 'what' must be a character string or a function. I need to do equivalent in my system. Thank you and sorry one more time. No dia 16 de Fev de 2013 23:53, arun smartpink...@yahoo.com escreveu: Hi, You didn't mention what the error message or whether you are reading file names which are not m11kk.txt. It is workiing on my system as I run it again. ?c() combine values into a vector or list. sessionInfo() R version 2.15.1 (2012-06-22) Platform: x86_64-pc-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_CA.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_CA.UTF-8 LC_COLLATE=en_CA.UTF-8 [5] LC_MONETARY=en_CA.UTF-8 LC_MESSAGES=en_CA.UTF-8 [7] LC_PAPER=C LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_CA.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] stringr_0.6.2 reshape2_1.2.2 loaded via a namespace (and not attached): [1] plyr_1.8 #code res-do.call(c,lapply(list.files(recursive=T)[grep(m11kk,list.files(recursive=T))],function(x) {names(x)-gsub(^(.*)\\/.*,\\1,x); lapply(x,function(y) read.table(y,header=TRUE,stringsAsFactors=FALSE,fill=TRUE))})) #it seems like one of the rows of your file doesn't have 6 elements, so added fill=TRUE names(res)-paste(group_,gsub(\\d+,,names(res)),sep=) res2-split(res,names(res)) res3- lapply(res2,function(x) {names(x)-paste(gsub(.*_,,names(x)),1:length(x),sep=);x}) #result res3 #$group_a #$group_a$a1 Id M mm x b u k j y p v 1 aAA 1 2 739 0.1257000 2 2 AA 2 8867 8926 2 a 1 2 2263 0.0004000 2 2 AR 4 7640 8926 3 aA 2 1 1 0.0845435 2 AA 2 6790 734,1092 NA 4 aAA 1 2 1965 0.0007000 4 3 AR 2 11616 8926 5 aAAA 1 3 3660 0.0008600 18 3 AA 2 20392 496 6 AA na 2 1972 0.0007000 11 3 AR 25 509 734 $group_a$a2 Id M mm x b u k j y p v 1 aAA 1 2 739 0.1257000 2 2 AA 2 8867 8926 2 a 1 2 2263 0.0004000 2 2 AR 4 7640 8926 3 aA 2 1 1 0.0845435 2 AA 2 6790 734,1092 NA 4 aAA 1 2 1965 0.0007000 4 3 AR 2 11616 8926 5 aAAA 1 3 3660 0.0008600 18 3 AA 2 20392 496 6 AA na 2 1972 0.0007000 11 3 AR 25 509 734 $group_a$a3 Id M mm x b u k j y p v 1 aAA 1 2 739 0.1257000 2 2 AA 2 8867 8926 2 a 1 2 2263 0.0004000 2 2 AR 4 7640 8926 3 aA 2 1 1 0.0845435 2 AA 2 6790 734,1092 NA 4 aAA 1 2 1965 0.0007000 4 3 AR 2 11616 8926 5 aAAA 1 3 3660 0.0008600 18 3 AA 2 20392 496 6 AA na 2 1972 0.0007000 11 3 AR 25 509 734 $group_b $group_b$b1 Id M mm x b u k j y p v 1 aAA 1 2 739 0.1257000 2 2 AA 2 8867 8926 2 a 1 2 2263 0.0004000 2 2 AR 4 7640 8926 3 aA 2 1 1 0.0845435 2 AA 2 6790 734,1092 NA 4 aAA 1 2 1965 0.0007000 4 3 AR 2 11616 8926 5 aAAA 1 3 3660 0.0008600 18 3 AA 2 20392 496 6 AA na 2 1972 0.0007000 11 3 AR 25 509 734 $group_b$b2 Id M mm x b u k j y p v 1 aAA 1 2 739 0.1257000 2 2 AA 2 8867 8926 2 a 1 2 2263 0.0004000 2 2 AR 4 7640 8926 3 aA 2 1 1 0.0845435 2 AA 2 6790 734,1092 NA 4 aAA 1 2 1965 0.0007000 4 3 AR 2 11616 8926 5 aAAA 1 3 3660 0.0008600 18 3 AA 2 20392 496 6 AA na 2 1972 0.0007000 11 3 AR 25 509 734 $group_c $group_c$c1 Id M mm x b u k j y p v 1 aAA 1 2 739 0.1257000 2 2 AA 2 8867 8926 2 a 1 2 2263 0.0004000 2 2 AR 4 7640 8926 3 aA 2 1 1 0.0845435 2 AA 2 6790 734,1092 NA 4 aAA 1 2 1965 0.0007000 4 3 AR 2 11616 8926 5 aAAA 1 3 3660 0.0008600 18 3 AA 2 20392 496 6 AA na 2 1972 0.0007000 11 3 AR 25 509 734 A.K. From: Vera Costa veracosta...@gmail.com To: arun smartpink...@yahoo.com Sent: Saturday, February 16, 2013 6:32 PM Subject: Re: reading data Sorry again... In: res-do.call(c,lapply(list.files(recursive=T)[grep(... What
Re: [R] reading data
Hi, Try by putting quotes ie. res- do.call(c,...) A.K. From: Vera Costa veracosta...@gmail.com To: arun smartpink...@yahoo.com Sent: Saturday, February 16, 2013 7:10 PM Subject: Re: reading data Thank you. In mine, I have an error 'what' must be a character string or a function. I need to do equivalent in my system. Thank you and sorry one more time. No dia 16 de Fev de 2013 23:53, arun smartpink...@yahoo.com escreveu: Hi, You didn't mention what the error message or whether you are reading file names which are not m11kk.txt. It is workiing on my system as I run it again. ?c() combine values into a vector or list. sessionInfo() R version 2.15.1 (2012-06-22) Platform: x86_64-pc-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_CA.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_CA.UTF-8 LC_COLLATE=en_CA.UTF-8 [5] LC_MONETARY=en_CA.UTF-8 LC_MESSAGES=en_CA.UTF-8 [7] LC_PAPER=C LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_CA.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] stringr_0.6.2 reshape2_1.2.2 loaded via a namespace (and not attached): [1] plyr_1.8 #code res-do.call(c,lapply(list.files(recursive=T)[grep(m11kk,list.files(recursive=T))],function(x) {names(x)-gsub(^(.*)\\/.*,\\1,x); lapply(x,function(y) read.table(y,header=TRUE,stringsAsFactors=FALSE,fill=TRUE))})) #it seems like one of the rows of your file doesn't have 6 elements, so added fill=TRUE names(res)-paste(group_,gsub(\\d+,,names(res)),sep=) res2-split(res,names(res)) res3- lapply(res2,function(x) {names(x)-paste(gsub(.*_,,names(x)),1:length(x),sep=);x}) #result res3 #$group_a #$group_a$a1 Id M mm x b u k j y p v 1 aAA 1 2 739 0.1257000 2 2 AA 2 8867 8926 2 a 1 2 2263 0.0004000 2 2 AR 4 7640 8926 3 aA 2 1 1 0.0845435 2 AA 2 6790 734,1092 NA 4 aAA 1 2 1965 0.0007000 4 3 AR 2 11616 8926 5 aAAA 1 3 3660 0.0008600 18 3 AA 2 20392 496 6 AA na 2 1972 0.0007000 11 3 AR 25 509 734 $group_a$a2 Id M mm x b u k j y p v 1 aAA 1 2 739 0.1257000 2 2 AA 2 8867 8926 2 a 1 2 2263 0.0004000 2 2 AR 4 7640 8926 3 aA 2 1 1 0.0845435 2 AA 2 6790 734,1092 NA 4 aAA 1 2 1965 0.0007000 4 3 AR 2 11616 8926 5 aAAA 1 3 3660 0.0008600 18 3 AA 2 20392 496 6 AA na 2 1972 0.0007000 11 3 AR 25 509 734 $group_a$a3 Id M mm x b u k j y p v 1 aAA 1 2 739 0.1257000 2 2 AA 2 8867 8926 2 a 1 2 2263 0.0004000 2 2 AR 4 7640 8926 3 aA 2 1 1 0.0845435 2 AA 2 6790 734,1092 NA 4 aAA 1 2 1965 0.0007000 4 3 AR 2 11616 8926 5 aAAA 1 3 3660 0.0008600 18 3 AA 2 20392 496 6 AA na 2 1972 0.0007000 11 3 AR 25 509 734 $group_b $group_b$b1 Id M mm x b u k j y p v 1 aAA 1 2 739 0.1257000 2 2 AA 2 8867 8926 2 a 1 2 2263 0.0004000 2 2 AR 4 7640 8926 3 aA 2 1 1 0.0845435 2 AA 2 6790 734,1092 NA 4 aAA 1 2 1965 0.0007000 4 3 AR 2 11616 8926 5 aAAA 1 3 3660 0.0008600 18 3 AA 2 20392 496 6 AA na 2 1972 0.0007000 11 3 AR 25 509 734 $group_b$b2 Id M mm x b u k j y p v 1 aAA 1 2 739 0.1257000 2 2 AA 2 8867 8926 2 a 1 2 2263 0.0004000 2 2 AR 4 7640 8926 3 aA 2 1 1 0.0845435 2 AA 2 6790 734,1092 NA 4 aAA 1 2 1965 0.0007000 4 3 AR 2 11616 8926 5 aAAA 1 3 3660 0.0008600 18 3 AA 2 20392 496 6 AA na 2 1972 0.0007000 11 3 AR 25 509 734 $group_c $group_c$c1 Id M mm x b u k j y p v 1 aAA 1 2 739 0.1257000 2 2 AA 2 8867 8926 2 a 1 2 2263 0.0004000 2 2 AR 4 7640 8926 3 aA 2 1 1 0.0845435 2 AA 2 6790 734,1092 NA 4 aAA 1 2 1965 0.0007000 4 3 AR 2 11616 8926 5 aAAA 1 3 3660 0.0008600 18 3 AA 2 20392 496 6 AA na 2 1972 0.0007000 11 3 AR 25 509 734 A.K. From: Vera Costa veracosta...@gmail.com To: arun smartpink...@yahoo.com Sent: Saturday, February 16, 2013 6:32 PM Subject: Re: reading data Sorry again... In: res-do.call(c,lapply(list.files(recursive=T)[grep(... What is this c? In do.call(c, When I put this row im R, I have an error. Thank you No dia 15 de Fev de 2013 18:11, arun smartpink...@yahoo.com escreveu: Hi, No problem. BTW, these questions are not stupid.. Arun From: Vera Costa veracosta...@gmail.com To: arun smartpink...@yahoo.com Sent: Friday, February 15, 2013 1:08 PM Subject: Re: reading data Thank
Re: [R] reading data
Hi, #working directory data1 #changed name data to data1. Added some files in each of sub directories a1, a2, etc. indx1- indx[indx!=] lapply(indx1,function(x) list.files(x)) #[[1]] #[1] a1.txt m11kk.txt #[[2]] #[1] a2.txt m11kk.txt #[[3]] #[1] a3.txt m11kk.txt #[[4]] #[1] b1.txt m11kk.txt #[[5]] #[1] b2.txt b3.txt m11kk.txt [[6]] [1] c1.txt c2.txt c3.txt c4.txt [5] m11kk.txt res-do.call(c,lapply(list.files(recursive=T)[grep(m11kk,list.files(recursive=T))],function(x) {names(x)-gsub(^(.*)\\/.*,\\1,x); lapply(x,function(y) read.table(y,header=TRUE,stringsAsFactors=FALSE,fill=TRUE))})) #it seems like one of the rows of your file doesn't have 6 elements, so added fill=TRUE head(res,2) #$a1 # Id M mm x b u k j y p v #1 aAA 1 2 739 0.1257000 2 2 AA 2 8867 8926 #2 a 1 2 2263 0.0004000 2 2 AR 4 7640 8926 #3 aA 2 1 1 0.0845435 2 AA 2 6790 734,1092 NA #4 aAA 1 2 1965 0.0007000 4 3 AR 2 11616 8926 #5 aAAA 1 3 3660 0.0008600 18 3 AA 2 20392 496 #6 AA na 2 1972 0.0007000 11 3 AR 25 509 734 #$a2 # Id M mm x b u k j y p v #1 aAA 1 2 739 0.1257000 2 2 AA 2 8867 8926 #2 a 1 2 2263 0.0004000 2 2 AR 4 7640 8926 #3 aA 2 1 1 0.0845435 2 AA 2 6790 734,1092 NA #4 aAA 1 2 1965 0.0007000 4 3 AR 2 11616 8926 #5 aAAA 1 3 3660 0.0008600 18 3 AA 2 20392 496 #6 AA na 2 1972 0.0007000 11 3 AR 25 509 734 If you want the names to be group_a, group_b etc. names(res)-paste(group_,gsub(\\d+,,names(res)),sep=) res[grep(group_b,names(res))] $group_b # Id M mm x b u k j y p v #1 aAA 1 2 739 0.1257000 2 2 AA 2 8867 8926 #2 a 1 2 2263 0.0004000 2 2 AR 4 7640 8926 #3 aA 2 1 1 0.0845435 2 AA 2 6790 734,1092 NA #4 aAA 1 2 1965 0.0007000 4 3 AR 2 11616 8926 #5 aAAA 1 3 3660 0.0008600 18 3 AA 2 20392 496 #6 AA na 2 1972 0.0007000 11 3 AR 25 509 734 #$group_b # Id M mm x b u k j y p v #1 aAA 1 2 739 0.1257000 2 2 AA 2 8867 8926 #2 a 1 2 2263 0.0004000 2 2 AR 4 7640 8926 #3 aA 2 1 1 0.0845435 2 AA 2 6790 734,1092 NA #4 aAA 1 2 1965 0.0007000 4 3 AR 2 11616 8926 #5 aAAA 1 3 3660 0.0008600 18 3 AA 2 20392 496 #6 AA na 2 1972 0.0007000 11 3 AR 25 509 734 A.K. - Original Message - From: veracosta...@gmail.com veracosta...@gmail.com To: smartpink...@yahoo.com Cc: Sent: Friday, February 15, 2013 9:15 AM Subject: reading data Hi, I post yesterday and you helped me. I have little problem. At first, I never worked with regular expressions... The code that you gave me it's ok, but my files are inside the folders a1,a2,a3. I try to explain better. I have one folder named data. Inside this folder I have some other folders named a1,a2,b1,b2,...and inside of each one of that I have some files. I want only the file mm.txt (in all folders I have One file with this name). The name of the folder give me the name of the group,but I need to read the file inside. And after, have group_a, group_b...because I need to work with this data grouped (and know the name of the group). Thank you. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] reading data
HI, Just to add: res-do.call(c,lapply(list.files(recursive=T)[grep(m11kk,list.files(recursive=T))],function(x) {names(x)-gsub(^(.*)\\/.*,\\1,x); lapply(x,function(y) read.table(y,header=TRUE,stringsAsFactors=FALSE,fill=TRUE))})) #it seems like one of the rows of your file doesn't have 6 elements, so added fill=TRUE names(res)-paste(group_,gsub(\\d+,,names(res)),sep=) res[grep(group_b,names(res))] I am not sure how you want the grouped data to look like. If you want something like this: res1-do.call(rbind,res) res2-lapply(split(res1,gsub([.0-9],,row.names(res1))),function(x) {row.names(x)-1:nrow(x);x}) res2 #$group_a # Id M mm x b u k j y p v #1 aAA 1 2 739 0.1257000 2 2 AA 2 8867 8926 #2 a 1 2 2263 0.0004000 2 2 AR 4 7640 8926 #3 aA 2 1 1 0.0845435 2 AA 2 6790 734,1092 NA #4 aAA 1 2 1965 0.0007000 4 3 AR 2 11616 8926 #5 aAAA 1 3 3660 0.0008600 18 3 AA 2 20392 496 #6 AA na 2 1972 0.0007000 11 3 AR 25 509 734 #7 aAA 1 2 739 0.1257000 2 2 AA 2 8867 8926 #8 a 1 2 2263 0.0004000 2 2 AR 4 7640 8926 #9 aA 2 1 1 0.0845435 2 AA 2 6790 734,1092 NA #10 aAA 1 2 1965 0.0007000 4 3 AR 2 11616 8926 #11 aAAA 1 3 3660 0.0008600 18 3 AA 2 20392 496 #12 AA na 2 1972 0.0007000 11 3 AR 25 509 734 #13 aAA 1 2 739 0.1257000 2 2 AA 2 8867 8926 #14 a 1 2 2263 0.0004000 2 2 AR 4 7640 8926 #15 aA 2 1 1 0.0845435 2 AA 2 6790 734,1092 NA #16 aAA 1 2 1965 0.0007000 4 3 AR 2 11616 8926 #17 aAAA 1 3 3660 0.0008600 18 3 AA 2 20392 496 #18 AA na 2 1972 0.0007000 11 3 AR 25 509 734 #$group_b # Id M mm x b u k j y p v #1 aAA 1 2 739 0.1257000 2 2 AA 2 8867 8926 #2 a 1 2 2263 0.0004000 2 2 AR 4 7640 8926 #3 aA 2 1 1 0.0845435 2 AA 2 6790 734,1092 NA #4 aAA 1 2 1965 0.0007000 4 3 AR 2 11616 8926 #5 aAAA 1 3 3660 0.0008600 18 3 AA 2 20392 496 #6 AA na 2 1972 0.0007000 11 3 AR 25 509 734 #7 aAA 1 2 739 0.1257000 2 2 AA 2 8867 8926 #8 a 1 2 2263 0.0004000 2 2 AR 4 7640 8926 #9 aA 2 1 1 0.0845435 2 AA 2 6790 734,1092 NA #10 aAA 1 2 1965 0.0007000 4 3 AR 2 11616 8926 #11 aAAA 1 3 3660 0.0008600 18 3 AA 2 20392 496 #12 AA na 2 1972 0.0007000 11 3 AR 25 509 734 #$group_c # Id M mm x b u k j y p v #1 aAA 1 2 739 0.1257000 2 2 AA 2 8867 8926 #2 a 1 2 2263 0.0004000 2 2 AR 4 7640 8926 #3 aA 2 1 1 0.0845435 2 AA 2 6790 734,1092 NA #4 aAA 1 2 1965 0.0007000 4 3 AR 2 11616 8926 #5 aAAA 1 3 3660 0.0008600 18 3 AA 2 20392 496 #6 AA na 2 1972 0.0007000 11 3 AR 25 509 734 #or if you want it like this: res2-split(res,names(res)) res2[[group_b]] #$group_b # Id M mm x b u k j y p v #1 aAA 1 2 739 0.1257000 2 2 AA 2 8867 8926 #2 a 1 2 2263 0.0004000 2 2 AR 4 7640 8926 #3 aA 2 1 1 0.0845435 2 AA 2 6790 734,1092 NA #4 aAA 1 2 1965 0.0007000 4 3 AR 2 11616 8926 #5 aAAA 1 3 3660 0.0008600 18 3 AA 2 20392 496 #6 AA na 2 1972 0.0007000 11 3 AR 25 509 734 #$group_b # Id M mm x b u k j y p v #1 aAA 1 2 739 0.1257000 2 2 AA 2 8867 8926 #2 a 1 2 2263 0.0004000 2 2 AR 4 7640 8926 #3 aA 2 1 1 0.0845435 2 AA 2 6790 734,1092 NA #4 aAA 1 2 1965 0.0007000 4 3 AR 2 11616 8926 #5 aAAA 1 3 3660 0.0008600 18 3 AA 2 20392 496 #6 AA na 2 1972 0.0007000 11 3 AR 25 509 734 Hope this helps. A.K. - Original Message - From: veracosta...@gmail.com veracosta...@gmail.com To: smartpink...@yahoo.com Cc: Sent: Friday, February 15, 2013 9:15 AM Subject: reading data Hi, I post yesterday and you helped me. I have little problem. At first, I never worked with regular expressions... The code that you gave me it's ok, but my files are inside the folders a1,a2,a3. I try to explain better. I have one folder named data. Inside this folder I have some other folders named a1,a2,b1,b2,...and inside of each one of that I have some files. I want only the file mm.txt (in all folders I have One file with this name). The name of the folder give me the name of the group,but I need to read the file inside. And after, have group_a, group_b...because I need to work with this data grouped (and know the name of the group). Thank you. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained,
Re: [R] reading data
HI, No problem. ?c() for concatenate to vector or list(). If I use do.call(cbind,..) or do.call(rbind,...) do.call(cbind,lapply(list.files(recursive=T)[grep(m11kk,list.files(recursive=T))],function(x) {names(x)-gsub(^(.*)\\/.*,\\1,x); lapply(x,function(y) read.table(y,header=TRUE,stringsAsFactors=FALSE,fill=TRUE))})) # [,1] [,2] [,3] [,4] [,5] [,6] #a1 List,11 List,11 List,11 List,11 List,11 List,11 do.call(rbind,lapply(list.files(recursive=T)[grep(m11kk,list.files(recursive=T))],function(x) {names(x)-gsub(^(.*)\\/.*,\\1,x); lapply(x,function(y) read.table(y,header=TRUE,stringsAsFactors=FALSE,fill=TRUE))})) # a1 #[1,] List,11 #[2,] List,11 #[3,] List,11 #[4,] List,11 #[5,] List,11 #[6,] List,11 ie. list within in a list restrial-lapply(list.files(recursive=T)[grep(m11kk,list.files(recursive=T))],function(x) {names(x)-gsub(^(.*)\\/.*,\\1,x); lapply(x,function(y) read.table(y,header=TRUE,stringsAsFactors=FALSE,fill=TRUE))}) str(restrial) #List of 6 # $ :List of 1 #..$ a1:'data.frame': 6 obs. of 11 variables: .#. ..$ Id: chr [1:6] aAA a aA aAA ... #.. ..$ M : chr [1:6] 1 1 2 1 ... #. ..$ mm: int [1:6] 2 2 1 2 3 2 #. ..$ x : int [1:6] 739 2263 1 1965 3660 1972 - str(res) #List of 6 # $ a1:'data.frame': 6 obs. of 11 variables: # ..$ Id: chr [1:6] aAA a aA aAA ... #..$ M : chr [1:6] 1 1 2 1 ... # ..$ mm: int [1:6] 2 2 1 2 3 2 # ..$ x : int [1:6] 739 2263 1 1965 3660 1972 - You mentioned about naming this to group_a,group_b. etc.. names(res)-paste(group_,gsub(\\d+,,names(res)),sep=) res2-split(res,names(res)) res3- lapply(res2,function(x) {names(x)-paste(gsub(.*_,,names(x)),1:length(x),sep=);x}) res3$group_a $a1 # Id M mm x b u k j y p v #1 aAA 1 2 739 0.1257000 2 2 AA 2 8867 8926 #2 a 1 2 2263 0.0004000 2 2 AR 4 7640 8926 #3 aA 2 1 1 0.0845435 2 AA 2 6790 734,1092 NA #4 aAA 1 2 1965 0.0007000 4 3 AR 2 11616 8926 #5 aAAA 1 3 3660 0.0008600 18 3 AA 2 20392 496 #6 AA na 2 1972 0.0007000 11 3 AR 25 509 734 #$a2 # Id M mm x b u k j y p v #1 aAA 1 2 739 0.1257000 2 2 AA 2 8867 8926 #2 a 1 2 2263 0.0004000 2 2 AR 4 7640 8926 #3 aA 2 1 1 0.0845435 2 AA 2 6790 734,1092 NA #4 aAA 1 2 1965 0.0007000 4 3 AR 2 11616 8926 #5 aAAA 1 3 3660 0.0008600 18 3 AA 2 20392 496 #6 AA na 2 1972 0.0007000 11 3 AR 25 509 734 #$a3 # Id M mm x b u k j y p v #1 aAA 1 2 739 0.1257000 2 2 AA 2 8867 8926 #2 a 1 2 2263 0.0004000 2 2 AR 4 7640 8926 #3 aA 2 1 1 0.0845435 2 AA 2 6790 734,1092 NA #4 aAA 1 2 1965 0.0007000 4 3 AR 2 11616 8926 #5 aAAA 1 3 3660 0.0008600 18 3 AA 2 20392 496 #6 AA na 2 1972 0.0007000 11 3 AR 25 509 734 A.K. From: Vera Costa veracosta...@gmail.com To: arun smartpink...@yahoo.com Sent: Friday, February 15, 2013 12:39 PM Subject: Re: reading data Thank you very much and sorry my questions. But this code isn't grouping for letters sure? I mean, a1,a2,a3 is the same group, (the first letter give me the name of the group) Another question, in do.call, you did do.call (c,.) .What is c? Sorry 2013/2/15 arun smartpink...@yahoo.com HI, Just to add: res-do.call(c,lapply(list.files(recursive=T)[grep(m11kk,list.files(recursive=T))],function(x) {names(x)-gsub(^(.*)\\/.*,\\1,x); lapply(x,function(y) read.table(y,header=TRUE,stringsAsFactors=FALSE,fill=TRUE))})) #it seems like one of the rows of your file doesn't have 6 elements, so added fill=TRUE names(res)-paste(group_,gsub(\\d+,,names(res)),sep=) res[grep(group_b,names(res))] I am not sure how you want the grouped data to look like. If you want something like this: res1-do.call(rbind,res) res2-lapply(split(res1,gsub([.0-9],,row.names(res1))),function(x) {row.names(x)-1:nrow(x);x}) res2 #$group_a # Id M mm x b u k j y p v #1 aAA 1 2 739 0.1257000 2 2 AA 2 8867 8926 #2 a 1 2 2263 0.0004000 2 2 AR 4 7640 8926 #3 aA 2 1 1 0.0845435 2 AA 2 6790 734,1092 NA #4 aAA 1 2 1965 0.0007000 4 3 AR 2 11616 8926 #5 aAAA 1 3 3660 0.0008600 18 3 AA 2 20392 496 #6 AA na 2 1972 0.0007000 11 3 AR 25 509 734 #7 aAA 1 2 739 0.1257000 2 2 AA 2 8867 8926 #8 a 1 2 2263 0.0004000 2 2 AR 4 7640 8926 #9 aA 2 1 1 0.0845435 2 AA 2 6790 734,1092 NA #10 aAA 1 2 1965 0.0007000 4 3 AR 2 11616 8926 #11 aAAA 1 3 3660 0.0008600 18 3 AA 2 20392 496 #12 AA na 2 1972 0.0007000 11 3 AR 25 509 734 #13 aAA 1 2 739
Re: [R] reading data into R
hello, The error message is right, you have read the file have NOT assigned it to an object, to a variable. mydata1 - read.table (mydata1.csv, sep=,, header=T) Now you can use the variable 'mydata1'. It's a data.frame, and you can see what it looks like with the following instructions. str(mydata1)# str for structure head(mydata1) # default is first 6 lines Note also that you could have called your dataset a name different from the filename. mean (mydata1.csvX) Where have you found that syntax??? Correct mean(mydata1$X) mean(mydata1[ , X ]) You should read R-intro.pdf, it comes with any installation of R, folder doc. There are obvious beginner stuff things you could quickly learn. Hope this helps, Rui Barradas -- View this message in context: http://r.789695.n4.nabble.com/reading-data-into-R-tp4630069p4630071.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] reading data into R
You need to assign your data set to something -- right now you're just reading it in and then throwing it away: dats - read.csv(mydata1.csv) mean(dats$X) # Dollar sign, not ampersand Best, Michael On Tue, May 15, 2012 at 8:57 AM, jacaranda tree myjacara...@yahoo.com wrote: Hi I am really new using R, so this is really a beginner stuff! I created a very small data set on excel and then converted it to .csv file. I am able to open the data on R using the command read.table (mydata1.csv, sep=,, header=T) and it just works fine. But when I want to work on the data (e.g. calculate the mean of variable X) R says object not found. I tried the attach command or mean (mydata1.csvX) but still I get the same error message. I don't understand why R is having difficulty finding a variable. I believe I am doing something wrong. I will really appreciate if you could help me with this. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] reading data into R
What was the exact syntax? read.table (mydata1.csv, sep=,, header=T) will read the data but not save anything. mydat -read.table (mydata1.csv, sep=,, header=T) give you a data.frame called mydat. mean(mydat$X) should give you the mean of X John Kane Kingston ON Canada -Original Message- From: myjacara...@yahoo.com Sent: Tue, 15 May 2012 05:57:51 -0700 (PDT) To: r-help@r-project.org Subject: [R] reading data into R Hi I am really new using R, so this is really a beginner stuff! I created a very small data set on excel and then converted it to .csv file. I am able to open the data on R using the command read.table (mydata1.csv, sep=,, header=T) and it just works fine. But when I want to work on the data (e.g. calculate the mean of variable X) R says object not found. I tried the attach command or mean (mydata1.csvX) but still I get the same error message. I don't understand why R is having difficulty finding a variable. I believe I am doing something wrong. I will really appreciate if you could help me with this. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. FREE 3D MARINE AQUARIUM SCREENSAVER - Watch dolphins, sharks orcas on your desktop! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] reading data into R
Hi ! You need to assign the output of read.table() into an object; this is how R works: mydata - read.table (mydata1.csv, sep=,, header=T) mymean - mean(mydata$var) You should read some introductory material. I found this useful: http://www.burns-stat.com/pages/Tutor/hints_R_begin.html And then, there are tons of good books and documentation (go check the CRAN as well) HTH, Ivan PS: post in plain text -- Ivan CALANDRA Université de Bourgogne UMR CNRS/uB 6282 Biogéosciences 6 Boulevard Gabriel 21000 Dijon, FRANCE +33(0)3.80.39.63.06 ivan.calan...@u-bourgogne.fr http://biogeosciences.u-bourgogne.fr/calandra Le 15/05/12 14:57, jacaranda tree a écrit : Hi I am really new using R, so this is really a beginner stuff! I created a very small data set on excel and then converted it to .csv file. I am able to open the data on R using the command read.table (mydata1.csv, sep=,, header=T) and it just works fine. But when I want to work on the data (e.g. calculate the mean of variable X) R says object not found. I tried the attach command or mean (mydata1.csvX) but still I get the same error message. I don't understand why R is having difficulty finding a variable. I believe I am doing something wrong. I will really appreciate if you could help me with this. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading data/variables
Thanks Sarah. I have read about the problems with attach(), and I will try to avoid it. I have now found the line that's causing the problem is: setwd(z:/homework) With that line in place, either in a program or in Rprofile.site (?), then the moment I run R and simply enter (before reading any data) summary(mydata) I get sample statistics for a dozen variables! Do not save the workspace? I thought the option to save/use a binary file is meant to be convenient. I like working in the same working directory, and I like .rdata files. Does this sound hopeless? Thanks. At 09:26 PM 11/15/2011, Sarah Goslee wrote: Hi, The obvious answer is don't use attach() and you'll never have that problem. And see further comments inline. On Tue, Nov 15, 2011 at 6:05 PM, Steven Yen s...@utk.edu wrote: Can someone help me with this variable/data reading issue? I read a csv file and transform/create an additional variable (called y). The first set of commands below produced different sample statistics for hw11$y and y In the second set of command I renameuse the variable name yy, and sample statistics for $hw11$yy and yy are identical. Using y - yy fixed it, but I am not sure why I would need to do that. That y appeared to have come from a variable called y from another data frame (unrelated to the current run). Help! setwd(z:/homework) sink (z:/homework/hw11.our, append=T, split=T) hw11 - read.csv(ij10b.csv,header=T) hw11$y - hw11$e3 attach(hw11) The following object(s) are masked _by_ '.GlobalEnv': y Look there. R even *told* you that it was going to use the y in the global environment rather than the one you were trying to attach. The other solution: don't save your workspace. Your other email on this topic suggested to me that there is a .RData file in your preferred working directory that contains an object y, and that's what is interfering with what you think should happen. Deleting that file, or using a different directory, or removing y before you attach the data frame would all work. But truly, the best possible strategy is to avoid using attach() so you don't have to worry about which object named y is really being used because you specify it explicitly. (n - dim(hw11)[1]) [1] 13765 summary(hw11$y) Min. 1st Qu. Median Mean 3rd Qu. Max. 0. 0.4500 1. 1.6726 2. 140. length(hw11$y) [1] 13765 summary(y) Min. 1st Qu. MedianMean 3rd Qu.Max. 0.0 0.0 0.0 0.24958 0.0 1.0 length(y) [1] 601 setwd(z:/homework) sink (z:/homework/hw11.our, append=T, split=T) hw11 - read.csv(ij10b.csv,header=T) hw11$yy - hw11$e3 attach(hw11) hw11$yy - hw11$e3 summary(hw11$yy) Min. 1st Qu. Median Mean 3rd Qu. Max. 0. 0.4500 1. 1.6726 2. 140. length(hw11$yy) [1] 13765 summary(yy) Min. 1st Qu. Median Mean 3rd Qu. Max. 0. 0.4500 1. 1.6726 2. 140. length(yy) [1] 13765 -- Sarah Goslee http://www.functionaldiversity.org -- Steven T. Yen, Professor of Agricultural Economics The University of Tennessee http://web.utk.edu/~syen/ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading data/variables
Well, if your problem is that a workspace is being loaded automatically and you don't want that workspace, you have several options: 1. Use a different directory for each project so that the file loaded by default is the correct one. 2. Don't save your workspace, but regenerate it each time. 3. Use R --vanilla or your OS's equivalent to start R without loading anything automatically, and use load() and save() to manually manage RData files. Yes, it's convenient, but if you want to use a non-standard way of working you need to understand what you're doing. Sarah On Thu, Nov 17, 2011 at 3:10 AM, Steven Yen s...@utk.edu wrote: Thanks Sarah. I have read about the problems with attach(), and I will try to avoid it. I have now found the line that's causing the problem is: setwd(z:/homework) With that line in place, either in a program or in Rprofile.site (?), then the moment I run R and simply enter (before reading any data) summary(mydata) I get sample statistics for a dozen variables! Do not save the workspace? I thought the option to save/use a binary file is meant to be convenient. I like working in the same working directory, and I like .rdata files. Does this sound hopeless? Thanks. At 09:26 PM 11/15/2011, Sarah Goslee wrote: Hi, The obvious answer is don't use attach() and you'll never have that problem. And see further comments inline. On Tue, Nov 15, 2011 at 6:05 PM, Steven Yen s...@utk.edu wrote: Can someone help me with this variable/data reading issue? I read a csv file and transform/create an additional variable (called y). The first set of commands below produced different sample statistics for hw11$y and y In the second set of command I renameuse the variable name yy, and sample statistics for $hw11$yy and yy are identical. Using y - yy fixed it, but I am not sure why I would need to do that. That y appeared to have come from a variable called y from another data frame (unrelated to the current run). Help! setwd(z:/homework) sink (z:/homework/hw11.our, append=T, split=T) hw11 - read.csv(ij10b.csv,header=T) hw11$y - hw11$e3 attach(hw11) The following object(s) are masked _by_ '.GlobalEnv': y Look there. R even *told* you that it was going to use the y in the global environment rather than the one you were trying to attach. The other solution: don't save your workspace. Your other email on this topic suggested to me that there is a .RData file in your preferred working directory that contains an object y, and that's what is interfering with what you think should happen. Deleting that file, or using a different directory, or removing y before you attach the data frame would all work. But truly, the best possible strategy is to avoid using attach() so you don't have to worry about which object named y is really being used because you specify it explicitly. (n - dim(hw11)[1]) [1] 13765 summary(hw11$y) Min. 1st Qu. Median Mean 3rd Qu. Max. 0. 0.4500 1. 1.6726 2. 140. length(hw11$y) [1] 13765 summary(y) Min. 1st Qu. Median Mean 3rd Qu. Max. 0.0 0.0 0.0 0.24958 0.0 1.0 length(y) [1] 601 setwd(z:/homework) sink (z:/homework/hw11.our, append=T, split=T) hw11 - read.csv(ij10b.csv,header=T) hw11$yy - hw11$e3 attach(hw11) hw11$yy - hw11$e3 summary(hw11$yy) Min. 1st Qu. Median Mean 3rd Qu. Max. 0. 0.4500 1. 1.6726 2. 140. length(hw11$yy) [1] 13765 summary(yy) Min. 1st Qu. Median Mean 3rd Qu. Max. 0. 0.4500 1. 1.6726 2. 140. length(yy) [1] 13765 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading data/variables
A follow-up on the data/variable issue I posted earlier: Here was what I did, which might was obviously causing the problem: I inserted the following line in my file Rprofile.site: setwd(z:/R) Then, as soon as I run R (before I read any data) I issue summary(mydata) I get summary statistics for a dozen variables from a file that appeared to reside in z:/R. Am I not supposed to set the working directory by the line above (which brings me to my preferred working directory each time I run R)? At 06:05 PM 11/15/2011, Steven Yen wrote: Can someone help me with this variable/data reading issue? I read a csv file and transform/create an additional variable (called y). The first set of commands below produced different sample statistics for hw11$y and y In the second set of command I renameuse the variable name yy, and sample statistics for $hw11$yy and yy are identical. Using y - yy fixed it, but I am not sure why I would need to do that. That y appeared to have come from a variable called y from another data frame (unrelated to the current run). Help! setwd(z:/homework) sink (z:/homework/hw11.our, append=T, split=T) hw11 - read.csv(ij10b.csv,header=T) hw11$y - hw11$e3 attach(hw11) The following object(s) are masked _by_ '.GlobalEnv': y (n - dim(hw11)[1]) [1] 13765 summary(hw11$y) Min. 1st Qu. Median Mean 3rd Qu. Max. 0. 0.4500 1. 1.6726 2. 140. length(hw11$y) [1] 13765 summary(y) Min. 1st Qu. MedianMean 3rd Qu.Max. 0.0 0.0 0.0 0.24958 0.0 1.0 length(y) [1] 601 setwd(z:/homework) sink (z:/homework/hw11.our, append=T, split=T) hw11 - read.csv(ij10b.csv,header=T) hw11$yy - hw11$e3 attach(hw11) hw11$yy - hw11$e3 summary(hw11$yy) Min. 1st Qu. Median Mean 3rd Qu. Max. 0. 0.4500 1. 1.6726 2. 140. length(hw11$yy) [1] 13765 summary(yy) Min. 1st Qu. Median Mean 3rd Qu. Max. 0. 0.4500 1. 1.6726 2. 140. length(yy) [1] 13765 -- Steven T. Yen, Professor of Agricultural Economics The University of Tennessee http://web.utk.edu/~syen/ -- Steven T. Yen, Professor of Agricultural Economics The University of Tennessee http://web.utk.edu/~syen/ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading data/variables
Hi, The obvious answer is don't use attach() and you'll never have that problem. And see further comments inline. On Tue, Nov 15, 2011 at 6:05 PM, Steven Yen s...@utk.edu wrote: Can someone help me with this variable/data reading issue? I read a csv file and transform/create an additional variable (called y). The first set of commands below produced different sample statistics for hw11$y and y In the second set of command I renameuse the variable name yy, and sample statistics for $hw11$yy and yy are identical. Using y - yy fixed it, but I am not sure why I would need to do that. That y appeared to have come from a variable called y from another data frame (unrelated to the current run). Help! setwd(z:/homework) sink (z:/homework/hw11.our, append=T, split=T) hw11 - read.csv(ij10b.csv,header=T) hw11$y - hw11$e3 attach(hw11) The following object(s) are masked _by_ '.GlobalEnv': y Look there. R even *told* you that it was going to use the y in the global environment rather than the one you were trying to attach. The other solution: don't save your workspace. Your other email on this topic suggested to me that there is a .RData file in your preferred working directory that contains an object y, and that's what is interfering with what you think should happen. Deleting that file, or using a different directory, or removing y before you attach the data frame would all work. But truly, the best possible strategy is to avoid using attach() so you don't have to worry about which object named y is really being used because you specify it explicitly. (n - dim(hw11)[1]) [1] 13765 summary(hw11$y) Min. 1st Qu. Median Mean 3rd Qu. Max. 0. 0.4500 1. 1.6726 2. 140. length(hw11$y) [1] 13765 summary(y) Min. 1st Qu. Median Mean 3rd Qu. Max. 0.0 0.0 0.0 0.24958 0.0 1.0 length(y) [1] 601 setwd(z:/homework) sink (z:/homework/hw11.our, append=T, split=T) hw11 - read.csv(ij10b.csv,header=T) hw11$yy - hw11$e3 attach(hw11) hw11$yy - hw11$e3 summary(hw11$yy) Min. 1st Qu. Median Mean 3rd Qu. Max. 0. 0.4500 1. 1.6726 2. 140. length(hw11$yy) [1] 13765 summary(yy) Min. 1st Qu. Median Mean 3rd Qu. Max. 0. 0.4500 1. 1.6726 2. 140. length(yy) [1] 13765 -- Sarah Goslee http://www.functionaldiversity.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading data with 'awk' - basics?
On Mon, Oct 17, 2011 at 9:23 AM, Brian Smith bsmith030...@gmail.com wrote: Hi, I had a large file for which I require a subset of rows. Instead of reading it all into memory, I use the awk command to get the relevant rows. However, I'm doing it pretty inefficiently as I write the subset to disk, before reading it into R. Is there a way that I can read it into an R object without writing to disk? For example, this is what I do currently: ## write test sample file mat1 - matrix(sample(1:100,16),8,2) fname1 - 'temp1.txt' fname2 - 'temp2.txt' write.table(mat1,fname1,sep='\t',row.names=F,col.names=F) ## Read a subset of rows, write to file, and read from file system(paste(awk '(NR 1 NR 4) {print $0}' ,fname1, ,fname2,sep='')) mat2 - read.table(fname2,sep='\t') print(mat2) # Is there a way that I can skip writing to disk? See: http://tolstoy.newcastle.edu.au/R/e5/help/08/09/2129.html -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading data with 'awk' - basics?
On Mon, 17 Oct 2011, Brian Smith wrote: Hi, I had a large file for which I require a subset of rows. Instead of reading it all into memory, I use the awk command to get the relevant rows. However, I'm doing it pretty inefficiently as I write the subset to disk, before reading it into R. Is there a way that I can read it into an R object without writing to disk? For example, this is what I do currently: ## write test sample file mat1 - matrix(sample(1:100,16),8,2) fname1 - 'temp1.txt' fname2 - 'temp2.txt' write.table(mat1,fname1,sep='\t',row.names=F,col.names=F) ## Read a subset of rows, write to file, and read from file system(paste(awk '(NR 1 NR 4) {print $0}' ,fname1, ,fname2,sep='')) mat2 - read.table(fname2,sep='\t') print(mat2) # Is there a way that I can skip writing to disk? Use a pipe() connection. thanks! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading data with 'awk' - basics?
Got it. Thanks! On Mon, Oct 17, 2011 at 9:40 AM, Prof Brian Ripley rip...@stats.ox.ac.ukwrote: On Mon, 17 Oct 2011, Brian Smith wrote: Hi, I had a large file for which I require a subset of rows. Instead of reading it all into memory, I use the awk command to get the relevant rows. However, I'm doing it pretty inefficiently as I write the subset to disk, before reading it into R. Is there a way that I can read it into an R object without writing to disk? For example, this is what I do currently: ## write test sample file mat1 - matrix(sample(1:100,16),8,2) fname1 - 'temp1.txt' fname2 - 'temp2.txt' write.table(mat1,fname1,sep='\**t',row.names=F,col.names=F) ## Read a subset of rows, write to file, and read from file system(paste(awk '(NR 1 NR 4) {print $0}' ,fname1, ,fname2,sep='')) mat2 - read.table(fname2,sep='\t') print(mat2) # Is there a way that I can skip writing to disk? Use a pipe() connection. thanks! [[alternative HTML version deleted]] __** R-help@r-project.org mailing list https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/** posting-guide.html http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~**ripley/http://www.stats.ox.ac.uk/%7Eripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading data in lisp format
Thanks David, The crx.data is a different database and I would like to use both. I have contacted with the developer but he has not answered me. Regards, Esteban De: David Winsemius [mailto:dwinsem...@comcast.net] Enviado el: mié 21/09/2011 17:08 Para: ESTEBAN ALFARO CORTES CC: r-help@r-project.org Asunto: Re: [R] Reading data in lisp format If you think that R is loosely typed, then examining LiSP code will change your mind, or at least give you a new data point further out on the Loose-Tight axis. I think you will need to do the processing by hand. The organization of the data is fairly clear. There are logical columns with values :neg and :pos, categorical columns with values in (id value) pairs, numeric ones and then a group of computed columns at the bottom. It also appears that after the first enumeration of ids with logical values that subsequent logical variables are defined possibly with pos: values only. So I guess the counter-question is: How important is this particular dataset to you?? And further question might be, are you sure that you don't want the dataset that is right next to it: ftp://ftp.ics.uci.edu/pub/machine-learning-databases/credit-screening/crx.data It is well-behaved comma-separated file. -- David. On Sep 21, 2011, at 6:39 AM, ESTEBAN ALFARO CORTES wrote: Hi, I am trying to read the credit.lisp file of the Japanese credit database in UCI repository, but it is in lisp format which I do not know how to read. I have not found how to do that in the foreign library http://archive.ics.uci.edu/ml/datasets/Japanese+Credit+Screening http://archive.ics.uci.edu/ml/datasets/Japanese+Credit+Screening Could anyone help me? Best regards, Esteban Alfaro PS: This is my first time in r-help so I apologize for possible inconveniences. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading data in lisp format
Thanks Cesar, Any idea for this contents of the file? ;; positive examples represent people that were granted credit (def-pred credit_screening :type (:person) :pos ((s1) (s2) (s4) (s5) (s6) (s7) (s8) (s9) (s14) (s15) (s17) (s18) (s19) (s21) (s22) (s24) (s28) (s29) (s31) (s32) (s35) (s38) (s40) (s41) (s42) (s43) (s45) (s46) (s47) (s49) (s50) (s51) (s53) (s54) (s55) (s56) (s57) (s59) (s61) (s62) (s63) (s64) (s65) (s66) (s69) (s70) (s71) (s72) (s73) (s74) (s75) (s76) (s77) (s78) (s79) (s80) (s81) (s83) (s84) (s85) (s86) (s87) (s89) (s90) (s91) (s92) (s93) (s94) (s96) (s97) (s98) (s100) (s103) (s104) (s106) (s108) (s110) (s116) (s117) (s118) (s119) (s121) (s122) (s123) (s124)) :neg ((s3) (s10) (s11) (s12) (s13) (s16) (s20) (s23) (s25) (s26) (s27) (s30) (s33) (s34) (s36) (s37) (s39) (s44) (s48) (s52) (s58) (s60) (s67) (s68) (s82) (s88) (s95) (s99) (s101) (s102) (s105) (s107) (s109) (s111) (s112) (s113) (s114) (s115) (s120) (s125))) (def-pred jobless :type (:person) :pos ((s3) (s10) (s12) (s23) (s34) (s39) (s44) (s56) (s60) (s82) (s85) (s88) (s99) (s115))) ;; item purchased that loan is for. (def-pred purchase_item :type (:person :atom) :pos ((s1 pc) (s2 pc) (s3 pc) (s4 pc) (s5 pc) (s6 pc) (s7 pc) (s8 pc) (s9 pc) (s10 pc) (s11 car) (s12 car) (s13 car) (s14 car) (s15 car) (s16 car) (s17 car) (s18 car) (s19 car) (s20 car) (s21 stereo) (s22 stereo) (s23 stereo) (s24 stereo) (s25 stereo) (s26 stereo) (s27 stereo) (s28 stereo) (s29 stereo) (s30 stereo) (s31 stereo) (s32 stereo) (s33 stereo) (s34 stereo) (s35 stereo) (s36 stereo) (s37 stereo) (s38 stereo) (s39 stereo) (s40 stereo) (s41 stereo) (s42 jewel) (s43 jewel) (s44 jewel) (s45 jewel) (s46 jewel) (s47 jewel) (s48 jewel) (s49 jewel) (s50 jewel) (s51 jewel) (s52 jewel) (s53 jewel) (s54 jewel) (s55 jewel) (s56 jewel) (s57 jewel) (s58 jewel) (s59 jewel) (s60 jewel) (s61 jewel) (s62 jewel) (s63 medinstru) (s64 medinstru) (s65 medinstru) (s66 medinstru) (s67 medinstru) (s68 medinstru) (s69 medinstru) (s70 medinstru) (s71 medinstru) (s72 medinstru) (s73 medinstru) (s74 medinstru) (s75 medinstru) (s76 medinstru) (s77 medinstru) (s78 medinstru) (s79 medinstru) (s80 medinstru) (s81 medinstru) (s82 medinstru) (s83 medinstru) (s84 jewel) (s85 stereo) (s86 medinstru) (s87 stereo) (s88 stereo) (s89 stereo) (s90 stereo) (s91 stereo) (s92 medinstru) (s93 medinstru) (s94 medinstru) (s95 medinstru) (s96 jewel) (s97 jewel) (s98 jewel) (s99 jewel) (s100 jewel) (s101 jewel) (s102 jewel) (s103 jewel) (s104 jewel) (s105 jewel) (s106 bike) (s107 bike) (s108 bike) (s109 bike) (s110 bike) (s111 bike) (s112 bike) (s113 bike) (s114 bike) (s115 bike) (s116 furniture) (s117 furniture) (s118 furniture) (s119 furniture) (s120 furniture) (s121 furniture) (s122 furniture) (s123 furniture) (s124 furniture) (s125 furniture))) (def-pred male :type (:person) :pos ((s6) (s7) (s8) (s9) (s10) (s16) (s17) (s18) (s19) (s20) (s21) (s22) (s25) (s27) (s29) (s37) (s38) (s39) (s40) (s41) (s42) (s43) (s45) (s48) (s49) (s51) (s58) (s59) (s60) (s61) (s62) (s68) (s69) (s70) (s71) (s72) (s74) (s76) (s77) (s79) (s80) (s82) (s84) (s86) (s89) (s90) (s91) (s92) (s94) (s97) (s98) (s102) (s103) (s104) (s105) (s106) (s107) (s108) (s109) (s110) (s121) (s122) (s123) (s124) (s125))) (def-pred female :type (:person) :pos ((s1) (s2) (s3) (s4) (s5) (s11) (s12) (s13) (s14) (s15) (s23) (s24) (s26) (s28) (s30) (s31) (s32) (s33) (s34) (s35) (s36) (s44) (s46) (s47) (s50) (s52) (s53) (s54) (s55) (s56) (s57) (s63) (s64) (s65) (s66) (s67) (s73) (s75) (s78) (s81) (s83) (s85) (s87) (s88) (s93) (s95) (s96) (s99) (s100) (s101) (s111) (s112) (s113) (s114) (s115) (s116) (s117) (s118) (s119) (s120))) (def-pred unmarried :type (:person) :pos ((s1) (s2) (s5) (s6) (s7) (s11) (s13) (s14) (s16) (s18) (s22) (s25) (s26) (s28) (s30) (s31) (s32) (s33) (s34) (s37) (s41) (s43) (s46) (s48) (s50) (s52) (s53) (s54) (s55) (s59) (s60) (s63) (s68) (s70) (s74) (s75) (s76) (s78) (s82) (s84) (s86) (s87) (s90) (s93) (s95) (s96) (s97) (s100) (s101) (s102) (s104) (s105) (s106) (s107) (s108) (s109) (s114) (s118) (s123))) ;; people who live in a problematic region (def-pred problematic_region :type (:person) :pos ((s3) (s5) (s23) (s30) (s33) (s39) (s48) (s60) (s68) (s72) (s76) (s78) (s84) (s105))) (def-pred age :type (:person :number) :pos ((s1 18) (s2 20) (s3 25) (s4 40) (s5 50) (s6 18) (s7 22) (s8 28) (s9 40) (s10 50) (s11 18) (s12 20) (s13 25) (s14 38) (s15 50) (s16 19) (s17 21) (s18 25) (s19 38) (s20 50) (s21 42) (s22 28) (s23 55) (s24 21) (s25 81) (s26 23) (s27 35) (s28 47) (s29 98) (s30 68) (s31 27) (s32 19) (s33 23) (s34 25) (s35 31) (s36 34) (s37 20) (s38 32) (s39 38) (s40 45) (s41 57) (s42 25) (s43 42) (s44 61) (s45 48)
Re: [R] Reading data in lisp format
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 22/09/11 10:13, ESTEBAN ALFARO CORTES wrote: Thanks David, The crx.data is a different database and I would like to use both. I have contacted with the developer but he has not answered me. I would suggest talk to folks using emacs, as emacs is written in lisp (well - elsisp)and particularly, especially the guys from org-mode (http://orgmode.org/), because they deal with all kinds of languages (including R) http://orgmode.org/worg/org-contrib/babel/. Very helpfull mailing list (http://orgmode.org/index.html#sec-5-2). I am quite sure, that somebody there should be able to help. Rainer Regards, Esteban De: David Winsemius [mailto:dwinsem...@comcast.net] Enviado el: mié 21/09/2011 17:08 Para: ESTEBAN ALFARO CORTES CC: r-help@r-project.org Asunto: Re: [R] Reading data in lisp format If you think that R is loosely typed, then examining LiSP code will change your mind, or at least give you a new data point further out on the Loose-Tight axis. I think you will need to do the processing by hand. The organization of the data is fairly clear. There are logical columns with values :neg and :pos, categorical columns with values in (id value) pairs, numeric ones and then a group of computed columns at the bottom. It also appears that after the first enumeration of ids with logical values that subsequent logical variables are defined possibly with pos: values only. So I guess the counter-question is: How important is this particular dataset to you?? And further question might be, are you sure that you don't want the dataset that is right next to it: ftp://ftp.ics.uci.edu/pub/machine-learning-databases/credit-screening/crx.data It is well-behaved comma-separated file. -- David. On Sep 21, 2011, at 6:39 AM, ESTEBAN ALFARO CORTES wrote: Hi, I am trying to read the credit.lisp file of the Japanese credit database in UCI repository, but it is in lisp format which I do not know how to read. I have not found how to do that in the foreign library http://archive.ics.uci.edu/ml/datasets/Japanese+Credit+Screening http://archive.ics.uci.edu/ml/datasets/Japanese+Credit+Screening Could anyone help me? Best regards, Esteban Alfaro PS: This is my first time in r-help so I apologize for possible inconveniences. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. - -- Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation Biology, UCT), Dipl. Phys. (Germany) Centre of Excellence for Invasion Biology Stellenbosch University South Africa Tel : +33 - (0)9 53 10 27 44 Cell: +33 - (0)6 85 62 59 98 Fax : +33 - (0)9 58 10 27 44 Fax (D):+49 - (0)3 21 21 25 22 44 email: rai...@krugs.de Skype: RMkrug -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk57JDwACgkQoYgNqgF2egrnOwCfaHw0ayhVoNlqkDfbwx8lXFVW AAcAnAwSyzlEK7eQJBnK0jMogIbLk64U =b961 -END PGP SIGNATURE- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading data in lisp format
Em 21/9/2011 07:39, ESTEBAN ALFARO CORTES escreveu: Hi, I am trying to read the credit.lisp file of the Japanese credit database in UCI repository, but it is in lisp format which I do not know how to read. I have not found how to do that in the foreign library http://archive.ics.uci.edu/ml/datasets/Japanese+Credit+Screeninghttp://archive.ics.uci.edu/ml/datasets/Japanese+Credit+Screening Could anyone help me? Esteban, Lisp files may mean a lot of different things, so it is not enough for an authoritative answer without further qualifications. They _may_ be files that can be read by XLisp-Stat or be a data format coming from another Lisp written program. For the former you could have XLisp-Stat read the file and export in a more manageable format (plain text, csv and for some add ons [like VisTa] to Excel]), for the latter the only option would be to ask the developer of the program or for the structure of the file or for an export to another format R can read. HTH -- Cesar Rabak __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading data in lisp format
If you think that R is loosely typed, then examining LiSP code will change your mind, or at least give you a new data point further out on the Loose-Tight axis. I think you will need to do the processing by hand. The organization of the data is fairly clear. There are logical columns with values :neg and :pos, categorical columns with values in (id value) pairs, numeric ones and then a group of computed columns at the bottom. It also appears that after the first enumeration of ids with logical values that subsequent logical variables are defined possibly with pos: values only. So I guess the counter-question is: How important is this particular dataset to you?? And further question might be, are you sure that you don't want the dataset that is right next to it: ftp://ftp.ics.uci.edu/pub/machine-learning-databases/credit-screening/crx.data It is well-behaved comma-separated file. -- David. On Sep 21, 2011, at 6:39 AM, ESTEBAN ALFARO CORTES wrote: Hi, I am trying to read the credit.lisp file of the Japanese credit database in UCI repository, but it is in lisp format which I do not know how to read. I have not found how to do that in the foreign library http://archive.ics.uci.edu/ml/datasets/Japanese+Credit+Screening http://archive.ics.uci.edu/ml/datasets/Japanese+Credit+Screening Could anyone help me? Best regards, Esteban Alfaro PS: This is my first time in r-help so I apologize for possible inconveniences. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] reading data from multiple files with multiple headers
If you know how many lines to skip, you can set skip=xx in read.table. The question is what you can do if you have variable lines to skip in various files but you have characters indicating the begining of the data, like ~A. What you can do is get the file in using readLines, use grep to find the line where the character locates. Then, you can specify skip in read.table to input your data. HTH Weidong Gu On Tue, Aug 30, 2011 at 10:23 PM, Julius Tesoro jutes...@yahoo.com wrote: Dear All, I have many files with a lot of headers and text at the beginning of the file. The headers are not uniform though and they contain different sizes Is there a way where I can read a table and skip all of the headers/text on top of it until I encounter a certain text pattern? Here is an example. I just want to read the table after the ~A. ~Version Information VERS. 2.00: CWLS log ASCII Standard -VERSION 2.00 WRAP. NO: One line per depth step # # ~Well Information Block #MNEM.UNIT Data Type Description #- --- STRT.M 51.000 :START DEPTH STOP.M .010 :STOP DEPTH STEP.M -.010 :STEP # # ~Curve Information Block #MNEM.UNIT Curve Description #- - DEPT.M :DEPTH GRDE.GAPI :GAMMA FROM DENSITY TOOL CODE.G/C3 :COMPENSATED DENSITY # # # ~A Depth GRDE CODE LSDU BRDU CADE DENL DENB ADEN VL2F VL4F VL6F VL2A 51.000 -999.25 -999.25 -999.25 -999.25 -999.25 -999.25 -999.25 -999.25 -999.25 -999.25 -999.25 -999.25 50.990 -999.25 -999.25 -999.25 -999.25 -999.25 -999.25 -999.25 -999.25 -999.25 -999.25 -999.25 -999.25 50.980 -999.25 -999.25 -999.25 -999.25 -999.25 -999.25 -999.25 -999.25 -999.25 -999.25 -999.25 -999.25 50.970 -999.25 -999.25 -999.25 -999.25 -999.25 -999.25 -999.25 -999.25 -999.25 -999.25 -999.25 -999.25 50.960 -999.25 -999.25 -999.25 -999.25 -999.25 -999.25 -999.25 -999.25 -999.25 -999.25 -999.25 -999.25 50.950 -999.25 -999.25 -999.25 -999.25 -999.25 -999.25 -999.25 -999.25 -999.25 -999.25 -999.25 -999.25 50.940 -999.25 -999.25 -999.25 -999.25 -999.25 -999.25 -999.25 -999.25 -999.25 -999.25 -999.25 -999.25 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] reading data from multiple files with multiple header
use readLines to read in the entire file, find your pattern of where your data starts and then write the data starting there using writeLines to a temporary file and now you can just read in that file using read.table; you will have 'skipped' the extra header data. Sent from my iPad On Aug 30, 2011, at 22:23, Julius Tesoro jutes...@yahoo.com wrote: Dear All, I have many files with a lot of headers and text at the beginning of the file. The headers are not uniform though and they contain different sizes Is there a way where I can read a table and skip all of the headers/text on top of it until I encounter a certain text pattern? Here is an example. I just want to read the table after the ~A. ~Version Information VERS. 2.00: CWLS log ASCII Standard -VERSION 2.00 WRAP.NO: One line per depth step # # ~Well Information Block #MNEM.UNIT Data Type Description #- --- STRT.M 51.000:START DEPTH STOP.M .010:STOP DEPTH STEP.M -.010:STEP # # ~Curve Information Block #MNEM.UNITCurve Description #-- DEPT.M :DEPTH GRDE.GAPI:GAMMA FROM DENSITY TOOL CODE.G/C3:COMPENSATED DENSITY # # # ~A Depth GRDE CODE LSDU BRDU CADE DENL DENB ADEN VL2F VL4F VL6F VL2A 51.000-999.25-999.25-999.25-999.25-999.25-999.25 -999.25-999.25-999.25-999.25-999.25-999.25 50.990-999.25-999.25-999.25-999.25-999.25-999.25 -999.25-999.25-999.25-999.25-999.25-999.25 50.980-999.25-999.25-999.25-999.25-999.25-999.25 -999.25-999.25-999.25-999.25-999.25-999.25 50.970-999.25-999.25-999.25-999.25-999.25-999.25 -999.25-999.25-999.25-999.25-999.25-999.25 50.960-999.25-999.25-999.25-999.25-999.25-999.25 -999.25-999.25-999.25-999.25-999.25-999.25 50.950-999.25-999.25-999.25-999.25-999.25-999.25 -999.25-999.25-999.25-999.25-999.25-999.25 50.940-999.25-999.25-999.25-999.25-999.25-999.25 -999.25-999.25-999.25-999.25-999.25-999.25 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] reading data from password protected url
Hi Duncan Your method works well for my situation when I make only one call to the database/URL with the login info. Our database is configured like the first situation (cookies) that you described below. Now, I will need to make multiple successive calls to get data for different sites in the database (one call per site). It doesn't seem to work at times when I do this. Is there something that needs to be done to re-initialize (Do I need to log out before making the second call)? Thanks Steve === Steven R. CorsiPhone: (608) 821-3835 Research Hydrologist email: srco...@usgs.gov U.S. Geological Survey Wisconsin Water Science Center 8505 Research Way Middleton, WI 53562 === On 6/25/2011 6:16 PM, Duncan Temple Lang wrote: Hi Steve RCurl can help you when you need to have more control over Web requests. The details vary from Web site to Web site and the different ways to specify passwords, etc. If the JSESSIONID and NCES_JSESSIONID are regular cookies and returned in the first request as cookies, then you can just have RCurl handle the cookies But the basics for your case are library(RCurl) h = getCurlHandle( cookiefile = ) Then make your Web request using getURLContent(), getForm() or postForm() but making certain to pass the curl handle stored in h in each call, e.g. ans = getForm(yourURL, login = bob, password = jane, curl = h) txt = getURLContent(dataURL, curl = h) If JSESSIONID and NCES_JSESSIONID are not returned as cookies but HTTP header fields, then you need to process the header. Something like rdr = dynCurlReader(h) ans = getForm(yourURL, login = bob, password = jane, curl = h, header = rdr$update) Then the header from the HTTP response is available as rdr$header() and you can use parseHTTPHeader(rdr$header()) to convert it into a named vector. HTH, D. On 6/24/11 2:12 PM, Steven R Corsi wrote: I am trying to retrieve data from a password protected database. I have login information and the proper url. When I make a request to the url, I get back some info, but need to read the hidden header information that has JSESSIONID and NCES_JSESSIONID. They need to be used to set cookies before sending off the actual url request that will result in the data transfer. Any help would be much appreciated. Thanks Steve __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] reading data from password protected url
Hi Steve RCurl can help you when you need to have more control over Web requests. The details vary from Web site to Web site and the different ways to specify passwords, etc. If the JSESSIONID and NCES_JSESSIONID are regular cookies and returned in the first request as cookies, then you can just have RCurl handle the cookies But the basics for your case are library(RCurl) h = getCurlHandle( cookiefile = ) Then make your Web request using getURLContent(), getForm() or postForm() but making certain to pass the curl handle stored in h in each call, e.g. ans = getForm(yourURL, login = bob, password = jane, curl = h) txt = getURLContent(dataURL, curl = h) If JSESSIONID and NCES_JSESSIONID are not returned as cookies but HTTP header fields, then you need to process the header. Something like rdr = dynCurlReader(h) ans = getForm(yourURL, login = bob, password = jane, curl = h, header = rdr$update) Then the header from the HTTP response is available as rdr$header() and you can use parseHTTPHeader(rdr$header()) to convert it into a named vector. HTH, D. On 6/24/11 2:12 PM, Steven R Corsi wrote: I am trying to retrieve data from a password protected database. I have login information and the proper url. When I make a request to the url, I get back some info, but need to read the hidden header information that has JSESSIONID and NCES_JSESSIONID. They need to be used to set cookies before sending off the actual url request that will result in the data transfer. Any help would be much appreciated. Thanks Steve __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading Data from mle into excel?
Can I use sink() to transfer the MLE results which are a S4 type object to a text file? Can someone show me how to do this? -- View this message in context: http://r.789695.n4.nabble.com/Reading-Data-from-mle-into-excel-tp3545569p3563385.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading Data from mle into excel?
The sink function will write to a file what normally shows up on the screen after running some code. So while it is possible to use it to capture the output of the mle command and read the results into excel, I don't see anything useful that you could then do with it in excel. If you can tell us more about what your ultimate goal is, what you want to do with the results, then we can give better advice on either how to get the pieces you want into excel, or probably better, how do accomplish what you want in R without needing to involve excel at all. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Bazman76 Sent: Tuesday, May 31, 2011 9:04 AM To: r-help@r-project.org Subject: Re: [R] Reading Data from mle into excel? Can I use sink() to transfer the MLE results which are a S4 type object to a text file? Can someone show me how to do this? -- View this message in context: http://r.789695.n4.nabble.com/Reading- Data-from-mle-into-excel-tp3545569p3563385.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading Data from mle into excel?
Hi Greg, I have about 40 time series each of which I have to run a seperate MLE on. I will be experimenting with different starting values for the parameters etc, so some way to automate the process will be useful. I think I can just about do this part (if you see the code above) but as I can't do the second part I can't check it properly. The second part I simply want to take the results of all the MLE calculation: the parameter estimates, there standard errors and the actual value of the likilihood ratio so that I can compare them and present them to my supervisor. The last part must be done in excel as my supervisor has not been converted to R yet. Kind Regards Hugh Date: Tue, 31 May 2011 08:24:08 -0700 From: ml-node+3563453-1045326083-236...@n4.nabble.com To: h_a_patie...@hotmail.com Subject: Re: Reading Data from mle into excel? The sink function will write to a file what normally shows up on the screen after running some code. So while it is possible to use it to capture the output of the mle command and read the results into excel, I don't see anything useful that you could then do with it in excel. If you can tell us more about what your ultimate goal is, what you want to do with the results, then we can give better advice on either how to get the pieces you want into excel, or probably better, how do accomplish what you want in R without needing to involve excel at all. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare [hidden email] 801.408.8111 -Original Message- From: [hidden email] [mailto:r-help-bounces@r- project.org] On Behalf Of Bazman76 Sent: Tuesday, May 31, 2011 9:04 AM To: [hidden email] Subject: Re: [R] Reading Data from mle into excel? Can I use sink() to transfer the MLE results which are a S4 type object to a text file? Can someone show me how to do this? -- View this message in context: http://r.789695.n4.nabble.com/Reading- Data-from-mle-into-excel-tp3545569p3563385.html Sent from the R help mailing list archive at Nabble.com. __ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. If you reply to this email, your message will be added to the discussion below:http://r.789695.n4.nabble.com/Reading-Data-from-mle-into-excel-tp3545569p3563453.html To unsubscribe from Reading Data from mle into excel?, click here. -- View this message in context: http://r.789695.n4.nabble.com/Reading-Data-from-mle-into-excel-tp3545569p3563495.html Sent from the R help mailing list archive at Nabble.com. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading Data from mle into excel?
I did not see any code above, but you could write a simple function that does the mle fit (is this mle from the stats4 package?) then extracts the information that you want and puts it into a vector, something like: out - c( coef(fit), sqrt(diag(vcov(fit))), ll=logLik(fit) ) And returns the vector of the pieces that you want. Then you can use the sapply function to run the fits and return a matrix with the coefficients, etc. You can then use write.csv to create a csv file of the results that your advisor can open in excel (or there are several ways to transfer the contents of a matrix to excel). -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Bazman76 Sent: Tuesday, May 31, 2011 9:40 AM To: r-help@r-project.org Subject: Re: [R] Reading Data from mle into excel? Hi Greg, I have about 40 time series each of which I have to run a seperate MLE on. I will be experimenting with different starting values for the parameters etc, so some way to automate the process will be useful. I think I can just about do this part (if you see the code above) but as I can't do the second part I can't check it properly. The second part I simply want to take the results of all the MLE calculation: the parameter estimates, there standard errors and the actual value of the likilihood ratio so that I can compare them and present them to my supervisor. The last part must be done in excel as my supervisor has not been converted to R yet. Kind Regards Hugh Date: Tue, 31 May 2011 08:24:08 -0700 From: ml-node+3563453-1045326083-236...@n4.nabble.com To: h_a_patie...@hotmail.com Subject: Re: Reading Data from mle into excel? The sink function will write to a file what normally shows up on the screen after running some code. So while it is possible to use it to capture the output of the mle command and read the results into excel, I don't see anything useful that you could then do with it in excel. If you can tell us more about what your ultimate goal is, what you want to do with the results, then we can give better advice on either how to get the pieces you want into excel, or probably better, how do accomplish what you want in R without needing to involve excel at all. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare [hidden email] 801.408.8111 -Original Message- From: [hidden email] [mailto:r-help-bounces@r- project.org] On Behalf Of Bazman76 Sent: Tuesday, May 31, 2011 9:04 AM To: [hidden email] Subject: Re: [R] Reading Data from mle into excel? Can I use sink() to transfer the MLE results which are a S4 type object to a text file? Can someone show me how to do this? -- View this message in context: http://r.789695.n4.nabble.com/Reading- Data-from-mle-into-excel-tp3545569p3563385.html Sent from the R help mailing list archive at Nabble.com. __ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. If you reply to this email, your message will be added to the discussion below:http://r.789695.n4.nabble.com/Reading-Data-from-mle- into-excel-tp3545569p3563453.html To unsubscribe from Reading Data from mle into excel?, click here. -- View this message in context: http://r.789695.n4.nabble.com/Reading- Data-from-mle-into-excel-tp3545569p3563495.html Sent from the R help mailing list archive at Nabble.com. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading Data from mle into excel?
Greg that's it! Thank you thank you thank you So simple in the end? From: greg.s...@imail.org To: h_a_patie...@hotmail.com; r-help@r-project.org Date: Tue, 31 May 2011 10:27:13 -0600 Subject: RE: [R] Reading Data from mle into excel? I did not see any code above, but you could write a simple function that does the mle fit (is this mle from the stats4 package?) then extracts the information that you want and puts it into a vector, something like: out - c( coef(fit), sqrt(diag(vcov(fit))), ll=logLik(fit) ) And returns the vector of the pieces that you want. Then you can use the sapply function to run the fits and return a matrix with the coefficients, etc. You can then use write.csv to create a csv file of the results that your advisor can open in excel (or there are several ways to transfer the contents of a matrix to excel). -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Bazman76 Sent: Tuesday, May 31, 2011 9:40 AM To: r-help@r-project.org Subject: Re: [R] Reading Data from mle into excel? Hi Greg, I have about 40 time series each of which I have to run a seperate MLE on. I will be experimenting with different starting values for the parameters etc, so some way to automate the process will be useful. I think I can just about do this part (if you see the code above) but as I can't do the second part I can't check it properly. The second part I simply want to take the results of all the MLE calculation: the parameter estimates, there standard errors and the actual value of the likilihood ratio so that I can compare them and present them to my supervisor. The last part must be done in excel as my supervisor has not been converted to R yet. Kind Regards Hugh Date: Tue, 31 May 2011 08:24:08 -0700 From: ml-node+3563453-1045326083-236...@n4.nabble.com To: h_a_patie...@hotmail.com Subject: Re: Reading Data from mle into excel? The sink function will write to a file what normally shows up on the screen after running some code. So while it is possible to use it to capture the output of the mle command and read the results into excel, I don't see anything useful that you could then do with it in excel. If you can tell us more about what your ultimate goal is, what you want to do with the results, then we can give better advice on either how to get the pieces you want into excel, or probably better, how do accomplish what you want in R without needing to involve excel at all. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare [hidden email] 801.408.8111 -Original Message- From: [hidden email] [mailto:r-help-bounces@r- project.org] On Behalf Of Bazman76 Sent: Tuesday, May 31, 2011 9:04 AM To: [hidden email] Subject: Re: [R] Reading Data from mle into excel? Can I use sink() to transfer the MLE results which are a S4 type object to a text file? Can someone show me how to do this? -- View this message in context: http://r.789695.n4.nabble.com/Reading- Data-from-mle-into-excel-tp3545569p3563385.html Sent from the R help mailing list archive at Nabble.com. __ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. If you reply to this email, your message will be added to the discussion below:http://r.789695.n4.nabble.com/Reading-Data-from-mle- into-excel-tp3545569p3563453.html To unsubscribe from Reading Data from mle into excel?, click here. -- View this message in context: http://r.789695.n4.nabble.com/Reading- Data-from-mle-into-excel-tp3545569p3563495.html Sent from the R help mailing list archive at Nabble.com. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch
Re: [R] Reading Data from mle into excel?
thanks for all your help I have taken a slightly different route but I think I am getting there library(plyr) #setwd(C:/Documents and Settings/Hugh/My Documents/PhD) #files-list.files(C:/Documents and Settings/Hugh/My Documents/PhD/,pattern=Swaption Vols.csv) #vols - lapply(files, read.csv, header = TRUE) vols=read.csv(file=C:/Documents and Settings/Hugh/My Documents/PhD/Swaption vols.csv + , header=TRUE, sep=,) dcOU-function(x,t,x0,theta,log=FALSE){ Ex-theta[1]/theta[2]+(x0-theta[1]/theta[2])*exp(-theta[2]*t) Vx-theta[3]^2*(1-exp(-2*theta[2]*t))/(2*theta[2]) dnorm(x,mean=Ex,sd=sqrt(Vx),log=log) } OU.lik-function(theta1,theta2,theta3){ n-length(X) dt-deltat(X) -sum(dcOU(X[2:n],dt,X[1:(n-1)],c(theta1,theta2,theta3),log=TRUE)) } require(stats4) require(sde) nc=ncol(vols) for(i in 2:nc){ X-ts(vols[,i]) mle(OU.lik,start=list(theta1=1,theta2=1,theta3=1), method=L-BFGS-B,lower=c(-Inf,-Inf,-Inf),upper=c(Inf,Inf,Inf))-fit summary(fit) } right now the summary(fit) gives the summary results for the 53rd column so the code is working correctly. How can I save these summary results in an array or datatable on each loop? -- View this message in context: http://r.789695.n4.nabble.com/Reading-Data-from-mle-into-excel-tp3545569p3547236.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading Data from mle into excel?
Hi: This isn't too hard to do. The strategy is basically this: (1) Create a list of file names. (See ?list.files for some ideas) (2) Read the data files from (1) into a list. (3) Create a function to apply to each data frame in the list. (4) Apply the function to each data frame. (5) Extract the coefficients/output and put them into a new data frame or list. Here's a really simple example, but it exemplifies the process. I created five data frames and exported them to .csv files in my current directory. The game is to create a vector of file names, use lapply() to pass them into a list, create a function to extract the coefficients from a linear regression model, and then use ldply() from the plyr package to combine steps (4) and (5) above. files - paste('dat', 1:5, '.csv', sep = '') # Code to create the output files in case you want to try out the example # wf - function(f) # write.csv(data.frame(x1 = rnorm(10), x2 = rnorm(10), y = rnorm(10)), # file = f, quote = FALSE, row.names = FALSE) # lapply(files, wf) library(plyr) datalist - lapply(files, read.csv, header = TRUE) lfun - function(d) coef(lm(y ~ x1 + x2, data = d)) ldply(datalist, lfun) You should look at ?list.files to help create the file list you need. You should then be able to use something similar to the code that generates datalist to get the data frames into a single list. Obviously, my function is a lot simpler than yours, but the principle is that the function should work for any generic data object in your list. I like the ldply() function for this simple example because it's 'one-stop shopping', but the base package alternative would be something like do.call(rbind, lapply(datalist, lfun)) The challenge in your problem is that you want to return a coefficient matrix, which raises a new set of issues. If you use the do.call() approach, it won't keep track of the corresponding data frame to which the results pertain. ldply() won't work because the coefficient matrix is not a data frame and if you coerce it to one, the row names (variables) will disappear. One approach is to use llply() instead; consider the following function and its application: # This function returns a matrix of coefficients, standard errors, # t-tests of significance and p-values rather than a vector lfun2 - function(d) summary(lm(y ~ ., data = d))['coefficients'] # returns a list rather than a data frame llply(datalist, lfun2) It's not hard to figure out how to write the results to one or more files from there. You may want to adapt your function to get the output you need. HTH, Dennis On Mon, May 23, 2011 at 2:32 PM, Bazman76 h_a_patie...@hotmail.com wrote: Hi there, I ran the following code: vols=read.csv(file=C:/Documents and Settings/Hugh/My Documents/PhD/Swaption vols.csv , header=TRUE, sep=,) X-ts(vols[,2]) #X dcOU-function(x,t,x0,theta,log=FALSE){ Ex-theta[1]/theta[2]+(x0-theta[1]/theta[2])*exp(-theta[2]*t) Vx-theta[3]^2*(1-exp(-2*theta[2]*t))/(2*theta[2]) dnorm(x,mean=Ex,sd=sqrt(Vx),log=log) } OU.lik-function(theta1,theta2,theta3){ n-length(X) dt-deltat(X) -sum(dcOU(X[2:n],dt,X[1:(n-1)],c(theta1,theta2,theta3),log=TRUE)) } require(stats4) require(sde) set.seed(1) #X-sde.sim(model=OU,theta=c(3,1,2),N=1,delta=1) mle(OU.lik,start=list(theta1=1,theta2=1,theta3=1), method=L-BFGS-B,lower=c(-Inf,-Inf,-Inf),upper=c(Inf,Inf,Inf))-fit summary(fit) #ex3.01 R prof-profile(fit) par(mfrow=c(1,3)) plot(prof) par(mfrow=c(1,1)) vcov(fit) I run the code above and I get: summary(fit) Maximum likelihood estimation Call: mle(minuslogl = OU.lik, start = list(theta1 = 1, theta2 = 1, theta3 = 1), method = L-BFGS-B, lower = c(-Inf, -Inf, -Inf), upper = c(Inf, Inf, Inf)) Coefficients: Estimate Std. Error theta1 0.03595581 0.013929892 theta2 4.30910365 1.663781710 theta3 0.02120220 0.004067477 -2 log L: -5136.327 I need to run the same analysis for 40 different time series. I want to be able to collate all the estimates of theta and the associated stadard errors and then transfer them into excel? Can someone please point me to some R code that will allow me to do this? Thanks -- View this message in context: http://r.789695.n4.nabble.com/Reading-Data-from-mle-into-excel-tp3545569p3545569.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading Data from mle into excel?
I think cognizance should be taken of fortune(very uneasy). cheers, Rolf Turner __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading Data from mle into excel?
Hi Scott, Thanks for this. Got some questions below: Thanks Hugh Date: Mon, 23 May 2011 17:32:52 -0500 From: scttchamberla...@gmail.com To: h_a_patie...@hotmail.com CC: r-help@r-project.org Subject: Re: [R] Reading Data from mle into excel? I would read the datasets into a list first, something like this which will make a list of dataframes: filenames - dir() # where only filenames you want to read in are in this directory dataframelist - lapply(filenames, read.csv, header = TRUE, sep = ,) OK I tried the code you suggest and I get: filenames-dir(C:/Documents and Settings/Hugh/My Documents/PhD/Swaption vols.csv) dataframelist-lapply(filenames, read.csv, header = TRUE, sep = ,) dataframelist list() list function (...) .Primitive(list) Is this correct? I only actually need one file, all the time series are stored in seperate columns. vols=read.csv(file=C:/Documents and Settings/Hugh/My Documents/PhD/Swaption vols.csv , header=TRUE, sep=,) X-ts(vols[,2]) Can I still use this format? You should be able to put the whole procedure, after reading in dataframes, into one lapply perhaps, e.g., lapply(dataframelist, yourfunction) as for my function the function mle involves calls to other functions? dcOU-function(x,t,x0,theta,log=FALSE){ Ex-theta[1]/theta[2]+(x0-theta[1]/theta[2])*exp(-theta[2]*t) Vx-theta[3]^2*(1-exp(-2*theta[2]*t))/(2*theta[2]) dnorm(x,mean=Ex,sd=sqrt(Vx),log=log) } OU.lik-function(theta1,theta2,theta3){ n-length(X) dt-deltat(X) -sum(dcOU(X[2:n],dt,X[1:(n-1)],c(theta1,theta2,theta3),log=TRUE)) } require(stats4) require(sde) set.seed(1) #X-sde.sim(model=OU,theta=c(3,1,2),N=1,delta=1) mle(OU.lik,start=list(theta1=1,theta2=1,theta3=1), method=L-BFGS-B,lower=c(-Inf,-Inf,-Inf),upper=c(Inf,Inf,Inf))-fit summary(fit) Should I store each function is a seperate script? If so how do I then make sure that they are available in the workspace? Assuming that vols contains the dataframelist who would I call the mle function using lapply like you showed? where dataframelist is a list of dataframes, and yourfunction is a function that does all the procedures for one dataset. The function 'yourfunction' will be applied to each dataset in the list separately, then the results output into a list. Then, if the results from each dataset will have the same dimensions, you can do something like ldply using package plyr ldply(output, 'identity') # where 'output' is the output list of results from the lapply call above This will give you a data frame of all the results. Scott Chamberlain Rice University, EEB Dept. On Monday, May 23, 2011 at 4:32 PM, Bazman76 wrote: Hi there, I ran the following code: vols=read.csv(file=C:/Documents and Settings/Hugh/My Documents/PhD/Swaption vols.csv , header=TRUE, sep=,) X-ts(vols[,2]) #X dcOU-function(x,t,x0,theta,log=FALSE){ Ex-theta[1]/theta[2]+(x0-theta[1]/theta[2])*exp(-theta[2]*t) Vx-theta[3]^2*(1-exp(-2*theta[2]*t))/(2*theta[2]) dnorm(x,mean=Ex,sd=sqrt(Vx),log=log) } OU.lik-function(theta1,theta2,theta3){ n-length(X) dt-deltat(X) -sum(dcOU(X[2:n],dt,X[1:(n-1)],c(theta1,theta2,theta3),log=TRUE)) } require(stats4) require(sde) set.seed(1) #X-sde.sim(model=OU,theta=c(3,1,2),N=1,delta=1) mle(OU.lik,start=list(theta1=1,theta2=1,theta3=1), method=L-BFGS-B,lower=c(-Inf,-Inf,-Inf),upper=c(Inf,Inf,Inf))-fit summary(fit) #ex3.01 R prof-profile(fit) par(mfrow=c(1,3)) plot(prof) par(mfrow=c(1,1)) vcov(fit) I run the code above and I get: summary(fit) Maximum likelihood estimation Call: mle(minuslogl = OU.lik, start = list(theta1 = 1, theta2 = 1, theta3 = 1), method = L-BFGS-B, lower = c(-Inf, -Inf, -Inf), upper = c(Inf, Inf, Inf)) Coefficients: Estimate Std. Error theta1 0.03595581 0.013929892 theta2 4.30910365 1.663781710 theta3 0.02120220 0.004067477 -2 log L: -5136.327 I need to run the same analysis for 40 different time series. I want to be able to collate all the estimates of theta and the associated stadard errors and then transfer them into excel? Can someone please point me to some R code that will allow me to do this? Thanks -- View this message in context: http://r.789695.n4.nabble.com/Reading-Data-from-mle-into-excel-tp3545569p3545569.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading Data from mle into excel?
I would read the datasets into a list first, something like this which will make a list of dataframes: filenames - dir() # where only filenames you want to read in are in this directory dataframelist - lapply(filenames, read.csv, header = TRUE, sep = ,) You should be able to put the whole procedure, after reading in dataframes, into one lapply perhaps, e.g., lapply(dataframelist, yourfunction) where dataframelist is a list of dataframes, and yourfunction is a function that does all the procedures for one dataset. The function 'yourfunction' will be applied to each dataset in the list separately, then the results output into a list. Then, if the results from each dataset will have the same dimensions, you can do something like ldply using package plyr ldply(output, 'identity') # where 'output' is the output list of results from the lapply call above This will give you a data frame of all the results. Scott Chamberlain Rice University, EEB Dept. On Monday, May 23, 2011 at 4:32 PM, Bazman76 wrote: Hi there, I ran the following code: vols=read.csv(file=C:/Documents and Settings/Hugh/My Documents/PhD/Swaption vols.csv , header=TRUE, sep=,) X-ts(vols[,2]) #X dcOU-function(x,t,x0,theta,log=FALSE){ Ex-theta[1]/theta[2]+(x0-theta[1]/theta[2])*exp(-theta[2]*t) Vx-theta[3]^2*(1-exp(-2*theta[2]*t))/(2*theta[2]) dnorm(x,mean=Ex,sd=sqrt(Vx),log=log) } OU.lik-function(theta1,theta2,theta3){ n-length(X) dt-deltat(X) -sum(dcOU(X[2:n],dt,X[1:(n-1)],c(theta1,theta2,theta3),log=TRUE)) } require(stats4) require(sde) set.seed(1) #X-sde.sim(model=OU,theta=c(3,1,2),N=1,delta=1) mle(OU.lik,start=list(theta1=1,theta2=1,theta3=1), method=L-BFGS-B,lower=c(-Inf,-Inf,-Inf),upper=c(Inf,Inf,Inf))-fit summary(fit) #ex3.01 R prof-profile(fit) par(mfrow=c(1,3)) plot(prof) par(mfrow=c(1,1)) vcov(fit) I run the code above and I get: summary(fit) Maximum likelihood estimation Call: mle(minuslogl = OU.lik, start = list(theta1 = 1, theta2 = 1, theta3 = 1), method = L-BFGS-B, lower = c(-Inf, -Inf, -Inf), upper = c(Inf, Inf, Inf)) Coefficients: Estimate Std. Error theta1 0.03595581 0.013929892 theta2 4.30910365 1.663781710 theta3 0.02120220 0.004067477 -2 log L: -5136.327 I need to run the same analysis for 40 different time series. I want to be able to collate all the estimates of theta and the associated stadard errors and then transfer them into excel? Can someone please point me to some R code that will allow me to do this? Thanks -- View this message in context: http://r.789695.n4.nabble.com/Reading-Data-from-mle-into-excel-tp3545569p3545569.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading data into
try this: input - readLines(textConnection(a1 89 2 79 392 + b 3 45 4 65)) closeAllConnections() # now parse each line to create a dataframe with each row being the score result - NULL for (i in input){ + x - strsplit(i, '[[:space:]]+')[[1]] + x.l - length(x) + result - rbind(result, + data.frame(judge = paste(Judge_, rep(x[1], x.l %/% 2), sep = ''), +poster = as.integer(x[seq(2, x.l, 2)]), +score = as.integer(x[seq(3, x.l, 2)]), +stringsAsFactors = FALSE)) + } require(reshape) cast(result, poster ~ judge, value = 'score') poster Judge_a Judge_b 1 1 89 NA 2 2 79 NA 3 3 92 45 4 4 NA 65 On Mon, Oct 4, 2010 at 4:19 PM, Federman, Douglas douglas.feder...@utoledo.edu wrote: I have data in the following form: judge poster score poster score poster score a 1 89 2 79 3 92 b 3 45 4 65 and am trying to get it to the following: Poster Judge_A Judge_B Judge_C 1 89 2 79 3 92 45 4 65 Any hints would be appreciated. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading data from xls..........please help
Hi r-help-boun...@r-project.org napsal dne 16.06.2010 22:14:33: Thanks for your reply. Possibly I donot have perl. I am not sure although. How I can find whether I have it? If I dont have it then where can I download it from? Do you have Excel? If yes you can Open Excel Select data you want to transfer to R Press Ctrl-C to copy it to clipboard Open R and write mydata - read.delim(clipboard) to console. Regards Petr On Thu, Jun 17, 2010 at 12:57 AM, Barry Rowlingson b.rowling...@lancaster.ac.uk wrote: On Wed, Jun 16, 2010 at 7:29 PM, Christofer Bogaso bogaso.christo...@gmail.com wrote: Can anyone help me how to read xls file into R. I have tried following library(gdata) xlsfile - file.path(.path.package('gdata'),'xls','iris.xls') read.xls(xlsfile) I got following error: Converting xls file to csv file... Error in system(cmd, intern = !verbose) : perl not found Error in file.exists(tfn) : invalid 'file' argument Question *1) What is the way to get it working?* Works for me on an Ubuntu 9.10 with R 2.10.1, so swapping your OS and R version to that will get it working... What OS/R are you on? Note it says 'perl not found'. That's because it hasn't found perl. Do you have perl on your system? Do you need to specify the path to perl, as in the examples for Windows in help(read.xls)? Barry [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading data from xls..........please help
Surely you could also save the excel spreadsheet with the relevant data as a text file, and then read it into R as normal? Select save as in Excel and then change save as type to Text (Tab delimited)(*.txt). Save it in the directory you are using in R, (or change the directory in R to where you would like to keep your data), and then use the read.table statement for the text version of the file. Just ensure that you do not have spaces between words in each cell, so e.g. if you have a column named variable type, change it to variable_type-otherwise you'll get an error when you try to read it into R. You might also find it useful to change any missing (empty) cells in Excel to NA before reading it in. Petr PIKAL petr.pi...@precheza.cz 2010/06/18 01:42 PM Hi r-help-boun...@r-project.org napsal dne 16.06.2010 22:14:33: Thanks for your reply. Possibly I donot have perl. I am not sure although. How I can find whether I have it? If I dont have it then where can I download it from? Do you have Excel? If yes you can Open Excel Select data you want to transfer to R Press Ctrl-C to copy it to clipboard Open R and write mydata - read.delim(clipboard) to console. Regards Petr On Thu, Jun 17, 2010 at 12:57 AM, Barry Rowlingson b.rowling...@lancaster.ac.uk wrote: On Wed, Jun 16, 2010 at 7:29 PM, Christofer Bogaso bogaso.christo...@gmail.com wrote: Can anyone help me how to read xls file into R. I have tried following library(gdata) xlsfile - file.path(.path.package('gdata'),'xls','iris.xls') read.xls(xlsfile) I got following error: Converting xls file to csv file... Error in system(cmd, intern = !verbose) : perl not found Error in file.exists(tfn) : invalid 'file' argument Question *1) What is the way to get it working?* Works for me on an Ubuntu 9.10 with R 2.10.1, so swapping your OS and R version to that will get it working... What OS/R are you on? Note it says 'perl not found'. That's because it hasn't found perl. Do you have perl on your system? Do you need to specify the path to perl, as in the examples for Windows in help(read.xls)? Barry [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ### UNIVERSITY OF CAPE TOWN This e-mail is subject to the UCT ICT policies and e-mail disclaimer published on our website at http://www.uct.ac.za/about/policies/emaildisclaimer/ or obtainable from +27 21 650 4500. This e-mail is intended only for the person(s) to whom it is addressed. If the e-mail has reached you in error, please notify the author. If you are not the intended recipient of the e-mail you may not use, disclose, copy, redirect or print the content. If this e-mail is not related to the business of UCT it is sent by the sender in the sender's individual capacity. ### __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading data from xls..........please help
If you're on windows and you never installed perl, then you don't have it. Another easy way to find out is to type perl in the search window under the start menu. If there's no perl.exe on your computer, you don't have it. Take a look at : http://www.perl.org/ If you download Perl, it doesn't really matter that much whether you take the strawberry or the Activestate version. I took the ActiveState, but feel free to differ. http://www.activestate.com/activeperl Cheers Joris On Wed, Jun 16, 2010 at 10:14 PM, Christofer Bogaso bogaso.christo...@gmail.com wrote: Thanks for your reply. Possibly I donot have perl. I am not sure although. How I can find whether I have it? If I dont have it then where can I download it from? On Thu, Jun 17, 2010 at 12:57 AM, Barry Rowlingson b.rowling...@lancaster.ac.uk wrote: On Wed, Jun 16, 2010 at 7:29 PM, Christofer Bogaso bogaso.christo...@gmail.com wrote: Can anyone help me how to read xls file into R. I have tried following library(gdata) xlsfile - file.path(.path.package('gdata'),'xls','iris.xls') read.xls(xlsfile) I got following error: Converting xls file to csv file... Error in system(cmd, intern = !verbose) : perl not found Error in file.exists(tfn) : invalid 'file' argument Question *1) What is the way to get it working?* Works for me on an Ubuntu 9.10 with R 2.10.1, so swapping your OS and R version to that will get it working... What OS/R are you on? Note it says 'perl not found'. That's because it hasn't found perl. Do you have perl on your system? Do you need to specify the path to perl, as in the examples for Windows in help(read.xls)? Barry [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joris Meys Statistical consultant Ghent University Faculty of Bioscience Engineering Department of Applied mathematics, biometrics and process control tel : +32 9 264 59 87 joris.m...@ugent.be --- Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading data from xls..........please help
Hi r-help-boun...@r-project.org napsal dne 18.06.2010 14:00:47: Surely you could also save the excel spreadsheet with the relevant data as a text file, and then read it into R as normal? Select save as in Excel and then change save as type to Text (Tab delimited)(*.txt). Save it in the directory you are using in R, (or change the directory in R to where you would like to keep your data), and then use the read.table statement for the text version of the file. Just ensure that you do not have spaces between words in each cell, so e.g. if you have a column named variable type, change it to variable_type-otherwise you'll get an error when you try to read it into R. You might also find it useful to change any missing (empty) cells in Excel to NA before reading it in. I would not recommend it. If you put NA to Excel numeric column it will start to behave like character and you need to do some fiddling to turn it back to numerics. Regards Petr Petr PIKAL petr.pi...@precheza.cz 2010/06/18 01:42 PM Hi r-help-boun...@r-project.org napsal dne 16.06.2010 22:14:33: Thanks for your reply. Possibly I donot have perl. I am not sure although. How I can find whether I have it? If I dont have it then where can I download it from? Do you have Excel? If yes you can Open Excel Select data you want to transfer to R Press Ctrl-C to copy it to clipboard Open R and write mydata - read.delim(clipboard) to console. Regards Petr On Thu, Jun 17, 2010 at 12:57 AM, Barry Rowlingson b.rowling...@lancaster.ac.uk wrote: On Wed, Jun 16, 2010 at 7:29 PM, Christofer Bogaso bogaso.christo...@gmail.com wrote: Can anyone help me how to read xls file into R. I have tried following library(gdata) xlsfile - file.path(.path.package('gdata'),'xls','iris.xls') read.xls(xlsfile) I got following error: Converting xls file to csv file... Error in system(cmd, intern = !verbose) : perl not found Error in file.exists(tfn) : invalid 'file' argument Question *1) What is the way to get it working?* Works for me on an Ubuntu 9.10 with R 2.10.1, so swapping your OS and R version to that will get it working... What OS/R are you on? Note it says 'perl not found'. That's because it hasn't found perl. Do you have perl on your system? Do you need to specify the path to perl, as in the examples for Windows in help(read.xls)? Barry [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ### UNIVERSITY OF CAPE TOWN This e-mail is subject to the UCT ICT policies and e-mail disclaimer published on our website at http://www.uct.ac.za/about/policies/emaildisclaimer/ or obtainable from +27 21 650 4500. This e-mail is intended only for the person (s) to whom it is addressed. If the e-mail has reached you in error, please notify the author. If you are not the intended recipient of the e-mail you may not use, disclose, copy, redirect or print the content. If this e-mail is not related to the business of UCT it is sent by the sender in the sender's individual capacity. ### __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading data from xls..........please help
On Wed, Jun 16, 2010 at 2:29 PM, Christofer Bogaso bogaso.christo...@gmail.com wrote: Can anyone help me how to read xls file into R. I have tried following library(gdata) xlsfile - file.path(.path.package('gdata'),'xls','iris.xls') read.xls(xlsfile) I got following error: Converting xls file to csv file... Error in system(cmd, intern = !verbose) : perl not found Error in file.exists(tfn) : invalid 'file' argument Either you don't have perl or you do have it but it can't find it since its not on your path. You can either add perl to your path or use the perl= argument to give it the path. See ?read.xls and note the examples in the example section at the bottom. Question *1) What is the way to get it working?* 2nd approach I done was with RODBC package: library(RODBC) odbcConnectExcel(xlsfile) Here I got following report: RODBC Connection 4 Details: case=nochange DBQ=C:\PROGRA~1\R\R-211~1.1\library\gdata\xls\iris.xls DefaultDir=C:\PROGRA~1\R\R-211~1.1\library\gdata\xls Driver={Microsoft Excel Driver (*.xls)} DriverId=790 MaxBufferSize=2048 PageTimeout=5 My question is *2) How I retrieve data here?* * * Thanks for your time. Read the documentation that comes with RODBC: vignette(RODBC, package = RODBC) Also there is a list of many alternatives here: http://rwiki.sciviews.org/doku.php?id=tips:data-io:ms_windowss=excel __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading data from xls..........please help
On Wed, Jun 16, 2010 at 7:29 PM, Christofer Bogaso bogaso.christo...@gmail.com wrote: Can anyone help me how to read xls file into R. I have tried following library(gdata) xlsfile - file.path(.path.package('gdata'),'xls','iris.xls') read.xls(xlsfile) I got following error: Converting xls file to csv file... Error in system(cmd, intern = !verbose) : perl not found Error in file.exists(tfn) : invalid 'file' argument Question *1) What is the way to get it working?* Works for me on an Ubuntu 9.10 with R 2.10.1, so swapping your OS and R version to that will get it working... What OS/R are you on? Note it says 'perl not found'. That's because it hasn't found perl. Do you have perl on your system? Do you need to specify the path to perl, as in the examples for Windows in help(read.xls)? Barry __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading data from xls..........please help
Thanks for your reply. Possibly I donot have perl. I am not sure although. How I can find whether I have it? If I dont have it then where can I download it from? On Thu, Jun 17, 2010 at 12:57 AM, Barry Rowlingson b.rowling...@lancaster.ac.uk wrote: On Wed, Jun 16, 2010 at 7:29 PM, Christofer Bogaso bogaso.christo...@gmail.com wrote: Can anyone help me how to read xls file into R. I have tried following library(gdata) xlsfile - file.path(.path.package('gdata'),'xls','iris.xls') read.xls(xlsfile) I got following error: Converting xls file to csv file... Error in system(cmd, intern = !verbose) : perl not found Error in file.exists(tfn) : invalid 'file' argument Question *1) What is the way to get it working?* Works for me on an Ubuntu 9.10 with R 2.10.1, so swapping your OS and R version to that will get it working... What OS/R are you on? Note it says 'perl not found'. That's because it hasn't found perl. Do you have perl on your system? Do you need to specify the path to perl, as in the examples for Windows in help(read.xls)? Barry [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading data file with both fixed and tab-delimited fields
I tried to shoehorn the read.* functions and match both the fixed width and the variable width fields in the data but it doesn't seem evident to me. (read.fwf reads fixed width data properly but the rest of the fields must be processed separately -- maybe insert NULL stubs in the remaining fields and fill them in later?) One way is to sidestep the entire issue and convert the structured data you have into a csv file using sed (usually available on most *nix systems) with something like so: cat data | sed -r 's/^(..)(.)(..)(.{6})(..)[ \t]*([^ \t]*)[ \t]*([^ \t]*)[ \t]*([^ \t]*)[ \t]*([^ \t]*)[ \t]*([^ \t]*)/\1,\2,\3,\4,\5,\6,\7,\8,\9/' | less and see if the output is alright and use the resulting .csv file directly in R using read.csv If that does not satisfy you maybe the R Wizards on the list might be able to point you to a native R way of doing this possibly using scan? I'm not sure though. Hope this helps, Chillu On Tue, Mar 2, 2010 at 9:42 PM, Marshall Feldman ma...@uri.edu wrote: Hello R wizards, What is the best way to read a data file containing both fixed-width and tab-delimited files? (More detail follows.) _*Details:*_ The U.S. Bureau of Labor Statistics provides local area unemployment statistics at ftp://ftp.bls.gov/pub/time.series/la/, and the data are documented in the file la.txt ftp://ftp.bls.gov/pub/time.series/la/la.txt. Each data file has five tab-delimited fields: * series_id * year * period (codes for things like quarter or month of year) * value * footnote_codes The series_id consists of five fixed-width subfields (length in parentheses): * survey abbreviation (2) * seasonal code (1) * area type code (2) * area code (6) * measure code (2) So an example record might be: LASPS36040003 1990M01 8.8 L I want to read in the data in one pass and convert them to a data frame with the following columns (actual name, class in parentheses): Survey abbreviation (survey, character) Seasonal (seasonal, logical seasonal=T) Area type (area_type_code, factor) Area (area_code, factor) Measure (measure_code, factor) Year (year, Date) Period (period, factor) Value (value, numeric) Footnote (footnote_codes, character but see note) (Regarding the Footnote, I have to look at the data more. If there's just one code per record, this will be a factor; if there are multiple, it will either be character or a list. For not I'm making it only character.) Currently I can read the data just fine using read.table, but this makes series_id the first variable. I want to break out the subfields as separate columns. Any suggestions? Thanks. Marsh Feldman [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading data file with both fixed and tab-delimited fields
Ah, I should have mentioned this. Personally I work on Macs (Leopard) and PC's (XP Pro and XP Pro x64). Even though the PC's do have Cygwin, I'm trying to make this code portable. So I want to avoid such things as sed, perl, etc. I want to do this in R, even if processing is a bit slower. Eventually, I'll hide the code in a class, so the code can be a bit complex. Marsh Feldman On 3/2/2010 12:29 PM, Chidambaram Annamalai wrote: I tried to shoehorn the read.* functions and match both the fixed width and the variable width fields in the data but it doesn't seem evident to me. (read.fwf reads fixed width data properly but the rest of the fields must be processed separately -- maybe insert NULL stubs in the remaining fields and fill them in later?) One way is to sidestep the entire issue and convert the structured data you have into a csv file using sed (usually available on most *nix systems) with something like so: cat data | sed -r 's/^(..)(.)(..)(.{6})(..)[ \t]*([^ \t]*)[ \t]*([^ \t]*)[ \t]*([^ \t]*)[ \t]*([^ \t]*)[ \t]*([^ \t]*)/\1,\2,\3,\4,\5,\6,\7,\8,\9/' | less and see if the output is alright and use the resulting .csv file directly in R using read.csv If that does not satisfy you maybe the R Wizards on the list might be able to point you to a native R way of doing this possibly using scan? I'm not sure though. Hope this helps, Chillu On Tue, Mar 2, 2010 at 9:42 PM, Marshall Feldman ma...@uri.edu mailto:ma...@uri.edu wrote: Hello R wizards, What is the best way to read a data file containing both fixed-width and tab-delimited files? (More detail follows.) _*Details:*_ The U.S. Bureau of Labor Statistics provides local area unemployment statistics at ftp://ftp.bls.gov/pub/time.series/la/, and the data are documented in the file la.txt ftp://ftp.bls.gov/pub/time.series/la/la.txt. Each data file has five tab-delimited fields: * series_id * year * period (codes for things like quarter or month of year) * value * footnote_codes The series_id consists of five fixed-width subfields (length in parentheses): * survey abbreviation (2) * seasonal code (1) * area type code (2) * area code (6) * measure code (2) So an example record might be: LASPS36040003 1990M01 8.8 L I want to read in the data in one pass and convert them to a data frame with the following columns (actual name, class in parentheses): Survey abbreviation (survey, character) Seasonal (seasonal, logical seasonal=T) Area type (area_type_code, factor) Area (area_code, factor) Measure (measure_code, factor) Year (year, Date) Period (period, factor) Value (value, numeric) Footnote (footnote_codes, character but see note) (Regarding the Footnote, I have to look at the data more. If there's just one code per record, this will be a factor; if there are multiple, it will either be character or a list. For not I'm making it only character.) Currently I can read the data just fine using read.table, but this makes series_id the first variable. I want to break out the subfields as separate columns. Any suggestions? Thanks. Marsh Feldman [[alternative HTML version deleted]] __ R-help@r-project.org mailto:R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Dr. Marshall Feldman, PhD Director of Research and Academic Affairs CUSR Logo Center for Urban Studies and Research The University of Rhode Island email: marsh @ uri .edu (remove spaces) Contact Information: Kingston: 202 Hart House Charles T. Schmidt Labor Research Center The University of Rhode Island 36 Upper College Road Kingston, RI 02881-0815 tel. (401) 874-5953: fax: (401) 874-5511 Providence: 206E Shepard Building URI Feinstein Providence Campus 80 Washington Street Providence, RI 02903-1819 tel. (401) 277-5218 fax: (401) 277-5464 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] reading data from web data sources
Try this. First we read the raw lines into R using grep to remove any lines containing a character that is not a number or space. Then we look for the year lines and repeat them down V1 using cumsum. Finally we omit the year lines. myURL - http://climate.arm.ac.uk/calibrated/soil/dsoil100_cal_1910-1919.dat; raw.lines - readLines(myURL) DF - read.table(textConnection(raw.lines[!grepl([^ 0-9.],raw.lines)]), fill = TRUE) DF$V1 - DF[cumsum(is.na(DF[[2]])), 1] DF - na.omit(DF) head(DF) On Sat, Feb 27, 2010 at 6:32 AM, Tim Coote tim+r-project@coote.org wrote: Hullo I'm trying to read some time series data of meteorological records that are available on the web (eg http://climate.arm.ac.uk/calibrated/soil/dsoil100_cal_1910-1919.dat). I'd like to be able to read in the digital data directly into R. However, I cannot work out the right function and set of parameters to use. It could be that the only practical route is to write a parser, possibly in some other language, reformat the files and then read these into R. As far as I can tell, the informal grammar of the file is: comments terminated by a blank line [year number on a line on its own daily readings lines ]+ and the daily readings are of the form: whitespace day number [whitespace reading on day of month] 12 Readings for days in months where a day does not exist have special values. Missing values have a different special value. And then I've got the problem of iterating over all relevant files to get a whole timeseries. Is there a way to read in this type of file into R? I've read all of the examples that I can find, but cannot work out how to do it. I don't think that read.table can handle the separate sections of data representing each year. read.ftable maybe can be coerced to parse the data, but I cannot see how after reading the documentation and experimenting with the parameters. I'm using R 2.10.1 on osx 10.5.8 and 2.10.0 on Fedora 10. Any help/suggestions would be greatly appreciated. I can see that this type of issue is likely to grow in importance, and I'd also like to give the data owners suggestions on how to reformat their data so that it is easier to consume by machines, while being easy to read for humans. The early records are a serious machine parsing challenge as they are tiff images of old notebooks ;-) tia Tim Tim Coote t...@coote.org vincit veritas __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] reading data from web data sources
Mark Leeds pointed out to me that the code wrapped around in the post so it may not be obvious that the regular expression in the grep is (i.e. it contains a space): [^ 0-9.] On Sat, Feb 27, 2010 at 7:15 AM, Gabor Grothendieck ggrothendi...@gmail.com wrote: Try this. First we read the raw lines into R using grep to remove any lines containing a character that is not a number or space. Then we look for the year lines and repeat them down V1 using cumsum. Finally we omit the year lines. myURL - http://climate.arm.ac.uk/calibrated/soil/dsoil100_cal_1910-1919.dat; raw.lines - readLines(myURL) DF - read.table(textConnection(raw.lines[!grepl([^ 0-9.],raw.lines)]), fill = TRUE) DF$V1 - DF[cumsum(is.na(DF[[2]])), 1] DF - na.omit(DF) head(DF) On Sat, Feb 27, 2010 at 6:32 AM, Tim Coote tim+r-project@coote.org wrote: Hullo I'm trying to read some time series data of meteorological records that are available on the web (eg http://climate.arm.ac.uk/calibrated/soil/dsoil100_cal_1910-1919.dat). I'd like to be able to read in the digital data directly into R. However, I cannot work out the right function and set of parameters to use. It could be that the only practical route is to write a parser, possibly in some other language, reformat the files and then read these into R. As far as I can tell, the informal grammar of the file is: comments terminated by a blank line [year number on a line on its own daily readings lines ]+ and the daily readings are of the form: whitespace day number [whitespace reading on day of month] 12 Readings for days in months where a day does not exist have special values. Missing values have a different special value. And then I've got the problem of iterating over all relevant files to get a whole timeseries. Is there a way to read in this type of file into R? I've read all of the examples that I can find, but cannot work out how to do it. I don't think that read.table can handle the separate sections of data representing each year. read.ftable maybe can be coerced to parse the data, but I cannot see how after reading the documentation and experimenting with the parameters. I'm using R 2.10.1 on osx 10.5.8 and 2.10.0 on Fedora 10. Any help/suggestions would be greatly appreciated. I can see that this type of issue is likely to grow in importance, and I'd also like to give the data owners suggestions on how to reformat their data so that it is easier to consume by machines, while being easy to read for humans. The early records are a serious machine parsing challenge as they are tiff images of old notebooks ;-) tia Tim Tim Coote t...@coote.org vincit veritas __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] reading data from web data sources
Thanks, Gabor. My take away from this and Phil's post is that I'm going to have to construct some code to do the parsing, rather than use a standard function. I'm afraid that neither approach works, yet: Gabor's gets has an off-by-one error (days start on the 2nd, not the first), and the years get messed up around the 29th day. I think that na.omit (DF) line is throwing out the baby with the bathwater. It's interesting that this approach is based on read.table, I'd assumed that I'd need read.ftable, which I couldn't understand the documentation for. What is it that's removing the -999 and -888 values in this code -they seem to be gone, but I cannot see why. Phil's reads in the data, but interleaves rows with just a year and all other values as NA. Tim On 27 Feb 2010, at 17:33, Gabor Grothendieck wrote: Mark Leeds pointed out to me that the code wrapped around in the post so it may not be obvious that the regular expression in the grep is (i.e. it contains a space): [^ 0-9.] On Sat, Feb 27, 2010 at 7:15 AM, Gabor Grothendieck ggrothendi...@gmail.com wrote: Try this. First we read the raw lines into R using grep to remove any lines containing a character that is not a number or space. Then we look for the year lines and repeat them down V1 using cumsum. Finally we omit the year lines. myURL - http://climate.arm.ac.uk/calibrated/soil/dsoil100_cal_1910-1919.dat raw.lines - readLines(myURL) DF - read.table(textConnection(raw.lines[!grepl([^ 0-9.],raw.lines)]), fill = TRUE) DF$V1 - DF[cumsum(is.na(DF[[2]])), 1] DF - na.omit(DF) head(DF) On Sat, Feb 27, 2010 at 6:32 AM, Tim Coote tim+r-project@coote.org wrote: Hullo I'm trying to read some time series data of meteorological records that are available on the web (eg http://climate.arm.ac.uk/calibrated/soil/ dsoil100_cal_1910-1919.dat). I'd like to be able to read in the digital data directly into R. However, I cannot work out the right function and set of parameters to use. It could be that the only practical route is to write a parser, possibly in some other language, reformat the files and then read these into R. As far as I can tell, the informal grammar of the file is: comments terminated by a blank line [year number on a line on its own daily readings lines ]+ and the daily readings are of the form: whitespace day number [whitespace reading on day of month] 12 Readings for days in months where a day does not exist have special values. Missing values have a different special value. And then I've got the problem of iterating over all relevant files to get a whole timeseries. Is there a way to read in this type of file into R? I've read all of the examples that I can find, but cannot work out how to do it. I don't think that read.table can handle the separate sections of data representing each year. read.ftable maybe can be coerced to parse the data, but I cannot see how after reading the documentation and experimenting with the parameters. I'm using R 2.10.1 on osx 10.5.8 and 2.10.0 on Fedora 10. Any help/suggestions would be greatly appreciated. I can see that this type of issue is likely to grow in importance, and I'd also like to give the data owners suggestions on how to reformat their data so that it is easier to consume by machines, while being easy to read for humans. The early records are a serious machine parsing challenge as they are tiff images of old notebooks ;-) tia Tim Tim Coote t...@coote.org vincit veritas __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Tim Coote t...@coote.org vincit veritas __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] reading data from web data sources
No one else posted so the other post you are referring to must have been an email to you, not a post. We did not see it. By one off I think you are referring to the row names, which are meaningless, rather than the day numbers. The data for day 1 is present, not missing. The example code did replace the day number column with the year since the days were just sequential and therefore derivable but its trivial to keep them if that is important to you and we have made that change below. The previous code used grep to kick out lines that had any character not in the set: minus sign, space and digit but in this version we add minus sign to that set. We also corrected the year column and added column names and converted all -999 strings to NA. Due to this last point we cannot use na.omit any more but we now have iy available that distinguishes between year rows and other rows. Every line here has been indented so anything that starts at the left column must have been word wrapped in transmission. myURL - http://climate.arm.ac.uk/calibrated/soil/dsoil100_cal_1910-1919.dat; raw.lines - readLines(myURL) DF - read.table(textConnection(raw.lines[!grepl([^- 0-9.], raw.lines)]), fill = TRUE, col.names = c(day, month.abb), na.strings = -999) iy - is.na(DF[[2]]) # is year row DF$year - DF[iy, 1][cumsum(iy)] DF - DF[!iy, ] DF On Sat, Feb 27, 2010 at 3:28 PM, Tim Coote tim+r-project@coote.org wrote: Thanks, Gabor. My take away from this and Phil's post is that I'm going to I think the other `post`` must have been directly to you. We didn`t see it. have to construct some code to do the parsing, rather than use a standard function. I'm afraid that neither approach works, yet: Gabor's gets has an off-by-one error (days start on the 2nd, not the first), and the years get messed up around the 29th day. I think that na.omit (DF) line is throwing out the baby with the bathwater. It's interesting that this approach is based on read.table, I'd assumed that I'd need read.ftable, which I couldn't understand the documentation for. What is it that's removing the -999 and -888 values in this code -they seem to be gone, but I cannot see why. Phil's reads in the data, but interleaves rows with just a year and all other values as NA. Tim On 27 Feb 2010, at 17:33, Gabor Grothendieck wrote: Mark Leeds pointed out to me that the code wrapped around in the post so it may not be obvious that the regular expression in the grep is (i.e. it contains a space): [^ 0-9.] On Sat, Feb 27, 2010 at 7:15 AM, Gabor Grothendieck ggrothendi...@gmail.com wrote: Try this. First we read the raw lines into R using grep to remove any lines containing a character that is not a number or space. Then we look for the year lines and repeat them down V1 using cumsum. Finally we omit the year lines. myURL - http://climate.arm.ac.uk/calibrated/soil/dsoil100_cal_1910-1919.dat; raw.lines - readLines(myURL) DF - read.table(textConnection(raw.lines[!grepl([^ 0-9.],raw.lines)]), fill = TRUE) DF$V1 - DF[cumsum(is.na(DF[[2]])), 1] DF - na.omit(DF) head(DF) On Sat, Feb 27, 2010 at 6:32 AM, Tim Coote tim+r-project@coote.org wrote: Hullo I'm trying to read some time series data of meteorological records that are available on the web (eg http://climate.arm.ac.uk/calibrated/soil/dsoil100_cal_1910-1919.dat). I'd like to be able to read in the digital data directly into R. However, I cannot work out the right function and set of parameters to use. It could be that the only practical route is to write a parser, possibly in some other language, reformat the files and then read these into R. As far as I can tell, the informal grammar of the file is: comments terminated by a blank line [year number on a line on its own daily readings lines ]+ and the daily readings are of the form: whitespace day number [whitespace reading on day of month] 12 Readings for days in months where a day does not exist have special values. Missing values have a different special value. And then I've got the problem of iterating over all relevant files to get a whole timeseries. Is there a way to read in this type of file into R? I've read all of the examples that I can find, but cannot work out how to do it. I don't think that read.table can handle the separate sections of data representing each year. read.ftable maybe can be coerced to parse the data, but I cannot see how after reading the documentation and experimenting with the parameters. I'm using R 2.10.1 on osx 10.5.8 and 2.10.0 on Fedora 10. Any help/suggestions would be greatly appreciated. I can see that this type of issue is likely to grow in importance, and I'd also like to give the data owners suggestions on how to reformat their data so that it is easier to consume by machines, while being easy to read for humans. The early records are a serious machine parsing challenge as they are tiff images of old
Re: [R] reading data from web data sources
Sorry, I forgot to cc the group: Tim - Here's a way to read the data into a list, with one entry per year: x = read.table('http://climate.arm.ac.uk/calibrated/soil/dsoil100_cal_1910-1919.dat', header=FALSE,fill=TRUE,skip=13) cts = apply(x,1,function(x)sum(is.na(x))) wh = which(cts == 12) start = wh+1 end = c(wh[-1] - 1,nrow(x)) ans = mapply(function(i,j)x[i:j,],start,end,SIMPLIFY=FALSE) names(ans) = x[wh,1] Hope this helps. - Phil Spector On Sat, 27 Feb 2010, Gabor Grothendieck wrote: No one else posted so the other post you are referring to must have been an email to you, not a post. We did not see it. By one off I think you are referring to the row names, which are meaningless, rather than the day numbers. The data for day 1 is present, not missing. The example code did replace the day number column with the year since the days were just sequential and therefore derivable but its trivial to keep them if that is important to you and we have made that change below. The previous code used grep to kick out lines that had any character not in the set: minus sign, space and digit but in this version we add minus sign to that set. We also corrected the year column and added column names and converted all -999 strings to NA. Due to this last point we cannot use na.omit any more but we now have iy available that distinguishes between year rows and other rows. Every line here has been indented so anything that starts at the left column must have been word wrapped in transmission. myURL - http://climate.arm.ac.uk/calibrated/soil/dsoil100_cal_1910-1919.dat; raw.lines - readLines(myURL) DF - read.table(textConnection(raw.lines[!grepl([^- 0-9.], raw.lines)]), fill = TRUE, col.names = c(day, month.abb), na.strings = -999) iy - is.na(DF[[2]]) # is year row DF$year - DF[iy, 1][cumsum(iy)] DF - DF[!iy, ] DF On Sat, Feb 27, 2010 at 3:28 PM, Tim Coote tim+r-project@coote.org wrote: Thanks, Gabor. My take away from this and Phil's post is that I'm going to I think the other `post`` must have been directly to you. We didn`t see it. have to construct some code to do the parsing, rather than use a standard function. I'm afraid that neither approach works, yet: Gabor's gets has an off-by-one error (days start on the 2nd, not the first), and the years get messed up around the 29th day. I think that na.omit (DF) line is throwing out the baby with the bathwater. It's interesting that this approach is based on read.table, I'd assumed that I'd need read.ftable, which I couldn't understand the documentation for. What is it that's removing the -999 and -888 values in this code -they seem to be gone, but I cannot see why. Phil's reads in the data, but interleaves rows with just a year and all other values as NA. Tim On 27 Feb 2010, at 17:33, Gabor Grothendieck wrote: Mark Leeds pointed out to me that the code wrapped around in the post so it may not be obvious that the regular expression in the grep is (i.e. it contains a space): [^ 0-9.] On Sat, Feb 27, 2010 at 7:15 AM, Gabor Grothendieck ggrothendi...@gmail.com wrote: Try this. First we read the raw lines into R using grep to remove any lines containing a character that is not a number or space. Then we look for the year lines and repeat them down V1 using cumsum. Finally we omit the year lines. myURL - http://climate.arm.ac.uk/calibrated/soil/dsoil100_cal_1910-1919.dat; raw.lines - readLines(myURL) DF - read.table(textConnection(raw.lines[!grepl([^ 0-9.],raw.lines)]), fill = TRUE) DF$V1 - DF[cumsum(is.na(DF[[2]])), 1] DF - na.omit(DF) head(DF) On Sat, Feb 27, 2010 at 6:32 AM, Tim Coote tim+r-project@coote.org wrote: Hullo I'm trying to read some time series data of meteorological records that are available on the web (eg http://climate.arm.ac.uk/calibrated/soil/dsoil100_cal_1910-1919.dat). I'd like to be able to read in the digital data directly into R. However, I cannot work out the right function and set of parameters to use. It could be that the only practical route is to write a parser, possibly in some other language, reformat the files and then read these into R. As far as I can tell, the informal grammar of the file is: comments terminated by a blank line [year number on a line on its own daily readings lines ]+ and the daily readings are of the form: whitespace day number [whitespace reading on day of month] 12 Readings for days in months where a day does not exist have special values. Missing values have a different special value. And then I've got the problem of iterating over all relevant files to get a whole timeseries. Is there a way to read in this type of file into R? I've read all of the examples that I can find, but cannot work out how to do it. I don't think that read.table can handle the separate sections of data representing each year. read.ftable maybe can be coerced to parse the data, but I cannot see how after reading the
Re: [R] reading data from web data sources
On Feb 27, 2010, at 4:33 PM, Gabor Grothendieck wrote: No one else posted so the other post you are referring to must have been an email to you, not a post. We did not see it. By one off I think you are referring to the row names, which are meaningless, rather than the day numbers. The data for day 1 is present, not missing. The example code did replace the day number column with the year since the days were just sequential and therefore derivable but its trivial to keep them if that is important to you and we have made that change below. The previous code used grep to kick out lines that had any character not in the set: minus sign, space and digit but in this version we add minus sign to that set. We also corrected the year column and added column names and converted all -999 strings to NA. Due to this last point we cannot use na.omit any more but we now have iy available that distinguishes between year rows and other rows. Every line here has been indented so anything that starts at the left column must have been word wrapped in transmission. myURL - http://climate.arm.ac.uk/calibrated/soil/dsoil100_cal_1910-1919.dat raw.lines - readLines(myURL) DF - read.table(textConnection(raw.lines[!grepl([^- 0-9.], raw.lines)]), fill = TRUE, col.names = c(day, month.abb), na.strings = -999) iy - is.na(DF[[2]]) # is year row DF$year - DF[iy, 1][cumsum(iy)] DF - DF[!iy, ] DF Wouldn't they be of more value if they were sequential? dta - data.matrix(DF[, -c(1,14)]) dtafrm -data.frame(rdta=dta[!is.na(dta)], dom= DF[row(dta)[!is.na(dta)], 1], month= col(dta)[!is.na(dta)]) # adding a year column would be trivial. sum(dtafrm$month ==2) [1] 282 sum(dtafrm$month ==12) [1] 310 plot(dtafrm$rdta, type=l) Yes, I know that zoo() might be better, but I'm still a zoobie, or would that be newzer? So, is there a zooisher function I should learn that would strip out the NA's and incorporate the data values? -- David. On Sat, Feb 27, 2010 at 3:28 PM, Tim Coote tim+r-project@coote.org wrote: Thanks, Gabor. My take away from this and Phil's post is that I'm going to I think the other `post`` must have been directly to you. We didn`t see it. have to construct some code to do the parsing, rather than use a standard function. I'm afraid that neither approach works, yet: Gabor's gets has an off-by-one error (days start on the 2nd, not the first), and the years get messed up around the 29th day. I think that na.omit (DF) line is throwing out the baby with the bathwater. It's interesting that this approach is based on read.table, I'd assumed that I'd need read.ftable, which I couldn't understand the documentation for. What is it that's removing the -999 and -888 values in this code -they seem to be gone, but I cannot see why. Phil's reads in the data, but interleaves rows with just a year and all other values as NA. Tim On 27 Feb 2010, at 17:33, Gabor Grothendieck wrote: Mark Leeds pointed out to me that the code wrapped around in the post so it may not be obvious that the regular expression in the grep is (i.e. it contains a space): [^ 0-9.] On Sat, Feb 27, 2010 at 7:15 AM, Gabor Grothendieck ggrothendi...@gmail.com wrote: Try this. First we read the raw lines into R using grep to remove any lines containing a character that is not a number or space. Then we look for the year lines and repeat them down V1 using cumsum. Finally we omit the year lines. myURL - http://climate.arm.ac.uk/calibrated/soil/dsoil100_cal_1910-1919.dat raw.lines - readLines(myURL) DF - read.table(textConnection(raw.lines[!grepl([^ 0-9.],raw.lines)]), fill = TRUE) DF$V1 - DF[cumsum(is.na(DF[[2]])), 1] DF - na.omit(DF) head(DF) On Sat, Feb 27, 2010 at 6:32 AM, Tim Coote tim+r-project@coote.org wrote: Hullo I'm trying to read some time series data of meteorological records that are available on the web (eg http://climate.arm.ac.uk/calibrated/soil/dsoil100_cal_1910-1919.dat) . I'd like to be able to read in the digital data directly into R. However, I cannot work out the right function and set of parameters to use. It could be that the only practical route is to write a parser, possibly in some other language, reformat the files and then read these into R. As far as I can tell, the informal grammar of the file is: comments terminated by a blank line [year number on a line on its own daily readings lines ]+ and the daily readings are of the form: whitespace day number [whitespace reading on day of month] 12 Readings for days in months where a day does not exist have special values. Missing values have a different special value. And then I've got the problem of iterating over all relevant files to get a whole timeseries. Is there a way to read in this type of file into R? I've read all of the examples that I can find, but cannot work out how to do it. I don't think that
Re: [R] reading data from web data sources
Tim - I don't understand what you mean about interleaving rows. I'm guessing that you want a single large data frame with all the data, and not a list with each year separately. If that's the case: x = read.table('http://climate.arm.ac.uk/calibrated/soil/dsoil100_cal_1910-1919.dat', header=FALSE,fill=TRUE,skip=13) cts = apply(x,1,function(x)sum(is.na(x))) wh = which(cts == 12) start = wh+1 end = c(wh[-1] - 1,nrow(x)) ans = mapply(function(i,j)x[i:j,],start,end,SIMPLIFY=FALSE) names(ans) = x[wh,1] alldat = do.call(rbind,ans) alldat$year = rep(names(ans),sapply(ans,nrow)) names(alldat) = c('day',month.name,'year') On the other hand, if you want a long data frame with month, day, year and value: longdat = reshape(alldat,idvar=c('day','year'), varying=list(month.name),direction='long',times=month.name) names(longdat)[c(3,4)] = c('Month','value') Next , if you want to create a Date variable: longdat = transform(longdat,date=as.Date(paste(Month,day,year),'%B %d %Y')) longdat = na.omit(longdat) longdat = longdat[order(longdat$date),] and finally: zoodat = zoo(longdat$value,longdat$date) which should be suitable for time series analysis. Hope this helps. - Phil On Sat, 27 Feb 2010, Tim Coote wrote: Thanks, Gabor. My take away from this and Phil's post is that I'm going to have to construct some code to do the parsing, rather than use a standard function. I'm afraid that neither approach works, yet: Gabor's gets has an off-by-one error (days start on the 2nd, not the first), and the years get messed up around the 29th day. I think that na.omit (DF) line is throwing out the baby with the bathwater. It's interesting that this approach is based on read.table, I'd assumed that I'd need read.ftable, which I couldn't understand the documentation for. What is it that's removing the -999 and -888 values in this code -they seem to be gone, but I cannot see why. Phil's reads in the data, but interleaves rows with just a year and all other values as NA. Tim On 27 Feb 2010, at 17:33, Gabor Grothendieck wrote: Mark Leeds pointed out to me that the code wrapped around in the post so it may not be obvious that the regular expression in the grep is (i.e. it contains a space): [^ 0-9.] On Sat, Feb 27, 2010 at 7:15 AM, Gabor Grothendieck ggrothendi...@gmail.com wrote: Try this. First we read the raw lines into R using grep to remove any lines containing a character that is not a number or space. Then we look for the year lines and repeat them down V1 using cumsum. Finally we omit the year lines. myURL - http://climate.arm.ac.uk/calibrated/soil/dsoil100_cal_1910-1919.dat; raw.lines - readLines(myURL) DF - read.table(textConnection(raw.lines[!grepl([^ 0-9.],raw.lines)]), fill = TRUE) DF$V1 - DF[cumsum(is.na(DF[[2]])), 1] DF - na.omit(DF) head(DF) On Sat, Feb 27, 2010 at 6:32 AM, Tim Coote tim+r-project@coote.org wrote: Hullo I'm trying to read some time series data of meteorological records that are available on the web (eg http://climate.arm.ac.uk/calibrated/soil/dsoil100_cal_1910-1919.dat). I'd like to be able to read in the digital data directly into R. However, I cannot work out the right function and set of parameters to use. It could be that the only practical route is to write a parser, possibly in some other language, reformat the files and then read these into R. As far as I can tell, the informal grammar of the file is: comments terminated by a blank line [year number on a line on its own daily readings lines ]+ and the daily readings are of the form: whitespace day number [whitespace reading on day of month] 12 Readings for days in months where a day does not exist have special values. Missing values have a different special value. And then I've got the problem of iterating over all relevant files to get a whole timeseries. Is there a way to read in this type of file into R? I've read all of the examples that I can find, but cannot work out how to do it. I don't think that read.table can handle the separate sections of data representing each year. read.ftable maybe can be coerced to parse the data, but I cannot see how after reading the documentation and experimenting with the parameters. I'm using R 2.10.1 on osx 10.5.8 and 2.10.0 on Fedora 10. Any help/suggestions would be greatly appreciated. I can see that this type of issue is likely to grow in importance, and I'd also like to give the data owners suggestions on how to reformat their data so that it is easier to consume by machines, while being easy to read for humans. The early records are a serious machine parsing challenge as they are tiff images of old notebooks ;-) tia Tim Tim Coote t...@coote.org vincit veritas __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide
Re: [R] reading data from web data sources
On Feb 27, 2010, at 6:17 PM, Phil Spector wrote: Tim - I don't understand what you mean about interleaving rows. I'm guessing that you want a single large data frame with all the data, and not a list with each year separately. If that's the case: x = read.table('http://climate.arm.ac.uk/calibrated/soil/dsoil100_cal_1910-1919.dat' , header=FALSE,fill=TRUE,skip=13) cts = apply(x,1,function(x)sum(is.na(x))) wh = which(cts == 12) start = wh+1 end = c(wh[-1] - 1,nrow(x)) ans = mapply(function(i,j)x[i:j,],start,end,SIMPLIFY=FALSE) names(ans) = x[wh,1] alldat = do.call(rbind,ans) alldat$year = rep(names(ans),sapply(ans,nrow)) names(alldat) = c('day',month.name,'year') On the other hand, if you want a long data frame with month, day, year and value: longdat = reshape(alldat,idvar=c('day','year'), varying=list(month.name),direction='long',times=month.name) names(longdat)[c(3,4)] = c('Month','value') Next , if you want to create a Date variable: longdat = transform(longdat,date=as.Date(paste(Month,day,year),'%B %d %Y')) longdat = na.omit(longdat) longdat = longdat[order(longdat$date),] and finally: zoodat = zoo(longdat$value,longdat$date) which should be suitable for time series analysis. OK, I think I get it: (From Gabor's DF) dta - data.matrix(DF[, -c(1,14)]) dtafrm -data.frame(rdta=dta[!is.na(dta)], d.o.m= DF[row(dta)[!is.na(dta)], 1], month= col(dta)[!is.na(dta)], year=DF[row(dta)[!is.na(dta)], 14]) library(zoo) zoodat2 - with(dtafrm, zoo(rdta, as.Date(paste(month, d.o.m, year), %m %d %Y))) str(zoodat2) ‘zoo’ series from 1910-01-01 to 1919-12-31 Data: num [1:3652] 6.4 6.5 6.3 6.7 6.7 6.8 7 7.1 7.1 7.2 ... Index: Class 'Date' num [1:3652] -21915 -21914 -21913 -21912 -21911 ... Hope this helps. - Phil On Sat, 27 Feb 2010, Tim Coote wrote: Thanks, Gabor. My take away from this and Phil's post is that I'm going to have to construct some code to do the parsing, rather than use a standard function. I'm afraid that neither approach works, yet: Gabor's gets has an off-by-one error (days start on the 2nd, not the first), and the years get messed up around the 29th day. I think that na.omit (DF) line is throwing out the baby with the bathwater. It's interesting that this approach is based on read.table, I'd assumed that I'd need read.ftable, which I couldn't understand the documentation for. What is it that's removing the -999 and -888 values in this code -they seem to be gone, but I cannot see why. Phil's reads in the data, but interleaves rows with just a year and all other values as NA. Tim On 27 Feb 2010, at 17:33, Gabor Grothendieck wrote: Mark Leeds pointed out to me that the code wrapped around in the post so it may not be obvious that the regular expression in the grep is (i.e. it contains a space): [^ 0-9.] On Sat, Feb 27, 2010 at 7:15 AM, Gabor Grothendieck ggrothendi...@gmail.com wrote: Try this. First we read the raw lines into R using grep to remove any lines containing a character that is not a number or space. Then we look for the year lines and repeat them down V1 using cumsum. Finally we omit the year lines. myURL - http://climate.arm.ac.uk/calibrated/soil/dsoil100_cal_1910-1919.dat raw.lines - readLines(myURL) DF - read.table(textConnection(raw.lines[!grepl([^ 0-9.],raw.lines)]), fill = TRUE) DF$V1 - DF[cumsum(is.na(DF[[2]])), 1] DF - na.omit(DF) head(DF) On Sat, Feb 27, 2010 at 6:32 AM, Tim Coote tim+r-project@coote.org wrote: Hullo I'm trying to read some time series data of meteorological records that are available on the web (eg http://climate.arm.ac.uk/calibrated/soil/dsoil100_cal_1910-1919.dat) . I'd like to be able to read in the digital data directly into R. However, I cannot work out the right function and set of parameters to use. It could be that the only practical route is to write a parser, possibly in some other language, reformat the files and then read these into R. As far as I can tell, the informal grammar of the file is: comments terminated by a blank line [year number on a line on its own daily readings lines ]+ and the daily readings are of the form: whitespace day number [whitespace reading on day of month] 12 Readings for days in months where a day does not exist have special values. Missing values have a different special value. And then I've got the problem of iterating over all relevant files to get a whole timeseries. Is there a way to read in this type of file into R? I've read all of the examples that I can find, but cannot work out how to do it. I don't think that read.table can handle the separate sections of data representing each year. read.ftable maybe can be coerced to parse the data, but I cannot see how after reading the documentation and experimenting with the parameters.
Re: [R] reading data from web data sources
Here is a continuation to turn DF into a zoo series: It depends on the fact that all NAs are structural, i.e. they indicate dates which cannot exist such as Feb 31 as opposed to missing data. dd is the data as one long series with component names being the dates in the indicated format. That is converted to a zoo series in the next statement using Date class: dd - na.omit(unlist(by(DF[2:13], DF$year, c))) library(zoo) z - zoo(unname(dd), as.Date(names(dd), %Y.%b%d)) Here are the first few and last few in z: head(z) 1910-01-01 1910-01-02 1910-01-03 1910-01-04 1910-01-05 1910-01-06 6.46.56.36.76.76.8 tail(z) 1919-12-26 1919-12-27 1919-12-28 1919-12-29 1919-12-30 1919-12-31 6.76.66.66.56.46.4 On Sat, Feb 27, 2010 at 4:33 PM, Gabor Grothendieck ggrothendi...@gmail.com wrote: No one else posted so the other post you are referring to must have been an email to you, not a post. We did not see it. By one off I think you are referring to the row names, which are meaningless, rather than the day numbers. The data for day 1 is present, not missing. The example code did replace the day number column with the year since the days were just sequential and therefore derivable but its trivial to keep them if that is important to you and we have made that change below. The previous code used grep to kick out lines that had any character not in the set: minus sign, space and digit but in this version we add minus sign to that set. We also corrected the year column and added column names and converted all -999 strings to NA. Due to this last point we cannot use na.omit any more but we now have iy available that distinguishes between year rows and other rows. Every line here has been indented so anything that starts at the left column must have been word wrapped in transmission. myURL - http://climate.arm.ac.uk/calibrated/soil/dsoil100_cal_1910-1919.dat; raw.lines - readLines(myURL) DF - read.table(textConnection(raw.lines[!grepl([^- 0-9.], raw.lines)]), fill = TRUE, col.names = c(day, month.abb), na.strings = -999) iy - is.na(DF[[2]]) # is year row DF$year - DF[iy, 1][cumsum(iy)] DF - DF[!iy, ] DF On Sat, Feb 27, 2010 at 3:28 PM, Tim Coote tim+r-project@coote.org wrote: Thanks, Gabor. My take away from this and Phil's post is that I'm going to I think the other `post`` must have been directly to you. We didn`t see it. have to construct some code to do the parsing, rather than use a standard function. I'm afraid that neither approach works, yet: Gabor's gets has an off-by-one error (days start on the 2nd, not the first), and the years get messed up around the 29th day. I think that na.omit (DF) line is throwing out the baby with the bathwater. It's interesting that this approach is based on read.table, I'd assumed that I'd need read.ftable, which I couldn't understand the documentation for. What is it that's removing the -999 and -888 values in this code -they seem to be gone, but I cannot see why. Phil's reads in the data, but interleaves rows with just a year and all other values as NA. Tim On 27 Feb 2010, at 17:33, Gabor Grothendieck wrote: Mark Leeds pointed out to me that the code wrapped around in the post so it may not be obvious that the regular expression in the grep is (i.e. it contains a space): [^ 0-9.] On Sat, Feb 27, 2010 at 7:15 AM, Gabor Grothendieck ggrothendi...@gmail.com wrote: Try this. First we read the raw lines into R using grep to remove any lines containing a character that is not a number or space. Then we look for the year lines and repeat them down V1 using cumsum. Finally we omit the year lines. myURL - http://climate.arm.ac.uk/calibrated/soil/dsoil100_cal_1910-1919.dat; raw.lines - readLines(myURL) DF - read.table(textConnection(raw.lines[!grepl([^ 0-9.],raw.lines)]), fill = TRUE) DF$V1 - DF[cumsum(is.na(DF[[2]])), 1] DF - na.omit(DF) head(DF) On Sat, Feb 27, 2010 at 6:32 AM, Tim Coote tim+r-project@coote.org wrote: Hullo I'm trying to read some time series data of meteorological records that are available on the web (eg http://climate.arm.ac.uk/calibrated/soil/dsoil100_cal_1910-1919.dat). I'd like to be able to read in the digital data directly into R. However, I cannot work out the right function and set of parameters to use. It could be that the only practical route is to write a parser, possibly in some other language, reformat the files and then read these into R. As far as I can tell, the informal grammar of the file is: comments terminated by a blank line [year number on a line on its own daily readings lines ]+ and the daily readings are of the form: whitespace day number [whitespace reading on day of month] 12 Readings for days in months where a day does not exist have special values. Missing values have a different special value. And then
Re: [R] Reading data
Hi Val, I am not sure what it is that you are trying to do. read.table Is not used to open an R script, but to open a data file. You will also need to give the extension of the file when using the command (someone please correct me if I am wrong). If you wish to open an R script, I would just use the GUI menu and use: file - open script and find your script to open. Good luck, Tal -- My contact information: Tal Galili E-mail: tal.gal...@gmail.com Phone number: 972-52-7275845 FaceBook: Tal Galili My Blogs: http://www.talgalili.com (Web and general, Hebrew) http://www.biostatistics.co.il (Statistics, Hebrew) http://www.r-statistics.com/ (Statistics,R, English) On Wed, Oct 28, 2009 at 4:04 PM, Val valkr...@gmail.com wrote: Hi User's, This might be a simple question but it is giving me a hard time as I am a new user. I installed R version 2.9.2 (2009-08-24) 1. I just copied a short script from Fox (2002) as a practice and wanted to save it as Rossi.R. The system saved it without complain but when I looked at using a windows explorer it is not *.R file but only Rossi. Why this is happening? 2. the script and the data files are in the same working directory. When I run the following script Rossi - read.table('Rossi',header=T) Rossi[1:5,1:10] I got the following error messages Error in file(file, r) : cannot open the connection In addition: Warning message: In file(file, r) : cannot open file 'Rossi': No such file or directory Rossi[1:5,1:10] Error: object 'Rossi' not found Thank you for your help in advance Val [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading data
Hi Val, Windows does not display extensions by default. Check the 'Type' column; it should read 'R file'. Keep in mind what you are dealing with; Rossi.R is a script, so you cannot open it with read.table. You have to use source() for that. Moreover, use the extension, as well (Rossi.R, not Rossi) Cheers! On Wed, Oct 28, 2009 at 3:04 PM, Val valkr...@gmail.com wrote: Hi User's, This might be a simple question but it is giving me a hard time as I am a new user. I installed R version 2.9.2 (2009-08-24) 1. I just copied a short script from Fox (2002) as a practice and wanted to save it as Rossi.R. The system saved it without complain but when I looked at using a windows explorer it is not *.R file but only Rossi. Why this is happening? 2. the script and the data files are in the same working directory. When I run the following script Rossi - read.table('Rossi',header=T) Rossi[1:5,1:10] I got the following error messages Error in file(file, r) : cannot open the connection In addition: Warning message: In file(file, r) : cannot open file 'Rossi': No such file or directory Rossi[1:5,1:10] Error: object 'Rossi' not found Thank you for your help in advance Val [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading data
On Oct 28, 2009, at 10:04 AM, Val wrote: Hi User's, This might be a simple question but it is giving me a hard time as I am a new user. I installed R version 2.9.2 (2009-08-24) 1. I just copied a short script from Fox (2002) as a practice and wanted to save it as Rossi.R. How? The system saved it without complain but when I looked at using a windows explorer it is not *.R file but only Rossi. Why this is happening? If you were to include the code, we perhaps could tell you. In its default mode Windows may be hiding the extension from you. (Or possibly because R does not postpend file types and (I am now guessing here about a package I have not used and don't even know if you are) neither does Rcmdr.) 2. the script and the data files are in the same working directory. When I run the following script Rossi - read.table('Rossi',header=T) No path specification. And ??? thought you said it was a script, which would have been loaded with source() Rossi[1:5,1:10] But this suggests you are using it as data. What do you get when your type this: getwd() Maybe if you tried (untested)... Nah ... not going to do further guessing. Read the posting guide and supply the missing elements. I got the following error messages Error in file(file, r) : cannot open the connection In addition: Warning message: In file(file, r) : cannot open file 'Rossi': No such file or directory Rossi[1:5,1:10] Error: object 'Rossi' not found Thank you for your help in advance Val -- David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading data
The working directory is getwd() [1] C:/Documents and Settings/Val/My Documents The data file(Rossi.dat) and the script(Rossi.R) are in C:/Documents and Settings/Val/My Documents/R_data/prd How should I write to read the file? source(???) # what should be included here? Rossi - read.table('Rossi.dat',header=T) I still got the same error message. Error in file(file, r) : cannot open the connection In addition: Warning message: In file(file, r) : cannot open file 'Rossi': No such file or directory Thanks Val On Wed, Oct 28, 2009 at 10:32 AM, David Winsemius dwinsem...@comcast.netwrote: On Oct 28, 2009, at 10:04 AM, Val wrote: Hi User's, This might be a simple question but it is giving me a hard time as I am a new user. I installed R version 2.9.2 (2009-08-24) 1. I just copied a short script from Fox (2002) as a practice and wanted to save it as Rossi.R. How? The system saved it without complain but when I looked at using a windows explorer it is not *.R file but only Rossi. Why this is happening? If you were to include the code, we perhaps could tell you. In its default mode Windows may be hiding the extension from you. (Or possibly because R does not postpend file types and (I am now guessing here about a package I have not used and don't even know if you are) neither does Rcmdr.) 2. the script and the data files are in the same working directory. When I run the following script Rossi - read.table('Rossi',header=T) No path specification. And ??? thought you said it was a script, which would have been loaded with source() Rossi[1:5,1:10] But this suggests you are using it as data. What do you get when your type this: getwd() the working directory is getwd() [1] C:/Documents and Settings/val/My Documents getwd() Maybe if you tried (untested)... Nah ... not going to do further guessing. Read the posting guide and supply the missing elements. I got the following error messages Error in file(file, r) : cannot open the connection In addition: Warning message: In file(file, r) : cannot open file 'Rossi': No such file or directory Rossi[1:5,1:10] Error: object 'Rossi' not found Thank you for your help in advance Val -- David Winsemius, MD Heritage Laboratories West Hartford, CT [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading data
Val, please take it slow, you are missing basic stuff here. (1) Windows Explorer may hide extensions; the 'Type' column should read 'R file' anyway. (2) Script files are included in your workspace with the comand source(). Please type ?source for details. (3) You should call files with their path and extensions (in your case 'Rossi.R') Hope the above help, On Wed, Oct 28, 2009 at 3:55 PM, Val valkr...@gmail.com wrote: The working directory is getwd() [1] C:/Documents and Settings/Val/My Documents The data file(Rossi.dat) and the script(Rossi.R) are in C:/Documents and Settings/Val/My Documents/R_data/prd How should I write to read the file? source(???) # what should be included here? Rossi - read.table('Rossi.dat',header=T) I still got the same error message. Error in file(file, r) : cannot open the connection In addition: Warning message: In file(file, r) : cannot open file 'Rossi': No such file or directory Thanks Val On Wed, Oct 28, 2009 at 10:32 AM, David Winsemius dwinsem...@comcast.netwrote: On Oct 28, 2009, at 10:04 AM, Val wrote: Hi User's, This might be a simple question but it is giving me a hard time as I am a new user. I installed R version 2.9.2 (2009-08-24) 1. I just copied a short script from Fox (2002) as a practice and wanted to save it as Rossi.R. How? The system saved it without complain but when I looked at using a windows explorer it is not *.R file but only Rossi. Why this is happening? If you were to include the code, we perhaps could tell you. In its default mode Windows may be hiding the extension from you. (Or possibly because R does not postpend file types and (I am now guessing here about a package I have not used and don't even know if you are) neither does Rcmdr.) 2. the script and the data files are in the same working directory. When I run the following script Rossi - read.table('Rossi',header=T) No path specification. And ??? thought you said it was a script, which would have been loaded with source() Rossi[1:5,1:10] But this suggests you are using it as data. What do you get when your type this: getwd() the working directory is getwd() [1] C:/Documents and Settings/val/My Documents getwd() Maybe if you tried (untested)... Nah ... not going to do further guessing. Read the posting guide and supply the missing elements. I got the following error messages Error in file(file, r) : cannot open the connection In addition: Warning message: In file(file, r) : cannot open file 'Rossi': No such file or directory Rossi[1:5,1:10] Error: object 'Rossi' not found Thank you for your help in advance Val -- David Winsemius, MD Heritage Laboratories West Hartford, CT [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading data
On Oct 28, 2009, at 10:55 AM, Val wrote: The working directory is getwd() [1] C:/Documents and Settings/Val/My Documents The data file(Rossi.dat) and the script(Rossi.R) are in C:/Documents and Settings/Val/My Documents/R_data/prd So you are not giving a proper path when you issue the read.table command. The default path when not explicitly provided is to the working directory, and you have stored your data elsewhere. How should I write to read the file? source(???) # what should be included here? The guess I was about to make when I realized you were conflating data and scripts was that you might want: Rossi - read.table(paste(getwd(), 'Rossi.dat', sep=/), header=T) # but that would not have been effectively different from the default behavior. So you instead want: Rossi - read.table(paste(C:/Documents and Settings/Val/My Documents/ R_data/prd, Rossi.dat, sep=/), header=T) Only if you wanted to read in a script with valid r-code would you use source(). I still got the same error message. Error in file(file, r) : cannot open the connection In addition: Warning message: In file(file, r) : cannot open file 'Rossi': No such file or directory -- David On Wed, Oct 28, 2009 at 10:32 AM, David Winsemius dwinsem...@comcast.net wrote: On Oct 28, 2009, at 10:04 AM, Val wrote: Hi User's, This might be a simple question but it is giving me a hard time as I am a new user. I installed R version 2.9.2 (2009-08-24) 1. I just copied a short script from Fox (2002) as a practice and wanted to save it as Rossi.R. How? The system saved it without complain but when I looked at using a windows explorer it is not *.R file but only Rossi. Why this is happening? If you were to include the code, we perhaps could tell you. In its default mode Windows may be hiding the extension from you. (Or possibly because R does not postpend file types and (I am now guessing here about a package I have not used and don't even know if you are) neither does Rcmdr.) 2. the script and the data files are in the same working directory. When I run the following script Rossi - read.table('Rossi',header=T) No path specification. And ??? thought you said it was a script, which would have been loaded with source() Rossi[1:5,1:10] But this suggests you are using it as data. What do you get when your type this: getwd() the working directory is getwd() [1] C:/Documents and Settings/val/My Documents getwd() Maybe if you tried (untested)... Nah ... not going to do further guessing. Read the posting guide and supply the missing elements. I got the following error messages Error in file(file, r) : cannot open the connection In addition: Warning message: In file(file, r) : cannot open file 'Rossi': No such file or directory Rossi[1:5,1:10] Error: object 'Rossi' not found Thank you for your help in advance Val -- David Winsemius, MD Heritage Laboratories West Hartford, CT [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading data
Val, please take it slow, you are missing basic stuff here. (1) Windows Explorer may hide extensions; the 'Type' column should read 'R file' anyway. * Yes I looked at it and it only shows type. To check I downloaded another script with R extension test.R and the type column shows the exact extension(i.e., test.R). * (2) Script files are included in your workspace with the comand source(). Please type ?source for details. (3) You should call files with their path and extensions (in your case 'Rossi.R') I can open the script using this command, * Rossi - read.table( file.choose(),header=T) * *Why I can not open with this command?* Rossi - read.table(C:/Documents and Settings/Val/My Documents/R_data/prd/Rossi.dat,header=T) *David, *You suggested to use , Rossi - read.table(paste(C:/Documents and Settings/Val/My Documents/R_data/prd, Rossi.dat, sep=/), header=T) This is not working as well I got the same error message. Any help is highly appreciated Val Hope the above help, On Wed, Oct 28, 2009 at 3:55 PM, Val valkr...@gmail.com wrote: The working directory is getwd() [1] C:/Documents and Settings/Val/My Documents The data file(Rossi.dat) and the script(Rossi.R) are in C:/Documents and Settings/Val/My Documents/R_data/prd How should I write to read the file? source(???) # what should be included here? Rossi - read.table('Rossi.dat',header=T) I still got the same error message. Error in file(file, r) : cannot open the connection In addition: Warning message: In file(file, r) : cannot open file 'Rossi': No such file or directory Thanks Val On Wed, Oct 28, 2009 at 10:32 AM, David Winsemius dwinsem...@comcast.netwrote: On Oct 28, 2009, at 10:04 AM, Val wrote: Hi User's, This might be a simple question but it is giving me a hard time as I am a new user. I installed R version 2.9.2 (2009-08-24) 1. I just copied a short script from Fox (2002) as a practice and wanted to save it as Rossi.R. How? The system saved it without complain but when I looked at using a windows explorer it is not *.R file but only Rossi. Why this is happening? If you were to include the code, we perhaps could tell you. In its default mode Windows may be hiding the extension from you. (Or possibly because R does not postpend file types and (I am now guessing here about a package I have not used and don't even know if you are) neither does Rcmdr.) 2. the script and the data files are in the same working directory. When I run the following script Rossi - read.table('Rossi',header=T) No path specification. And ??? thought you said it was a script, which would have been loaded with source() Rossi[1:5,1:10] But this suggests you are using it as data. What do you get when your type this: getwd() the working directory is getwd() [1] C:/Documents and Settings/val/My Documents getwd() Maybe if you tried (untested)... Nah ... not going to do further guessing. Read the posting guide and supply the missing elements. I got the following error messages Error in file(file, r) : cannot open the connection In addition: Warning message: In file(file, r) : cannot open file 'Rossi': No such file or directory Rossi[1:5,1:10] Error: object 'Rossi' not found Thank you for your help in advance Val -- David Winsemius, MD Heritage Laboratories West Hartford, CT [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading data
David Winsemius wrote: On Oct 28, 2009, at 10:55 AM, Val wrote: The working directory is getwd() [1] C:/Documents and Settings/Val/My Documents The data file(Rossi.dat) and the script(Rossi.R) are in C:/Documents and Settings/Val/My Documents/R_data/prd So you are not giving a proper path when you issue the read.table command. The default path when not explicitly provided is to the working directory, and you have stored your data elsewhere. How should I write to read the file? source(???) # what should be included here? The guess I was about to make when I realized you were conflating data and scripts was that you might want: Rossi - read.table(paste(getwd(), 'Rossi.dat', sep=/), header=T) # but that would not have been effectively different from the default behavior. So you instead want: Rossi - read.table(paste(C:/Documents and Settings/Val/My Documents/R_data/prd, Rossi.dat, sep=/), header=T) Sometimes it's easiest to use Rossi - read.table(file.choose(), header=TRUE) which allows the mouse-addicted to click away. -Peter Ehlers Only if you wanted to read in a script with valid r-code would you use source(). I still got the same error message. Error in file(file, r) : cannot open the connection In addition: Warning message: In file(file, r) : cannot open file 'Rossi': No such file or directory __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading data
On Oct 28, 2009, at 11:46 AM, Val wrote: Val, please take it slow, you are missing basic stuff here. (1) Windows Explorer may hide extensions; the 'Type' column should read 'R file' anyway. * Yes I looked at it and it only shows type. To check I downloaded another script with R extension test.R and the type column shows the exact extension(i.e., test.R). * (2) Script files are included in your workspace with the comand source(). Please type ?source for details. (3) You should call files with their path and extensions (in your case 'Rossi.R') I can open the script using this command, * Rossi - read.table( file.choose(),header=T) * *Why I can not open with this command?* Rossi - read.table(C:/Documents and Settings/Val/My Documents/R_data/prd/Rossi.dat,header=T) *David, *You suggested to use , Rossi - read.table(paste(C:/Documents and Settings/Val/My Documents/R_data/prd, Rossi.dat, sep=/), header=T) This is not working as well I got the same error message. H0: there is no file by that name in that directory. HA: (or Windows and the email process is mucking up the spaces in the path). I do not see a space between My and Documents in the email representation. I originally asked and you never answered...HOW did your save Rossi or Rossi.dat? Code and output ... we want all your code and console output! So, please reproduce complete code and complete error messages. There are often details in those messages that new users are unable to decode. Any help is highly appreciated Val Hope the above help, On Wed, Oct 28, 2009 at 3:55 PM, Val valkr...@gmail.com wrote: The working directory is getwd() [1] C:/Documents and Settings/Val/My Documents The data file(Rossi.dat) and the script(Rossi.R) are in C:/Documents and Settings/Val/My Documents/R_data/prd How should I write to read the file? source(???) # what should be included here? Rossi - read.table('Rossi.dat',header=T) I still got the same error message. Error in file(file, r) : cannot open the connection In addition: Warning message: In file(file, r) : cannot open file 'Rossi': No such file or directory Thanks Val On Wed, Oct 28, 2009 at 10:32 AM, David Winsemius dwinsem...@comcast.netwrote: On Oct 28, 2009, at 10:04 AM, Val wrote: Hi User's, This might be a simple question but it is giving me a hard time as I am a new user. I installed R version 2.9.2 (2009-08-24) 1. I just copied a short script from Fox (2002) as a practice and wanted to save it as Rossi.R. How? The system saved it without complain but when I looked at using a windows explorer it is not *.R file but only Rossi. Why this is happening? If you were to include the code, we perhaps could tell you. In its default mode Windows may be hiding the extension from you. (Or possibly because R does not postpend file types and (I am now guessing here about a package I have not used and don't even know if you are) neither does Rcmdr.) 2. the script and the data files are in the same working directory. When I run the following script Rossi - read.table('Rossi',header=T) No path specification. And ??? thought you said it was a script, which would have been loaded with source() Rossi[1:5,1:10] But this suggests you are using it as data. What do you get when your type this: getwd() the working directory is getwd() [1] C:/Documents and Settings/val/My Documents getwd() Maybe if you tried (untested)... Nah ... not going to do further guessing. Read the posting guide and supply the missing elements. I got the following error messages Error in file(file, r) : cannot open the connection In addition: Warning message: In file(file, r) : cannot open file 'Rossi': No such file or directory Rossi[1:5,1:10] Error: object 'Rossi' not found Thank you for your help in advance Val -- David Winsemius, MD Heritage Laboratories West Hartford, CT [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained,
Re: [R] Reading data
On Wed, Oct 28, 2009 at 11:59 AM, David Winsemius dwinsem...@comcast.netwrote: On Oct 28, 2009, at 11:46 AM, Val wrote: Val, please take it slow, you are missing basic stuff here. (1) Windows Explorer may hide extensions; the 'Type' column should read 'R file' anyway. * Yes I looked at it and it only shows type. To check I downloaded another script with R extension test.R and the type column shows the exact extension(i.e., test.R). * (2) Script files are included in your workspace with the comand source(). Please type ?source for details. (3) You should call files with their path and extensions (in your case 'Rossi.R') I can open the script using this command, * Rossi - read.table( file.choose(),header=T) * *Why I can not open with this command?* Rossi - read.table(C:/Documents and Settings/Val/My Documents/R_data/prd/Rossi.dat,header=T) *David, *You suggested to use , Rossi - read.table(paste(C:/Documents and Settings/Val/My Documents/R_data/prd, Rossi.dat, sep=/), header=T) This is not working as well I got the same error message. H0: there is no file by that name in that directory. HA: (or Windows and the email process is mucking up the spaces in the path). I do not see a space between My and Documents in the email representation. I originally asked and you never answered...HOW did your save Rossi or Rossi.dat? Code and output ... we want all your code and console output! *Sorry for that and this is the code that was saved as Ross.R. Now I manged to save it as *.R. By default when I clicked file -- save as --- the window asks file name and shows save as type -- R files(*.R) file type. In my case I was typing only the file name Ross without the extension assuming that the window will append the extension since it asked me R files(*.R). I thought it is just like the other windows program like Word or Excel. Now I have to type the full file name Ross.R. The script file name is Ross.R * *Rossi - read.table(paste(C:/Documents and Settings/Val/My Documents/R_data/prd, Rossi.dat, sep=/), header=T)* the console output is Error in file(file, r) : cannot open the connection In addition: Warning message: In file(file, r) : cannot open file 'C:/Documents and Settings/Val/My Documents/R_data/prd/Rossi.dat': No such file or directory So, please reproduce complete code and complete error messages. There are often details in those messages that new users are unable to decode. Any help is highly appreciated Val Hope the above help, On Wed, Oct 28, 2009 at 3:55 PM, Val valkr...@gmail.com wrote: The working directory is getwd() [1] C:/Documents and Settings/Val/My Documents The data file(Rossi.dat) and the script(Rossi.R) are in C:/Documents and Settings/Val/My Documents/R_data/prd How should I write to read the file? source(???) # what should be included here? Rossi - read.table('Rossi.dat',header=T) I still got the same error message. Error in file(file, r) : cannot open the connection In addition: Warning message: In file(file, r) : cannot open file 'Rossi': No such file or directory Thanks Val On Wed, Oct 28, 2009 at 10:32 AM, David Winsemius dwinsem...@comcast.netwrote: On Oct 28, 2009, at 10:04 AM, Val wrote: Hi User's, This might be a simple question but it is giving me a hard time as I am a new user. I installed R version 2.9.2 (2009-08-24) 1. I just copied a short script from Fox (2002) as a practice and wanted to save it as Rossi.R. How? The system saved it without complain but when I looked at using a windows explorer it is not *.R file but only Rossi. Why this is happening? If you were to include the code, we perhaps could tell you. In its default mode Windows may be hiding the extension from you. (Or possibly because R does not postpend file types and (I am now guessing here about a package I have not used and don't even know if you are) neither does Rcmdr.) 2. the script and the data files are in the same working directory. When I run the following script Rossi - read.table('Rossi',header=T) No path specification. And ??? thought you said it was a script, which would have been loaded with source() Rossi[1:5,1:10] But this suggests you are using it as data. What do you get when your type this: getwd() the working directory is getwd() [1] C:/Documents and Settings/val/My Documents getwd() Maybe if you tried (untested)... Nah ... not going to do further guessing. Read the posting guide and supply the missing elements. I got the following error messages Error in file(file, r) : cannot open the connection In addition: Warning message: In file(file, r) : cannot open file 'Rossi': No such file or directory Rossi[1:5,1:10] Error: object 'Rossi' not found Thank you for your help in advance Val --
Re: [R] Reading data
On Oct 28, 2009, at 12:21 PM, Val wrote: On Wed, Oct 28, 2009 at 11:59 AM, David Winsemius dwinsem...@comcast.net wrote: On Oct 28, 2009, at 11:46 AM, Val wrote: Val, please take it slow, you are missing basic stuff here. (1) Windows Explorer may hide extensions; the 'Type' column should read 'R file' anyway. * Yes I looked at it and it only shows type. To check I downloaded another script with R extension test.R and the type column shows the exact extension(i.e., test.R). * (2) Script files are included in your workspace with the comand source(). Please type ?source for details. (3) You should call files with their path and extensions (in your case 'Rossi.R') I can open the script using this command, * Rossi - read.table( file.choose(),header=T) * *Why I can not open with this command?* Rossi - read.table(C:/Documents and Settings/Val/My Documents/R_data/prd/Rossi.dat,header=T) *David, *You suggested to use , Rossi - read.table(paste(C:/Documents and Settings/Val/My Documents/R_data/prd, Rossi.dat, sep=/), header=T) This is not working as well I got the same error message. H0: there is no file by that name in that directory. HA: (or Windows and the email process is mucking up the spaces in the path). I do not see a space between My and Documents in the email representation. I originally asked and you never answered...HOW did your save Rossi or Rossi.dat? Code and output ... we want all your code and console output! Sorry for that and this is the code that was saved as Ross.R. This? What was this? In my opinion, indefinite pronouns should be banned from discourse when discussing computer programs. Now I manged to save it as *.R. By default when I clicked file -- save as --- the window asks file name and shows save as type -- R files(*.R) file type. In my case I was typing only the file name Ross without the extension assuming that the window will append the extension since it asked me R files(*.R). I thought it is just like the other windows program like Word or Excel. Now I have to type the full file name Ross.R. There still appears to be confusion about data files and scripts. Do you have both? I was asking about the file that you were hoping to read with the read.table command. Please stop referring to creation of scripts. The read.table command is not to be used for accessing scripts. Only source() would be so used Again. Please produce the original code you used to create the file which you are hoping to access using read.table. The script file name is Ross.R Which is of no interest to us unless you named a data file incorrectly. We still would need to know HOW it was created. What were the commands? What was in it? Rossi - read.table(paste(C:/Documents and Settings/Val/My Documents/R_data/prd, Rossi.dat, sep=/), header=T) the console output is Error in file(file, r) : cannot open the connection In addition: Warning message: In file(file, r) : cannot open file 'C:/Documents and Settings/Val/My Documents/ R_data/prd/Rossi.dat': No such file or directory So, please reproduce complete code and complete error messages. There are often details in those messages that new users are unable to decode. Any help is highly appreciated Val Hope the above help, On Wed, Oct 28, 2009 at 3:55 PM, Val valkr...@gmail.com wrote: The working directory is getwd() [1] C:/Documents and Settings/Val/My Documents The data file(Rossi.dat) and the script(Rossi.R) are in C:/Documents and Settings/Val/My Documents/R_data/prd How should I write to read the file? source(???) # what should be included here? Rossi - read.table('Rossi.dat',header=T) I still got the same error message. Error in file(file, r) : cannot open the connection In addition: Warning message: In file(file, r) : cannot open file 'Rossi': No such file or directory Thanks Val On Wed, Oct 28, 2009 at 10:32 AM, David Winsemius dwinsem...@comcast.netwrote: On Oct 28, 2009, at 10:04 AM, Val wrote: Hi User's, This might be a simple question but it is giving me a hard time as I am a new user. I installed R version 2.9.2 (2009-08-24) 1. I just copied a short script from Fox (2002) as a practice and wanted to save it as Rossi.R. How? The system saved it without complain but when I looked at using a windows explorer it is not *.R file but only Rossi. Why this is happening? If you were to include the code, we perhaps could tell you. In its default mode Windows may be hiding the extension from you. (Or possibly because R does not postpend file types and (I am now guessing here about a package I have not used and don't even know if you are) neither does Rcmdr.) 2. the script and the data files are in the same working directory. When I run the following script Rossi -
Re: [R] Reading data
On Wed, Oct 28, 2009 at 1:08 PM, David Winsemius dwinsem...@comcast.netwrote: On Oct 28, 2009, at 12:21 PM, Val wrote: On Wed, Oct 28, 2009 at 11:59 AM, David Winsemius dwinsem...@comcast.netwrote: On Oct 28, 2009, at 11:46 AM, Val wrote: Val, please take it slow, you are missing basic stuff here. (1) Windows Explorer may hide extensions; the 'Type' column should read 'R file' anyway. * Yes I looked at it and it only shows type. To check I downloaded another script with R extension test.R and the type column shows the exact extension(i.e., test.R). * (2) Script files are included in your workspace with the comand source(). Please type ?source for details. (3) You should call files with their path and extensions (in your case 'Rossi.R') I can open the script using this command, * Rossi - read.table( file.choose(),header=T) * *Why I can not open with this command?* Rossi - read.table(C:/Documents and Settings/Val/My Documents/R_data/prd/Rossi.dat,header=T) *David, *You suggested to use , Rossi - read.table(paste(C:/Documents and Settings/Val/My Documents/R_data/prd, Rossi.dat, sep=/), header=T) This is not working as well I got the same error message. H0: there is no file by that name in that directory. HA: (or Windows and the email process is mucking up the spaces in the path). I do not see a space between My and Documents in the email representation. I originally asked and you never answered...HOW did your save Rossi or Rossi.dat? Code and output ... we want all your code and console output! *Sorry for that and this is the code that was saved as Ross.R. * This? What was this? In my opinion, indefinite pronouns should be banned from discourse when discussing computer programs. *Now I manged to save it as *.R. By default when I clicked file -- save as --- the window asks file name and shows save as type -- R files(*.R) file type. In my case I was typing only the file name Ross without the extension assuming that the window will append the extension since it asked me R files(*.R). I thought it is just like the other windows program like Word or Excel. Now I have to type the full file name Ross.R. * There still appears to be confusion about data files and scripts. Do you have both? I was asking about the file that you were hoping to read with the read.table command. Please stop referring to creation of scripts. The read.table command is not to be used for accessing scripts. Only source() would be so used Again. Please produce the original code you used to create the file which you are hoping to access using read.table. *The script file name is Ross.R * Which is of no interest to us unless you named a data file incorrectly. We still would need to know HOW it was created. What were the commands? What was in it? David, *Yes You are right! I used note pad to create the data and when I added * *.txt** in the file name then it worked. Sorry for the confusion* Rossi - read.table(paste(C:/Documents and Settings/Val/My Documents/R_data/prd, Rossi.dat*.txt*, sep=/), header=T) * Thanks a lot for your patience Val* * * *Rossi - read.table(paste(C:/Documents and Settings/Val/My Documents/R_data/prd, Rossi.dat, sep=/), header=T)* the console output is Error in file(file, r) : cannot open the connection In addition: Warning message: In file(file, r) : cannot open file 'C:/Documents and Settings/Val/My Documents/R_data/prd/Rossi.dat': No such file or directory So, please reproduce complete code and complete error messages. There are often details in those messages that new users are unable to decode. Any help is highly appreciated Val Hope the above help, On Wed, Oct 28, 2009 at 3:55 PM, Val valkr...@gmail.com wrote: The working directory is getwd() [1] C:/Documents and Settings/Val/My Documents The data file(Rossi.dat) and the script(Rossi.R) are in C:/Documents and Settings/Val/My Documents/R_data/prd How should I write to read the file? source(???) # what should be included here? Rossi - read.table('Rossi.dat',header=T) I still got the same error message. Error in file(file, r) : cannot open the connection In addition: Warning message: In file(file, r) : cannot open file 'Rossi': No such file or directory Thanks Val On Wed, Oct 28, 2009 at 10:32 AM, David Winsemius dwinsem...@comcast.netwrote: On Oct 28, 2009, at 10:04 AM, Val wrote: Hi User's, This might be a simple question but it is giving me a hard time as I am a new user. I installed R version 2.9.2 (2009-08-24) 1. I just copied a short script from Fox (2002) as a practice and wanted to save it as Rossi.R. How? The system saved it without complain but when I looked at using a windows explorer it is not *.R file but only Rossi. Why this is happening? If you were to include the code, we perhaps could tell
Re: [R] Reading data
Sometimes it is easiest to open a file using a file selection widget. I keep this in my .Rprofile: getOpenFile - function(...){ require(tcltk) return(tclvalue(tkgetOpenFile())) } With this you can find your file and open it with rel - read.table(getOpenFile(), quote=, header=FALSE, sep=, col.names=c(id,orel,nrel)) or filename - getOpenFile() rel - read.table(filename, quote=, header=FALSE, sep=, col.names=c(id,orel,nrel)) Mike P.S. I keep a couple functions on hand for choosing writable files and directories too... getSaveFile - function(...){ require(tcltk) return(tclvalue(tkgetSaveFile())) } chooseDir - function(...){ require(tcltk) return(tclvalue(tkchooseDirectory())) } __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading data
You can use R.utils (on CRAN) to help you figure out why the file is not found or not readable. library(R.utils); pathname - C:/Documents and Settings/ashta/My Documents/R_data/rel.dat; pathname - Arguments$getReadablePathname(pathname); rel - read.table(pathname, quote=, header=FALSE, sep=, col.names=c(id,orel,nrel)); If the file is not found it gives an error an tries to tell you why, e.g. Arguments$getReadablePathname(C:/Windows/system32/cmd.exe) [1] C:/Windows/system32/cmd.exe Arguments$getReadablePathname(C:/Windows/system323/cmd.exe) Error in list(`Arguments$getReadablePathname(C:/Windows/system323/cmd.exe)` = environment, : [2009-09-25 10:11:57] Exception: Pathname not found: C:/Windows/system323/cmd.exe (C:/Windows/ exists, but nothing beyond) at throw(Exception(...)) at throw.default(Pathname not found: , pathname, reason) at throw(Pathname not found: , pathname, reason) at method(static, ...) at Arguments$getReadablePathname(C:/Windows/system323/cmd.exe) It will also tell you if the file exists, but you don't have the permission to read it. Second, your error message reports on a pathname that starts with 'file=', which I've never seen; cannot open file 'file=C:/Documents and Settings/sewalem/MyDocuments/R_data/rel.dat': Invalid argument what version of R are you use, i.e. what does sessionInfo() give? Third, it is true that backslashes need to be escaped. However, *forward-slashes* work with *any platform*. I stick with the latter so I don't have to think about it. It should make no difference in your case. My $.02 /Henrik On Fri, Sep 25, 2009 at 7:32 AM, Michael A. Miller mmill...@iupui.edu wrote: Sometimes it is easiest to open a file using a file selection widget. I keep this in my .Rprofile: getOpenFile - function(...){ require(tcltk) return(tclvalue(tkgetOpenFile())) } With this you can find your file and open it with rel - read.table(getOpenFile(), quote=, header=FALSE, sep=, col.names=c(id,orel,nrel)) or filename - getOpenFile() rel - read.table(filename, quote=, header=FALSE, sep=, col.names=c(id,orel,nrel)) Mike P.S. I keep a couple functions on hand for choosing writable files and directories too... getSaveFile - function(...){ require(tcltk) return(tclvalue(tkgetSaveFile())) } chooseDir - function(...){ require(tcltk) return(tclvalue(tkchooseDirectory())) } __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading data
On Fri, Sep 25, 2009 at 10:18 AM, Henrik Bengtsson h...@stat.berkeley.edu wrote: You can use R.utils (on CRAN) to help you figure out why the file is not found or not readable. library(R.utils); pathname - C:/Documents and Settings/ashta/My Documents/R_data/rel.dat; pathname - Arguments$getReadablePathname(pathname); rel - read.table(pathname, quote=, header=FALSE, sep=, col.names=c(id,orel,nrel)); If the file is not found it gives an error an tries to tell you why, e.g. Arguments$getReadablePathname(C:/Windows/system32/cmd.exe) [1] C:/Windows/system32/cmd.exe Arguments$getReadablePathname(C:/Windows/system323/cmd.exe) Error in list(`Arguments$getReadablePathname(C:/Windows/system323/cmd.exe)` = environment, : [2009-09-25 10:11:57] Exception: Pathname not found: C:/Windows/system323/cmd.exe (C:/Windows/ exists, but nothing beyond) at throw(Exception(...)) at throw.default(Pathname not found: , pathname, reason) at throw(Pathname not found: , pathname, reason) at method(static, ...) at Arguments$getReadablePathname(C:/Windows/system323/cmd.exe) It will also tell you if the file exists, but you don't have the permission to read it. Second, your error message reports on a pathname that starts with 'file=', which I've never seen; cannot open file 'file=C:/Documents and Settings/sewalem/MyDocuments/R_data/rel.dat': Invalid argument what version of R are you use, i.e. what does sessionInfo() give? Did you *really* do? rel - read.table(C:/Documents and Settings/sewalem/MyDocuments/R_data/rel.dat, quote=, header=FALSE, sep=, col.names=c(id,orel,nrel)) or did you try to do: rel - read.table(file=C:/Documents and Settings/sewalem/MyDocuments/R_data/rel.dat, quote=, header=FALSE, sep=, col.names=c(id,orel,nrel)) but wrote? rel - read.table(file=C:/Documents and Settings/sewalem/MyDocuments/R_data/rel.dat, quote=, header=FALSE, sep=, col.names=c(id,orel,nrel)) /H Third, it is true that backslashes need to be escaped. However, *forward-slashes* work with *any platform*. I stick with the latter so I don't have to think about it. It should make no difference in your case. My $.02 /Henrik On Fri, Sep 25, 2009 at 7:32 AM, Michael A. Miller mmill...@iupui.edu wrote: Sometimes it is easiest to open a file using a file selection widget. I keep this in my .Rprofile: getOpenFile - function(...){ require(tcltk) return(tclvalue(tkgetOpenFile())) } With this you can find your file and open it with rel - read.table(getOpenFile(), quote=, header=FALSE, sep=, col.names=c(id,orel,nrel)) or filename - getOpenFile() rel - read.table(filename, quote=, header=FALSE, sep=, col.names=c(id,orel,nrel)) Mike P.S. I keep a couple functions on hand for choosing writable files and directories too... getSaveFile - function(...){ require(tcltk) return(tclvalue(tkgetSaveFile())) } chooseDir - function(...){ require(tcltk) return(tclvalue(tkchooseDirectory())) } __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading data
On 09/23/2009 10:42 PM, Ashta wrote: Dear R-users, I am a new user for R. I am eager to lean about it. I wanted to read and summary of the a simple data file I used the following, rel- read.table(C:/Documents and Settings/ashta/My Documents/R_data/rel.dat, quote=,header=FALSE,sep=,col.names= c(id,orel,nrel)) summary(rel) Below is the error message, rel- read.table(C:/Documents and Settings/ashta/My Documents/R_data/rel.dat, quote=,header=FALSE,sep=,col.names= + c(id,orel,nrel)) Error in file(file, r) : cannot open the connection In addition: Warning message: In file(file, r) : cannot open file 'file=C:/Documents and Settings/sewalem/My Documents/R_data/rel.dat': Invalid argument summary(rel) Error in summary(rel) : object 'rel' not found Does it need a library? Where can I get the library? Hi Ashta, If you have checked that the file rel.dat is really there where you think it is, there is a nasty trick that Windows plays with many files. For example, if you have created this file in Notepad and saved it, you may find that .txt has been added to the filename. So the real filename is rel.dat.txt. Of course, Windows won't show you that unless you go into Folder Options in Windows Explorer and turn off that Hide known extensions option. This is a wild guess, but it has happened to me so often that I am wary of it. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.