Re: [R] Dataframes and text identifier columns
Hi. Well, Case is probably factor, which is basically numeric vector with labels. It is useful for some operations but it can have some features which lead to this behaviour. I do not have available your exact code but I presume you use c or cbind somewhere. > Case<-factor(letters[1:4]) > Case [1] a b c d Levels: a b c d > c(Case, 1) [1] 1 2 3 4 1 > cbind(Case, rep(1,4)) Case [1,]1 1 [2,]2 1 [3,]3 1 [4,]4 1 You can try to change Case to character by as.character(Case) before cycle. Regards Petr > -Original Message- > From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- > project.org] On Behalf Of Brian Willis > Sent: Thursday, July 03, 2014 12:07 PM > To: r-help@r-project.org > Subject: Re: [R] Dataframes and text identifier columns > > Thank you for the suggestion > > What seems to work is assigning out_put$Case <- Inp_dat$Case > that is > > for(i in 1:4) > { > ... > Case<- Inp_dat$Case[i] > … > > out_put[i,]<-data.frame(Case, stdL, stdPP, stdSE, L, PP, PP_SE) > > } > > out_put$Case <- Inp_dat$Case > > out_put > > What I don't understand is why I need to do this, and why adding rows > to out_put[i,] within the loop the Case column has an integer label > assigned and not the text label. > > Further it seems I cannot correct this within the loop? > > > > > > -- > View this message in context: http://r.789695.n4.nabble.com/Dataframes- > and-text-identifier-columns-tp4693184p4693443.html > Sent from the R help mailing list archive at Nabble.com. > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code. Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a jsou určeny pouze jeho adresátům. Jestliže jste obdržel(a) tento e-mail omylem, informujte laskavě neprodleně jeho odesílatele. Obsah tohoto emailu i s přílohami a jeho kopie vymažte ze svého systému. Nejste-li zamýšleným adresátem tohoto emailu, nejste oprávněni tento email jakkoliv užívat, rozšiřovat, kopírovat či zveřejňovat. Odesílatel e-mailu neodpovídá za eventuální škodu způsobenou modifikacemi či zpožděním přenosu e-mailu. V případě, že je tento e-mail součástí obchodního jednání: - vyhrazuje si odesílatel právo ukončit kdykoliv jednání o uzavření smlouvy, a to z jakéhokoliv důvodu i bez uvedení důvodu. - a obsahuje-li nabídku, je adresát oprávněn nabídku bezodkladně přijmout; Odesílatel tohoto e-mailu (nabídky) vylučuje přijetí nabídky ze strany příjemce s dodatkem či odchylkou. - trvá odesílatel na tom, že příslušná smlouva je uzavřena teprve výslovným dosažením shody na všech jejích náležitostech. - odesílatel tohoto emailu informuje, že není oprávněn uzavírat za společnost žádné smlouvy s výjimkou případů, kdy k tomu byl písemně zmocněn nebo písemně pověřen a takové pověření nebo plná moc byly adresátovi tohoto emailu případně osobě, kterou adresát zastupuje, předloženy nebo jejich existence je adresátovi či osobě jím zastoupené známá. This e-mail and any documents attached to it may be confidential and are intended only for its intended recipients. If you received this e-mail by mistake, please immediately inform its sender. Delete the contents of this e-mail with all attachments and its copies from your system. If you are not the intended recipient of this e-mail, you are not authorized to use, disseminate, copy or disclose this e-mail in any manner. The sender of this e-mail shall not be liable for any possible damage caused by modifications of the e-mail or by delay with transfer of the email. In case that this e-mail forms part of business dealings: - the sender reserves the right to end negotiations about entering into a contract in any time, for any reason, and without stating any reasoning. - if the e-mail contains an offer, the recipient is entitled to immediately accept such offer; The sender of this e-mail (offer) excludes any acceptance of the offer on the part of the recipient containing any amendment or variation. - the sender insists on that the respective contract is concluded only upon an express mutual agreement on all its aspects. - the sender of this e-mail informs that he/she is not authorized to enter into any contracts on behalf of the company except for cases in which he/she is expressly authorized to do so in writing, and such authorization or power of attorney is submitted to the recipient or the person represented by the recipient, or the existence of such authorization is known to the recipient of the person represented by the recipient. __ R-he
Re: [R] Dataframes and text identifier columns
Thank you for the suggestion What seems to work is assigning out_put$Case <- Inp_dat$Case that is for(i in 1:4) { ... Case<- Inp_dat$Case[i] … out_put[i,]<-data.frame(Case, stdL, stdPP, stdSE, L, PP, PP_SE) } out_put$Case <- Inp_dat$Case out_put What I don't understand is why I need to do this, and why adding rows to out_put[i,] within the loop the Case column has an integer label assigned and not the text label. Further it seems I cannot correct this within the loop? -- View this message in context: http://r.789695.n4.nabble.com/Dataframes-and-text-identifier-columns-tp4693184p4693443.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Dataframes and text identifier columns
Hi What is the problem? some comments in line > -Original Message- > From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- > project.org] On Behalf Of Brian Willis > Sent: Wednesday, July 02, 2014 1:33 PM > To: r-help@r-project.org > Subject: Re: [R] Dataframes and text identifier columns > > Apologies I was trying to simplify the programme and missed out four > input files. The files on Andrew, Burt, Charlie and Dave have the same > format of one factor and 13 numeric variables with repeated > measurements eg. > Study v1 v2 v3 v4 v5 v6 v7 v8 v9 > v10 v11 > v12 v13 > A 153 4.0 2.002.00145.00 0.670.0149.00 0.34 > 0.04 > 0.96-3.24 0.04 > B 96 33 3.0 13.047.00.9 0.2 4.2 0.1 > 0.5 0.5 > -0.7-0.7 > > Inp_dat is > Case r p SE n > Andrew0.030.010.0004 500 > Burt 0.080.111 0.0450 > Charlie 0.040.022 0.0005 200 > Dave 0.2 0.028 0.006 85 > > out_put starts as empty data frame and rows are added incrementally one > for Andrew, one for Burt etc. > If the code is > Andrew<-read.csv("/File /Andrew.csv") > Burt<-read.csv("/File /Burt.csv") > Charlie<-read.csv("/File /Charlie.csv") > Dave<-read.csv("/File /Dave.csv") > > Inp_dat<- read.csv("/File/Input data.csv") > > > out_put<-data.frame(Case=character(), StdL=numeric(), StdPP=numeric(), > StdSE=numeric(), L=numeric(), MRPP=numeric(), MRSE=numeric(), > stringsAsFactors=FALSE) > > for(i in 1:4) > { > if (i==1) b<-Andrew > if (i==2) b<-Burt > if (i==3) b<-Charlie > if (i==4) b<-Dave ^ you do not use b in your further code so this is not necessary > > pr <- Inp_dat$p[i] > SE_pr <- Inp_dat$SE[i] > r<- Inp_dat$r[i] > n<- Inp_dat$n[i] > Case<- Inp_dat$Case[i] > … > > out_put[i,]<-data.frame(Case, stdL, stdPP, stdSE, L, PP, PP_SE) > > } > out_put > > Case StdL StdPP StdSE L > MRPPMRSE > 11 19.466823 0.16432300 0.03137456 26.002294 0.2080145 > 0.03804692 > 22 2.3341300.22566939 0.089626625.0957030.3888451 > 0.08399101 > 33 2.5886780.05502765 0.00454159 42.058326 0.4861511 > 0.02128030 > 44 7.8578980.18457822 0.043722974.7054870.1193687 > 0.01921609 > > > > The Cases are labelled as integers 1 corresponding to Andrew, 2 > corresponding to Burt etc. instead of the intended text labels Andrew, > Burt, Charlie and Dave. If you want to change Case to labels just use out_put$Case <- factor(out_put$Case), labels(Inp_dat$Case)) Regards Petr > > Note all other columns are correct. Furthermore > > str(Case) > Factor w/ 4 levels "Andrew","Burt",..: 4 > > str(out_put) > > 'data.frame': 4 obs. of 7 variables: > $ Case : chr "1" "2" "3" "4" > $ StdL : num 19.47 2.33 2.59 7.86 > etc > > > I have tried changing the line > > Case<- Inp_dat$Case[i] > to > > Case<- levels(Inp_dat$Case)[i] > > and this gives the following output > > Case StdL StdPP StdSE L > MRPPMRSE > 11 19.466823 0.16432300 0.03137456 26.002294 0.2080145 > 0.03804692 > 21 2.3341300.22566939 0.089626625.0957030.3888451 > 0.08399101 > 31 2.5886780.05502765 0.00454159 42.058326 0.4861511 > 0.02128030 > 41 7.8578980.18457822 0.043722974.7054870.1193687 > 0.01921609 > > str(Case) > > chr "Dave" > > and > > str(out_put) > > 'data.frame': 4 obs. of 7 variables: > $ Case : chr "1" "1" "1" "1" > $ StdL : num 19.47 2.33 2.59 7.86 > etc > > > I’ve also tried adding, as suggested the stringsAsFactors=FALSE to the > Inp_dat<- read.csv("/File/Input data.csv", stringsAsFactors=FALSE) > > This gives the same as the 2nd output above. > > > > > -- > View this message in context: http://r.789695.n4.nabble.com/Dataframes- > and-text-identifier-columns-tp4693184p4693389.html > Sent from the R help mailing list archive at Nabble.com. > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide co
Re: [R] Dataframes and text identifier columns
Apologies I was trying to simplify the programme and missed out four input files. The files on Andrew, Burt, Charlie and Dave have the same format of one factor and 13 numeric variables with repeated measurements eg. Study v1 v2 v3 v4 v5 v6 v7 v8 v9 v10 v11 v12 v13 A 153 4.0 2.002.00145.00 0.670.0149.00 0.34 0.040.96-3.24 0.04 B 96 33 3.0 13.047.00.9 0.2 4.2 0.1 0.5 0.5 -0.7-0.7 Inp_dat is Caser p SE n Andrew 0.030.010.0004 500 Burt0.080.111 0.0450 Charlie 0.040.022 0.0005 200 Dave0.2 0.028 0.006 85 out_put starts as empty data frame and rows are added incrementally one for Andrew, one for Burt etc. If the code is Andrew<-read.csv("/File /Andrew.csv") Burt<-read.csv("/File /Burt.csv") Charlie<-read.csv("/File /Charlie.csv") Dave<-read.csv("/File /Dave.csv") Inp_dat<- read.csv("/File/Input data.csv") out_put<-data.frame(Case=character(), StdL=numeric(), StdPP=numeric(), StdSE=numeric(), L=numeric(), MRPP=numeric(), MRSE=numeric(), stringsAsFactors=FALSE) for(i in 1:4) { if (i==1) b<-Andrew if (i==2) b<-Burt if (i==3) b<-Charlie if (i==4) b<-Dave pr <- Inp_dat$p[i] SE_pr <- Inp_dat$SE[i] r<- Inp_dat$r[i] n<- Inp_dat$n[i] Case<- Inp_dat$Case[i] … out_put[i,]<-data.frame(Case, stdL, stdPP, stdSE, L, PP, PP_SE) } out_put Case StdL StdPP StdSE L MRPPMRSE 11 19.466823 0.16432300 0.03137456 26.002294 0.2080145 0.03804692 22 2.3341300.22566939 0.089626625.0957030.3888451 0.08399101 33 2.5886780.05502765 0.00454159 42.058326 0.4861511 0.02128030 44 7.8578980.18457822 0.043722974.7054870.1193687 0.01921609 The Cases are labelled as integers 1 corresponding to Andrew, 2 corresponding to Burt etc. instead of the intended text labels Andrew, Burt, Charlie and Dave. Note all other columns are correct. Furthermore str(Case) Factor w/ 4 levels "Andrew","Burt",..: 4 str(out_put) 'data.frame': 4 obs. of 7 variables: $ Case : chr "1" "2" "3" "4" $ StdL : num 19.47 2.33 2.59 7.86 etc I have tried changing the line Case<- Inp_dat$Case[i] to Case<- levels(Inp_dat$Case)[i] and this gives the following output Case StdL StdPP StdSE L MRPPMRSE 11 19.466823 0.16432300 0.03137456 26.002294 0.2080145 0.03804692 21 2.3341300.22566939 0.089626625.0957030.3888451 0.08399101 31 2.5886780.05502765 0.00454159 42.058326 0.4861511 0.02128030 41 7.8578980.18457822 0.043722974.7054870.1193687 0.01921609 str(Case) chr "Dave" and str(out_put) 'data.frame': 4 obs. of 7 variables: $ Case : chr "1" "1" "1" "1" $ StdL : num 19.47 2.33 2.59 7.86 etc I’ve also tried adding, as suggested the stringsAsFactors=FALSE to the Inp_dat<- read.csv("/File/Input data.csv", stringsAsFactors=FALSE) This gives the same as the 2nd output above. -- View this message in context: http://r.789695.n4.nabble.com/Dataframes-and-text-identifier-columns-tp4693184p4693389.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Dataframes and text identifier columns
please stay on list. I am returning this to the list. David's comment that b is never used might be relevant. I think the next step is for you to post a small dataset that causes the problem dput(head(Inp_dat)) might be enough. Rich On Sun, Jun 29, 2014 at 3:11 PM, Brian Willis wrote: > Hi > > This does not work either. Comes up with same result > > Regards > > Brian > > ** > > Dr Brian H Willis > NIHR Clinical Lecturer > Primary Care Clinical Sciences > University of Birmingham > Edgbaston > Birmingham. B15 2TT > tel +44 121 414 7979 > b.h.wil...@bham.ac.uk > http://www.birmingham.ac.uk/brian-willis > > > From: Richard M. Heiberger [r...@temple.edu] > Sent: 29 June 2014 18:34 > To: Brian Willis > Cc: r-help > Subject: Re: [R] Dataframes and text identifier columns > > Inp_dat<- read.csv("/File/Input data.csv") > out_put[i,]<-data.frame(Case, stdL, stdPP, stdSE, L, PP, PP_SE) > > These two statements look like the source of the problem. > Add the optional argument > stringsAsFactors=FALSE > to both. > Rich > > On Sun, Jun 29, 2014 at 8:46 AM, Brian Willis wrote: >> I am trying to incrementally add rows to an empty data frame. The data has 7 >> columns, the last 6 are numeric. >> >> For the first columns I would like to include a text identifier, called >> ‘Case’ like Andrew, Burt, Charlie etc. that is also output to a data frame – >> this is where I am having the problem >> >> The identifiers are being input from file which automatically assigns them >> to a data frame >> Code: >> >> Inp_dat<- read.csv("/File/Input data.csv") >> >> out_put<-data.frame(Case=character(), StdL=numeric(), StdPP=numeric(), >> StdSE=numeric(), L=numeric(), MRPP=numeric(), MRSE=numeric(), >> stringsAsFactors=FALSE) >> >> for(i in 1:4) >> { >> if (i==1) b<-Andrew >> if (i==2) b<-Burt >> if (i==3) b<-Charlie >> if (i==4) b<-Dave >> >> pr <- Inp_dat$p[i] >> SE_pr <- Inp_dat$SE[i] >> r<- Inp_dat$r[i] >> n<- Inp_dat$n[i] >> Case<- levels(Inp_dat$Case)[i]) >> … >> >> out_put[i,]<-data.frame(Case, stdL, stdPP, stdSE, L, PP, PP_SE) >> >> } >> out_put >> >> Case StdL StdPP StdSE L >> MRPPMRSE >> 11 19.466823 0.16432300 0.03137456 26.002294 0.2080145 >> 0.03804692 >> 21 2.3341300.22566939 0.089626625.0957030.3888451 >> 0.08399101 >> 31 2.5886780.05502765 0.00454159 42.058326 0.4861511 >> 0.02128030 >> 41 7.8578980.18457822 0.043722974.7054870.1193687 >> 0.01921609 >> >> >> Unfortunately the Case column loses the labels Andrew, Burt … >> >> I’ve tried always to keep these, but are clearly doing something wrong. Can >> anyone help? >> >> >> >> >> -- >> View this message in context: >> http://r.789695.n4.nabble.com/Dataframes-and-text-identifier-columns-tp4693184.html >> Sent from the R help mailing list archive at Nabble.com. >> >> __ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Dataframes and text identifier columns
Inp_dat<- read.csv("/File/Input data.csv") out_put[i,]<-data.frame(Case, stdL, stdPP, stdSE, L, PP, PP_SE) These two statements look like the source of the problem. Add the optional argument stringsAsFactors=FALSE to both. Rich On Sun, Jun 29, 2014 at 8:46 AM, Brian Willis wrote: > I am trying to incrementally add rows to an empty data frame. The data has 7 > columns, the last 6 are numeric. > > For the first columns I would like to include a text identifier, called > ‘Case’ like Andrew, Burt, Charlie etc. that is also output to a data frame – > this is where I am having the problem > > The identifiers are being input from file which automatically assigns them > to a data frame > Code: > > Inp_dat<- read.csv("/File/Input data.csv") > > out_put<-data.frame(Case=character(), StdL=numeric(), StdPP=numeric(), > StdSE=numeric(), L=numeric(), MRPP=numeric(), MRSE=numeric(), > stringsAsFactors=FALSE) > > for(i in 1:4) > { > if (i==1) b<-Andrew > if (i==2) b<-Burt > if (i==3) b<-Charlie > if (i==4) b<-Dave > > pr <- Inp_dat$p[i] > SE_pr <- Inp_dat$SE[i] > r<- Inp_dat$r[i] > n<- Inp_dat$n[i] > Case<- levels(Inp_dat$Case)[i]) > … > > out_put[i,]<-data.frame(Case, stdL, stdPP, stdSE, L, PP, PP_SE) > > } > out_put > > Case StdL StdPP StdSE L > MRPPMRSE > 11 19.466823 0.16432300 0.03137456 26.002294 0.2080145 > 0.03804692 > 21 2.3341300.22566939 0.089626625.0957030.3888451 > 0.08399101 > 31 2.5886780.05502765 0.00454159 42.058326 0.4861511 > 0.02128030 > 41 7.8578980.18457822 0.043722974.7054870.1193687 > 0.01921609 > > > Unfortunately the Case column loses the labels Andrew, Burt … > > I’ve tried always to keep these, but are clearly doing something wrong. Can > anyone help? > > > > > -- > View this message in context: > http://r.789695.n4.nabble.com/Dataframes-and-text-identifier-columns-tp4693184.html > Sent from the R help mailing list archive at Nabble.com. > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Dataframes and text identifier columns
On Jun 29, 2014, at 5:46 AM, Brian Willis wrote: I am trying to incrementally add rows to an empty data frame. The data has 7 columns, the last 6 are numeric. For the first columns I would like to include a text identifier, called ‘Case’ like Andrew, Burt, Charlie etc. that is also output to a data frame – this is where I am having the problem The identifiers are being input from file which automatically assigns them to a data frame Code: Inp_dat<- read.csv("/File/Input data.csv") out_put<-data.frame(Case=character(), StdL=numeric(), StdPP=numeric(), StdSE=numeric(), L=numeric(), MRPP=numeric(), MRSE=numeric(), stringsAsFactors=FALSE) for(i in 1:4) { if (i==1) b<-Andrew # these assignments do not appear to ever get used. if (i==2) b<-Burt if (i==3) b<-Charlie if (i==4) b<-Dave pr <- Inp_dat$p[i] SE_pr <- Inp_dat$SE[i] r<- Inp_dat$r[i] n<- Inp_dat$n[i] Case<- levels(Inp_dat$Case)[i]) # Since the question involves failure of this step, there woudn't appear to be much we could say unless you provided teh output of levels(Inp_dat$Case). … out_put[i,]<-data.frame(Case, stdL, stdPP, stdSE, L, PP, PP_SE) The values stdL through PP_SE seem to come from someplace else. } out_put Case StdL StdPP StdSE L MRPPMRSE 11 19.466823 0.16432300 0.03137456 26.002294 0.2080145 0.03804692 21 2.3341300.22566939 0.089626625.0957030.3888451 0.08399101 31 2.5886780.05502765 0.00454159 42.058326 0.4861511 0.02128030 41 7.8578980.18457822 0.043722974.7054870.1193687 0.01921609 Unfortunately the Case column loses the labels Andrew, Burt … If you were thinking that the values Andrew and Burt would come from that loop then it's unclear whay tha twould be so since the value of "b" is never used after it is defined. I’ve tried always to keep these, but are clearly doing something wrong. Can anyone help? David Winsemius, MD Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Dataframes and text identifier columns
I am trying to incrementally add rows to an empty data frame. The data has 7 columns, the last 6 are numeric. For the first columns I would like to include a text identifier, called ‘Case’ like Andrew, Burt, Charlie etc. that is also output to a data frame – this is where I am having the problem The identifiers are being input from file which automatically assigns them to a data frame Code: Inp_dat<- read.csv("/File/Input data.csv") out_put<-data.frame(Case=character(), StdL=numeric(), StdPP=numeric(), StdSE=numeric(), L=numeric(), MRPP=numeric(), MRSE=numeric(), stringsAsFactors=FALSE) for(i in 1:4) { if (i==1) b<-Andrew if (i==2) b<-Burt if (i==3) b<-Charlie if (i==4) b<-Dave pr <- Inp_dat$p[i] SE_pr <- Inp_dat$SE[i] r<- Inp_dat$r[i] n<- Inp_dat$n[i] Case<- levels(Inp_dat$Case)[i]) … out_put[i,]<-data.frame(Case, stdL, stdPP, stdSE, L, PP, PP_SE) } out_put Case StdL StdPP StdSE L MRPPMRSE 11 19.466823 0.16432300 0.03137456 26.002294 0.2080145 0.03804692 21 2.3341300.22566939 0.089626625.0957030.3888451 0.08399101 31 2.5886780.05502765 0.00454159 42.058326 0.4861511 0.02128030 41 7.8578980.18457822 0.043722974.7054870.1193687 0.01921609 Unfortunately the Case column loses the labels Andrew, Burt … I’ve tried always to keep these, but are clearly doing something wrong. Can anyone help? -- View this message in context: http://r.789695.n4.nabble.com/Dataframes-and-text-identifier-columns-tp4693184.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.