[R] counting frequencies across two columns
I've got a data frame describing comments on an electronic journal, wherein each row is a unique comment, like so: commentID author articleID 1 1 smith 2 2 2 jones 3 3 3 andrews 2 4 4 jones 1 5 5 johnson 3 6 6 smith 2 I want know the number of unique authors per article. I can get a table of article frequencies with table(articleID), but I can't figure out how to count frequencies in a different column. I'm sure there's an easy way, but I guess I'm too new at this to find it. Thanks for your help! Jason Priem PhD student, School of Information and Library Science, University of North Carolina-Chapel Hill __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] counting frequencies across two columns
Hi Jason, As your example is not reproducible, may be something like: myFreq-data.frame(table(articleID, author)) if you want to know only those articles with 1 author, you can try subset(myFreq, Freq==1) or something like. bests milton On Sun, Nov 1, 2009 at 2:20 AM, Jason Priem pr...@email.unc.edu wrote: I've got a data frame describing comments on an electronic journal, wherein each row is a unique comment, like so: commentID author articleID 1 1 smith 2 2 2 jones 3 3 3 andrews 2 4 4 jones 1 5 5 johnson 3 6 6 smith 2 I want know the number of unique authors per article. I can get a table of article frequencies with table(articleID), but I can't figure out how to count frequencies in a different column. I'm sure there's an easy way, but I guess I'm too new at this to find it. Thanks for your help! Jason Priem PhD student, School of Information and Library Science, University of North Carolina-Chapel Hill __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] counting frequencies across two columns
On Sun, 01-Nov-2009 at 01:20AM -0500, Jason Priem wrote: I've got a data frame describing comments on an electronic journal, wherein each row is a unique comment, like so: commentID author articleID 1 1 smith 2 2 2 jones 3 3 3 andrews 2 4 4 jones 1 5 5 johnson 3 6 6 smith 2 Let's call that dataframe x I want know the number of unique authors per article. I can get a table of article frequencies with table(articleID), but I can't figure out how to count frequencies in a different column. I'm sure there's an easy way, but I guess I'm too new at this to find it. I'm not clear what you require, but maybe it's this: with(x, table(articleID, author)) articleID andrews johnson jones smith 1 0 0 1 0 2 1 0 0 2 3 0 1 1 0 Is that anything like what you're after? -- ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~. ___Patrick Connolly {~._.~} Great minds discuss ideas _( Y )_ Average minds discuss events (:_~*~_:) Small minds discuss people (_)-(_) . Eleanor Roosevelt ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] wilcox.test construction in r
Hi, I am very confused with constructing the wilcox.test in R. I have two populations 'original' and 'test'. I want to know if the 'test' is generally 'lower' than original. I use alpha of 0.05. So do I write the function as wilcox.test(original, test, alternative=l)? or wlcox.test(original, test, alternative = g)? or wilcox.test(test, original, alternative=g)? or wilcox.test(test, original, alternative=l)? How do I interpret the p-value given my criteria? Do I reject null when p-value less than 0.05? or greater than 0.95? Not a statistics major here so I'm really confused. Need some help. Thanks. -- View this message in context: http://old.nabble.com/wilcox.test-construction-in-r-tp26148779p26148779.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] wilcox.test construction in r
On Sun, 1 Nov 2009 00:47:50 -0700 (PDT) jomni jom...@gmail.com wrote: J So do I write the function as wilcox.test(original, test, J alternative=l)? or wlcox.test(original, test, alternative = g)? J or wilcox.test(test, original, alternative=g)? J or wilcox.test(test, original, alternative=l)? J How do I interpret the p-value given my criteria? J Do I reject null when p-value less than 0.05? J or greater than 0.95? The interpretation of the p depends on how you have tested the hypothesis. J Not a statistics major here so I'm really confused. You don't need to be that but please read the documentation and try the given examples in the documentation. If you would have typed example(wilcox.test) you would have seen for example: wlcx.t ## Two-sample test. wlcx.t ## Hollander Wolfe (1973), 69f. wlcx.t ## Permeability constants of the human chorioamnion (a placental wlcx.t ## membrane) at term (x) and between 12 to 26 weeks gestational wlcx.t ## age (y). The alternative of interest is greater permeability wlcx.t ## of the human chorioamnion for the term pregnancy. wlcx.t x - c(0.80, 0.83, 1.89, 1.04, 1.45, 1.38, 1.91, 1.64, 0.73, 1.46) wlcx.t y - c(1.15, 0.88, 0.90, 0.74, 1.21) wlcx.t wilcox.test(x, y, alternative = g)# greater Wilcoxon rank sum test data: x and y W = 35, p-value = 0.1272 alternative hypothesis: true location shift is greater than 0 This I think makes it very easy to interprete. Here it is tested as the text says whether x is greater than y. So if you want to test the hypothesis that x is smaller than y so you do wilcox.test(x,y,alternative=less) then the lower your p is the higher is the probability that the samples are different. hence p0.05 would match your confidence level. Now the surprising news: wilcox.test(y,x,alternative=greater) would work as well! If you are in doubt create an x and an y where you are sure that x is smaller than y. One final remark: if you have ties (several identical values in one sample) you should use wilcox_test of the coin package. hth Stefan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] re ct.hclust and horizontal dendrograms
As from subject: is it possible to use the rect.hclust (or something equivalent) with horizontal dendrograms? thanks nico -- View this message in context: http://old.nabble.com/rect.hclust-and-horizontal-dendrograms-tp26148886p26148886.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] convert list to Dataframe
Hi. I have a huge list called twitter: dim(twitter) NULL str(twitter) List of 1 $ :Classes 'PlainTextDocument', 'TextDocument', 'character' atomic [1:35575] 11999;10:47:14;20;10;2009;ObamaLouverture;Trails Mixed Lessons For Governance From Campaigner-in-chief: President obama jumps campaign 09 tuesday.. http://bit.ly/2eHMaN;Florida;USA;FL;;;27.6648274;-81.5157535 12210;10:47:37;20;10;2009;David_Stringer;William Hague heading Washington meets Gen. Jim Jones, Sen. John McCain others. Will Obama team raise worries EU ties?;London, England;United Kingdom;Greater London;Westminster;;51.5001524;-0.1262362 12355;10:47:53;20;10;2009;Singsabit;RT @Drudge_Report PAPER: Excuses wearing thin Obama, media pals... http://tinyurl.com/yfw6cd9;So. California;USA;CA;;;36.778261;-119.4179324 12407;10:47:59;20;10;2009;obamavideonews;Obama News Obama Afghanistan troop decision timing (AFP) : AFP - Pres.. http://bit.ly/3KPUr8 #obama #video;USA;USA37.09024;-95.712891 ... .. ..- attr(*, Author)= chr(0) .. ..- attr(*, DateTimeStamp)= POSIXlt[1:9], format: 2009-10-31 04:46:56 .. ..- attr(*, Description)= chr(0) .. ..- attr(*, Heading)= chr(0) .. ..- attr(*, ID)= chr 1 .. ..- attr(*, Language)= chr en .. ..- attr(*, LocalMetaData)= list() .. ..- attr(*, Origin)= chr(0) - attr(*, CMetaData)=List of 3 ..$ NodeID : num 0 ..$ MetaData:List of 2 .. ..$ create_date: POSIXlt[1:9], format: 2009-10-31 04:46:56 .. ..$ creator: Named chr .. .. ..- attr(*, names)= chr LOGNAME ..$ Children: NULL ..- attr(*, class)= chr MetaDataNode - attr(*, DMetaData)='data.frame': 1 obs. of 1 variable: ..$ MetaID: num 0 - attr(*, class)= chr [1:3] VCorpus Corpus list It contains tweets but in many languages. The columns are separated by semi-colons. I am using the tm package and it is a corpus. It looks like this: 547282;06:37:17;21;10;2009;dani_jade18;@Laura_Whyte1 day :p;Huddersfield/Lincoln;United Kingdom;Kirklees;Kirklees;;53.6468475;-1.7727296 547283;06:37:17;21;10;2009;fabiomafra;alguém traz mais lenha pro computador da facool? BOM DIA.;Belo Horizonte - MG - BR;Brazil;MG;;;-19.8157306;-43.9542226 547284;06:37:17;21;10;2009;romanotr;Вау, Репортеры без границ опубликовали список стран со свободой слова, из 173 Грузия на 81 месте опережая Украину. Успехи,успехи...;Portugal Aveiro;Portugal;Aveiro;;;40.6411848;-8.6536169 547285;06:37:18;21;10;2009;Y_T_;Playing: Beth Orton lt\;Someone's Daughtergt\;;Kanazawa, Japan;Japan;Ishikawa Prefecture;;;36.5613254;136.6562051 Error: invalid input '547286;06:37:18;21;10;2009;Atogey;支æŒä½ ,国家需è¦ä»–们,但是国家的未æ¥ä¸èƒ½é 他们…RT @zuola ￿我觉得 @wenyunc I want to convert it to fields or columns and so I thought I should convert it to a dataframe. I tried twitterDF-as.data.frame(twitter) Error in sort.list(y) : invalid input '547286;06:37:18;21;10;2009;Atogey;支æŒä½ ,国家需è¦ä»–们,但是国家的未æ¥ä¸èƒ½é 他们…RT @zuola ￿我觉得 @wenyunchao 一点都ä¸ä¹è§‚。真æ£çš„ä¹è§‚åº”è¯¥æ˜¯ï¼šä½ å…³æˆ‘åˆæ€Žä¹ˆæ ·ï¼Œåæ£æ”¿æ²»æ–—争ä¸ä¼šä¸¢æŽ‰æ€§å‘½ï¼Œè€å出æ¥åŽæ›´æ˜¯ä¸€æ¡å¥½æ±‰ã€‚北风还是èˆä¸å¾—*霸地ä½ã€è‚‰ã€ä¹¦ã€å¥³äººå’Œç½‘络的,ä¸è¿‡ç‰¢é‡Œä¸ä¼šæ供这些。å¦â€¦;山西,浙江;China;Zhejiang;;;28.695035;119.751054' in 'utf8towcs' Can anyone suggest what I can do? P.S. Actually, I would love to remove all the non-English tweets but I have no clue about how to do that. -- View this message in context: http://old.nabble.com/convert-list-to-Dataframe-tp26148889p26148889.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to properly shade the background panels of an xyplot?
On Fri, Oct 30, 2009 at 8:33 PM, Ottorino-Luca Pantani ottorino-luca.pant...@unifi.it wrote: Dear R users, this is a follow up of this message http://tolstoy.newcastle.edu.au/R/e6/help/09/05/13897.html I'm reproducing the core of it for convenience. // / data(Oats, package = MEMSS) / / tp1.oats - xyplot(yield ~ nitro | Variety + Block, / / data = Oats, / / panel = function(x, y, subscripts, ...) { / / # How to normalize my heatmap metric with / / # the value of the panel that has maximum average ? / / # metric = eval(mean(y)/ max(mean-of-each-panel) / / metric = eval(mean(y)/max(Oats$yield)) / / panel.fill(col = gray(metric)) / / panel.lines(x,y) / / } / / ) / / print(tp1.oats) / // xyplot(yield ~ nitro | Variety + Block, data = Oats, max.mean = max(with(Oats, tapply(yield, list(Variety, Block), mean))), panel = function(x, y, subscripts, max.mean, ...) { metric = mean(y)/max.mean panel.fill(col = gray(metric)) panel.lines(x,y) }) or xyplot(yield ~ nitro | Variety + Block, data = Oats, aux.env = new.env(parent = emptyenv()), prepanel = function(x, y, aux.env, ...) { aux.env$max.mean.y - max(aux.env$max.mean.y, mean(y)) list() }, panel = function(x, y, subscripts, aux.env, ...) { metric = mean(y) / aux.env$max.mean.y panel.fill(col = gray(metric)) panel.lines(x,y) }) -Deepayan The result is a trellis object in which the background colour of the panels is an outcome of the data contained in the panel itself. After all, this is what panel = function (x,y . is meant for, right ? But what, if I want to highlight some panels ? Arbitrarily or conditioned by another variable. Say I want to shade in gray only the upper right panels (Block VI, Victory and Marvellous varieties ) See ?which.packet, which will tell you the current levels of the conditioning variables. So something like panel = function(x, y, subscripts, aux.env, ...) { wp - which.packet() if (levels(Oats$Variety)[wp[1]] %in% c(Victory, Marvellous) || ...) panel.fill(col = gray(metric)) panel.lines(x,y) }) -Deepayan Given a data frame like this, with a variable intended to set the colour of the panel background / data(Oats, package = MEMSS) /Oats1 - cbind.data.frame(Oats, Highlight = ifelse(Oats$Block == VI Oats$Variety %in% c(Victory, Marvellous ), gray, transparent) ) which is a possible code ? I (more or less) know how to manage the data in the panel, but I cannot imagine how to do it with variables external to the panel itself. I suppose that the panel functions are not useful here. I'm wandering through par.settings, themes and panel.fill, but still without success. Any hint ? -- Ottorino-Luca Pantani, Università di Firenze Dip. Scienza del Suolo e Nutrizione della Pianta P.zle Cascine 28 50144 Firenze Italia Ubuntu 8.04.3 LTS -- GNU Emacs 23.0.60.1 (x86_64-pc-linux-gnu, GTK+ Version 2.12.9) ESS version 5.5 -- R 2.9.2 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] counting frequencies across two columns
On Nov 1, 2009, at 1:59 AM, Patrick Connolly wrote: On Sun, 01-Nov-2009 at 01:20AM -0500, Jason Priem wrote: I've got a data frame describing comments on an electronic journal, wherein each row is a unique comment, like so: commentID author articleID 1 1 smith 2 2 2 jones 3 3 3 andrews 2 4 4 jones 1 5 5 johnson 3 6 6 smith 2 Let's call that dataframe x I want know the number of unique authors per article. I can get a table of article frequencies with table(articleID), but I can't figure out how to count frequencies in a different column. I'm sure there's an easy way, but I guess I'm too new at this to find it. I'm not clear what you require, but maybe it's this: with(x, table(articleID, author)) articleID andrews johnson jones smith 1 0 0 1 0 2 1 0 0 2 3 0 1 1 0 Is that anything like what you're after? You've had two guesses so far and my guess increments the count. Were you attempting to specify this? df1 - read.table(textConnection(commentID author articleID 1 1 smith 2 2 2 jones 3 3 3 andrews 2 4 4 jones 1 5 5 johnson 3 6 6 smith 2), header=T) lapply( lapply(tapply(df1$author, df1$articleID, I), unique) , length) $`1` [1] 1 $`2` [1] 2 $`3` [1] 2 Or delivered in matrix form (and using Connolly's approach as intermediate: apply( with(df1, table(articleID, author)), 1, function(x) sum(x0) ) 1 2 3 1 2 2 -- ~ .~ .~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~. ___Patrick Connolly -- David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] convert list to Dataframe
Three suggestions: -- drop the idea of using a dataframe. It's only appropriate when the data is rectangular. -- look at strsplit for separating at @ characters. -- post the output of dput() on your sample, since email is probably not capable of rendering this data without creating distortions. -- David On Nov 1, 2009, at 7:43 AM, onyourmark wrote: Hi. I have a huge list called twitter: dim(twitter) NULL str(twitter) This looks to have been converted into an R object through soe process on some unspecified input. You should describe that process, and hte only unambiguous method of doing so is by including the code. List of 1 $ :Classes 'PlainTextDocument', 'TextDocument', 'character' atomic [1:35575] 11999;10:47:14;20;10;2009;ObamaLouverture;Trails Mixed Lessons For Governance From Campaigner-in-chief: President obama jumps campaign 09 tuesday.. http://bit.ly/2eHMaN;Florida;USA;FL;;;27.6648274;-81.5157535 12210;10:47:37;20;10;2009;David_Stringer;William Hague heading Washington meets Gen. Jim Jones, Sen. John McCain others. Will Obama team raise worries EU ties?;London, England;United Kingdom;Greater London;Westminster;;51.5001524;-0.1262362 12355;10:47:53;20;10;2009;Singsabit;RT @Drudge_Report PAPER: Excuses wearing thin Obama, media pals... http://tinyurl.com/yfw6cd9;So. California;USA;CA;;;36.778261;-119.4179324 12407;10:47:59;20;10;2009;obamavideonews;Obama News Obama Afghanistan troop decision timing (AFP) : AFP - Pres.. http://bit.ly/3KPUr8 #obama #video;USA;USA37.09024;-95.712891 ... .. ..- attr(*, Author)= chr(0) .. ..- attr(*, DateTimeStamp)= POSIXlt[1:9], format: 2009-10-31 04:46:56 .. ..- attr(*, Description)= chr(0) .. ..- attr(*, Heading)= chr(0) .. ..- attr(*, ID)= chr 1 .. ..- attr(*, Language)= chr en .. ..- attr(*, LocalMetaData)= list() .. ..- attr(*, Origin)= chr(0) - attr(*, CMetaData)=List of 3 ..$ NodeID : num 0 ..$ MetaData:List of 2 .. ..$ create_date: POSIXlt[1:9], format: 2009-10-31 04:46:56 .. ..$ creator: Named chr .. .. ..- attr(*, names)= chr LOGNAME ..$ Children: NULL ..- attr(*, class)= chr MetaDataNode - attr(*, DMetaData)='data.frame': 1 obs. of 1 variable: ..$ MetaID: num 0 - attr(*, class)= chr [1:3] VCorpus Corpus list It contains tweets but in many languages. The columns are separated by semi-colons. I am using the tm package and it is a corpus. It looks like this: It is difficult to see any connection with what you have above. 547282;06:37:17;21;10;2009;dani_jade18;@Laura_Whyte1 day :p;Huddersfield/Lincoln;United Kingdom;Kirklees;Kirklees;;53.6468475;-1.7727296 547283;06:37:17;21;10;2009;fabiomafra;alguém traz mais lenha pro computador da facool? BOM DIA.;Belo Horizonte - MG - BR;Brazil;MG;;;-19.8157306;-43.9542226 547284;06:37:17;21;10;2009;romanotr;Вау, Репортеры без границ опубликовали список стран со свободой слова, из 173 Грузия на 81 месте опережая Украину. Успехи,успехи...;Portugal Aveiro;Portugal;Aveiro;;; 40.6411848;-8.6536169 547285;06:37:18;21;10;2009;Y_T_;Playing: Beth Orton lt\;Someone's Daughtergt\;;Kanazawa, Japan;Japan;Ishikawa Prefecture;;;36.5613254;136.6562051 Error: invalid input '547286;06:37:18;21;10;2009;Atogey;支æŒä½ ,国家需è¦ä»– ä»¬ï¼Œä½†æ˜¯å›½å®¶çš„æœªæ ¥ä¸èƒ½é 他们…RT @zuola ￿我觉得 @wenyunc I want to convert it to fields or columns and so I thought I should convert it to a dataframe. I tried twitterDF-as.data.frame(twitter) Error in sort.list(y) : invalid input '547286;06:37:18;21;10;2009;Atogey;支æŒä½ ,国家需è¦ä»– ä»¬ï¼Œä½†æ˜¯å›½å®¶çš„æœªæ ¥ä¸èƒ½é 他们…RT @zuola ￿我觉得 @wenyunchao 一点都ä¸ä¹è§‚ã€‚çœŸæ £çš„ä¹è§‚åº”è¯¥æ˜¯ï¼šä½ å… ³æˆ‘åˆæ€Žä¹ˆæ ·ï¼Œåæ £æ”¿æ²»æ– —争ä¸ä¼šä¸¢æŽ‰æ€§å‘½ï¼Œè€å 出æ¥åŽæ›´æ˜¯ä¸€æ¡å¥½æ±‰ã€ ‚北风还是èˆä¸å¾—*霸地ä½ã €è‚‰ã€ä¹¦ã€å ¥³äººå’Œç½‘络的,ä¸è¿‡ç‰ ¢é‡Œä¸ä¼šæ供这些。å¦â €¦;山西,浙江;China;Zhejiang;;; 28.695035;119.751054' in 'utf8towcs' Can anyone suggest what I can do? P.S. Actually, I would love to remove all the non-English tweets but I have no clue about how to do that. -- David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] convert list to Dataframe
Hello. The fields are separated by a ';'. I think that the data is rectangular in the sense that there are about 15 fields for each row. Some of the fields are empty. In the dput() display below, it seems that the rows are delimited by ' ' . Any idea from this? Here is the end of the output for dput(twitter) 4927861;05:04:14;28;10;2009;HOYTSTHEATRES;GameStop Brings 15K Manage Holiday Rush [Black Friday] http://bit.ly/2d3OJg;Australia;Australia-25.274398;133.775136;, 4927863;05:04:14;28;10;2009;padden;Rachel master chef cook anytime!;Sydney, Australia;Australia;NSW;;;-33.867139;151.207114, 4927878;05:04:17;28;10;2009;GSpotMagazine;The penalty success bored attentions people formerly snubbed you. -Mary Wilson Little #quote;UK;United Kingdom55.378051;-3.435973, 4927885;05:04:20;28;10;2009;super_assassin;@triplejsr flight conchords, pleaaase :) thanks rosie xx;Australia;Australia-25.274398;133.775136, 4927893;05:04:21;28;10;2009;SLMFE;Gestern:Achso,ja okey,um 5 nach las ich jemanden komen der dir die Akupunkturnadel(zb 5!im Ohr!)entfernt..Um 10 n. kommt immer noch keiner..;Germany;Germany51.165691;10.451526, 4927901;05:04:23;28;10;2009;mikesemple;HHS Secretary pushes health care reform rural America: By Christopher Smart The health-care crisis .. http://bit.ly/49Iqcu;London;United Kingdom;Greater London;Westminster;;51.5001524;-0.1262362, 4927913;05:04:26;28;10;2009;coax_k;Facebook Headquarters Studio O+A: San Francisco based interior design firm Studio O+A designed .. http://bit.ly/hdqWp;Sydney;Australia;NSW;;;-33.867139;151.207114; ), Author = character(0), DateTimeStamp = structure(list(sec = 56.404713898, min = 46L, hour = 4L, mday = 31L, mon = 9L, year = 109L, wday = 6L, yday = 303L, isdst = 0L), .Names = c(sec, min, hour, mday, mon, year, wday, yday, isdst), class = c(POSIXt, POSIXlt), tzone = GMT), Description = character(0), Heading = character(0), ID = 1, Language = en, LocalMetaData = list(), Origin = character(0), class = c(PlainTextDocument, TextDocument, character))), CMetaData = structure(list(NodeID = 0, MetaData = structure(list(create_date = structure(list(sec = 56.4059998989105, min = 46L, hour = 4L, mday = 31L, mon = 9L, year = 109L, wday = 6L, yday = 303L, isdst = 0L), .Names = c(sec, min, hour, mday, mon, year, wday, yday, isdst ), class = c(POSIXt, POSIXlt), tzone = GMT), creator = structure(, .Names = LOGNAME)), .Names = c(create_date, creator)), Children = NULL), .Names = c(NodeID, MetaData, Children), class = MetaDataNode), DMetaData = structure(list( MetaID = 0), .Names = MetaID, row.names = c(NA, -1L), class = data.frame), class = c(VCorpus, Corpus, list)) onyourmark wrote: Hi. I have a huge list called twitter: dim(twitter) NULL str(twitter) List of 1 $ :Classes 'PlainTextDocument', 'TextDocument', 'character' atomic [1:35575] 11999;10:47:14;20;10;2009;ObamaLouverture;Trails Mixed Lessons For Governance From Campaigner-in-chief: President obama jumps campaign 09 tuesday.. http://bit.ly/2eHMaN;Florida;USA;FL;;;27.6648274;-81.5157535 12210;10:47:37;20;10;2009;David_Stringer;William Hague heading Washington meets Gen. Jim Jones, Sen. John McCain others. Will Obama team raise worries EU ties?;London, England;United Kingdom;Greater London;Westminster;;51.5001524;-0.1262362 12355;10:47:53;20;10;2009;Singsabit;RT @Drudge_Report PAPER: Excuses wearing thin Obama, media pals... http://tinyurl.com/yfw6cd9;So. California;USA;CA;;;36.778261;-119.4179324 12407;10:47:59;20;10;2009;obamavideonews;Obama News Obama Afghanistan troop decision timing (AFP) : AFP - Pres.. http://bit.ly/3KPUr8 #obama #video;USA;USA37.09024;-95.712891 ... .. ..- attr(*, Author)= chr(0) .. ..- attr(*, DateTimeStamp)= POSIXlt[1:9], format: 2009-10-31 04:46:56 .. ..- attr(*, Description)= chr(0) .. ..- attr(*, Heading)= chr(0) .. ..- attr(*, ID)= chr 1 .. ..- attr(*, Language)= chr en .. ..- attr(*, LocalMetaData)= list() .. ..- attr(*, Origin)= chr(0) - attr(*, CMetaData)=List of 3 ..$ NodeID : num 0 ..$ MetaData:List of 2 .. ..$ create_date: POSIXlt[1:9], format: 2009-10-31 04:46:56 .. ..$ creator: Named chr .. .. ..- attr(*, names)= chr LOGNAME ..$ Children: NULL ..- attr(*, class)= chr MetaDataNode - attr(*, DMetaData)='data.frame': 1 obs. of 1 variable: ..$ MetaID: num 0 - attr(*, class)= chr [1:3] VCorpus Corpus list It contains tweets but in many languages. The columns are separated by semi-colons. I am using the tm package and it is a corpus. It looks like this: 547282;06:37:17;21;10;2009;dani_jade18;@Laura_Whyte1 day :p;Huddersfield/Lincoln;United Kingdom;Kirklees;Kirklees;;53.6468475;-1.7727296 547283;06:37:17;21;10;2009;fabiomafra;alguém traz mais lenha pro computador da facool? BOM DIA.;Belo Horizonte - MG - BR;Brazil;MG;;;-19.8157306;-43.9542226
Re: [R] convert list to Dataframe
On 01/11/2009 7:43 AM, onyourmark wrote: Hi. I have a huge list called twitter: It's a list, but more importantly it's a VCorpus and a Corpus. You should use the functions appropriate to those classes to extract the strings making up the data, declare their encoding properly (or convert them to your native encoding), then use read.delim() on a textConnection to read them in. Duncan Murdoch dim(twitter) NULL str(twitter) List of 1 $ :Classes 'PlainTextDocument', 'TextDocument', 'character' atomic [1:35575] 11999;10:47:14;20;10;2009;ObamaLouverture;Trails Mixed Lessons For Governance From Campaigner-in-chief: President obama jumps campaign 09 tuesday.. http://bit.ly/2eHMaN;Florida;USA;FL;;;27.6648274;-81.5157535 12210;10:47:37;20;10;2009;David_Stringer;William Hague heading Washington meets Gen. Jim Jones, Sen. John McCain others. Will Obama team raise worries EU ties?;London, England;United Kingdom;Greater London;Westminster;;51.5001524;-0.1262362 12355;10:47:53;20;10;2009;Singsabit;RT @Drudge_Report PAPER: Excuses wearing thin Obama, media pals... http://tinyurl.com/yfw6cd9;So. California;USA;CA;;;36.778261;-119.4179324 12407;10:47:59;20;10;2009;obamavideonews;Obama News Obama Afghanistan troop decision timing (AFP) : AFP - Pres.. http://bit.ly/3KPUr8 #obama #video;USA;USA37.09024;-95.712891 ... .. ..- attr(*, Author)= chr(0) .. ..- attr(*, DateTimeStamp)= POSIXlt[1:9], format: 2009-10-31 04:46:56 .. ..- attr(*, Description)= chr(0) .. ..- attr(*, Heading)= chr(0) .. ..- attr(*, ID)= chr 1 .. ..- attr(*, Language)= chr en .. ..- attr(*, LocalMetaData)= list() .. ..- attr(*, Origin)= chr(0) - attr(*, CMetaData)=List of 3 ..$ NodeID : num 0 ..$ MetaData:List of 2 .. ..$ create_date: POSIXlt[1:9], format: 2009-10-31 04:46:56 .. ..$ creator: Named chr .. .. ..- attr(*, names)= chr LOGNAME ..$ Children: NULL ..- attr(*, class)= chr MetaDataNode - attr(*, DMetaData)='data.frame': 1 obs. of 1 variable: ..$ MetaID: num 0 - attr(*, class)= chr [1:3] VCorpus Corpus list It contains tweets but in many languages. The columns are separated by semi-colons. I am using the tm package and it is a corpus. It looks like this: 547282;06:37:17;21;10;2009;dani_jade18;@Laura_Whyte1 day :p;Huddersfield/Lincoln;United Kingdom;Kirklees;Kirklees;;53.6468475;-1.7727296 547283;06:37:17;21;10;2009;fabiomafra;alguém traz mais lenha pro computador da facool? BOM DIA.;Belo Horizonte - MG - BR;Brazil;MG;;;-19.8157306;-43.9542226 547284;06:37:17;21;10;2009;romanotr;Вау, Репортеры без границ опубликовали список стран со свободой слова, из 173 Грузия на 81 месте опережая Украину. Успехи,успехи...;Portugal Aveiro;Portugal;Aveiro;;;40.6411848;-8.6536169 547285;06:37:18;21;10;2009;Y_T_;Playing: Beth Orton lt\;Someone's Daughtergt\;;Kanazawa, Japan;Japan;Ishikawa Prefecture;;;36.5613254;136.6562051 Error: invalid input '547286;06:37:18;21;10;2009;Atogey;支æŒä½ ,国家需è¦ä»–们,但是国家的未æ¥ä¸èƒ½é 他们…RT @zuola ￿我觉得 @wenyunc I want to convert it to fields or columns and so I thought I should convert it to a dataframe. I tried twitterDF-as.data.frame(twitter) Error in sort.list(y) : invalid input '547286;06:37:18;21;10;2009;Atogey;支æŒä½ ,国家需è¦ä»–们,但是国家的未æ¥ä¸èƒ½é 他们…RT @zuola ￿我觉得 @wenyunchao 一点都ä¸ä¹è§‚。真æ£çš„ä¹è§‚åº”è¯¥æ˜¯ï¼šä½ å…³æˆ‘åˆæ€Žä¹ˆæ ·ï¼Œåæ£æ”¿æ²»æ–—争ä¸ä¼šä¸¢æŽ‰æ€§å‘½ï¼Œè€å出æ¥åŽæ›´æ˜¯ä¸€æ¡å¥½æ±‰ã€‚北风还是èˆä¸å¾—*霸地ä½ã€è‚‰ã€ä¹¦ã€å¥³äººå’Œç½‘络的,ä¸è¿‡ç‰¢é‡Œä¸ä¼šæ供这些。å¦â€¦;山西,浙江;China;Zhejiang;;;28.695035;119.751054' in 'utf8towcs' Can anyone suggest what I can do? P.S. Actually, I would love to remove all the non-English tweets but I have no clue about how to do that. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Decomposition method
Dear All, I'm doing an imputation on the missing data and is looking for a decomposition method in R. Could somone advice me the way to do this? Thank you Fir __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] convert list to Dataframe
On Nov 1, 2009, at 8:24 AM, onyourmark wrote: Hello. The fields are separated by a ';'. I think that the data is rectangular in the sense that there are about 15 fields for each row. There either are 15 fields or there aren't. You can't make a dataframe with an approximate number of fields. In the fragment below there appear to be 14 fields. Try: twitfrag - strsplit(c(4927861;05:04:14;28;10;2009;HOYTSTHEATRES;GameStop Brings 15K Manage Holiday Rush [Black Friday] http://bit.ly/2d3OJg;Australia;Australia-25.274398;133.775136 , 4927863;05:04:14;28;10;2009;padden;Rachel master chef cook anytime!;Sydney, Australia;Australia;NSW;;;-33.867139;151.207114, 4927878;05:04:17;28;10;2009;GSpotMagazine;The penalty success bored attentions people formerly snubbed you. -Mary Wilson Little #quote;UK;United Kingdom55.378051;-3.435973, 4927885;05:04:20;28;10;2009;super_assassin;@triplejsr flight conchords, pleaaase :) thanks rosie xx;Australia;Australia-25.274398;133.775136, 4927893;05:04:21;28;10;2009;SLMFE;Gestern:Achso,ja okey,um 5 nach las ich jemanden komen der dir die Akupunkturnadel(zb 5!im Ohr!)entfernt..Um 10 n. kommt immer noch keiner..;Germany;Germany51.165691;10.451526, 4927901;05:04:23;28;10;2009;mikesemple;HHS Secretary pushes health care reform rural America: By Christopher Smart The health-care crisis .. http://bit.ly/49Iqcu;London;United Kingdom;Greater London;Westminster;;51.5001524;-0.1262362, 4927913;05:04:26;28;10;2009;coax_k;Facebook Headquarters Studio O+A: San Francisco based interior design firm Studio O+A designed .. http://bit.ly/hdqWp;Sydney;Australia;NSW;;;-33.867139;151.207114; ), ;) twitfrag I think you will see some patterns emerging. Some of the fields are empty. In the dput() display below, it seems that the rows are delimited by ' ' . Any idea from this? They are strings (in our aRgot, objects of type character.) That is an effect of whatever processing you have done with components of the tm package, the entirety of which you are failing to share with us. Here is the end of the output for dput(twitter) The whole point of using dput is to create a complete representation of an object. 4927861;05:04:14;28;10;2009;HOYTSTHEATRES;GameStop Brings 15K Manage Holiday Rush [Black Friday] http://bit.ly/2d3OJg;Australia;Australia-25.274398;133.775136;, 4927863;05:04:14;28;10;2009;padden;Rachel master chef cook anytime!;Sydney, Australia;Australia;NSW;;;-33.867139;151.207114, 4927878;05:04:17;28;10;2009;GSpotMagazine;The penalty success bored attentions people formerly snubbed you. -Mary Wilson Little #quote;UK;United Kingdom55.378051;-3.435973, 4927885;05:04:20;28;10;2009;super_assassin;@triplejsr flight conchords, pleaaase :) thanks rosie xx;Australia;Australia-25.274398;133.775136, 4927893;05:04:21;28;10;2009;SLMFE;Gestern:Achso,ja okey,um 5 nach las ich jemanden komen der dir die Akupunkturnadel(zb 5!im Ohr!)entfernt..Um 10 n. kommt immer noch keiner..;Germany;Germany51.165691;10.451526, 4927901;05:04:23;28;10;2009;mikesemple;HHS Secretary pushes health care reform rural America: By Christopher Smart The health-care crisis .. http://bit.ly/49Iqcu;London;United Kingdom;Greater London;Westminster;;51.5001524;-0.1262362, 4927913;05:04:26;28;10;2009;coax_k;Facebook Headquarters Studio O +A: San Francisco based interior design firm Studio O+A designed .. http://bit.ly/hdqWp;Sydney;Australia;NSW;;;-33.867139;151.207114; ), Author = character(0), DateTimeStamp = structure(list(sec = 56.404713898, min = 46L, hour = 4L, mday = 31L, mon = 9L, year = 109L, wday = 6L, yday = 303L, isdst = 0L), .Names = c(sec, min, hour, mday, mon, year, wday, yday, isdst), class = c(POSIXt, POSIXlt), tzone = GMT), Description = character(0), Heading = character(0), ID = 1, Language = en, LocalMetaData = list(), Origin = character(0), class = c(PlainTextDocument, TextDocument, character))), CMetaData = structure(list(NodeID = 0, MetaData = structure(list(create_date = structure(list(sec = 56.4059998989105, min = 46L, hour = 4L, mday = 31L, mon = 9L, year = 109L, wday = 6L, yday = 303L, isdst = 0L), .Names = c(sec, min, hour, mday, mon, year, wday, yday, isdst ), class = c(POSIXt, POSIXlt), tzone = GMT), creator = structure(, .Names = LOGNAME)), .Names = c(create_date, creator)), Children = NULL), .Names = c(NodeID, MetaData, Children), class = MetaDataNode), DMetaData = structure(list( MetaID = 0), .Names = MetaID, row.names = c(NA, -1L), class = data.frame), class = c(VCorpus, Corpus, list)) onyourmark wrote: Hi. I have a huge list called twitter: dim(twitter) NULL str(twitter) List of 1 $ :Classes 'PlainTextDocument', 'TextDocument', 'character' atomic [1:35575] 11999;10:47:14;20;10;2009;ObamaLouverture;Trails Mixed Lessons For Governance From Campaigner-in-chief: President obama jumps campaign 09 tuesday..
Re: [R] convert list to Dataframe
I did this on the source files which were semi-colon delimted (to delimit the fields, I am not sure what character denotes the new tweet) After loading the tm package txt - system.file(texts, txt, package = tm) (twitter - Corpus(DirSource(txt), + readerControl = list(language = lat))) then twitter - tm_map(twitter, removeWords, stopwords(english)) That last command took about an hour to complete. onyourmark wrote: Hi. I have a huge list called twitter: dim(twitter) NULL str(twitter) List of 1 $ :Classes 'PlainTextDocument', 'TextDocument', 'character' atomic [1:35575] 11999;10:47:14;20;10;2009;ObamaLouverture;Trails Mixed Lessons For Governance From Campaigner-in-chief: President obama jumps campaign 09 tuesday.. http://bit.ly/2eHMaN;Florida;USA;FL;;;27.6648274;-81.5157535 12210;10:47:37;20;10;2009;David_Stringer;William Hague heading Washington meets Gen. Jim Jones, Sen. John McCain others. Will Obama team raise worries EU ties?;London, England;United Kingdom;Greater London;Westminster;;51.5001524;-0.1262362 12355;10:47:53;20;10;2009;Singsabit;RT @Drudge_Report PAPER: Excuses wearing thin Obama, media pals... http://tinyurl.com/yfw6cd9;So. California;USA;CA;;;36.778261;-119.4179324 12407;10:47:59;20;10;2009;obamavideonews;Obama News Obama Afghanistan troop decision timing (AFP) : AFP - Pres.. http://bit.ly/3KPUr8 #obama #video;USA;USA37.09024;-95.712891 ... .. ..- attr(*, Author)= chr(0) .. ..- attr(*, DateTimeStamp)= POSIXlt[1:9], format: 2009-10-31 04:46:56 .. ..- attr(*, Description)= chr(0) .. ..- attr(*, Heading)= chr(0) .. ..- attr(*, ID)= chr 1 .. ..- attr(*, Language)= chr en .. ..- attr(*, LocalMetaData)= list() .. ..- attr(*, Origin)= chr(0) - attr(*, CMetaData)=List of 3 ..$ NodeID : num 0 ..$ MetaData:List of 2 .. ..$ create_date: POSIXlt[1:9], format: 2009-10-31 04:46:56 .. ..$ creator: Named chr .. .. ..- attr(*, names)= chr LOGNAME ..$ Children: NULL ..- attr(*, class)= chr MetaDataNode - attr(*, DMetaData)='data.frame': 1 obs. of 1 variable: ..$ MetaID: num 0 - attr(*, class)= chr [1:3] VCorpus Corpus list It contains tweets but in many languages. The columns are separated by semi-colons. I am using the tm package and it is a corpus. It looks like this: 547282;06:37:17;21;10;2009;dani_jade18;@Laura_Whyte1 day :p;Huddersfield/Lincoln;United Kingdom;Kirklees;Kirklees;;53.6468475;-1.7727296 547283;06:37:17;21;10;2009;fabiomafra;alguém traz mais lenha pro computador da facool? BOM DIA.;Belo Horizonte - MG - BR;Brazil;MG;;;-19.8157306;-43.9542226 547284;06:37:17;21;10;2009;romanotr;Вау, Репортеры без границ опубликовали список стран со свободой слова, из 173 Грузия на 81 месте опережая Украину. Успехи,успехи...;Portugal Aveiro;Portugal;Aveiro;;;40.6411848;-8.6536169 547285;06:37:18;21;10;2009;Y_T_;Playing: Beth Orton lt\;Someone's Daughtergt\;;Kanazawa, Japan;Japan;Ishikawa Prefecture;;;36.5613254;136.6562051 Error: invalid input '547286;06:37:18;21;10;2009;Atogey;支æŒä½ ,国家需è¦ä»–们,但是国家的未æ¥ä¸èƒ½é 他们…RT @zuola ￿我觉得 @wenyunc I want to convert it to fields or columns and so I thought I should convert it to a dataframe. I tried twitterDF-as.data.frame(twitter) Error in sort.list(y) : invalid input '547286;06:37:18;21;10;2009;Atogey;支æŒä½ ,国家需è¦ä»–们,但是国家的未æ¥ä¸èƒ½é 他们…RT @zuola ￿我觉得 @wenyunchao 一点都ä¸ä¹è§‚。真æ£çš„ä¹è§‚åº”è¯¥æ˜¯ï¼šä½ å…³æˆ‘åˆæ€Žä¹ˆæ ·ï¼Œåæ£æ”¿æ²»æ–—争ä¸ä¼šä¸¢æŽ‰æ€§å‘½ï¼Œè€å出æ¥åŽæ›´æ˜¯ä¸€æ¡å¥½æ±‰ã€‚北风还是èˆä¸å¾—*霸地ä½ã€è‚‰ã€ä¹¦ã€å¥³äººå’Œç½‘络的,ä¸è¿‡ç‰¢é‡Œä¸ä¼šæ供这些。å¦â€¦;山西,浙江;China;Zhejiang;;;28.695035;119.751054' in 'utf8towcs' Can anyone suggest what I can do? P.S. Actually, I would love to remove all the non-English tweets but I have no clue about how to do that. -- View this message in context: http://old.nabble.com/convert-list-to-Dataframe-tp26148889p26148898.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] package lme4
Hi R Users, When I use package lme4 for mixed model analysis, I can't distinguish the significant and insignificant variables from all random independent variables. Here is my data and result: Data: Rice-data.frame(Yield=c(8,7,4,9,7,6,9,8,8,8,7,5,9,9,5,7,7,8,8,8,4,8,6,4,8,8,9), Variety=rep(rep(c(A1,A2,A3),each=3),3), Stand=rep(c(B1,B2,B3),9), Block=rep(1:3,each=9)) Rice.lmer-lmer(Yield ~ (1|Variety) + (1|Stand) + (1|Block) + (1|Variety:Stand), data = Rice) Result: Linear mixed model fit by REML Formula: Yield ~ (1 | Variety) + (1 | Stand) + (1 | Block) + (1 | Variety:Stand) Data: Rice AIC BIC logLik deviance REMLdev 96.25 104.0 -42.1285.33 84.25 Random effects: GroupsNameVariance Std.Dev. Variety:Stand (Intercept) 1.345679 1.16003 Block (Intercept) 0.00 0.0 Stand (Intercept) 0.89 0.94281 Variety (Intercept) 0.024691 0.15714 Residual 0.67 0.81650 Number of obs: 27, groups: Variety:Stand, 9; Block, 3; Stand, 3; Variety, 3 Fixed effects: Estimate Std. Error t value (Intercept) 7.1852 0.6919 10.38 Can you give me some advice for recognizing the significant variables among random effects above without other calculating. Any suggestions will be appreciated. Wenjun [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] SEM with Unbalanced Panel
Dear R Experts, I have a query concerning SEMs in a multilevel context. I have an unbalanced panel where children are nested in families, which in turn are nested in districts. The problem is the following: the outcome variable of interest is measured at the child level but the (explanatory) latent variable is at the family level. Most the variables are discrete and I am able to fit the model using the sem() and hetcor() packages ignoring the hierarchical structure of the data. But, how do I incorporate the fact that the outcome variable is at the child level and the latent variable at the family level? Any help would be greatly appreciated. Thanks a lot already Henry -- http://portal.gmx.net/de/go/dsl02 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] counting frequencies across two columns
Hi Jason, If I understand correctly, you are looking for something along the lines of with(X, tapply(author, articleID, function(x) length(unique(x # 1 2 3 # 1 2 2 with X your data frame. HTH, Jorge On Sun, Nov 1, 2009 at 1:20 AM, Jason Priem wrote: I've got a data frame describing comments on an electronic journal, wherein each row is a unique comment, like so: commentID author articleID 1 1 smith 2 2 2 jones 3 3 3 andrews 2 4 4 jones 1 5 5 johnson 3 6 6 smith 2 I want know the number of unique authors per article. I can get a table of article frequencies with table(articleID), but I can't figure out how to count frequencies in a different column. I'm sure there's an easy way, but I guess I'm too new at this to find it. Thanks for your help! Jason Priem PhD student, School of Information and Library Science, University of North Carolina-Chapel Hill __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] wilcox.test construction in r
Dear Stefan, See two comments inserted below. Stefan Grosse wrote: On Sun, 1 Nov 2009 00:47:50 -0700 (PDT) jomni jom...@gmail.com wrote: J So do I write the function as wilcox.test(original, test, J alternative=l)? or wlcox.test(original, test, alternative = g)? J or wilcox.test(test, original, alternative=g)? J or wilcox.test(test, original, alternative=l)? J How do I interpret the p-value given my criteria? J Do I reject null when p-value less than 0.05? J or greater than 0.95? The interpretation of the p depends on how you have tested the hypothesis. J Not a statistics major here so I'm really confused. You don't need to be that but please read the documentation and try the given examples in the documentation. Comment 1: As you point out, one should at least scan the documentation. Here's a quote from ?wilcox.test: 'the one-sided alternative greater is that x is shifted to the right of y' That's pretty unambiguous. If you would have typed example(wilcox.test) you would have seen for example: wlcx.t ## Two-sample test. wlcx.t ## Hollander Wolfe (1973), 69f. wlcx.t ## Permeability constants of the human chorioamnion (a placental wlcx.t ## membrane) at term (x) and between 12 to 26 weeks gestational wlcx.t ## age (y). The alternative of interest is greater permeability wlcx.t ## of the human chorioamnion for the term pregnancy. wlcx.t x - c(0.80, 0.83, 1.89, 1.04, 1.45, 1.38, 1.91, 1.64, 0.73, 1.46) wlcx.t y - c(1.15, 0.88, 0.90, 0.74, 1.21) wlcx.t wilcox.test(x, y, alternative = g)# greater Wilcoxon rank sum test data: x and y W = 35, p-value = 0.1272 alternative hypothesis: true location shift is greater than 0 This I think makes it very easy to interprete. Here it is tested as the text says whether x is greater than y. So if you want to test the hypothesis that x is smaller than y so you do wilcox.test(x,y,alternative=less) then the lower your p is the higher is the probability that the samples are different. hence p0.05 would match your confidence level. Now the Comment 2: I know that you know better, but with p-values it's always best to be careful with the language. ... the probability that the _samples_ are different makes little sense. The samples _are_ different, period (or why do the test?). The p-value says something about the distribution from which the samples are obtained. Cheers, Peter Ehlers surprising news: wilcox.test(y,x,alternative=greater) would work as well! If you are in doubt create an x and an y where you are sure that x is smaller than y. One final remark: if you have ties (several identical values in one sample) you should use wilcox_test of the coin package. hth Stefan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] matrix^(-1/2)
On Oct 31, 2009, at 9:33 PM, David Winsemius wrote: On Oct 31, 2009, at 4:39 PM, Kajan Saied wrote: Dear R-Help Team, as a R novice I have a (maybe for you very simple question), how do I get the following solved in R: Let R be a n x n matrix: \mid R\mid^{-\frac{1}{2}} solve(A) gives me the inverse of the matrix R, however not the ^(-1/2) of the matrix... GIYF: (and Bill Venables if friendly, too.) http://www.lmgtfy.com/?q=powers+of+matrix+r-project I had assumed that the first hit I got: https://stat.ethz.ch/pipermail/r-help/2008-April/160662.html ... would be the first hit anybody got, but that's not necessarily true now and especially for the future. And further searching within the results produced this more recent Maechler posting: https://stat.ethz.ch/pipermail/r-devel/2008-April/048969.html For the Mac users, there appears to be no binary, but the source compiles without error on a 64-bit version of R 2.10.0: install.packages(expm,repos=http://R-Forge.R-project.org;, type=source) #The suggested code throws an error, so my very minor revision would be: library(expm) ?%^% -- David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to loop thru a matrix or data frame , and append calculations to a new data frame?
Duh, thought of that after I'd left for dinner :( --- On Sat, 10/31/09, David Winsemius dwinsem...@comcast.net wrote: From: David Winsemius dwinsem...@comcast.net Subject: Re: [R] how to loop thru a matrix or data frame , and append calculations to a new data frame? To: John Kane jrkrid...@yahoo.ca Cc: r-help@r-project.org, Robert Wilkins robst...@gmail.com Received: Saturday, October 31, 2009, 2:00 PM On Oct 31, 2009, at 12:33 PM, John Kane wrote: Do you mean to apply the same calculation to each element? ? apply mydata - data.frame(aa = 1:5, bb= 11:15) mp5 - function(x) x*5 mp5data - apply(mydata, 2, mp5) mp5data It would have been much more compact (and in the spirit of functional programming) to just do: mp5data - mydata*5 # binary operations applied to dataframes give sensible results when the data types allow. mp5data aa bb [1,] 5 55 [2,] 10 60 [3,] 15 65 [4,] 20 70 [5,] 25 75 Many times the loops are implicit in the vectorized design of R. And that solution would not result in the structure requested, for which some further straightening would be needed: mp5data - as.vector(as.matrix(mydata)*5) mp5data [1] 5 10 15 20 25 55 60 65 70 75 -- David This is functionally equivelent to a double if loop. mydata - data.frame(aa = 1:5, bb= 11:15) newdata - data.frame(matrix(rep(NA,10), nrow=5)) for (i in 1:length(mydata[1,])) { for (j in 1:5) { newdata[j,i] - mydata[j,i]*5 } } newdata --- On Fri, 10/30/09, Robert Wilkins robst...@gmail.com wrote: From: Robert Wilkins robst...@gmail.com Subject: [R] how to loop thru a matrix or data frame , and append calculations to a new data frame? To: r-help@r-project.org Received: Friday, October 30, 2009, 11:46 AM How do you do a double loop through a matrix or data frame , and within each iteration , do a calculation and append it to a new, second data frame? (So, if your original matrix or data frame is 4 x 5 , then 20 calculations are done, and the new data frame, which had 0 rows to start with, now has 20 rows) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ Yahoo! Canada Toolbar: Search from anywhere on the web, and bookmark your favourite sites. Download it now http://ca.toolbar.yahoo.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] matrix^(-1/2)
A question, a comment, and an alternative answer to matrix^(-1/2): QUESTION: What's the status of the expm package, mentioned in the email you cited from Martin Maechler, dated Apr 5 19:52:09 CEST 2008? I tried both install.packages('expm') and install.packages(expm,repos=http://R-Forge.R-project.org;), and got package 'expm' is not available in both cases. COMMENT: The solution proposed by Venables rests on Sylvester's matrix theorem, which essentially says that if a matrix A is diagonalizable with eigenvalue decomposition eigA - eigen(A) and f: D → C with D ⊂ C be a function for which f(A) is well defined (http://en.wikipedia.org/wiki/Sylvester%27s_matrix_theorem), then f(A) = with(eigA, vectors %*% diag(f(values)) %*% solve(vectors)). Maechler and others have noted that this can be one of the least accurate and most computationally expensive ways to compute f(A). ALTERNATIVE ANSWER: For A^(-1/2), if A is symmetric and nonnegative definite, then solve(chol(A)) would be a very good way to compute it. Hope this helps, Spencer David Winsemius wrote: On Oct 31, 2009, at 9:33 PM, David Winsemius wrote: On Oct 31, 2009, at 4:39 PM, Kajan Saied wrote: Dear R-Help Team, as a R novice I have a (maybe for you very simple question), how do I get the following solved in R: Let R be a n x n matrix: \mid R\mid^{-\frac{1}{2}} solve(A) gives me the inverse of the matrix R, however not the ^(-1/2) of the matrix... GIYF: (and Bill Venables if friendly, too.) http://www.lmgtfy.com/?q=powers+of+matrix+r-project I had assumed that the first hit I got: https://stat.ethz.ch/pipermail/r-help/2008-April/160662.html ... would be the first hit anybody got, but that's not necessarily true now and especially for the future. And further searching within the results produced this more recent Maechler posting: https://stat.ethz.ch/pipermail/r-devel/2008-April/048969.html For the Mac users, there appears to be no binary, but the source compiles without error on a 64-bit version of R 2.10.0: install.packages(expm,repos=http://R-Forge.R-project.org;, type=source) #The suggested code throws an error, so my very minor revision would be: library(expm) ?%^% -- Spencer Graves, PE, PhD President and Chief Operating Officer Structure Inspection and Monitoring, Inc. 751 Emerson Ct. San José, CA 95126 ph: 408-655-4567 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] matrix^(-1/2)
On Sun, 1 Nov 2009, spencerg wrote: A question, a comment, and an alternative answer to matrix^(-1/2): QUESTION: What's the status of the expm package, mentioned in the email you cited from Martin Maechler, dated Apr 5 19:52:09 CEST 2008? I tried both install.packages('expm') and install.packages(expm,repos=http://R-Forge.R-project.org;), and got package 'expm' is not available in both cases. Try http://r-forge.r-project.org/projects/expm/ HTH, Chuck COMMENT: The solution proposed by Venables rests on Sylvester's matrix theorem, which essentially says that if a matrix A is diagonalizable with eigenvalue decomposition eigA - eigen(A) and f: D → C with D ⊂ C be a function for which f(A) is well defined (http://en.wikipedia.org/wiki/Sylvester%27s_matrix_theorem), then f(A) = with(eigA, vectors %*% diag(f(values)) %*% solve(vectors)). Maechler and others have noted that this can be one of the least accurate and most computationally expensive ways to compute f(A). ALTERNATIVE ANSWER: For A^(-1/2), if A is symmetric and nonnegative definite, then solve(chol(A)) would be a very good way to compute it. Hope this helps, Spencer David Winsemius wrote: On Oct 31, 2009, at 9:33 PM, David Winsemius wrote: On Oct 31, 2009, at 4:39 PM, Kajan Saied wrote: Dear R-Help Team, as a R novice I have a (maybe for you very simple question), how do I get the following solved in R: Let R be a n x n matrix: \mid R\mid^{-\frac{1}{2}} solve(A) gives me the inverse of the matrix R, however not the ^(-1/2) of the matrix... GIYF: (and Bill Venables if friendly, too.) http://www.lmgtfy.com/?q=powers+of+matrix+r-project I had assumed that the first hit I got: https://stat.ethz.ch/pipermail/r-help/2008-April/160662.html ... would be the first hit anybody got, but that's not necessarily true now and especially for the future. And further searching within the results produced this more recent Maechler posting: https://stat.ethz.ch/pipermail/r-devel/2008-April/048969.html For the Mac users, there appears to be no binary, but the source compiles without error on a 64-bit version of R 2.10.0: install.packages(expm,repos=http://R-Forge.R-project.org;, type=source) #The suggested code throws an error, so my very minor revision would be: library(expm) ?%^% -- Spencer Graves, PE, PhD President and Chief Operating Officer Structure Inspection and Monitoring, Inc. 751 Emerson Ct. San José, CA 95126 ph: 408-655-4567 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Charles C. Berry(858) 534-2098 Dept of Family/Preventive Medicine E mailto:cbe...@tajo.ucsd.edu UC San Diego http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] avoiding loop
On Sat, 31 Oct 2009, parkbomee wrote: Thank you both. However, using tapply() instead of a loop does not seem to improve my code much. I am using this inside of an optimization function, and it still takes more than it needs... Well, you haven't given us much to work with. The optimization choices depend on the particulars of your problem, which you've not detailed. It does not take long to run the tapply() code once, so you need to do it many times. Right? If so, you need to say how the structure varies (rendered as DF in David's reply) from iteration to iteration in the optimization. If it turns out that only 'value' changes and that the number of different times is not too large, then precomputing suitable indicator matrics may help: mat1 - model.matrix( ~ 0 + factor(time):as.numeric(choice==1),DF) mat2 - model.matrix( ~ 0 + factor(time), DF ) Inside the optimization use something like with(DF,(value%*%mat1)/(value%*%mat2)) If the structure can change or the number of unique times is large, then with so simple a calculation you should probably just inline some C code. http://cran.r-project.org/web/packages/inline/index.html HTH, Chuck CC: bbom...@hotmail.com; r-help@r-project.org From: dwinsem...@comcast.net To: d.rizopou...@erasmusmc.nl Subject: Re: [R] avoiding loop Date: Sat, 31 Oct 2009 22:26:17 -0400 This is pretty much equivalent: tapply(DF$value[DF$choice==1], DF$time[DF$choice==1], sum) / tapply(DF$value, DF$time, sum) And both will probably fail if the number of groups with choice==1 is different than the number overall. -- David. On Oct 31, 2009, at 5:14 PM, Dimitris Rizopoulos wrote: one approach is the following: # say 'DF' is your data frame, then with(DF, { ind - choice == 1 n - tapply(value[ind], time[ind], sum) d - tapply(value, time, sum) n / d }) I hope it helps. Best, Dimitris parkbomee wrote: Hi all, I am trying to figure out a way to improve my code's efficiency by avoiding the use of loop. I want to calculate a conditional mean(?) given time. For example, from the data below, I want to calculate sum((value| choice==1)/sum(value)) across time. Is there a way to do it without using a loop? time cum_time choicevalue 1 4 1 3 1 4 0 2 1 4 0 3 1 4 0 3 2 6 1 4 2 6 0 4 2 6 0 2 2 6 0 4 2 6 0 2 2 6 0 2 3 4 1 2 3 4 0 3 3 4 0 5 3 4 0 2 My code looks like objective[1] = value[1] / sum(value[1:cum_time[1]) for (i in 2:max(time)){ objective[i] = value[cum_time[i-1]+1] / sum(value[(cum_time[i-1]+1) : cum_time[i])]) } sum(objective) Anyone have an idea that I can do this without using a loop?? Thanks. _ [[elided Hotmail spam]] [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Dimitris Rizopoulos Assistant Professor Department of Biostatistics Erasmus University Medical Center Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands Tel: +31/(0)10/7043478 Fax: +31/(0)10/7043014 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT _ [[elided Hotmail spam]] [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Charles C. Berry(858) 534-2098 Dept of Family/Preventive Medicine E mailto:cbe...@tajo.ucsd.edu UC San Diego http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] problems whit seasonal ARIMA
Hello, I have daily wind speed data and need to fit seasonal ARIMA model, problem is that my period is 365. But when I use arima(...) function, with period 365, Im getting error message: Error in makeARIMA(trarma[[1]], trarma[[2]], Delta, kappa) : maximum supported lag is 350. Can someone help me with this problem? Thank you Sincerely yours, Laura Saltyte __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Package zelig
Confirmed with recent Zelig and R-2.10.0. CCing Kosuke Imai, the Zelig maintainer: Please can you take a look at this one and additionally fix the warnings given on your package's overview page at http://cran.r-project.org/web/checks/check_results_Zelig.html Please upload a fixed version with increased version number to CRAN. Best wishes, Uwe Ligges sayan dasgupta wrote: hello all I am using the R package Zelig for some tobit regression with robust standard errors. I have got R version 2.9.2 (2009-08-24) and Zelig Version: 3.4-5 when i do demo(robust) It ends like this way data(coalition) # Fit the model with robust standard error user.prompt() Press return to continue: z.out3 - zelig(Surv(duration, ciep12) ~ polar + numst2 + crisis, model = weibull, data = coalition, cluster = invest, robust = TRUE) *It is giving the following error *Error in rowsum.default(resid(fit, dfbeta), cluster) : 'x' must be numeric Please help where it is going wrong Thanks and Regards Sayan Dasgupta [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] matrix^(-1/2)
Hi, Chuck: Thanks very much, but why do I get package 'expm' is not available from install.packages(expm,repos=http://R-Forge.R-project.org;)? Best Wishes, Spencer Graves Charles C. Berry wrote: On Sun, 1 Nov 2009, spencerg wrote: A question, a comment, and an alternative answer to matrix^(-1/2): QUESTION: What's the status of the expm package, mentioned in the email you cited from Martin Maechler, dated Apr 5 19:52:09 CEST 2008? I tried both install.packages('expm') and install.packages(expm,repos=http://R-Forge.R-project.org;), and got package 'expm' is not available in both cases. Try http://r-forge.r-project.org/projects/expm/ HTH, Chuck COMMENT: The solution proposed by Venables rests on Sylvester's matrix theorem, which essentially says that if a matrix A is diagonalizable with eigenvalue decomposition eigA - eigen(A) and f: D → C with D ⊂ C be a function for which f(A) is well defined (http://en.wikipedia.org/wiki/Sylvester%27s_matrix_theorem), then f(A) = with(eigA, vectors %*% diag(f(values)) %*% solve(vectors)). Maechler and others have noted that this can be one of the least accurate and most computationally expensive ways to compute f(A). ALTERNATIVE ANSWER: For A^(-1/2), if A is symmetric and nonnegative definite, then solve(chol(A)) would be a very good way to compute it. Hope this helps, Spencer David Winsemius wrote: On Oct 31, 2009, at 9:33 PM, David Winsemius wrote: On Oct 31, 2009, at 4:39 PM, Kajan Saied wrote: Dear R-Help Team, as a R novice I have a (maybe for you very simple question), how do Iget the following solved in R: Let R be a n x n matrix: \mid R\mid^{-\frac{1}{2}} solve(A) gives me the inverse of the matrix R, however not the ^(-1/2)of the matrix... GIYF: (and Bill Venables if friendly, too.) http://www.lmgtfy.com/?q=powers+of+matrix+r-project I had assumed that the first hit I got: https://stat.ethz.ch/pipermail/r-help/2008-April/160662.html ... would be the first hit anybody got, but that's not necessarily true now and especially for the future. And further searching within the results produced this more recent Maechler posting: https://stat.ethz.ch/pipermail/r-devel/2008-April/048969.html For the Mac users, there appears to be no binary, but the source compiles without error on a 64-bit version of R 2.10.0: install.packages(expm,repos=http://R-Forge.R-project.org;, type=source) #The suggested code throws an error, so my very minor revision would be: library(expm) ?%^% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Charles C. Berry(858) 534-2098 Dept of Family/Preventive Medicine E mailto:cbe...@tajo.ucsd.eduUC San Diego http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fast optimizer
Hi, Here is one approach to solve your likelihood maximization subject to constraints. I have written a function called `constrOptim.nl' that can solve constrained optimization problems. I will demonstrate this with a simulation. # 1. Simulate the data (x,y). a - 4 b - 1 c - 2.5 p - 0.3 n - 100 z - rbinom(n, size=1, prob=p) x - ifelse(z, rpois(n, a), rpois(n, a+c)) y - ifelse(z, rpois(n, b+c), rpois(n, b)) # 2. Define the log-likelihood to be maximized mloglik - function(par, xx, y, ...){ # Note that I have named `xx' instead of `x' to avoid conflict with another function p - par[1] a - par[2] b - par[3] cc - par[4] sum(log(p*dpois(xx,a)*dpois(y,b+cc) + (1-p)*dpois(xx,a+cc)*dpois(y,b))) } # 3. Write the constraint function to define inequalities hin - function(par, ...) { # All constraints are defined such that h[i] 0 for all i. h - rep(NA, 7) h[1] - par[1] h[2] - 1 - par[1] h[3] - par[2] h[4] - par[3] h[5] - par[4] h[6] - par[2] - par[4] h[7] - par[3] - par[4] h } # Now, we are almost ready to maximize the log-likelihood. We need a feasible starting value. # We construct a projection function that can convert any arbitrary real vector to a feasible starting vector. # 5. Finding a feasible starting vector of parameter values project - function(par, lower, upper) { # a function to map a random vector to a feasible vector slack - 1.e-14 if (par[1] = 0) par[1] - slack if (par[1] = 1) par[1] - 1 - slack if (par[2] = 0) par[2] - slack if (par[3] = 0) par[3] - slack par[4] - min(par[2], par[3], par[4]) - slack par } p0c - runif(4, c(0,1,1,1), c(1, 5, 5, 2)) # a random candidate starting value p0 - project(p0c) # feasible starting value # 6. Optimization source(constrOptim_nl.R) # we source the `constrOptim.nl' function ans - constrOptim.nl(par=p0, fn=mloglik, hin=hin, xx=x, y=y, control=list(fnscale=-1)) # Note: we are maximizing ans This concludes our brief tutorial. Hope this helps, Ravi. Ravi Varadhan, Ph.D. Assistant Professor, Division of Geriatric Medicine and Gerontology School of Medicine Johns Hopkins University Ph. (410) 502-2619 email: rvarad...@jhmi.edu - Original Message - From: Ravi Varadhan rvarad...@jhmi.edu Date: Thursday, October 29, 2009 10:23 pm Subject: Re: [R] Fast optimizer To: R_help Help rhelp...@gmail.com Cc: r-help@r-project.org, r-sig-fina...@stat.math.ethz.ch This optimization should not take you 1-2 mins. My guess is that your coding of the likelihood is inefficient, perhaps containing for loops. Would you mind sharing your code with us? As far as incorporating inequality constraints, there are at least 4 approaches: 1. You could use `constrOptim' to incorporate inequality constraints. 2. I have written a function `constrOptim.nl' that is better than `constrOptim', and it can be used here. 3. The projection method in spg() function in the BB package can be used. 4. You can create a penalized objective function, where the inequalities c a and c b can be penalized. Then you can use optim's L-BFGS-B or spg() or nlminb(). Ravi. Ravi Varadhan, Ph.D. Assistant Professor, Division of Geriatric Medicine and Gerontology School of Medicine Johns Hopkins University Ph. (410) 502-2619 email: rvarad...@jhmi.edu - Original Message - From: R_help Help rhelp...@gmail.com Date: Thursday, October 29, 2009 9:21 pm Subject: Re: [R] Fast optimizer To: Ravi Varadhan rvarad...@jhmi.edu Cc: r-help@r-project.org, r-sig-fina...@stat.math.ethz.ch Ok. I have the following likelihood function. L - p*dpois(x,a)*dpois(y,b+c)+(1-p)*dpois(x,a+c)*dpois(y,b) where I have 100 points of (x,y) and parameters c(a,b,c,p) to estimate. Constraints are: 0 p 1 a,b,c 0 c a c b I construct a loglikelihood function out of this. First ignoring the last two constraints, it takes optim with box constraint about 1-2 min to estimate this. I have to estimate the MLE on about 200 rolling windows. This will take very long. Is there any faster implementation? Secondly, I cannot incorporate the last two contraints using optim function. Thank you, rc On Thu, Oct 29, 2009 at 9:02 PM, Ravi Varadhan rvarad...@jhmi.edu wrote: You have hardly given us any information for us to be able to help you. Give us more information on your problem, and, if possible, a minimal, self-contained example of what you are trying to do. Ravi. Ravi Varadhan, Ph.D. Assistant Professor, Division of Geriatric Medicine and Gerontology School of Medicine Johns Hopkins University Ph. (410) 502-2619 email: rvarad...@jhmi.edu - Original Message - From: R_help Help rhelp...@gmail.com Date:
Re: [R] matrix^(-1/2)
On Nov 1, 2009, at 1:46 PM, spencerg wrote: Hi, Chuck: Thanks very much, but why do I get package 'expm' is not available from install.packages(expm,repos=http://R-Forge.R-project.org )? In my case I think it was it is because there is no 2.10 branch to either the: http://r-forge.r-project.org/bin/macosx/leopard/contrib/... or the http://r-forge.r-project.org/bin/macosx/universal/contrib/...trees. I tried a variety of stems for the installer but got these messages: Warning: unable to access index for repository http://r-forge.r-project.org/bin/macosx/universal/contrib/latest/bin/macosx/leopard/contrib/2.10 Warning: unable to access index for repository http://r-forge.r-project.org/bin/macosx/universal/contrib/bin/macosx/leopard/contrib/2.10 Warning: unable to access index for repository http://r-forge.r-project.org/bin/macosx/leopard/contrib/2.10 So I wonder if the package installers' expectations for the r-forge repository are matching up with the tree structures. I should also note that the matpow or %^% functions in expm would not address the OP's question since they require that the exponent be positive. -- David. Best Wishes, Spencer Graves Charles C. Berry wrote: On Sun, 1 Nov 2009, spencerg wrote: A question, a comment, and an alternative answer to matrix^(-1/2): QUESTION: What's the status of the expm package, mentioned in the email you cited from Martin Maechler, dated Apr 5 19:52:09 CEST 2008? I tried both install.packages('expm') and install.packages(expm,repos=http://R-Forge.R-project.org;), and got package 'expm' is not available in both cases. Try http://r-forge.r-project.org/projects/expm/ HTH, Chuck COMMENT: The solution proposed by Venables rests on Sylvester's matrix theorem, which essentially says that if a matrix A is diagonalizable with eigenvalue decomposition eigA - eigen(A) and f: D → C with D ⊂ C be a function for which f(A) is well defined (http://en.wikipedia.org/wiki/Sylvester%27s_matrix_the orem), then f(A) = with(eigA, vectors %*% diag(f(values)) %*% solve(vectors)). Maechler and others have noted that this can be one of the least accurate and most computationally expensive ways to compute f(A). ALTERNATIVE ANSWER: For A^(-1/2), if A is symmetric and nonnegative definite, then solve(chol(A)) would be a very good way to compute it. Hope this helps, Spencer David Winsemius wrote: On Oct 31, 2009, at 9:33 PM, David Winsemius wrote: On Oct 31, 2009, at 4:39 PM, Kajan Saied wrote: Dear R-Help Team, as a R novice I have a (maybe for you very simple question), how do Iget the following solved in R: Let R be a n x n matrix: \mid R\mid^{-\frac{1}{2}} solve(A) gives me the inverse of the matrix R, however not the ^(-1/2)of the matrix... GIYF: (and Bill Venables if friendly, too.) http://www.lmgtfy.com/?q=powers+of+matrix+r-project I had assumed that the first hit I got: https://stat.ethz.ch/pipermail/r-help/2008-April/160662.html ... would be the first hit anybody got, but that's not necessarily true now and especially for the future. And further searching within the results produced this more recent Maechler posting: https://stat.ethz.ch/pipermail/r-devel/2008-April/048969.html For the Mac users, there appears to be no binary, but the source compiles without error on a 64-bit version of R 2.10.0: install.packages(expm,repos=http://R-Forge.R-project.org;, type=source) #The suggested code throws an error, so my very minor revision would be: library(expm) ?%^% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Charles C. Berry(858) 534-2098 Dept of Family/ Preventive Medicine E mailto:cbe...@tajo.ucsd.eduUC San Diego http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901 David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] problems whit seasonal ARIMA
On 2/11/2009, at 2:49 AM, Laura Saltyte wrote: Hello, I have daily wind speed data and need ``need'' is probably not the operative word! to fit seasonal ARIMA model, problem is that my period is 365. But when I use arima(...) function, with period 365, I’m getting error message: “Error in makeARIMA(trarma[[1]], trarma[[2]], Delta, kappa) : maximum supported lag is 350”. Can someone help me with this problem? Well, you could dig into the code and rewrite it to revise the maximum lag. But that would be pretty silly. A seasonal model with a lag of 365 is downright ridiculous. A seasonal model of lag ``s'' assumes that X_{t-s} has an influence upon or contains information about (or is non-trivially correlated with) X_t. Do you *really* believe that wind speed from a year ago contains information relevant to current wind speed? There is almost surely a periodic *trend* with period 365 (well, maybe period 365.25?) in daily wind speed data, but that's a different issue. If you have data for sufficiently many years you could try estimating such a trend simply by averaging over all occurrences of a given day to estimate the trend value for that day. You could also try fitting a sinusoid (probably a fairly high order sinusoid ) What you really should do depends on the (practical) question that you are trying to answer. cheers, Rolf Turner ## Attention: This e-mail message is privileged and confidential. If you are not the intended recipient please delete the message and notify the sender. Any views or opinions presented are solely those of the author. This e-mail has been scanned and cleared by MailMarshal www.marshalsoftware.com ## __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] avoiding loop
What you need to do is to understand how to use Rprof so that you can determine where the time is being spent. It probably indicates that this is not the source of slowness in your optimization function. How much time are we talking about? You may spent more time trying to optimize the function than just running the current version even if it is slow (slow is a relative term and does not hold much meaning without some context round it). On Sat, Oct 31, 2009 at 11:36 PM, parkbomee bbom...@hotmail.com wrote: Thank you both. However, using tapply() instead of a loop does not seem to improve my code much. I am using this inside of an optimization function, and it still takes more than it needs... CC: bbom...@hotmail.com; r-help@r-project.org From: dwinsem...@comcast.net To: d.rizopou...@erasmusmc.nl Subject: Re: [R] avoiding loop Date: Sat, 31 Oct 2009 22:26:17 -0400 This is pretty much equivalent: tapply(DF$value[DF$choice==1], DF$time[DF$choice==1], sum) / tapply(DF$value, DF$time, sum) And both will probably fail if the number of groups with choice==1 is different than the number overall. -- David. On Oct 31, 2009, at 5:14 PM, Dimitris Rizopoulos wrote: one approach is the following: # say 'DF' is your data frame, then with(DF, { ind - choice == 1 n - tapply(value[ind], time[ind], sum) d - tapply(value, time, sum) n / d }) I hope it helps. Best, Dimitris parkbomee wrote: Hi all, I am trying to figure out a way to improve my code's efficiency by avoiding the use of loop. I want to calculate a conditional mean(?) given time. For example, from the data below, I want to calculate sum((value| choice==1)/sum(value)) across time. Is there a way to do it without using a loop? time cum_time choice value 1 4 1 3 1 4 0 2 1 4 0 3 1 4 0 3 2 6 1 4 2 6 0 4 2 6 0 2 2 6 0 4 2 6 0 2 2 6 0 2 3 4 1 2 3 4 0 3 3 4 0 5 3 4 0 2 My code looks like objective[1] = value[1] / sum(value[1:cum_time[1]) for (i in 2:max(time)){ objective[i] = value[cum_time[i-1]+1] / sum(value[(cum_time[i-1]+1) : cum_time[i])]) } sum(objective) Anyone have an idea that I can do this without using a loop?? Thanks. _ [[elided Hotmail spam]] [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Dimitris Rizopoulos Assistant Professor Department of Biostatistics Erasmus University Medical Center Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands Tel: +31/(0)10/7043478 Fax: +31/(0)10/7043014 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT _ [[elided Hotmail spam]] [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] intigrate function and absolute error
Hi Can we get the result of an intigration without the absolute error? for example f1-function(x1){(1/gamma(alpha))*x1^(alpha-1)*exp(-x1)*log(x1)} I1-integrate(f1, 0, (max(cc)-tau1+(theta2/theta1)*tau1)/theta2) I1 0.08007414 with absolute error 7.2e-05 I need the answer 0.08007414 withou the other part(with absolute error 7.2e-05) how can we do that? thank you and take care Laila [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] intigrate function and absolute error
Hi Laila, Here is a suggestion: res - integrate(dnorm, -1.96, 1.96) res # 0.9500042 with absolute error 1.0e-11 res[[1]] # [1] 0.9500042 res$value # [1] 0.9500042 HTH, Jorge On Sun, Nov 1, 2009 at 3:02 PM, Laila Alkhalfan wrote: Hi Can we get the result of an intigration without the absolute error? for example f1-function(x1){(1/gamma(alpha))*x1^(alpha-1)*exp(-x1)*log(x1)} I1-integrate(f1, 0, (max(cc)-tau1+(theta2/theta1)*tau1)/theta2) I1 0.08007414 with absolute error 7.2e-05 I need the answer 0.08007414 withou the other part(with absolute error 7.2e-05) how can we do that? thank you and take care Laila [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Calculate Volume in a PCA
Hi, my data frame consist of 8 Variables and 120 000 observations. With those datas I am running a PCA and after I want to calculate the Volume of the PCA-cloud of certain subsets of my data. Does anyone have an idea about a function that can do this? Thanks [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] matrix^(-1/2)
On Sun, 1 Nov 2009, David Winsemius wrote: On Nov 1, 2009, at 1:46 PM, spencerg wrote: Hi, Chuck: Thanks very much, but why do I get package 'expm' is not available from install.packages(expm,repos=http://R-Forge.R-project.org;)? In my case I think it was it is because there is no 2.10 branch to either the: http: //r-forge.r-project.org/bin/macosx/leopard/contrib/... or the http: //r-forge.r-project.org/bin/macosx/universal/contrib/...trees. I tried a variety of stems for the installer but got these messages: Warning: unable to access index for repository http://r-forge.r-project.org/bin/macosx/universal/contrib/latest/bin/macosx/leopard/contrib/2.10 Warning: unable to access index for repository http://r-forge.r-project.org/bin/macosx/universal/contrib/bin/macosx/leopard/contrib/2.10 Warning: unable to access index for repository http://r-forge.r-project.org/bin/macosx/leopard/contrib/2.10 So I wonder if the package installers' expectations for the r-forge repository are matching up with the tree structures. Right. FWIW, the source install works OK on my linux box: sessionInfo() R version 2.10.0 (2009-10-26) x86_64-pc-linux-gnu [output truncated] I should also note that the matpow or %^% functions in expm would not address the OP's question since they require that the exponent be positive. Roger that. If solve(chol(A)) isn't good enough a symmetric inverse square root is available from expm as 'solve( sqrtm( A ) )' Chuck -- David. Best Wishes, Spencer Graves Charles C. Berry wrote: On Sun, 1 Nov 2009, spencerg wrote: A question, a comment, and an alternative answer to matrix^(-1/2): QUESTION: What's the status of the expm package, mentioned in the email you cited from Martin Maechler, dated Apr 5 19:52:09 CEST 2008? I tried both install.packages('expm') and install.packages(expm,repos=http://R-Forge.R-project.org;), and got package 'expm' is not available in both cases. Try http://r-forge.r-project.org/projects/expm/ HTH, Chuck COMMENT: The solution proposed by Venables rests on Sylvester's matrix theorem, which essentially says that if a matrix A is diagonalizable with eigenvalue decomposition eigA - eigen(A) and f: D → C with D ⊂ C be a function for which f(A) is well defined (http://en.wikipedia.org/wiki/Sylvester%27s_matrix_theorem), then f(A) = with(eigA, vectors %*% diag(f(values)) %*% solve(vectors)). Maechler and others have noted that this can be one of the least accurate and most computationally expensive ways to compute f(A). ALTERNATIVE ANSWER: For A^(-1/2), if A is symmetric and nonnegative definite, then solve(chol(A)) would be a very good way to compute it. Hope this helps, Spencer David Winsemius wrote: On Oct 31, 2009, at 9:33 PM, David Winsemius wrote: On Oct 31, 2009, at 4:39 PM, Kajan Saied wrote: Dear R-Help Team, as a R novice I have a (maybe for you very simple question), how do Iget the following solved in R: Let R be a n x n matrix: \mid R\mid^{-\frac{1}{2}} solve(A) gives me the inverse of the matrix R, however not the ^(-1/2)of the matrix... GIYF: (and Bill Venables if friendly, too.) http://www.lmgtfy.com/?q=powers+of+matrix+r-project I had assumed that the first hit I got: https://stat.ethz.ch/pipermail/r-help/2008-April/160662.html ... would be the first hit anybody got, but that's not necessarily true now and especially for the future. And further searching within the results produced this more recent Maechler posting: https://stat.ethz.ch/pipermail/r-devel/2008-April/048969.html For the Mac users, there appears to be no binary, but the source compiles without error on a 64-bit version of R 2.10.0: install.packages(expm,repos=http://R-Forge.R-project.org;, type=source) #The suggested code throws an error, so my very minor revision would be: library(expm) ?%^% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Charles C. Berry(858) 534-2098 Dept of Family/Preventive Medicine E mailto:cbe...@tajo.ucsd.eduUC San Diego http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901 David Winsemius, MD Heritage Laboratories West Hartford, CT Charles C. Berry(858) 534-2098 Dept of Family/Preventive Medicine E mailto:cbe...@tajo.ucsd.edu UC San Diego
Re: [R] underflow of fisher.test result
On Tue, Oct 20, 2009 at 8:14 AM, Ted Harding ted.hard...@manchester.ac.uk wrote: On 20-Oct-09 13:34:49, Peng Yu wrote: fisher.test() gives a very small p-value, which is underflow on my machine. However, the log of it should not be underflow. I'm wondering if there is a way get log() of a small p-value. An approximation is acceptable in this case. Thank you! fisher.test(rbind(c(1,10),c(10,1000)))$p.value [1] 0 I have not attempted an exact approach (though may do so later), but the P-value in question is about 1e-15000 so (using log to base e) log(P) approx = -33000 In such a case, P=0 is a pretty good approximation! Which prompts the question: Why the interest in having the value of such a very small number? Is there any existing method that could return a log() of a very small p-value? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Collapse factor levels
I'm sure this is simple enough, but an R site search on my subject terms did suggest a solution. I have a numeric vector with many values that I wish to create a factor from having only a few levels. Here is a toy example. x - 1:10 x - factor(x,levels=1:10,labels=c(A,A,A,B,B,B,C,C,C,C)) x [1] A A A B B B C C C C Levels: A A A B B B C C C C summary(x) A A A B B B C C C C 3 0 0 3 0 0 4 0 0 0 So, there are clearly still 10 underlying levels. The results I would like to see from printing the value and summary(x) are: x [1] A A A B B B C C C C Levels: A B C summary(x) A B C 3 3 4 Hopefully this makes sense. Thanks, Kevin -- Kevin E. Thorpe Biostatistician/Trialist, Knowledge Translation Program Assistant Professor, Dalla Lana School of Public Health University of Toronto email: kevin.tho...@utoronto.ca Tel: 416.864.5776 Fax: 416.864.3016 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Collapse factor levels
Hi Kevin, Here are two suggestions: # Combination of levels() and table() table(levels(x)) # A B C # 3 3 4 # Or defining a function mysummary - function(x) table(levels(x)) # you can easily improve it :-) mysummary(x) # A B C # 3 3 4 HTH, Jorge On Sun, Nov 1, 2009 at 3:51 PM, Kevin E. Thorpe wrote: I'm sure this is simple enough, but an R site search on my subject terms did suggest a solution. I have a numeric vector with many values that I wish to create a factor from having only a few levels. Here is a toy example. x - 1:10 x - factor(x,levels=1:10,labels=c(A,A,A,B,B,B,C,C,C,C)) x [1] A A A B B B C C C C Levels: A A A B B B C C C C summary(x) A A A B B B C C C C 3 0 0 3 0 0 4 0 0 0 So, there are clearly still 10 underlying levels. The results I would like to see from printing the value and summary(x) are: x [1] A A A B B B C C C C Levels: A B C summary(x) A B C 3 3 4 Hopefully this makes sense. Thanks, Kevin -- Kevin E. Thorpe Biostatistician/Trialist, Knowledge Translation Program Assistant Professor, Dalla Lana School of Public Health University of Toronto email: kevin.tho...@utoronto.ca Tel: 416.864.5776 Fax: 416.864.3016 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] avoiding loop
Thank you all. What Chuck has suggested might not be applicable since the number of different times is around 40,000. The object of optimization in my function is the varying value, which is basically data * parameter, of which parameter is the object of optimization.. And from the r profiling with a subset of data, I got this report..any idea what Anonymous is? $by.total total.time total.pct self.time self.pct Anonymous 571.56 100.0 0.02 0.0 optim 571.56 100.0 0.00 0.0 fn571.54 100.0 0.98 0.2 eval 423.74 74.1 0.00 0.0 with.default 423.74 74.1 0.00 0.0 with 423.74 74.1 0.00 0.0 tapply414.28 72.5 13.84 2.4 lapply255.48 44.7 76.94 13.5 factor127.68 22.3 11.08 1.9 unlist120.54 21.1 80.46 14.1 FUN94.16 16.5 94.16 16.5 . . . . . Date: Sun, 1 Nov 2009 15:35:41 -0400 Subject: Re: [R] avoiding loop From: jholt...@gmail.com To: bbom...@hotmail.com CC: dwinsem...@comcast.net; d.rizopou...@erasmusmc.nl; r-help@r-project.org What you need to do is to understand how to use Rprof so that you can determine where the time is being spent. It probably indicates that this is not the source of slowness in your optimization function. How much time are we talking about? You may spent more time trying to optimize the function than just running the current version even if it is slow (slow is a relative term and does not hold much meaning without some context round it). On Sat, Oct 31, 2009 at 11:36 PM, parkbomee bbom...@hotmail.com wrote: Thank you both. However, using tapply() instead of a loop does not seem to improve my code much. I am using this inside of an optimization function, and it still takes more than it needs... CC: bbom...@hotmail.com; r-help@r-project.org From: dwinsem...@comcast.net To: d.rizopou...@erasmusmc.nl Subject: Re: [R] avoiding loop Date: Sat, 31 Oct 2009 22:26:17 -0400 This is pretty much equivalent: tapply(DF$value[DF$choice==1], DF$time[DF$choice==1], sum) / tapply(DF$value, DF$time, sum) And both will probably fail if the number of groups with choice==1 is different than the number overall. -- David. On Oct 31, 2009, at 5:14 PM, Dimitris Rizopoulos wrote: one approach is the following: # say 'DF' is your data frame, then with(DF, { ind - choice == 1 n - tapply(value[ind], time[ind], sum) d - tapply(value, time, sum) n / d }) I hope it helps. Best, Dimitris parkbomee wrote: Hi all, I am trying to figure out a way to improve my code's efficiency by avoiding the use of loop. I want to calculate a conditional mean(?) given time. For example, from the data below, I want to calculate sum((value| choice==1)/sum(value)) across time. Is there a way to do it without using a loop? time cum_time choicevalue 1 4 1 3 1 4 0 2 1 4 0 3 1 4 0 3 2 6 1 4 2 6 0 4 2 6 0 2 2 6 0 4 2 6 0 2 2 6 0 2 3 4 1 2 3 4 0 3 3 4 0 5 3 4 0 2 My code looks like objective[1] = value[1] / sum(value[1:cum_time[1]) for (i in 2:max(time)){ objective[i] = value[cum_time[i-1]+1] / sum(value[(cum_time[i-1]+1) : cum_time[i])]) } sum(objective) Anyone have an idea that I can do this without using a loop?? Thanks. _ [[elided Hotmail spam]] [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Dimitris Rizopoulos Assistant Professor Department of Biostatistics Erasmus University Medical Center Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands Tel: +31/(0)10/7043478 Fax: +31/(0)10/7043014 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide
Re: [R] Collapse factor levels
On Nov 1, 2009, at 3:51 PM, Kevin E. Thorpe wrote: I'm sure this is simple enough, but an R site search on my subject terms did suggest a solution. I have a numeric vector with many values that I wish to create a factor from having only a few levels. Here is a toy example. x - 1:10 x - factor (x,levels=1:10,labels=c(A,A,A,B,B,B,C,C,C,C)) You have thusly created a pathological situation. In 2.10.0 this is what you might see: x - factor(x,levels=1:10,labels=c(A,A,A,B,B,B,C,C,C,C)) Warning message: In `levels-`(`*tmp*`, value = c(A, A, A, B, B, B, C, : duplicated levels will not be allowed in factors anymore What you _should_ have done was: x2 - factor(c(A,A,A,B,B,B,C,C,C,C)) The usual approach to getting rid of unused factor levels is just to apply the function factor() again without additional arguments. x - factor(x) # the x was from your code Warning message: In `levels-`(`*tmp*`, value = c(A, A, A, B, B, B, C, : duplicated levels will not be allowed in factors anymore # but that will be the last time you will see the warning.. summary(x) A B C 3 3 4 -- David. x [1] A A A B B B C C C C Levels: A A A B B B C C C C summary(x) A A A B B B C C C C 3 0 0 3 0 0 4 0 0 0 So, there are clearly still 10 underlying levels. The results I would like to see from printing the value and summary(x) are: x [1] A A A B B B C C C C Levels: A B C summary(x) A B C 3 3 4 Hopefully this makes sense. Thanks, Kevin -- Kevin E. Thorpe Biostatistician/Trialist, Knowledge Translation Program Assistant Professor, Dalla Lana School of Public Health University of Toronto email: kevin.tho...@utoronto.ca Tel: 416.864.5776 Fax: 416.864.3016 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Collapse factor levels
Kevin E. Thorpe wrote: I'm sure this is simple enough, but an R site search on my subject terms did suggest a solution. I have a numeric vector with many values that I wish to create a factor from having only a few levels. Here is a toy example. x - 1:10 x - factor(x,levels=1:10,labels=c(A,A,A,B,B,B,C,C,C,C)) x [1] A A A B B B C C C C Levels: A A A B B B C C C C summary(x) A A A B B B C C C C 3 0 0 3 0 0 4 0 0 0 So, there are clearly still 10 underlying levels. The results I would like to see from printing the value and summary(x) are: x [1] A A A B B B C C C C Levels: A B C summary(x) A B C 3 3 4 Hopefully this makes sense. Thanks, Kevin It's an anomaly inherited frokm S-PLUS (or so I have been told). Actually, with the current R, you should get a warning: x - 1:10 x - factor(x,levels=1:10,labels=c(A,A,A,B,B,B,C,C,C,C)) Warning message: In `levels-`(`*tmp*`, value = c(A, A, A, B, B, B, C, : duplicated levels will not be allowed in factors anymore This works (as documented on the help page for levels!): x - 1:10 x - factor(x,levels=1:10) levels(x) - c(A,A,A,B,B,B,C,C,C,C) table(x) x A B C 3 3 4 -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - (p.dalga...@biostat.ku.dk) FAX: (+45) 35327907 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Collapse factor levels
Peter Dalgaard wrote: Kevin E. Thorpe wrote: I'm sure this is simple enough, but an R site search on my subject terms did suggest a solution. I have a numeric vector with many values that I wish to create a factor from having only a few levels. Here is a toy example. x - 1:10 x - factor(x,levels=1:10,labels=c(A,A,A,B,B,B,C,C,C,C)) x [1] A A A B B B C C C C Levels: A A A B B B C C C C summary(x) A A A B B B C C C C 3 0 0 3 0 0 4 0 0 0 So, there are clearly still 10 underlying levels. The results I would like to see from printing the value and summary(x) are: x [1] A A A B B B C C C C Levels: A B C summary(x) A B C 3 3 4 Hopefully this makes sense. Thanks, Kevin It's an anomaly inherited frokm S-PLUS (or so I have been told). Actually, with the current R, you should get a warning: x - 1:10 x - factor(x,levels=1:10,labels=c(A,A,A,B,B,B,C,C,C,C)) Warning message: In `levels-`(`*tmp*`, value = c(A, A, A, B, B, B, C, : duplicated levels will not be allowed in factors anymore This works (as documented on the help page for levels!): x - 1:10 x - factor(x,levels=1:10) levels(x) - c(A,A,A,B,B,B,C,C,C,C) table(x) x A B C 3 3 4 Thanks. That's exactly what I need. I knew it was simple. I've even used levels() before, but it just didn't occur to me this time. I'm clearly not on current R. :-) When I have some time, I'll upgrade. Kevin -- Kevin E. Thorpe Biostatistician/Trialist, Knowledge Translation Program Assistant Professor, Dalla Lana School of Public Health University of Toronto email: kevin.tho...@utoronto.ca Tel: 416.864.5776 Fax: 416.864.3016 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] intigrate function and absolute error
You just need to extract the list component named value: I1$value Ravi. Ravi Varadhan, Ph.D. Assistant Professor, Division of Geriatric Medicine and Gerontology School of Medicine Johns Hopkins University Ph. (410) 502-2619 email: rvarad...@jhmi.edu - Original Message - From: Laila Alkhalfan laila...@gmail.com Date: Sunday, November 1, 2009 3:03 pm Subject: [R] intigrate function and absolute error To: r-help@r-project.org Hi Can we get the result of an intigration without the absolute error? for example f1-function(x1){(1/gamma(alpha))*x1^(alpha-1)*exp(-x1)*log(x1)} I1-integrate(f1, 0, (max(cc)-tau1+(theta2/theta1)*tau1)/theta2) I1 0.08007414 with absolute error 7.2e-05 I need the answer 0.08007414 withou the other part(with absolute error 7.2e-05) how can we do that? thank you and take care Laila [[alternative HTML version deleted]] __ R-help@r-project.org mailing list PLEASE do read the posting guide and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] polar.plot
Jim and John, Thanks for your replies. I ended up using both suggestions to plot the full tide series and then overlay the averages for rise and fall in different colours, which illustrated very well how such parameters can be misleadings in some cases. Regards, Tony 2009/10/31 Jim Lemon j...@bitwrit.com.au On 10/31/2009 04:49 AM, Tony Greig wrote: Hi, Two questions: 1 - Say I have average speed and directions for tide and I would like to plot them on a polar plot, but with different colors so I can indicate the two directions. I'm using polar.plot from the plotrix library. How can I add a second b and dir.b series to a polar.plot? library(plotrix) a = 3 dir.a = 85 b = 4 dir.b = 250 polar.plot(a, dir.a, start = 90, clockwise = T, show.grid.labels = T, radial.lim=c(0,5), line.col=2, lwd=2) 2 - Which parameter in polar.plot can I use to set the orientation for the grid labels which seem to default at 90 in my example above? Hi Tony, The first one is easy: polar.plot(c(a,b), c(dir.a,dir.b), start = 90, clockwise = T, show.grid.labels=FALSE, radial.lim=c(0,5), line.col=2, lwd=2) I have had one other person as about an add option, and I might include that in a future version. The second one is a bit harder. You probably noticed that I changed the show.grid.labels argument to FALSE in the above. par(xpd=TRUE) boxed.labels(rep(0,5),1:5,1:5,border=NA) par(xpd=FALSE) This will put the labels vertically up from the center. Doing fancier things like having the labels at an angle would require calculating the positions, which isn't too hard. Jim [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] matrix^(-1/2)
On Nov 1, 2009, at 3:12 PM, Charles C. Berry wrote: On Sun, 1 Nov 2009, David Winsemius wrote: On Nov 1, 2009, at 1:46 PM, spencerg wrote: Hi, Chuck: Thanks very much, but why do I get package 'expm' is not available from install.packages(expm,repos=http://R-Forge.R- project.org)? In my case I think it was it is because there is no 2.10 branch to either the: snipped details So I wonder if the package installers' expectations for the r-forge repository are matching up with the tree structures. Right. FWIW, the source install works OK on my linux box: sessionInfo() R version 2.10.0 (2009-10-26) x86_64-pc-linux-gnu Yes, as I said, I had already succeeded in a source compilation using: install.packages(expm,repos=http://R-Forge.R-project.org;, type=source) (I am not quite sure why R was able to navigate the tree, but suspect my issues may stem from generally using the 64-bit Mac version of R.) I should also note that the matpow or %^% functions in expm would not address the OP's question since they require that the exponent be positive. Roger that. If solve(chol(A)) isn't good enough a symmetric inverse square root is available from expm as 'solve( sqrtm( A ) )' I (as a noobisher matrix mechanic) have discovered that the symmetric part is essential. Experiments with non-symmetric examples have come to a singularly bad end. The OP did offer a symmetric example, so he probably was more matrix-savvy than I. I am not sufficiently knowledgeable to design the error checking. I think a useful addition to %^% would be properly designed negative fractional matrix powers with more informative error messages for the matrix klutzes among us. I hope I am correct in thinking that M %^% (-1/integer) has meaning for integers other than 2, correct? And noticing that the current version of %^% rounds the exponent, I think that raises the question (given that the matrix argument were symmetric), would one round the 1/ negative-exponent? And if so, Up or down? -- David. Chuck -- David. Best Wishes, Spencer Graves Charles C. Berry wrote: On Sun, 1 Nov 2009, spencerg wrote: A question, a comment, and an alternative answer to matrix^(-1/2): QUESTION: What's the status of the expm package, mentioned in the email you cited from Martin Maechler, dated Apr 5 19:52:09 CEST 2008? I tried both install.packages('expm') and install.packages(expm,repos=http://R-Forge.R-project.org;), and got package 'expm' is not available in both cases. Try http://r-forge.r-project.org/projects/expm/ HTH, Chuck COMMENT: The solution proposed by Venables rests on Sylvester's matrix theorem, which essentially says that if a matrix A is diagonalizable with eigenvalue decomposition eigA - eigen(A) and f: D → C with D ⊂ C be a function for which f(A) is well defined (http://en.wikipedia.org/wiki/Sylvester%27s_matrix_theorem ), then f(A) = with(eigA, vectors %*% diag(f(values)) %*% solve(vectors)). Maechler and others have noted that this can be one of the least accurate and most computationally expensive ways to compute f(A). ALTERNATIVE ANSWER: For A^(-1/2), if A is symmetric and nonnegative definite, then solve(chol(A)) would be a very good way to compute it. Hope this helps, Spencer David Winsemius wrote: On Oct 31, 2009, at 9:33 PM, David Winsemius wrote: On Oct 31, 2009, at 4:39 PM, Kajan Saied wrote: Dear R-Help Team, as a R novice I have a (maybe for you very simple question), how do Iget the following solved in R: Let R be a n x n matrix: \mid R\mid^{-\frac{1}{2}} solve(A) gives me the inverse of the matrix R, however not the ^(-1/2)of the matrix... GIYF: (and Bill Venables if friendly, too.) http://www.lmgtfy.com/?q=powers+of+matrix+r-project I had assumed that the first hit I got: https://stat.ethz.ch/pipermail/r-help/2008-April/160662.html ... would be the first hit anybody got, but that's not necessarilytrue now and especially for the future. And further searching within the results produced this more recent Maechler posting: https://stat.ethz.ch/pipermail/r-devel/2008-April/048969.html For the Mac users, there appears to be no binary, but the sourcecompiles without error on a 64-bit version of R 2.10.0: install.packages(expm,repos=http://R-Forge.R-project.org , type=source) #The suggested code throws an error, so my very minor revision wouldbe: library(expm) ?%^% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained,
[R] Time Series methods
Hello, I have a quick question about time series methodology. If I want to display a boxplot of time series data, sorted by period, I can type: boxplot(data ~ cycle(data)); where data is of class ts Is there a similar method for calculating, say, the median value of each time step within the series? (So for a monthly data set, calculate median for all Januarys, all Februarys, all Marchs, etc.) Thanks, Tim [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Time Series methods
introduce a factor variable with the months and then use tapply? Kjetil On Sun, Nov 1, 2009 at 9:07 PM, Tim Bean timb...@gmail.com wrote: Hello, I have a quick question about time series methodology. If I want to display a boxplot of time series data, sorted by period, I can type: boxplot(data ~ cycle(data)); where data is of class ts Is there a similar method for calculating, say, the median value of each time step within the series? (So for a monthly data set, calculate median for all Januarys, all Februarys, all Marchs, etc.) Thanks, Tim [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] convert list to numeric
I would like to preface this by saying that I am new to R, so I would ask that you be patient and thorough, so that I'm not completely clueless. I am trying to convert a list to numeric so that I can perform computations on it (specifically mean-center the variable), but I am running into problems. I have imported the data set into task (data frame). The data frame is made of factors with variable names in the first row. I am running a loop to set a variable equal to a column in the data frame. Here is an example of my problem: for (i in 1:dim(task)[2]){ predictor.loop - c(task[i]) predictor.loop.mc - predictor.loop - mean(predictor.loop, na.rm=T) } I get the following error: Error in predictor.loop - mean(predictor.loop, na.rm = T) : non-numeric argument to binary operator In addition: Warning message: In mean.default(predictor.loop, na.rm = T) : argument is not numeric or logical: returning NA The column is entirely made up of numerical data, except for the header, which is a string. My problem is that I receive an error because the predictor.loop variable is not numerical, so I need to find a way to convert it. I tried using: predictor.loop - c(as.numeric(task[i])) But I get the following error: Error: (list) object cannot be coerced to type 'double' If I call the variable, I can assign it to a numerical list (e.g., predictor loop - task$variablename), but since I am assigning the variable in a loop, I have to find another way as the variable name would have to change in each loop iteration. Any help would be greatly appreciated. Thanks! -- View this message in context: http://old.nabble.com/convert-list-to-numeric-tp26155039p26155039.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Document related data sets.
The manual on Writing R Extensions says that one should ``only document a ***single*** data object per Rd file''. My recollection is that in the past this commandment (which I occasionally found to be inconvenient) was enforced. I recently saw a post to R-help which appeared to indicate that one ***can*** document multiple (presumably related) data sets in a single Rd file. I experimented and found that the construction \usage{ melvin clyde irving fred } which I recollect as *not* working, now appears to work. This in effect permits one to document (conveniently) melvin, clyde, irving, and fred in one data file. Along with, of course, having \alias{melvin} \alias{clyde} \alias{irving} \alias{fred} under the \name{ } field at the top of the file.) (a) Has the situation indeed changed, or is my ageing memory playing tricks on me again? (b) If indeed the situation has changed (or perhaps even if it hasn't) shouldn't that line ``only document a ***single*** data object per Rd file'' be deleted from ``Writing R Extensions''? cheers, Rolf Turner ## Attention:\ This e-mail message is privileged and confid...{{dropped:9}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Calculate Volume in a PCA
Katja Seis wrote: Hi, my data frame consist of 8 Variables and 120 000 observations. With those datas I am running a PCA and after I want to calculate the Volume of the PCA-cloud of certain subsets of my data. Does anyone have an idea about a function that can do this? Not really quite enough information here , although some googling suggests that you are looking for a technique used in MRI studies? http://www.ncbi.nlm.nih.gov/pubmed?term=%22pca%20volume%22itool=QuerySuggestion as a wild guess, I'm going to suggest that the answer **might** be that you should take your PCA standard deviations and use the formula for the volume of a (hyper)-ellipsoid [http://en.wikipedia.org/wiki/Ellipsoid#Volume]. I don't know the general formula (pi*s_1*s_2 for an ellipse, 4/3*pi*s_1*s_2*s_3 for an ellipsoid ...) but in any case the answer is in general proportional to the product of all of the standard deviations ... for a princomp() fit, prod(x$sdev). In general you can do this without calculating PCA explicitly -- if your data matrix is d, then prod(sqrt(diag(eigen(vcov(d))$values))) should do it. Of course, I haven't tested or proofread these suggestions carefully, use at your own risk ... -- View this message in context: http://old.nabble.com/Calculate-Volume-in-a-PCA-tp26154363p26155426.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] convert list to numeric
it appears that what you really want is to use: task[[i]] instead of task[i] b On Nov 1, 2009, at 11:04 PM, dadrivr wrote: I would like to preface this by saying that I am new to R, so I would ask that you be patient and thorough, so that I'm not completely clueless. I am trying to convert a list to numeric so that I can perform computations on it (specifically mean-center the variable), but I am running into problems. I have imported the data set into task (data frame). The data frame is made of factors with variable names in the first row. I am running a loop to set a variable equal to a column in the data frame. Here is an example of my problem: for (i in 1:dim(task)[2]){ predictor.loop - c(task[i]) predictor.loop.mc - predictor.loop - mean(predictor.loop, na.rm=T) } I get the following error: Error in predictor.loop - mean(predictor.loop, na.rm = T) : non-numeric argument to binary operator In addition: Warning message: In mean.default(predictor.loop, na.rm = T) : argument is not numeric or logical: returning NA The column is entirely made up of numerical data, except for the header, which is a string. My problem is that I receive an error because the predictor.loop variable is not numerical, so I need to find a way to convert it. I tried using: predictor.loop - c(as.numeric(task[i])) But I get the following error: Error: (list) object cannot be coerced to type 'double' If I call the variable, I can assign it to a numerical list (e.g., predictor loop - task$variablename), but since I am assigning the variable in a loop, I have to find another way as the variable name would have to change in each loop iteration. Any help would be greatly appreciated. Thanks! -- View this message in context: http://old.nabble.com/convert-list-to-numeric-tp26155039p26155039.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] convert list to numeric
Can you at least provide an 'str' of the 'task' object (not sure if it is a dataframe; you said a 'list') so that we know what it looks like. It would also be helpful if you would provide a subset of the data that we could try out a script on it. The best way would be to post the first 10 or so rows (and limit columns it more that 10 or so). A convenient way is to : dput(task[1:10,]) and then cut/paste the result into your email. On Sun, Nov 1, 2009 at 8:04 PM, dadrivr dadr...@gmail.com wrote: I would like to preface this by saying that I am new to R, so I would ask that you be patient and thorough, so that I'm not completely clueless. I am trying to convert a list to numeric so that I can perform computations on it (specifically mean-center the variable), but I am running into problems. I have imported the data set into task (data frame). The data frame is made of factors with variable names in the first row. I am running a loop to set a variable equal to a column in the data frame. Here is an example of my problem: for (i in 1:dim(task)[2]){ predictor.loop - c(task[i]) predictor.loop.mc - predictor.loop - mean(predictor.loop, na.rm=T) } I get the following error: Error in predictor.loop - mean(predictor.loop, na.rm = T) : non-numeric argument to binary operator In addition: Warning message: In mean.default(predictor.loop, na.rm = T) : argument is not numeric or logical: returning NA The column is entirely made up of numerical data, except for the header, which is a string. My problem is that I receive an error because the predictor.loop variable is not numerical, so I need to find a way to convert it. I tried using: predictor.loop - c(as.numeric(task[i])) But I get the following error: Error: (list) object cannot be coerced to type 'double' If I call the variable, I can assign it to a numerical list (e.g., predictor loop - task$variablename), but since I am assigning the variable in a loop, I have to find another way as the variable name would have to change in each loop iteration. Any help would be greatly appreciated. Thanks! -- View this message in context: http://old.nabble.com/convert-list-to-numeric-tp26155039p26155039.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] save an object by dynamicly created name
Hi, I would like to save a few dynamically created objects to disk. The following is the basic flow of the code segment for(i = 1:10) { m = i:5 save(m, file = ...) ## ??? } To distinguish different objects to be saved, I would like to save m as m1, m2, m3 ..., to file /home/data/m1, /home/data/m2, home/data/m3, ... I tried a couple of methods on translating between object names and strings (below) but couldn't get it to work. https://stat.ethz.ch/pipermail/r-help/2008-November/178965.html http://tolstoy.newcastle.edu.au/R/help/04/08/2673.html Any suggestions would be appreciated. thanks Hao -- View this message in context: http://old.nabble.com/save-an-object-by-dynamicly-created-name-tp26155437p26155437.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Bagging with SVM
Dear sir, If I want to use bagging with SVM, which package should I choose?Thanks! Best wishes,Jie _ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] save an object by dynamicly created name
path - data; dir.create(path); for (i in 1:10) { m - i:5; filename - sprintf(m%02d.Rbin, i); pathname - file.path(path, filename); save(m, file=pathname); } /H On Sun, Nov 1, 2009 at 6:53 PM, jeffc h...@andrew.cmu.edu wrote: Hi, I would like to save a few dynamically created objects to disk. The following is the basic flow of the code segment for(i = 1:10) { m = i:5 save(m, file = ...) ## ??? } To distinguish different objects to be saved, I would like to save m as m1, m2, m3 ..., to file /home/data/m1, /home/data/m2, home/data/m3, ... I tried a couple of methods on translating between object names and strings (below) but couldn't get it to work. https://stat.ethz.ch/pipermail/r-help/2008-November/178965.html http://tolstoy.newcastle.edu.au/R/help/04/08/2673.html Any suggestions would be appreciated. thanks Hao -- View this message in context: http://old.nabble.com/save-an-object-by-dynamicly-created-name-tp26155437p26155437.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Bagging with SVM
There is no package that does this, but you can do it yourself pretty easily (although I think that it is a waste of time to bag this particular model). You can use the function bagEarth in the caret package as a prototype. Max On Sun, Nov 1, 2009 at 8:32 PM, 柯洁 kejiefina...@hotmail.com wrote: Dear sir, If I want to use bagging with SVM, which package should I choose?Thanks! Best wishes,Jie _ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Max __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] save an object by dynamicly created name
On Nov 1, 2009, at 10:16 PM, Henrik Bengtsson wrote: path - data; dir.create(path); for (i in 1:10) { m - i:5; filename - sprintf(m%02d.Rbin, i); pathname - file.path(path, filename); save(m, file=pathname); } That would result in each of the ten files containing an object with the same name == m. (Also on my system R data files have type Rdta.) So I thought what was requested might have been a slight mod: path - ~/; dir.create(path); for (i in 1:10) { assign( paste(m, i, sep=), i:5) filename - sprintf(m%02d.Rdta, i) pathname - file.path(path, filename) obj =get(paste(m, i, sep=)) save(obj, file=pathname) } -- David. /H On Sun, Nov 1, 2009 at 6:53 PM, jeffc h...@andrew.cmu.edu wrote: Hi, I would like to save a few dynamically created objects to disk. The following is the basic flow of the code segment for(i = 1:10) { m = i:5 save(m, file = ...) ## ??? } To distinguish different objects to be saved, I would like to save m as m1, m2, m3 ..., to file /home/data/m1, /home/data/m2, home/data/m3, ... I tried a couple of methods on translating between object names and strings (below) but couldn't get it to work. https://stat.ethz.ch/pipermail/r-help/2008-November/178965.html http://tolstoy.newcastle.edu.au/R/help/04/08/2673.html Any suggestions would be appreciated. thanks Hao -- View this message in context: http://old.nabble.com/save-an-object-by-dynamicly-created-name-tp26155437p26155437.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to execute a funcition which name is stored in a string?
Hi, everybody Is there any way to execute a function, which name is stored in a string. such as: a - ls() foo(a) ## same as ls() itself. Or, to execute a R command, which is stored in a string such as: a - m1 - matrix(1:9,3,3) foo(a) ## same as the assignment itself __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] using exists with coef from an arima fit
Dear R People: I have the output from an arima model fit in an object xxx. I want to verify that the ma1 coefficient is there, so I did the following: xxx$coef ar1ar2ma1 intercept 1.3841297 -0.4985667 -0.996 -0.1091657 str(xxx$coef) Named num [1:4] 1.384 -0.499 -1 -0.109 - attr(*, names)= chr [1:4] ar1 ar2 ma1 intercept exists('xxx$coef[ma1]') [1] FALSE Other than using xxx$coef[3], is there another way to check this, please? Thanks in advance, Sincerely, Erin -- Erin Hodgess Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: erinm.hodg...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] save an object by dynamicly created name
On Sun, Nov 1, 2009 at 7:48 PM, David Winsemius dwinsem...@comcast.net wrote: On Nov 1, 2009, at 10:16 PM, Henrik Bengtsson wrote: path - data; dir.create(path); for (i in 1:10) { m - i:5; filename - sprintf(m%02d.Rbin, i); pathname - file.path(path, filename); save(m, file=pathname); } That would result in each of the ten files containing an object with the same name == m. (Also on my system R data files have type Rdta.) So I thought what was requested might have been a slight mod: path - ~/; dir.create(path); for (i in 1:10) { assign( paste(m, i, sep=), i:5) filename - sprintf(m%02d.Rdta, i) pathname - file.path(path, filename) obj =get(paste(m, i, sep=)) save(obj, file=pathname) } Then a more convenient solution is to use saveObject() and loadObject() of R.utils. saveObject() does not save the name of the object save. If you want to save multiple objects, the wrap them up in a list. loadObject() does not assign variable, but instead return them. Example: library(R.utils); x - list(a=1,b=LETTERS,c=Sys.time()); saveObject(x, file=foo.Rbin); y - loadObject(foo.Rbin); stopifnot(identical(x,y)); So, for the original example, I'd recommend: library(R.utils); path - data; mkdirs(path); for (i in 1:10) { m - i:5; filename - sprintf(m%02d.Rbin, i); saveObject(m, file=filename, path=path); } and loading the objects back as: for (i in 1:10) { filename - sprintf(m%02d.Rbin, i); m - loadObject(filename, path=path); print(m); } /Henrik -- David. /H On Sun, Nov 1, 2009 at 6:53 PM, jeffc h...@andrew.cmu.edu wrote: Hi, I would like to save a few dynamically created objects to disk. The following is the basic flow of the code segment for(i = 1:10) { m = i:5 save(m, file = ...) ## ??? } To distinguish different objects to be saved, I would like to save m as m1, m2, m3 ..., to file /home/data/m1, /home/data/m2, home/data/m3, ... I tried a couple of methods on translating between object names and strings (below) but couldn't get it to work. https://stat.ethz.ch/pipermail/r-help/2008-November/178965.html http://tolstoy.newcastle.edu.au/R/help/04/08/2673.html Any suggestions would be appreciated. thanks Hao -- View this message in context: http://old.nabble.com/save-an-object-by-dynamicly-created-name-tp26155437p26155437.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] using exists with coef from an arima fit: please ignore
Got it! There is an xxx$model$theta that does the trick. Thanks, Erin -- Erin Hodgess Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: erinm.hodg...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to execute a funcition which name is stored in a string?
On Nov 1, 2009, at 11:07 PM, Ning Ma wrote: Hi, everybody Is there any way to execute a function, which name is stored in a string. such as: a - ls() foo(a) ## same as ls() itself. Need to leave the () off. fstr - sum eval(parse(text=fstr))(1:5) [1] 15 Or, to execute a R command, which is stored in a string such as: a - m1 - matrix(1:9,3,3) foo(a) ## same as the assignment itself eval(parse(text=m1 - matrix(1:9,3,3))) m1 [,1] [,2] [,3] [1,]147 [2,]258 [3,]369 -- David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] unusual result with any
Dear R People: I am working with a numeric vector and trying to use the any function. However, I'm getting some odd results, as shown below: xy [1] 0.7305081 2.4224211 str(xy) num [1:2] 0.73 2.42 any(xy) 1 [1] FALSE Warning message: In any(xy) : coercing argument of type 'double' to logical What am I doing wrong please? Thanks in advance, Sincerely, Erin -- Erin Hodgess Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: erinm.hodg...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] unusual result with any :please ignore
Dear R People: Sorry to be so pesky tonight. I figured it out. -- Erin Hodgess Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: erinm.hodg...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] unusual result with any
Erin Hodgess schrieb: xy [1] 0.7305081 2.4224211 str(xy) num [1:2] 0.73 2.42 any(xy) 1 [1] FALSE Warning message: In any(xy) : coercing argument of type 'double' to logical What am I doing wrong please? xy 1 should return TRUE FALSE, and you want to apply any() to that. Thus: any(xy 1) any(xy) returns TRUE, as the nonzero numbers are coerced to TRUE When TRUE is compared with 1, it is coerced to a number (no warning is issued here), namely 1. 1 1 returns FALSE. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] save an object by dynamicly created name
On Nov 1, 2009, at 11:28 PM, Henrik Bengtsson wrote: On Sun, Nov 1, 2009 at 7:48 PM, David Winsemius dwinsem...@comcast.net wrote: On Nov 1, 2009, at 10:16 PM, Henrik Bengtsson wrote: path - data; dir.create(path); for (i in 1:10) { m - i:5; filename - sprintf(m%02d.Rbin, i); pathname - file.path(path, filename); save(m, file=pathname); } That would result in each of the ten files containing an object with the same name == m. (Also on my system R data files have type Rdta.) So I thought what was requested might have been a slight mod: path - ~/; dir.create(path); for (i in 1:10) { assign( paste(m, i, sep=), i:5) filename - sprintf(m%02d.Rdta, i) pathname - file.path(path, filename) obj =get(paste(m, i, sep=)) save(obj, file=pathname) } Then a more convenient solution is to use saveObject() and loadObject() of R.utils. saveObject() does not save the name of the object save. The OP asked for this outcome : I would like to save m as m1, m2, m3 ..., to file /home/data/m1, /home/data/m2, home/data/m3, ... If you want to save multiple objects, the wrap them up in a list. I agree that a list would makes sense if it were to be stored in one file , although it was not what requested. But wouldn't that require assign()-ing a name before list()-wrapping? I suppose we ought to mention that the use of assign to create a variable is a FAQ ... 7.21? Yep, I have now referred to it a sufficient number of times to refer to it by number. http://cran.r-project.org/doc/FAQ/R-FAQ.html#How-can-I-turn-a-string-into-a-variable_003f -- David loadObject() does not assign variable, but instead return them. Example: library(R.utils); x - list(a=1,b=LETTERS,c=Sys.time()); saveObject(x, file=foo.Rbin); y - loadObject(foo.Rbin); stopifnot(identical(x,y)); So, for the original example, I'd recommend: library(R.utils); path - data; mkdirs(path); for (i in 1:10) { m - i:5; filename - sprintf(m%02d.Rbin, i); saveObject(m, file=filename, path=path); } and loading the objects back as: for (i in 1:10) { filename - sprintf(m%02d.Rbin, i); m - loadObject(filename, path=path); print(m); } /Henrik -- David. /H On Sun, Nov 1, 2009 at 6:53 PM, jeffc h...@andrew.cmu.edu wrote: Hi, I would like to save a few dynamically created objects to disk. The following is the basic flow of the code segment for(i = 1:10) { m = i:5 save(m, file = ...) ## ??? } To distinguish different objects to be saved, I would like to save m as m1, m2, m3 ..., to file /home/data/m1, /home/data/m2, home/data/m3, ... I tried a couple of methods on translating between object names and strings (below) but couldn't get it to work. https://stat.ethz.ch/pipermail/r-help/2008-November/178965.html http://tolstoy.newcastle.edu.au/R/help/04/08/2673.html Any suggestions would be appreciated. thanks Hao -- View this message in context: http://old.nabble.com/save-an-object-by-dynamicly-created-name-tp26155437p26155437.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] save an object by dynamicly created name
On Sun, Nov 1, 2009 at 9:18 PM, David Winsemius dwinsem...@comcast.net wrote: On Nov 1, 2009, at 11:28 PM, Henrik Bengtsson wrote: On Sun, Nov 1, 2009 at 7:48 PM, David Winsemius dwinsem...@comcast.net wrote: On Nov 1, 2009, at 10:16 PM, Henrik Bengtsson wrote: path - data; dir.create(path); for (i in 1:10) { m - i:5; filename - sprintf(m%02d.Rbin, i); pathname - file.path(path, filename); save(m, file=pathname); } That would result in each of the ten files containing an object with the same name == m. (Also on my system R data files have type Rdta.) So I thought what was requested might have been a slight mod: path - ~/; dir.create(path); for (i in 1:10) { assign( paste(m, i, sep=), i:5) filename - sprintf(m%02d.Rdta, i) pathname - file.path(path, filename) obj =get(paste(m, i, sep=)) save(obj, file=pathname) } Then a more convenient solution is to use saveObject() and loadObject() of R.utils. saveObject() does not save the name of the object save. The OP asked for this outcome : I would like to save m as m1, m2, m3 ..., to file /home/data/m1, /home/data/m2, home/data/m3, ... If you want to save multiple objects, the wrap them up in a list. I agree that a list would makes sense if it were to be stored in one file , although it was not what requested. That comment was not for the OP, but for saveObject()/loadObject() in general. But wouldn't that require assign()-ing a name before list()-wrapping? Nope, the whole point of using saveObject()/loadObject() is to save the objects/values without their names that you happens to choose in the current session, and to avoid overwriting existing ones in your next session. My example could also have been: library(R.utils); saveObject(list(a=1,b=LETTERS,c=Sys.time()), file=foo.Rbin); y - loadObject(foo.Rbin); z - loadObject(foo.Rbin); stopifnot(identical(y,z)); If you really want to attach the elements of the saved list, do: attachLocally(loadObject(foo.Rbin)); str(a) num 1 str(b) chr [1:26] A B C D E F G H I J ... str(c) POSIXct[1:1], format: 2009-11-01 21:30:41 I suppose we ought to mention that the use of assign to create a variable is a FAQ ... 7.21? Yep, I have now referred to it a sufficient number of times to refer to it by number. http://cran.r-project.org/doc/FAQ/R-FAQ.html#How-can-I-turn-a-string-into-a-variable_003f My personal take on assign() and get() is that if you find yourself using them (at this level), there is a good chance there exists a better solution that you should use instead. My $.02 /H -- David loadObject() does not assign variable, but instead return them. Example: library(R.utils); x - list(a=1,b=LETTERS,c=Sys.time()); saveObject(x, file=foo.Rbin); y - loadObject(foo.Rbin); stopifnot(identical(x,y)); So, for the original example, I'd recommend: library(R.utils); path - data; mkdirs(path); for (i in 1:10) { m - i:5; filename - sprintf(m%02d.Rbin, i); saveObject(m, file=filename, path=path); } and loading the objects back as: for (i in 1:10) { filename - sprintf(m%02d.Rbin, i); m - loadObject(filename, path=path); print(m); } /Henrik -- David. /H On Sun, Nov 1, 2009 at 6:53 PM, jeffc h...@andrew.cmu.edu wrote: Hi, I would like to save a few dynamically created objects to disk. The following is the basic flow of the code segment for(i = 1:10) { m = i:5 save(m, file = ...) ## ??? } To distinguish different objects to be saved, I would like to save m as m1, m2, m3 ..., to file /home/data/m1, /home/data/m2, home/data/m3, ... I tried a couple of methods on translating between object names and strings (below) but couldn't get it to work. https://stat.ethz.ch/pipermail/r-help/2008-November/178965.html http://tolstoy.newcastle.edu.au/R/help/04/08/2673.html Any suggestions would be appreciated. thanks Hao -- View this message in context: http://old.nabble.com/save-an-object-by-dynamicly-created-name-tp26155437p26155437.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD
[R] Using processed objects as arguments of a function
Dear R users, I wish to utilise processed and saved objects as arguments of a function. Specifically, I have created objects using *assign* *paste* functions with an incremental index i, the names of the objects are: fund1, fund2, fund3,., fund80,. (where the numerical value increments according to the index i class of these objects are dataframes) I wish to collapse these objects row wisely using *rbind* function. paste(fund, 1:i, sep = ) results in list of objects as characters... get(paste(fund, 1:i, sep = )) outputs fund1... Are there any methods to use these objects as an argument of rbind to collapse the dataframes? Your expertise in resolving this issue would be highly appreciated. Steven [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] avoiding loop
parkbomee bbom...@hotmail.com writes: Thank you all. What Chuck has suggested might not be applicable since the number of different times is around 40,000. The object of optimization in my function is the varying value, which is basically data * parameter, of which parameter is the object of optimization.. And from the r profiling with a subset of data, I got this report..any idea what Anonymous is? $by.total total.time total.pct self.time self.pct Anonymous 571.56 100.0 0.02 0.0 optim 571.56 100.0 0.00 0.0 fn571.54 100.0 0.98 0.2 You're giving us 'by.total', so these are saying that all the time was spent in these functions or the functions they called. Probably all are in 'optim' and its arguments; since little self.time is spent here, there isn't much to work with eval 423.74 74.1 0.00 0.0 with.default 423.74 74.1 0.00 0.0 with 423.74 74.1 0.00 0.0 These are probably in the internals of optim, where the function you're trying to optimize is being set up for evaluation. Again there's little self.time, and all these say is that a big piece of the time is being spent in code called by this code. tapply414.28 72.5 13.84 2.4 lapply255.48 44.7 76.94 13.5 factor127.68 22.3 11.08 1.9 unlist120.54 21.1 80.46 14.1 FUN94.16 16.5 94.16 16.5 these look like they are tapply-related calls (looking at the code for tapply, it calls lapply, factor, and unlist, and FUN is the function argument to tapply), perhaps from the function you're optimizing (did you implement this as suggested below? it would really help to have a possibly simplified version of the code you're calling). There is material to work with here, as apparently a fairly large amount of self.time is being spent in each of these functions. So here's a sample data set n - 10 set.seed(123) df - data.frame(time=sort(as.integer(ceiling(runif(n)*n/5))), value=ceiling(runif(n)*5)) It would have been helpful for you to provide reproducible code like that above, so that the characteristics of your data were easily reproducible. Let's time tapply replicate(5, { + system.time(x0 - tapply0(df$value, df$time, sum), gcFirst=TRUE)[[1]] + }) [1] 0.316 0.316 0.308 0.320 0.304 tapply is quite general, but in your case I think you'd be happy with tapply1 - function(X, INDEX, FUN) unlist(lapply(split(X, INDEX), FUN), use.names=FALSE) replicate(5, { + system.time(x1 - tapply1(df$value, df$time, sum), gcFirst=TRUE)[[1]] + }) [1] 0.156 0.148 0.152 0.144 0.152 so about twice the speed (timing depends quite a bit on what 'time' is, integer or numeric or character or factor). The vector values of the two calculations are identical, though tapply presents the data as an array with column names identical(as.vector(x0), x1) [1] TRUE tapply allows FUN to be anything, but if the interest is in the sum of each time interval, and the time intervals can be assumed to be sorted (sorting is not expensive, so could be done on the fly), then tapply2 - function(X, INDEX) { csum - cumsum(c(0, X)) idx - diff(INDEX) != 0 csum[c(FALSE, idx, TRUE)] - csum[c(TRUE, idx, FALSE)] } calculates the cumulative sum and the points in INDEX where the time intervals change. It then takes the difference over the appropriate interval. replicate(5, { + system.time(x2 - tapply2(df$value, df$time), gcFirst=TRUE)[[1]] + }) [1] 0.024 0.024 0.024 0.024 0.024 identical(as.vector(x0), x2) [1] TRUE This approach could be subject to rounding error (if csum gets very large and the intervals remain small). To calculate values where choice == 1 I think you'd want to tapply2(df$value * (df$choice==1), df$time) rather than sub-setting, so that the result of tapply2 is always a vector of the same length even when some time intervals never have choice==1. Because tapply in these examples seems so fast compared to your calculation, I wonder whether optim is evaluating your function many times, and that reformulating the optimization might lead to a very substantial speed-up? Martin . . . . . Date: Sun, 1 Nov 2009 15:35:41 -0400 Subject: Re: [R] avoiding loop From: jholt...@gmail.com To: bbom...@hotmail.com CC: dwinsem...@comcast.net; d.rizopou...@erasmusmc.nl; r-help@r-project.org What you need to do is to understand how to use Rprof so that you can determine where the time is being spent. It probably indicates that this is not the source of slowness in your optimization function. How much time are we talking about? You may spent more time trying to optimize the function than just running
Re: [R] How to execute a funcition which name is stored in a string?
On Sun, Nov 1, 2009 at 8:07 PM, Ning Ma pnin...@gmail.com wrote: Hi, everybody Is there any way to execute a function, which name is stored in a string. such as: a - ls() foo(a) ## same as ls() itself. One way to accomplish this by using get() to search for a function that matches your string. You can assign the return value of get() to a variable which may be used like the original function object: a - 'ls' foo - get( a, mode = 'function' ) foo() [1] a foo -Charlie __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] RMySQL and Stored procedures
Can someone provide me a good way to circumvent the lack of calling Stored Procedures from RMySQL? I can not rewrite those stored procedures on R because there a lot more folks here that only understands SQL. The stored procedure returns a resultset. Thanks Caveman __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to execute a funcition which name is stored in a string?
?do.call /H On Sun, Nov 1, 2009 at 10:58 PM, Charlie Sharpsteen ch...@sharpsteen.net wrote: On Sun, Nov 1, 2009 at 8:07 PM, Ning Ma pnin...@gmail.com wrote: Hi, everybody Is there any way to execute a function, which name is stored in a string. such as: a - ls() foo(a) ## same as ls() itself. One way to accomplish this by using get() to search for a function that matches your string. You can assign the return value of get() to a variable which may be used like the original function object: a - 'ls' foo - get( a, mode = 'function' ) foo() [1] a foo -Charlie __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.