[Rd] Bug in as.character? (PR#14206)
A long formula which is converted using as.character, looses its last part: ``diagonal = 1e-12)'' Shorter formula is ok though. Best, HÃ¥vard Browse[2] formula.str y ~ -1 + b1 + b2 + b3 + b4 + b5 + b6 + b7 + b8 + b9 + b10 + b11 + b12 + b13 + b14 + b15 + b16 + b17 + b18 + b19 + b20 + b21 + b22 + b23 + b24 + b25 + b26 + b27 + b28 + b29 + b30 + b31 + b32 + b33 + b34 + b35 + b36 + b37 + b38 + b39 + b40 + b41 + b42 + b43 + b44 + b45 + b46 + b47 + b48 + b49 + elevation + f(idx, model = sphere, sphere.dir = global_temperature_80s, T.order = 2, K.order = 2, T.model = rotsym, K.model = rotsym, initial = c(-4, 1, 0), param = c(-4, 0.01, 3, 0.01, 0, 1), replicate = replicate, diagonal = 1e-12) Browse[2] formula.str[3] -1 + b1 + b2 + b3 + b4 + b5 + b6 + b7 + b8 + b9 + b10 + b11 + b12 + b13 + b14 + b15 + b16 + b17 + b18 + b19 + b20 + b21 + b22 + b23 + b24 + b25 + b26 + b27 + b28 + b29 + b30 + b31 + b32 + b33 + b34 + b35 + b36 + b37 + b38 + b39 + b40 + b41 + b42 + b43 + b44 + b45 + b46 + b47 + b48 + b49 + elevation + f(idx, model = sphere, sphere.dir = global_temperature_80s, T.order = 2, K.order = 2, T.model = rotsym, K.model = rotsym, initial = c(-4, 1, 0), param = c(-4, 0.01, 3, 0.01, 0, 1), replicate = replicate, diagonal = 1e-12)() Browse[2] as.character(formula.str[3]) [1] -1 + b1 + b2 + b3 + b4 + b5 + b6 + b7 + b8 + b9 + b10 + b11 + b12 + b13 + b14 + b15 + b16 + b17 + b18 + b19 + b20 + b21 + b22 + b23 + b24 + b25 + b26 + b27 + b28 + b29 + b30 + b31 + b32 + b33 + b34 + b35 + b36 + b37 + b38 + b39 + b40 + b41 + b42 + b43 + b44 + b45 + b46 + b47 + b48 + b49 + elevation + f(idx, model = \sphere\, sphere.dir = \global_temperature_80s\, T.order = 2, K.order = 2, T.model = \rotsym \, K.model = \rotsym\, initial = c(-4, 1, 0), param = c(-4, 0.01, 3, 0.01, 0, 1), replicate = replicate, --please do not edit the information below-- Version: platform = x86_64-redhat-linux-gnu arch = x86_64 os = linux-gnu system = x86_64, linux-gnu status = major = 2 minor = 10.1 year = 2009 month = 12 day = 14 svn rev = 50720 language = R version.string = R version 2.10.1 (2009-12-14) Locale: LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_DK.utf8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C Search Path: .GlobalEnv, package:stats, package:graphics, package:grDevices, package:datasets, package:INLA, package:R.utils, package:R.oo, package:utils, package:R.methodsS3, package:methods, Autoloads, package:base -- HÃ¥vard Rue Department of Mathematical Sciences Norwegian University of Science and Technology N-7491 Trondheim, Norway Voice: +47-7359-3533URL : http://www.math.ntnu.no/~hrue Fax : +47-7359-3524Email: havard@math.ntnu.no This message was created in a Microsoft-free computing environment. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Bug in as.character? (PR#14206)
havard@math.ntnu.no wrote: A long formula which is converted using as.character, looses its last part: ``diagonal = 1e-12)'' Shorter formula is ok though. (If you have to put a ? in a bug report, ask instead!) This is entirely consistent with help(as.character): Note: ‘as.character’ truncates components of language objects to 500 characters (was about 70 before 1.3.1). If you insist on working with very long formulas in their character representation, you need to use deparse() and deal with the resulting multi-line character vectors. (I can't tell what you're trying to do, but update.formula() may provide a cleaner way of modifying formulas.) -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - (p.dalga...@biostat.ku.dk) FAX: (+45) 35327907 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Why is there no c.factor?
On Thu, Feb 4, 2010 at 12:03 PM, Hadley Wickham had...@rice.edu wrote: I'd propose the following: If the sets of levels of all arguments are the same, then c.factor() would return a factor with the common set of levels; if the sets of levels differ, then, as Hadley suggests, the level-set of the result would be the union of sets of levels of the arguments, but a warning would be issued. I like this compromise (as long as there was an argument to suppress the warning) If I provided code to do this, along with the warnings for ordered factors and using the optimisation suggested by Matthew, is there any member of R core would be interested in sponsoring it? Hadley -- http://had.co.nz/ __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Why is there no c.factor?
Hadley Wickham wrote: On Thu, Feb 4, 2010 at 12:03 PM, Hadley Wickham had...@rice.edu wrote: I'd propose the following: If the sets of levels of all arguments are the same, then c.factor() would return a factor with the common set of levels; if the sets of levels differ, then, as Hadley suggests, the level-set of the result would be the union of sets of levels of the arguments, but a warning would be issued. I like this compromise (as long as there was an argument to suppress the warning) If I provided code to do this, along with the warnings for ordered factors and using the optimisation suggested by Matthew, is there any member of R core would be interested in sponsoring it? Hadley Messing with c() is a bit unattractive (I'm not too happy with the other c methods either; normally c() strips attributes and reduces to the base class, and those obviously do not), but a more general concat() function has been suggested a number of times. With a suitable range of methods, this could also be used to reimplement rbind.data.frame (which, incidentally, already contains a method for concatenating factors, with several ugly warts!) -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - (p.dalga...@biostat.ku.dk) FAX: (+45) 35327907 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Why is there no c.factor?
-Original Message- From: r-devel-boun...@r-project.org [mailto:r-devel-boun...@r-project.org] On Behalf Of Peter Dalgaard Sent: Friday, February 05, 2010 7:41 AM To: Hadley Wickham Cc: John Fox; r-devel@r-project.org; Thomas Lumley Subject: Re: [Rd] Why is there no c.factor? Hadley Wickham wrote: On Thu, Feb 4, 2010 at 12:03 PM, Hadley Wickham had...@rice.edu wrote: I'd propose the following: If the sets of levels of all arguments are the same, then c.factor() would return a factor with the common set of levels; if the sets of levels differ, then, as Hadley suggests, the level-set of the result would be the union of sets of levels of the arguments, but a warning would be issued. I like this compromise (as long as there was an argument to suppress the warning) If I provided code to do this, along with the warnings for ordered factors and using the optimisation suggested by Matthew, is there any member of R core would be interested in sponsoring it? Hadley Messing with c() is a bit unattractive (I'm not too happy with the other c methods either; normally c() strips attributes and reduces to the base class, and those obviously do not), but a more general concat() function has been suggested a number of times. With a suitable range of methods, this could also be used to reimplement rbind.data.frame (which, incidentally, already contains a method for concatenating factors, with several ugly warts!) Yes, c() should have been put on the deprecated list a couple of decades ago, since people expect it to do too many incompatible things. And factor should have been a virtual class, with subclasses FixedLevels (e.g., Sex) or AdHocLevels (e.g., FamilyName), so c() and [()- could do the appropriate thing in either case. Back to reality, S+ has a concat(...) function, whose comments say # This function works like c() except that names of arguments are # ignored. That is, it concatenates its arguments into a single # S vector object, without considering the names of the arguments, # in the order that the arguments are given. # # To make this function work for new classes, it is only necessary # to make methods for the concat.two function, which concatenates # two vectors; recursion will take care of the rest. concat() is not generic but it repeatedly calls concat.two(x,y), an SV4-generic that dispatches on the classes of x and y. Thus you can easily predict the class of concat(x,y,z), although it may not be the same as the class of concat(z,y,x), given suitably bizarre methods for concat.two(). concat() doesn't get a lot of use but I think the idea is sound. Perhaps that model would work well for a concatenation function in R. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - (p.dalga...@biostat.ku.dk) FAX: (+45) 35327907 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Why is there no c.factor?
c() should have been put on the deprecated list a couple of decades ago Don't you dare! Back to reality phew! had me worried there. c() is no problem at all for lists, Dates and most simple vector types; why deprecate something solely because it doesn't behave for something it doesn't claim to work on? Steve E *** This email and any attachments are confidential. Any use...{{dropped:8}} __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Why is there no c.factor?
concat() doesn't get a lot of use How do you know? Maybe its used a lot but the users had no need to tell you what they were using. The exact opposite might in fact be the case i.e. because concat is so good in splus, you just never hear of problems with it from the users. That might be a very good sign. perhaps that model would work well for a concatenation function in R I'd be happy to test it. I'm a bit concerned about performance though given what you said about repeated recursive calls, and dispatch. Could you run the following test in s-plus please and post back the timing? If this small 100MB example was fine, then we could proceed to a 64bit 10GB test. This is quite nippy at the moment in R (1.1sec). I'd be happy with a better way as long as speed wasn't compromised. set.seed(1) L = as.vector(outer(LETTERS,LETTERS,paste,sep=)) # union set of 676 levels F = lapply(1:100, function(i) {# create 100 factors f = sample(1:100, 1*1024^2 / 4, replace=TRUE) # each factor 1MB large (262144 integers), plus small amount for the levels levels(f) = sample(L,100) # pick 100 levels from the union set class(f) = factor f }) head(F[[1]]) [1] RT DM CO JV BG KU 100 Levels: YC FO PN IL CB CY HQ ... head(F[[2]]) [1] RK PD FE SG SJ CQ 100 Levels: JV FV DX NL XB ND CY QQ ... With c.factor from data.table, as posted, placed in .GlobalEnv system.time(G - do.call(c,F)) user system elapsed 0.810.321.12 head(G) [1] RT DM CO JV BG KU# looks right, comparing to F[[1]] above 676 Levels: AA AB AC AD AE AF AG AH AI AJ AK AL AM AN AO AP AQ AR AS AT AU AV AW AX AY AZ BA BB BC BD BE BF ... ZZ G[262145:262150] [1] RK PD FE SG SJ CQ # looks right, comparing to F[[2]] above 676 Levels: AA AB AC AD AE AF AG AH AI AJ AK AL AM AN AO AP AQ AR AS AT AU AV AW AX AY AZ BA BB BC BD BE BF ... ZZ identical(as.character(G),as.character(unlist(F))) [1] TRUE So I guess this would be compared to following in splus ? system.time(G - do.call(concat, F)) or maybe its just the following : system.time(G - concat(F)) I don't have splus so I can't test that myself. William Dunlap wdun...@tibco.com wrote in message news:77eb52c6dd32ba4d87471dcd70c8d7000275b...@na-pa-vbe03.na.tibco.com... -Original Message- From: r-devel-boun...@r-project.org [mailto:r-devel-boun...@r-project.org] On Behalf Of Peter Dalgaard Sent: Friday, February 05, 2010 7:41 AM To: Hadley Wickham Cc: John Fox; r-devel@r-project.org; Thomas Lumley Subject: Re: [Rd] Why is there no c.factor? Hadley Wickham wrote: On Thu, Feb 4, 2010 at 12:03 PM, Hadley Wickham had...@rice.edu wrote: I'd propose the following: If the sets of levels of all arguments are the same, then c.factor() would return a factor with the common set of levels; if the sets of levels differ, then, as Hadley suggests, the level-set of the result would be the union of sets of levels of the arguments, but a warning would be issued. I like this compromise (as long as there was an argument to suppress the warning) If I provided code to do this, along with the warnings for ordered factors and using the optimisation suggested by Matthew, is there any member of R core would be interested in sponsoring it? Hadley Messing with c() is a bit unattractive (I'm not too happy with the other c methods either; normally c() strips attributes and reduces to the base class, and those obviously do not), but a more general concat() function has been suggested a number of times. With a suitable range of methods, this could also be used to reimplement rbind.data.frame (which, incidentally, already contains a method for concatenating factors, with several ugly warts!) Yes, c() should have been put on the deprecated list a couple of decades ago, since people expect it to do too many incompatible things. And factor should have been a virtual class, with subclasses FixedLevels (e.g., Sex) or AdHocLevels (e.g., FamilyName), so c() and [()- could do the appropriate thing in either case. Back to reality, S+ has a concat(...) function, whose comments say # This function works like c() except that names of arguments are # ignored. That is, it concatenates its arguments into a single # S vector object, without considering the names of the arguments, # in the order that the arguments are given. # # To make this function work for new classes, it is only necessary # to make methods for the concat.two function, which concatenates # two vectors; recursion will take care of the rest. concat() is not generic but it repeatedly calls concat.two(x,y), an SV4-generic that dispatches on the classes of x and y. Thus you can easily predict the class of concat(x,y,z), although it may not be the same as the class of concat(z,y,x), given suitably bizarre methods for concat.two(). concat() doesn't get a lot of use but I think the idea is sound. Perhaps
Re: [Rd] Why is there no c.factor?
From: r-devel-boun...@r-project.org [mailto:r-devel-boun...@r-project.org] On Behalf Of Matthew Dowle Sent: Friday, February 05, 2010 11:17 AM To: r-de...@stat.math.ethz.ch Subject: Re: [Rd] Why is there no c.factor? concat() doesn't get a lot of use How do you know? Maybe its used a lot but the users had no need to tell you what they were using. The exact opposite might in fact be the case i.e. because concat is so good in splus, you just never hear of problems with it from the users. That might be a very good sign. We don't use concat in many of our functions. It tends to be used only where c fails. It is slower than c(), in part because it is an SV4 generic while c is a .Internal (the fastest S+ interface to C code). concat() is also written entirely in S code, with calls to heavyweights like sapply. Writing it in C would speed it up a lot. sys.time(for(i in 1:1)c(1,2)) [1] 0.27 0.27 sys.time(for(i in 1:1)concat(1,2)) [1] 20.29 20.29 sys.time(for(i in 1:1)concat.two(1,2)) [1] 0.52 0.52 The last just calls the default method of concat.two, which is a call to c(). perhaps that model would work well for a concatenation function in R I'd be happy to test it. I'm a bit concerned about performance though given what you said about repeated recursive calls, and dispatch. Could you run the following test in s-plus please and post back the timing? If this small 100MB example was fine, then we could proceed to a 64bit 10GB test. This is quite nippy at the moment in R (1.1sec). I'd be happy with a better way as long as speed wasn't compromised. set.seed(1) L = as.vector(outer(LETTERS,LETTERS,paste,sep=)) # union set of 676 levels F = lapply(1:100, function(i) {# create 100 factors f = sample(1:100, 1*1024^2 / 4, replace=TRUE) # each factor 1MB large (262144 integers), plus small amount for the levels levels(f) = sample(L,100) # pick 100 levels from the union set class(f) = factor f }) head(F[[1]]) [1] RT DM CO JV BG KU 100 Levels: YC FO PN IL CB CY HQ ... head(F[[2]]) [1] RK PD FE SG SJ CQ 100 Levels: JV FV DX NL XB ND CY QQ ... With c.factor from data.table, as posted, placed in .GlobalEnv system.time(G - do.call(c,F)) user system elapsed 0.810.321.12 head(G) [1] RT DM CO JV BG KU# looks right, comparing to F[[1]] above 676 Levels: AA AB AC AD AE AF AG AH AI AJ AK AL AM AN AO AP AQ AR AS AT AU AV AW AX AY AZ BA BB BC BD BE BF ... ZZ G[262145:262150] [1] RK PD FE SG SJ CQ # looks right, comparing to F[[2]] above 676 Levels: AA AB AC AD AE AF AG AH AI AJ AK AL AM AN AO AP AQ AR AS AT AU AV AW AX AY AZ BA BB BC BD BE BF ... ZZ identical(as.character(G),as.character(unlist(F))) [1] TRUE So I guess this would be compared to following in splus ? system.time(G - do.call(concat, F)) or maybe its just the following : system.time(G - concat(F)) I don't have splus so I can't test that myself. William Dunlap wdun...@tibco.com wrote in message news:77eb52c6dd32ba4d87471dcd70c8d7000275b...@na-pa-vbe03.na.t ibco.com... -Original Message- From: r-devel-boun...@r-project.org [mailto:r-devel-boun...@r-project.org] On Behalf Of Peter Dalgaard Sent: Friday, February 05, 2010 7:41 AM To: Hadley Wickham Cc: John Fox; r-devel@r-project.org; Thomas Lumley Subject: Re: [Rd] Why is there no c.factor? Hadley Wickham wrote: On Thu, Feb 4, 2010 at 12:03 PM, Hadley Wickham had...@rice.edu wrote: I'd propose the following: If the sets of levels of all arguments are the same, then c.factor() would return a factor with the common set of levels; if the sets of levels differ, then, as Hadley suggests, the level-set of the result would be the union of sets of levels of the arguments, but a warning would be issued. I like this compromise (as long as there was an argument to suppress the warning) If I provided code to do this, along with the warnings for ordered factors and using the optimisation suggested by Matthew, is there any member of R core would be interested in sponsoring it? Hadley Messing with c() is a bit unattractive (I'm not too happy with the other c methods either; normally c() strips attributes and reduces to the base class, and those obviously do not), but a more general concat() function has been suggested a number of times. With a suitable range of methods, this could also be used to reimplement rbind.data.frame (which, incidentally, already contains a method for concatenating factors, with several ugly warts!) Yes, c() should have been put on the deprecated list a couple of decades ago, since people expect it to do too many incompatible things. And factor should have been a virtual class, with subclasses FixedLevels (e.g., Sex) or