Herve (et al.), On Fri, May 22, 2020 at 3:16 PM Hervé Pagès <hpa...@fredhutch.org> wrote:
> Gabe, > > It's the current behavior of paste() that is a major source of bugs: > > ## Add "rs" prefix to SNP ids and collapse them in a > ## comma-separated string. > collapse_snp_ids <- function(snp_ids) > paste("rs", snp_ids, sep="", collapse=",") > > snp_groups <- list( > group1=c(55, 22, 200), > group2=integer(0), > group3=c(99, 550) > ) > > vapply(snp_groups, collapse_snp_ids, character(1)) > # group1 group2 group3 > # "rs55,rs22,rs200" "rs" "rs99,rs550" > > This has hit me so many times! > > Now with 'collapse0=TRUE', we finally have the opportunity to make it do > the right thing. Let's not miss that opportunity. > I see what you're saying, but I don' know. Maybe my intuition is just different but when I collapse multiple character vectors together, I expect all the characters from each of those vectors to be in the resulting collapsed one. In your example its a string literal tot be added elementwise to the prefix, but what if it is another vector of length > 1. Wouldn't it be strange that all those values are wiped and absent from the resulting string? Maybe it's just me. like for paste(x,y,z, sep ="", collapse = ", ", recycle0=TRUE) if length(y) is 0, it literally makes no difference when x and z are. I seem to be being largely outvoted anyway though, so we will see what Martin and others who may pop up might think, but I raised the points I wanted to raise so we'll see where things ultimately fall. ~G > > Cheers, > H. > > > On 5/22/20 11:26, Gabriel Becker wrote: > > I understand that this is consistent but it also strikes me as an > > enormous 'gotcha' of a magnitude that 'we' are trying to avoid/smooth > > over at this point in user-facing R space. > > > > For the record I'm not suggesting it should return something other than > > "", and in particular I'm not arguing that any call to paste /that does > > not return an error/ with non-NULL collapse should return a character > > vector of length one. > > > > Rather I'm pointing out that it could (perhaps should, imo) simply be an > > error, which is also consistent, in the strict sense, with > > previous behavior in that it is the developer simply declining to extend > > the recycle0 argument to the full parameter space (there is no rule that > > says we must do so, arguments whose use is incompatible with other > > arguments can be reasonable and called for). > > > > I don't feel feel super strongly that reeturning "" in this and similar > > cases horrible and should never happen, but i'd bet dollars to donuts > > that to the extent that behavior occurs it will be a disproportionately > > major source of bugs, and i think thats at least worth considering in > > addition to pure consistency. > > > > ~G > > > > On Fri, May 22, 2020 at 9:50 AM William Dunlap <wdun...@tibco.com > > <mailto:wdun...@tibco.com>> wrote: > > > > I agree with Herve, processing collapse happens last so > > collapse=non-NULL always leads to a single character string being > > returned, the same as paste(collapse=""). See the altPaste function > > I posted yesterday. > > > > Bill Dunlap > > TIBCO Software > > wdunlap tibco.com > > < > https://urldefense.proofpoint.com/v2/url?u=http-3A__tibco.com&d=DwMFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=Z1o-HO3_OqxOR9LaRguGvnG7X4vF_z1_q13I7zmjcfY&s=7ZT1IjmexPqsDBhrV3NspPTr8M8XiMweEwJWErgAlqw&e= > > > > > > > > On Fri, May 22, 2020 at 9:12 AM Hervé Pagès <hpa...@fredhutch.org > > <mailto:hpa...@fredhutch.org>> wrote: > > > > I think that > > > > paste(c("a", "b"), NULL, c("c", "d"), sep = " ", collapse > > = ",", > > recycle0=TRUE) > > > > should just return an empty string and don't see why it needs to > > emit a > > warning or raise an error. To me it does exactly what the user > > is asking > > for, which is to change how the 3 arguments are recycled > > **before** the > > 'sep' operation. > > > > The 'recycle0' argument has no business in the 'collapse' > operation > > (which comes after the 'sep' operation): this operation still > > behaves > > like it always had. > > > > That's all there is to it. > > > > H. > > > > > > On 5/22/20 03:00, Gabriel Becker wrote: > > > Hi Martin et al, > > > > > > > > > > > > On Thu, May 21, 2020 at 9:42 AM Martin Maechler > > > <maech...@stat.math.ethz.ch > > <mailto:maech...@stat.math.ethz.ch> > > <mailto:maech...@stat.math.ethz.ch > > <mailto:maech...@stat.math.ethz.ch>>> wrote: > > > > > > >>>>> Hervé Pagès > > > >>>>> on Fri, 15 May 2020 13:44:28 -0700 writes: > > > > > > > There is still the situation where **both** 'sep' > and > > > 'collapse' are > > > > specified: > > > > > > >> paste(integer(0), "nth", sep="", collapse=",") > > > > [1] "nth" > > > > > > > In that case 'recycle0' should **not** be ignored > i.e. > > > > > > > paste(integer(0), "nth", sep="", collapse=",", > > recycle0=TRUE) > > > > > > > should return the empty string (and not > > character(0) like it > > > does at the > > > > moment). > > > > > > > In other words, 'recycle0' should only control the > > first > > > operation (the > > > > operation controlled by 'sep'). Which makes plenty > > of sense: > > > the 1st > > > > operation is binary (or n-ary) while the collapse > > operation > > > is unary. > > > > There is no concept of recycling in the context of > > unary > > > operations. > > > > > > Interesting, ..., and sounding somewhat convincing. > > > > > > > On 5/15/20 11:25, Gabriel Becker wrote: > > > >> Hi all, > > > >> > > > >> This makes sense to me, but I would think that > > recycle0 and > > > collapse > > > >> should actually be incompatible and paste should > > throw an > > > error if > > > >> recycle0 were TRUE and collapse were declared in > > the same > > > call. I don't > > > >> think the value of recycle0 should be silently > > ignored if it > > > is actively > > > >> specified. > > > >> > > > >> ~G > > > > > > Just to summarize what I think we should know and agree > > (or be > > > be "disproven") and where this comes from ... > > > > > > 1) recycle0 is a new R 4.0.0 option in paste() / paste0() > > which by > > > default > > > (recycle0 = FALSE) should (and *does* AFAIK) not > > change anything, > > > hence paste() / paste0() behave completely > > back-compatible > > > if recycle0 is kept to FALSE. > > > > > > 2) recycle0 = TRUE is meant to give different behavior, > > notably > > > 0-length arguments (among '...') should result in > > 0-length results. > > > > > > The above does not specify what this means in detail, > > see 3) > > > > > > 3) The current R 4.0.0 implementation (for which I'm > > primarily > > > responsible) > > > and help(paste) are in accordance. > > > Notably the help page (Arguments -> 'recycle0' ; > > Details 1st > > > para ; Examples) > > > says and shows how the 4.0.0 implementation has been > > meant to work. > > > > > > 4) Several provenly smart members of the R community > > argue that > > > both the implementation and the documentation of > > 'recycle0 = > > > TRUE' should be changed to be more logical / > > coherent / sensical .. > > > > > > Is the above all correct in your view? > > > > > > Assuming yes, I read basically two proposals, both > agreeing > > > that recycle0 = TRUE should only ever apply to the > > action of 'sep' > > > but not the action of 'collapse'. > > > > > > 1) Bill and Hervé (I think) propose that 'recycle0' > > should have > > > no effect whenever 'collapse = <string>' > > > > > > 2) Gabe proposes that 'collapse = <string>' and 'recycle0 > > = TRUE' > > > should be declared incompatible and error. If going > > in that > > > direction, I could also see them to give a warning > (and > > > continue as if recycle = FALSE). > > > > > > > > > Herve makes a good point about when sep and collapse are both > > set. That > > > said, if the user explicitly sets recycle0, Personally, I > > don't think it > > > should be silently ignored under any configuration of other > > arguments. > > > > > > If all of the arguments are to go into effect, the question > > then becomes > > > one of ordering, I think. > > > > > > Consider > > > > > > paste(c("a", "b"), NULL, c("c", "d"), sep = " ", > > collapse = ",", > > > recycle0=TRUE) > > > > > > Currently that returns character(0), becuase the logic is > > > essenttially (in pseudo-code) > > > > > > collapse(paste(c("a", "b"), NULL, c("c", "d"), sep = " > ", > > > recycle0=TRUE), collapse = ", ", recycle0=TRUE) > > > > > > -> collapse(character(0), collapse = ", " recycle0=TRUE) > > > > > > -> character(0) > > > > > > Now Bill Dunlap argued, fairly convincingly I think, that > > paste(..., > > > collapse=<string>) should /always/ return a character vector > > of length > > > exactly one. With recycle0, though, it will return "" via > > the progression > > > > > > paste(c("a", "b"), NULL, c("c", "d"), sep = " ", > > collapse = ",", > > > recycle0=TRUE) > > > > > > -> collapse(character(0), collapse = ", ") > > > > > > -> "" > > > > > > > > > because recycle0 is still applied to the sep-based operation > > which > > > occurs before collapse, thus leaving a vector of length 0 to > > collapse. > > > > > > That is consistent but seems unlikely to be what the user > > wanted, imho. > > > I think if it does this there should be at least a warning > > when paste > > > collapses to "" this way, if it is allowed at all (ie if > mixing > > > collapse=<string>and recycle0=TRUEis not simply made an > error). > > > > > > I would like to hear others' thoughts as well though. @Pages, > > Herve > > > <mailto:hpa...@fredhutch.org <mailto:hpa...@fredhutch.org>> > > @William Dunlap > > > <mailto:wdun...@tibco.com <mailto:wdun...@tibco.com>> is "" > > what you envision as thee desired and > > > useful behavior there? > > > > > > Best, > > > ~G > > > > > > > > > > > > I have not yet my mind up but would tend to agree to "you > > guys", > > > but I think that other R Core members should chime in, > too. > > > > > > Martin > > > > > > >> On Fri, May 15, 2020 at 11:05 AM Hervé Pagès > > > <hpa...@fredhutch.org <mailto:hpa...@fredhutch.org> > > <mailto:hpa...@fredhutch.org <mailto:hpa...@fredhutch.org>> > > > >> <mailto:hpa...@fredhutch.org > > <mailto:hpa...@fredhutch.org> <mailto:hpa...@fredhutch.org > > <mailto:hpa...@fredhutch.org>>>> > > > wrote: > > > >> > > > >> Totally agree with that. > > > >> > > > >> H. > > > >> > > > >> On 5/15/20 10:34, William Dunlap via R-devel > wrote: > > > >> > I agree: paste(collapse="something", ...) > > should always > > > return a > > > >> single > > > >> > character string, regardless of the value of > > recycle0. > > > This would be > > > >> > similar to when there are no non-NULL arguments > > to paste; > > > >> collapse="." > > > >> > gives a single empty string and collapse=NULL > > gives a zero > > > long > > > >> character > > > >> > vector. > > > >> >> paste() > > > >> > character(0) > > > >> >> paste(collapse=", ") > > > >> > [1] "" > > > >> > > > > >> > Bill Dunlap > > > >> > TIBCO Software > > > >> > wdunlap tibco.com > > < > https://urldefense.proofpoint.com/v2/url?u=http-3A__tibco.com&d=DwMFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=Z1o-HO3_OqxOR9LaRguGvnG7X4vF_z1_q13I7zmjcfY&s=7ZT1IjmexPqsDBhrV3NspPTr8M8XiMweEwJWErgAlqw&e= > > > > > > > < > https://urldefense.proofpoint.com/v2/url?u=http-3A__tibco.com&d=DwMFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=q5ueyHReS5hHK6TZ0dJ1N7Ro8dx-rsLHys8GrCugOls&s=o9ozvxBK-kVvAUFro7U1RrI5w0U8EPb0uyjQwMvOpt8&e= > > > > > >> > > > > > < > https://urldefense.proofpoint.com/v2/url?u=http-3A__tibco.com&d=DwMFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=cC2qctlVXd0qHMPvCyYvuVMqR8GU3DjTTqKJ0zjIFj8&s=rXIwWqf4U4HZS_bjUT3KfA9ARaV5YTb_kEcXWHnkt-c&e= > > > > > >> > > > > >> > > > > >> > On Thu, Apr 30, 2020 at 9:56 PM > > suharto_anggono--- via > > > R-devel < > > > >> > r-devel@r-project.org > > <mailto:r-devel@r-project.org> <mailto:r-devel@r-project.org > > <mailto:r-devel@r-project.org>> > > > <mailto:r-devel@r-project.org > > <mailto:r-devel@r-project.org> <mailto:r-devel@r-project.org > > <mailto:r-devel@r-project.org>>>> wrote: > > > >> > > > > >> >> Without 'collapse', 'paste' pastes > > (concatenates) its > > > arguments > > > >> >> elementwise (separated by 'sep', " " by > > default). New in > > > R devel > > > >> and R > > > >> >> patched, specifying recycle0 = FALSE makes > mixing > > > zero-length and > > > >> >> nonzero-length arguments results in length > > zero. The > > > result of > > > >> paste(n, > > > >> >> "th", sep = "", recycle0 = FALSE) always have > > the same > > > length as > > > >> 'n'. > > > >> >> Previously, the result is still as long as the > > longest > > > argument, > > > >> with the > > > >> >> zero-length argument like "". If all og the > > arguments have > > > >> length zero, > > > >> >> 'recycle0' doesn't matter. > > > >> >> > > > >> >> As far as I understand, 'paste' with > > 'collapse' as a > > > character > > > >> string is > > > >> >> supposed to put together elements of a vector > > into a single > > > >> character > > > >> >> string. I think 'recycle0' shouldn't change it. > > > >> >> > > > >> >> In current R devel and R patched, > > paste(character(0), > > > collapse = "", > > > >> >> recycle0 = FALSE) is character(0). I think it > > should be > > > "", like > > > >> >> paste(character(0), collapse=""). > > > >> >> > > > >> >> paste(c("4", "5"), "th", sep = "", collapse = > > ", ", > > > recycle0 = > > > >> FALSE) > > > >> >> is > > > >> >> "4th, 5th". > > > >> >> paste(c("4" ), "th", sep = "", collapse = > > ", ", > > > recycle0 = > > > >> FALSE) > > > >> >> is > > > >> >> "4th". > > > >> >> I think > > > >> >> paste(c( ), "th", sep = "", collapse = > > ", ", > > > recycle0 = > > > >> FALSE) > > > >> >> should be > > > >> >> "", > > > >> >> not character(0). > > > >> >> > > > >> >> ______________________________________________ > > > >> >> R-devel@r-project.org > > <mailto:R-devel@r-project.org> <mailto:R-devel@r-project.org > > <mailto:R-devel@r-project.org>> > > > <mailto:R-devel@r-project.org > > <mailto:R-devel@r-project.org> <mailto:R-devel@r-project.org > > <mailto:R-devel@r-project.org>>> > > > mailing list > > > >> >> > > > >> > > > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=776IovW06eUHr1EDrabHLY7F47rU9CCUEItSDI96zc0&s=xN84DhkZeoxzn6SG0QTMpOGg2w_ThmjZmZymGUuD0Uw&e= > > > >> >> > > > >> > > > > >> > [[alternative HTML version deleted]] > > > >> > > > > >> > ______________________________________________ > > > >> > R-devel@r-project.org > > <mailto:R-devel@r-project.org> <mailto:R-devel@r-project.org > > <mailto:R-devel@r-project.org>> > > > <mailto:R-devel@r-project.org > > <mailto:R-devel@r-project.org> <mailto:R-devel@r-project.org > > <mailto:R-devel@r-project.org>>> > > > mailing list > > > >> > > > > >> > > > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=776IovW06eUHr1EDrabHLY7F47rU9CCUEItSDI96zc0&s=xN84DhkZeoxzn6SG0QTMpOGg2w_ThmjZmZymGUuD0Uw&e= > > > >> > > > > >> > > > >> -- > > > >> Hervé Pagès > > > >> > > > >> Program in Computational Biology > > > >> Division of Public Health Sciences > > > >> Fred Hutchinson Cancer Research Center > > > >> 1100 Fairview Ave. N, M1-B514 > > > >> P.O. Box 19024 > > > >> Seattle, WA 98109-1024 > > > >> > > > >> E-mail: hpa...@fredhutch.org > > <mailto:hpa...@fredhutch.org> <mailto:hpa...@fredhutch.org > > <mailto:hpa...@fredhutch.org>> > > > <mailto:hpa...@fredhutch.org > > <mailto:hpa...@fredhutch.org> <mailto:hpa...@fredhutch.org > > <mailto:hpa...@fredhutch.org>>> > > > >> Phone: (206) 667-5791 > > > >> Fax: (206) 667-1319 > > > >> > > > >> ______________________________________________ > > > >> R-devel@r-project.org > > <mailto:R-devel@r-project.org> <mailto:R-devel@r-project.org > > <mailto:R-devel@r-project.org>> > > > <mailto:R-devel@r-project.org > > <mailto:R-devel@r-project.org> <mailto:R-devel@r-project.org > > <mailto:R-devel@r-project.org>>> > > > mailing list > > > >> https://stat.ethz.ch/mailman/listinfo/r-devel > > < > https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwMFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=Z1o-HO3_OqxOR9LaRguGvnG7X4vF_z1_q13I7zmjcfY&s=CDOaP2RJnAyhpbHe6-O752uc4IPMugypbcgdYzhoF_8&e= > > > > > > > < > https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwMFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=q5ueyHReS5hHK6TZ0dJ1N7Ro8dx-rsLHys8GrCugOls&s=OLA7CqaU5uKeid1aGw41XJ_2Uq7JXbcwpPOrTWWG2v4&e= > > > > > >> > > > > > < > https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwMFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=cC2qctlVXd0qHMPvCyYvuVMqR8GU3DjTTqKJ0zjIFj8&s=COnDeGgHNnHJlLLZOznMlhcaFU1nIRlkaSbssvlrMvw&e= > > > > > >> > > > > > > > -- > > > > Hervé Pagès > > > > > > > Program in Computational Biology > > > > Division of Public Health Sciences > > > > Fred Hutchinson Cancer Research Center > > > > 1100 Fairview Ave. N, M1-B514 > > > > P.O. Box 19024 > > > > Seattle, WA 98109-1024 > > > > > > > E-mail: hpa...@fredhutch.org > > <mailto:hpa...@fredhutch.org> <mailto:hpa...@fredhutch.org > > <mailto:hpa...@fredhutch.org>> > > > > Phone: (206) 667-5791 > > > > Fax: (206) 667-1319 > > > > > > > ______________________________________________ > > > > R-devel@r-project.org > > <mailto:R-devel@r-project.org> <mailto:R-devel@r-project.org > > <mailto:R-devel@r-project.org>> mailing list > > > > https://stat.ethz.ch/mailman/listinfo/r-devel > > < > https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwMFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=Z1o-HO3_OqxOR9LaRguGvnG7X4vF_z1_q13I7zmjcfY&s=CDOaP2RJnAyhpbHe6-O752uc4IPMugypbcgdYzhoF_8&e= > > > > > > > < > https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwMFaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=q5ueyHReS5hHK6TZ0dJ1N7Ro8dx-rsLHys8GrCugOls&s=OLA7CqaU5uKeid1aGw41XJ_2Uq7JXbcwpPOrTWWG2v4&e= > > > > > > > > > -- > > Hervé Pagès > > > > Program in Computational Biology > > Division of Public Health Sciences > > Fred Hutchinson Cancer Research Center > > 1100 Fairview Ave. N, M1-B514 > > P.O. Box 19024 > > Seattle, WA 98109-1024 > > > > E-mail: hpa...@fredhutch.org <mailto:hpa...@fredhutch.org> > > Phone: (206) 667-5791 > > Fax: (206) 667-1319 > > > > -- > Hervé Pagès > > Program in Computational Biology > Division of Public Health Sciences > Fred Hutchinson Cancer Research Center > 1100 Fairview Ave. N, M1-B514 > P.O. Box 19024 > Seattle, WA 98109-1024 > > E-mail: hpa...@fredhutch.org > Phone: (206) 667-5791 > Fax: (206) 667-1319 > [[alternative HTML version deleted]] ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel