The fact that strsplit() doesn't say anything about 'split' being longer than 'x' adds to the confusion:
> strsplit(c("xAy", "xxByB", "xCyCCz"), split=c("A", "B", "C", "D")) [[1]] [1] "x" "y" [[2]] [1] "xx" "y" [[3]] [1] "x" "y" "" "z" A warning (or error) would go a long way in helping the user realize they're doing something wrong. No warning either when 'split' is shorter than 'x' but the length of the latter is not a multiple of the length of the former: > strsplit(c("xAy", "xxByB", "xCyCCz"), split=c("A", "B")) [[1]] [1] "x" "y" [[2]] [1] "xx" "y" [[3]] [1] "xCyCCz" Which is also unexpected given that most binary operations do issue a warning in this case (e.g. 11:13 * 1:2). H. On 12/18/19 06:48, Duncan Murdoch wrote: > On 18/12/2019 9:42 a.m., IAGO GINÉ VÁZQUEZ wrote: >> Hi all, >> >> In the help of strsplit one can read >> >> split character vector (or object which can be coerced to such) >> containing regular >> expression<https://urldefense.proofpoint.com/v2/url?u=http-3A__127.0.0.1-3A39783_help_library_base_help_regular-2520expression&d=DwIDaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=8oX1lQmqWY3lK0RSHzCrjkg95jmR7nr4Q0GU3Nw13qA&s=Tfpsttj1v1lIOp9QlfoqGJ1UsKCFOndwgmaNd6XT64s&e= >> >> >(s) (unless fixed = TRUE) to use for splitting. If empty matches >> occur, in particular if split has length 0, x is split into single >> characters. Ifsplit has length greater than 1, it is re-cycled along x. >> >> Taking into account that split is said to be a vector (not a length 1 >> vector) and the last claim above, I would expect that the output of >> >> >> strsplit("3:4", split = c(",",":"), fixed = TRUE) >> >> was the same than the output of >> >> strsplit("3:4", split = c(":"), fixed = TRUE) >> >> that is, splitting by "," (without effect in this example) and also by >> ":" >> >> [[1]] >> [1] "3" "4" >> >> But, instead, I get >> [[1]] >> [1] "3:4" >> >> Am I wrongly understanding the help? Is it an expected output? >> I tried with R 3.6.1 for Windows (10). > > Yes, you are misunderstanding the help. Your input x has length 1, so > only the first element of split will be used. If you wanted to use > both, you would need a longer x. For example, > > > strsplit(c("1:2", "3:4"), split=c(",", ":"), fixed=TRUE) > [[1]] > [1] "1:2" > > [[2]] > [1] "3" "4" > > The first element is split using "," -- since there are none, there's no > splitting done. The second element is split using ":". > > Duncan Murdoch > > ______________________________________________ > R-devel@r-project.org mailing list > https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwIDaQ&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=8oX1lQmqWY3lK0RSHzCrjkg95jmR7nr4Q0GU3Nw13qA&s=9m5muon8TUVCJdnvZtnyuxUQ88pc7qHCUsC6JGDF1qM&e= > > -- Hervé Pagès Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M1-B514 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: hpa...@fredhutch.org Phone: (206) 667-5791 Fax: (206) 667-1319 ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel