Re: [R] Extracting File Basename without Extension
The S+ basename() function has an argument called suffix and it will remove the suffix from the result. This was based on the Unix basename command, but I missed the special case in the Unix basename that doesn't remove the suffix if the removal would result in an empty string. The suffix must include any initial dot. Unlike the Unix basename, the suffix is a regular expression (but other regexpr arguments like ignore.case and fixed are not acceptedi - there ought to be a regular expression class so these things get attached to the expression instead of added to the call). basename(c(foobar, dir/foo.bar), suffix=.bar) [1] fo foo basename(c(foobar, dir/foo.bar), suffix=\\.bar) [1] foobar foo Bill Dunlap TIBCO Software Inc - Spotfire Division wdunlap tibco.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Extracting File Basename without Extension
Gabor Grothendieck wrote: On Fri, Jan 9, 2009 at 4:20 PM, Wacek Kusnierczyk right; there's a straightforward fix to my solution that accounts for cases such as '.bashrc': names = c(foo.bar, .zee) sub((.+)[.][^.]+$, \\1, names) you could also use a lookbehind if possible (not in r, afaik). or: sub(.*[.], ., names) [1] .bar .zee it was foo that was desired... vQ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Extracting File Basename without Extension
Gabor Grothendieck wrote: On Fri, Jan 9, 2009 at 4:28 PM, Wacek Kusnierczyk waclaw.marcin.kusnierc...@idi.ntnu.no wrote: Rau, Roland wrote: P.S. Any suggestions how to become more proficient with regular expressions? The O'Reilly book (Mastering...)? Whenever I tried anything more complicated than basic usage (things like ^ $ * . ) in R, I was way faster to write a new function (like above) instead of finding a regex solution. the book you mention is good. you may also consider http://www.regular-expressions.info/ regexes are usually well explained with lots of examples in perl books. By the way: it might be still possible to *write* regular expressions, but what about code re-use? Are there people who can easily *read* complicated regular expressions? in some cases it is possible to write regular expressions in a way that facilitates reading them by a human. in perl, for example, you can use so-called readable regexes: / (.+)# match and remember at least one arbitrary character [.] # match a dot [^.]+ # match at least one non-dot character $ # end of string anchor /x; you can also use within regex comments: /(.+)(?# one or more chars)[.](?# a dot)[^.]+(?# one or more non-dots)$(?# end of string)/ nothing of the sorts in r, however. Supports that if you begin the regular expression with (?x) and use perl = TRUE. See ?regexp cool, i see ?xism is supported. so the above can be written in r as: names = c(foo.bar, .zee) sub((?x) # alloow embedded comemnts (.+) # match and remember at least one arbitrary character [.] # match a dot [^.]+ # match at least one non-dot character $ # end of string anchor, \\1, names, perl=TRUE) is this what you wanted, roland? vQ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Extracting File Basename without Extension
G'day all, On Fri, 9 Jan 2009 08:12:18 -0200 Henrique Dallazuanna www...@gmail.com wrote: Try this also: substr(basename(myfile), 1, nchar(basename(myfile)) - 4) Or, in case that the extension has more than three letters or myfile is a vector of names: R myfile - path1/path2/myoutput.txt R sapply(strsplit(basename(myfile),\\.), function(x) paste(x[1:(length(x)-1)], collapse=.)) [1] myoutput R myfile2 - c(myfile, path2/path3/myoutput.temp) R sapply(strsplit(basename(myfile2),\\.), function(x) paste(x[1:(length(x)-1)], collapse=.)) [1] myoutput myoutput R myfile3 - c(myfile2, path4/path5/my.out.put.xls) R sapply(strsplit(basename(myfile3),\\.), function(x) paste(x[1:(length(x)-1)], collapse=.)) [1] myoutput myoutput my.out.put HTH. Cheers, Berwin On Fri, Jan 9, 2009 at 12:10 AM, Gundala Viswanath gunda...@gmail.comwrote: Dear all, The basename() function returns the extension also: myfile - path1/path2/myoutput.txt basename(myfile) [1] myoutput.txt Is there any other function where it just returns plain base: myoutput i.e. without 'txt' - Gundala Viswanath Jakarta - Indonesia === Full address = Berwin A TurlachTel.: +65 6516 4416 (secr) Dept of Statistics and Applied Probability+65 6516 6650 (self) Faculty of Science FAX : +65 6872 3919 National University of Singapore 6 Science Drive 2, Blk S16, Level 7 e-mail: sta...@nus.edu.sg Singapore 117546http://www.stat.nus.edu.sg/~statba __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Extracting File Basename without Extension
Berwin A Turlach wrote: G'day all, On Fri, 9 Jan 2009 08:12:18 -0200 Henrique Dallazuanna www...@gmail.com wrote: Try this also: substr(basename(myfile), 1, nchar(basename(myfile)) - 4) Or, in case that the extension has more than three letters or myfile is a vector of names: R myfile - path1/path2/myoutput.txt R sapply(strsplit(basename(myfile),\\.), function(x) paste(x[1:(length(x)-1)], collapse=.)) [1] myoutput R myfile2 - c(myfile, path2/path3/myoutput.temp) R sapply(strsplit(basename(myfile2),\\.), function(x) paste(x[1:(length(x)-1)], collapse=.)) [1] myoutput myoutput R myfile3 - c(myfile2, path4/path5/my.out.put.xls) R sapply(strsplit(basename(myfile3),\\.), function(x) paste(x[1:(length(x)-1)], collapse=.)) [1] myoutput myoutput my.out.put or have sub do the job for you: filenames.ext = c(foo.bar, basename(foo/bar/hello.dolly)) (filenames.noext = sub([.][^.]*$, , filenames.ext, perl=TRUE)) vQ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Extracting File Basename without Extension
On Fri, Jan 9, 2009 at 6:52 AM, Wacek Kusnierczyk waclaw.marcin.kusnierc...@idi.ntnu.no wrote: Berwin A Turlach wrote: G'day all, On Fri, 9 Jan 2009 08:12:18 -0200 Henrique Dallazuanna www...@gmail.com wrote: Try this also: substr(basename(myfile), 1, nchar(basename(myfile)) - 4) Or, in case that the extension has more than three letters or myfile is a vector of names: R myfile - path1/path2/myoutput.txt R sapply(strsplit(basename(myfile),\\.), function(x) paste(x[1:(length(x)-1)], collapse=.)) [1] myoutput R myfile2 - c(myfile, path2/path3/myoutput.temp) R sapply(strsplit(basename(myfile2),\\.), function(x) paste(x[1:(length(x)-1)], collapse=.)) [1] myoutput myoutput R myfile3 - c(myfile2, path4/path5/my.out.put.xls) R sapply(strsplit(basename(myfile3),\\.), function(x) paste(x[1:(length(x)-1)], collapse=.)) [1] myoutput myoutput my.out.put or have sub do the job for you: filenames.ext = c(foo.bar, basename(foo/bar/hello.dolly)) (filenames.noext = sub([.][^.]*$, , filenames.ext, perl=TRUE)) We can omit perl = TRUE here. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Extracting File Basename without Extension
Gabor Grothendieck wrote: On Fri, Jan 9, 2009 at 6:52 AM, Wacek Kusnierczyk or have sub do the job for you: filenames.ext = c(foo.bar, basename(foo/bar/hello.dolly)) (filenames.noext = sub([.][^.]*$, , filenames.ext, perl=TRUE)) We can omit perl = TRUE here. or maybe not, depending on the actual task: names = replicate(1, paste(sample(c(letters, .), 100, replace=TRUE), collapse=)) system.time(replicate(10, sub([.][^.]*$, , names, perl=TRUE))) system.time(replicate(10, sub([.][^.]*$, , names))) vQ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Extracting File Basename without Extension
G'day Wacek, On Fri, 09 Jan 2009 12:52:46 +0100 Wacek Kusnierczyk waclaw.marcin.kusnierc...@idi.ntnu.no wrote: Berwin A Turlach wrote: G'day all, On Fri, 9 Jan 2009 08:12:18 -0200 Henrique Dallazuanna www...@gmail.com wrote: Try this also: substr(basename(myfile), 1, nchar(basename(myfile)) - 4) Or, in case that the extension has more than three letters or myfile is a vector of names: R myfile - path1/path2/myoutput.txt R sapply(strsplit(basename(myfile),\\.), function(x) R paste(x[1:(length(x)-1)], collapse=.)) [1] myoutput R myfile2 - c(myfile, path2/path3/myoutput.temp) R sapply(strsplit(basename(myfile2),\\.), function(x) R paste(x[1:(length(x)-1)], collapse=.)) [1] myoutput myoutput R myfile3 - c(myfile2, path4/path5/my.out.put.xls) R sapply(strsplit(basename(myfile3),\\.), function(x) R paste(x[1:(length(x)-1)], collapse=.)) [1] myoutput myoutput my.out.put or have sub do the job for you: filenames.ext = c(foo.bar, basename(foo/bar/hello.dolly)) (filenames.noext = sub([.][^.]*$, , filenames.ext, perl=TRUE)) Apparently also a possibility, I guess it can be made to work with the original example and my extensions. Though, it seems to require the knowledge of perl, or at least perl's regular expression. Cheers, Berwin __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Extracting File Basename without Extension
Berwin A Turlach wrote: G'day Wacek, Or, in case that the extension has more than three letters or myfile is a vector of names: R myfile - path1/path2/myoutput.txt R sapply(strsplit(basename(myfile),\\.), function(x) R paste(x[1:(length(x)-1)], collapse=.)) [1] myoutput R myfile2 - c(myfile, path2/path3/myoutput.temp) R sapply(strsplit(basename(myfile2),\\.), function(x) R paste(x[1:(length(x)-1)], collapse=.)) [1] myoutput myoutput R myfile3 - c(myfile2, path4/path5/my.out.put.xls) R sapply(strsplit(basename(myfile3),\\.), function(x) R paste(x[1:(length(x)-1)], collapse=.)) [1] myoutput myoutput my.out.put or have sub do the job for you: filenames.ext = c(foo.bar, basename(foo/bar/hello.dolly)) (filenames.noext = sub([.][^.]*$, , filenames.ext, perl=TRUE)) g'afternoon berwin, Apparently also a possibility, I guess it can be made to work with the original example and my extensions. i guess it does work with the original example and your extensions. Though, it seems to require the knowledge of perl, or at least perl's regular expression. oh my, sorry. it' so bad to go an inch out of the cosy world of r. but, as gabor pointed, 'perl=TRUE' is inessential here, so you actually need to know just (very basic) regular expressions, with no 'perl' implied. having learnt this simple regex syntax you can avoid the need for looking up strsplit and paste in tfm, so i'd consider it worthwhile. vQ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Extracting File Basename without Extension
G'day Wacek, On Fri, 09 Jan 2009 14:22:19 +0100 Wacek Kusnierczyk waclaw.marcin.kusnierc...@idi.ntnu.no wrote: Apparently also a possibility, I guess it can be made to work with the original example and my extensions. i guess it does work with the original example and your extensions. And I thought that you would have known for sure. Though, it seems to require the knowledge of perl, or at least perl's regular expression. oh my, sorry. it' so bad to go an inch out of the cosy world of r. Well, if that's how you feel, don't do it. I regularly use other languages besides R. Mostly C and Fortran, occasionally Python. But I never found time to learn Perl or Java or awk or C++ or; some people do not have the time to learn all languages under the sun. Also, if one concentrates on a few, one can learn them really well. but, as gabor pointed, 'perl=TRUE' is inessential here, I thought that your answer to Gabor indicated that, depending on the context, perl=TRUE was essential; though I must admit that I did not run that code. so you actually need to know just (very basic) regular expressions, with no 'perl' implied. having learnt this simple regex syntax you can avoid the need for looking up strsplit and paste in tfm, so i'd consider it worthwhile. As people say, YMMV, I do not need to look up strsplit and/or paste; but I would have to look up what the regular expression syntax or finally memorise it; something I did not consider worthwhile so far. Cheers, Berwin __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Extracting File Basename without Extension
Berwin A Turlach wrote: G'day Wacek, On Fri, 09 Jan 2009 14:22:19 +0100 Wacek Kusnierczyk waclaw.marcin.kusnierc...@idi.ntnu.no wrote: Apparently also a possibility, I guess it can be made to work with the original example and my extensions. i guess it does work with the original example and your extensions. And I thought that you would have known for sure. i thought i did until you made that comment, which made me think you've just discovered it doesn't. Though, it seems to require the knowledge of perl, or at least perl's regular expression. oh my, sorry. it' so bad to go an inch out of the cosy world of r. Well, if that's how you feel, don't do it. quite the opposite. I regularly use other languages besides R. Mostly C and Fortran, occasionally Python. But I never found time to learn Perl or Java or awk or C++ or; some people do not have the time to learn all languages under the sun. Also, if one concentrates on a few, one can learn them really well. i think i did not suggest the original poster to learn perl. many responses on this list involve regular expressions, and regexes are so ubiquitous in code that has to do with parsing and processing text, be it filenames or loads of data, that a user of r may well want to learn a bit of this stuff in addition to the details about how real numbers are represented below the surface. you may want to keep saying 'use r where applicable instead of worse tools', and i'd like to keep saying 'use regexes where they're applicable instead of worse tools'. same philosophy. but, as gabor pointed, 'perl=TRUE' is inessential here, I thought that your answer to Gabor indicated that, depending on the context, perl=TRUE was essential; though I must admit that I did not run that code. i see i may have expressed that wrongly. it should have been but maybe you'd want to keep it, meaning this is not essential for the final result, but may improve the runtime. so you actually need to know just (very basic) regular expressions, with no 'perl' implied. having learnt this simple regex syntax you can avoid the need for looking up strsplit and paste in tfm, so i'd consider it worthwhile. As people say, YMMV, I do not need to look up strsplit and/or paste; but I would have to look up what the regular expression syntax or finally memorise it; something I did not consider worthwhile so far. well, the regex syntax is fairly standard, though variations exist among languages. once you learn it, or rather the ideas behind the syntax, you're well equipped for quite a range of tasks. on the other hand, the details of strsplit and paste are pretty r-specific, and you don't gain much by remembering them (except for freeing yourself from having to read tfm again). i have seen quite a bunch of programs written by scientists who spent over one hundred lines of code on just parsing command line arguments; i wish they knew regexes exist (and better, getopt-like modules too). if you're doing serious programming without knowing regexes, you're rather lucky. vQ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Extracting File Basename without Extension
Prof Brian Ripley wrote: Actually, that's a valid regex in any of the variants offered. A more conventional writing of it is the second of f - 'foo.bar.R' sub([.][^.]*$, , f) [1] foo.bar sub(\\.[^.]*$, , f) [1] foo.bar more conventional in r, perhaps. it's not portable, due to the 'escape the escape to have an escape' feature of r when it comes to regexes; in perl, for example, /\\.[^.]*$/ would hardly do the job. vQ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Extracting File Basename without Extension
G'day Wacek, On Fri, 09 Jan 2009 15:19:46 +0100 Wacek Kusnierczyk waclaw.marcin.kusnierc...@idi.ntnu.no wrote: i think i did not suggest the original poster to learn perl. As I see it, you didn't suggest anything to the original poster, at least not directly. But, since these days you have to be subscribed to r-help to post IIRC, it is probably reasonable to assume that the original poster saw your posting. many responses on this list involve regular expressions, and regexes are so ubiquitous in code that has to do with parsing and processing text, be it filenames or loads of data, that a user of r may well want to learn a bit of this stuff in addition to the details about how real numbers are represented below the surface. You should really get rid of that chip on your shoulder. you may want to keep saying 'use r where applicable instead of worse tools', Now I am getting really confused, isn't your suggested solution not using R too? So why would I say something like this? I suggest you get rid of the chip on the other shoulder too... :) well, the regex syntax is fairly standard, though variations exist among languages. Exactly, and the existence of these variations which can trip one up and require to rtfm in anything but the most simplest situations, that's why regexp are not what jumps first to my mind. i have seen quite a bunch of programs written by scientists who spent over one hundred lines of code on just parsing command line arguments; i wish they knew regexes exist (and better, getopt-like modules too). if you're doing serious programming without knowing regexes, you're rather lucky. Probably depends on what one calls serious. Best wishes, Berwin __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Extracting File Basename without Extension
Right, Other option is: substr(nameFile, 1, tail(unlist(gregexpr(\\., nameFile)), 1) - 1) On Fri, Jan 9, 2009 at 1:23 PM, Rau, Roland r...@demogr.mpg.de wrote: Hi, [mailto:r-help-boun...@r-project.org] On Behalf Of Henrique Dallazuanna Try this also: substr(basename(myfile), 1, nchar(basename(myfile)) - 4) This, of course, assumes that the extensions are always 3 characters. Sometimes there might be more (index.html), sometimes less (shellscript.sh). Although my solution is not as compact as the others (I wish I was proficient in 'mastering regular expressions'), I'd like to provide my little code-snippet which does not require any regular expressions (but expects a . in the filename). ## x1 - roland.txt x2 - roland.html x3 - roland.sh no.extension - function(astring) { if (substr(astring, nchar(astring), nchar(astring))==.) { return(substr(astring, 1, nchar(astring)-1)) } else { no.extension(substr(astring, 1, nchar(astring)-1)) } } no.extension(x1) no.extension(x2) no.extension(x3) ## Hope this helps a bit, Roland P.S. Any suggestions how to become more proficient with regular expressions? The O'Reilly book (Mastering...)? Whenever I tried anything more complicated than basic usage (things like ^ $ * . ) in R, I was way faster to write a new function (like above) instead of finding a regex solution. By the way: it might be still possible to *write* regular expressions, but what about code re-use? Are there people who can easily *read* complicated regular expressions? -- This mail has been sent through the MPI for Demographic Research. Should you receive a mail that is apparently from a MPI user without this text displayed, then the address has most likely been faked. If you are uncertain about the validity of this message, please check the mail header or ask your system administrator for assistance. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Extracting File Basename without Extension
On 1/8/2009 9:10 PM, Gundala Viswanath wrote: Dear all, The basename() function returns the extension also: myfile - path1/path2/myoutput.txt basename(myfile) [1] myoutput.txt Is there any other function where it just returns plain base: myoutput i.e. without 'txt' I'm curious about something: does file extension have a standard definition? Most (all? I haven't tried them all) of the solutions presented in this thread would return an empty string for the plain base if given the filename .bashrc. Windows (where file extensions really mean something), though reluctant to create such a file, appears to agree that the extension is bashrc, even though to me it appears clear that that file has no extension. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Extracting File Basename without Extension
Hi, [mailto:r-help-boun...@r-project.org] On Behalf Of Henrique Dallazuanna Try this also: substr(basename(myfile), 1, nchar(basename(myfile)) - 4) This, of course, assumes that the extensions are always 3 characters. Sometimes there might be more (index.html), sometimes less (shellscript.sh). Although my solution is not as compact as the others (I wish I was proficient in 'mastering regular expressions'), I'd like to provide my little code-snippet which does not require any regular expressions (but expects a . in the filename). ## x1 - roland.txt x2 - roland.html x3 - roland.sh no.extension - function(astring) { if (substr(astring, nchar(astring), nchar(astring))==.) { return(substr(astring, 1, nchar(astring)-1)) } else { no.extension(substr(astring, 1, nchar(astring)-1)) } } no.extension(x1) no.extension(x2) no.extension(x3) ## Hope this helps a bit, Roland P.S. Any suggestions how to become more proficient with regular expressions? The O'Reilly book (Mastering...)? Whenever I tried anything more complicated than basic usage (things like ^ $ * . ) in R, I was way faster to write a new function (like above) instead of finding a regex solution. By the way: it might be still possible to *write* regular expressions, but what about code re-use? Are there people who can easily *read* complicated regular expressions? -- This mail has been sent through the MPI for Demographic Research. Should you receive a mail that is apparently from a MPI user without this text displayed, then the address has most likely been faked. If you are uncertain about the validity of this message, please check the mail header or ask your system administrator for assistance. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Extracting File Basename without Extension
On Fri, Jan 9, 2009 at 10:23 AM, Rau, Roland r...@demogr.mpg.de wrote: Hi, [mailto:r-help-boun...@r-project.org] On Behalf Of Henrique Dallazuanna Try this also: substr(basename(myfile), 1, nchar(basename(myfile)) - 4) This, of course, assumes that the extensions are always 3 characters. The regex solutions posted did not make this assumption. P.S. Any suggestions how to become more proficient with regular expressions? The O'Reilly book (Mastering...)? Whenever I tried anything more complicated than basic usage (things like ^ $ * . ) in R, I was way faster to write a new function (like above) instead of finding a regex solution. See the links in the Links box at: http://gsubfn.googlecode.com By the way: it might be still possible to *write* regular expressions, but what about code re-use? Are there people who can easily *read* complicated regular expressions? -- This mail has been sent through the MPI for Demographic Research. Should you receive a mail that is apparently from a MPI user without this text displayed, then the address has most likely been faked. If you are uncertain about the validity of this message, please check the mail header or ask your system administrator for assistance. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Extracting File Basename without Extension
Duncan Murdoch wrote: On 1/8/2009 9:10 PM, Gundala Viswanath wrote: Dear all, The basename() function returns the extension also: myfile - path1/path2/myoutput.txt basename(myfile) [1] myoutput.txt Is there any other function where it just returns plain base: myoutput i.e. without 'txt' I'm curious about something: does file extension have a standard definition? Most (all? I haven't tried them all) of the solutions presented in this thread would return an empty string for the plain base if given the filename .bashrc. Windows (where file extensions really mean something), though reluctant to create such a file, appears to agree that the extension is bashrc, even though to me it appears clear that that file has no extension. I'm not sure what is clear about it, but the GNU utility agrees with you: basename abc/.exe .exe .exe basename abc/1.exe .exe 1 Anyone want to contribute code for an optional suffix= argument for R's basename()? -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - (p.dalga...@biostat.ku.dk) FAX: (+45) 35327907 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Extracting File Basename without Extension
on 01/09/2009 09:00 AM Duncan Murdoch wrote: On 1/8/2009 9:10 PM, Gundala Viswanath wrote: Dear all, The basename() function returns the extension also: myfile - path1/path2/myoutput.txt basename(myfile) [1] myoutput.txt Is there any other function where it just returns plain base: myoutput i.e. without 'txt' I'm curious about something: does file extension have a standard definition? Most (all? I haven't tried them all) of the solutions presented in this thread would return an empty string for the plain base if given the filename .bashrc. Windows (where file extensions really mean something), though reluctant to create such a file, appears to agree that the extension is bashrc, even though to me it appears clear that that file has no extension. Duncan, That is going to be highly OS and even OS version specific. More information here: http://en.wikipedia.org/wiki/Filename http://en.wikipedia.org/wiki/Filename_extension There are relevant standard extensions for standard file formats: http://en.wikipedia.org/wiki/List_of_file_formats but that does not guarantee that user created filenames will adhere to them, especially for text files. As you note, filenames beginning with a '.' will be common on Unixen/Linuxen as otherwise normally hidden system/config files. Such files would actually create problems if attempted to be opened on Windows with certain applications and I have even seen problems with such files when using SMB under Linux to access files on a server. HTH, Marc Schwartz __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Extracting File Basename without Extension
Duncan Murdoch wrote: I'm curious about something: does file extension have a standard definition? Most (all? I haven't tried them all) of the solutions presented in this thread would return an empty string for the plain base if given the filename .bashrc. right; there's a straightforward fix to my solution that accounts for cases such as '.bashrc': names = c(foo.bar, .zee) sub((.+)[.][^.]+$, \\1, names) you could also use a lookbehind if possible (not in r, afaik). vQ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Extracting File Basename without Extension
Rau, Roland wrote: P.S. Any suggestions how to become more proficient with regular expressions? The O'Reilly book (Mastering...)? Whenever I tried anything more complicated than basic usage (things like ^ $ * . ) in R, I was way faster to write a new function (like above) instead of finding a regex solution. the book you mention is good. you may also consider http://www.regular-expressions.info/ regexes are usually well explained with lots of examples in perl books. By the way: it might be still possible to *write* regular expressions, but what about code re-use? Are there people who can easily *read* complicated regular expressions? in some cases it is possible to write regular expressions in a way that facilitates reading them by a human. in perl, for example, you can use so-called readable regexes: / (.+)# match and remember at least one arbitrary character [.] # match a dot [^.]+ # match at least one non-dot character $ # end of string anchor /x; you can also use within regex comments: /(.+)(?# one or more chars)[.](?# a dot)[^.]+(?# one or more non-dots)$(?# end of string)/ nothing of the sorts in r, however. vQ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Extracting File Basename without Extension
On Fri, Jan 9, 2009 at 4:20 PM, Wacek Kusnierczyk waclaw.marcin.kusnierc...@idi.ntnu.no wrote: Duncan Murdoch wrote: I'm curious about something: does file extension have a standard definition? Most (all? I haven't tried them all) of the solutions presented in this thread would return an empty string for the plain base if given the filename .bashrc. right; there's a straightforward fix to my solution that accounts for cases such as '.bashrc': names = c(foo.bar, .zee) sub((.+)[.][^.]+$, \\1, names) you could also use a lookbehind if possible (not in r, afaik). or: sub(.*[.], ., names) [1] .bar .zee __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Extracting File Basename without Extension
On Fri, Jan 9, 2009 at 4:28 PM, Wacek Kusnierczyk waclaw.marcin.kusnierc...@idi.ntnu.no wrote: Rau, Roland wrote: P.S. Any suggestions how to become more proficient with regular expressions? The O'Reilly book (Mastering...)? Whenever I tried anything more complicated than basic usage (things like ^ $ * . ) in R, I was way faster to write a new function (like above) instead of finding a regex solution. the book you mention is good. you may also consider http://www.regular-expressions.info/ regexes are usually well explained with lots of examples in perl books. By the way: it might be still possible to *write* regular expressions, but what about code re-use? Are there people who can easily *read* complicated regular expressions? in some cases it is possible to write regular expressions in a way that facilitates reading them by a human. in perl, for example, you can use so-called readable regexes: / (.+)# match and remember at least one arbitrary character [.] # match a dot [^.]+ # match at least one non-dot character $ # end of string anchor /x; you can also use within regex comments: /(.+)(?# one or more chars)[.](?# a dot)[^.]+(?# one or more non-dots)$(?# end of string)/ nothing of the sorts in r, however. Supports that if you begin the regular expression with (?x) and use perl = TRUE. See ?regexp __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Extracting File Basename without Extension
Dear all, The basename() function returns the extension also: myfile - path1/path2/myoutput.txt basename(myfile) [1] myoutput.txt Is there any other function where it just returns plain base: myoutput i.e. without 'txt' - Gundala Viswanath Jakarta - Indonesia __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Extracting File Basename without Extension
You can use 'sub' to get rid of the extensions: sub(^([^.]*).*, \\1, 'filename.extension') [1] filename sub(^([^.]*).*, \\1, 'filename.extension.and.more') [1] filename sub(^([^.]*).*, \\1, 'filename without extension') [1] filename without extension On Thu, Jan 8, 2009 at 9:10 PM, Gundala Viswanath gunda...@gmail.com wrote: Dear all, The basename() function returns the extension also: myfile - path1/path2/myoutput.txt basename(myfile) [1] myoutput.txt Is there any other function where it just returns plain base: myoutput i.e. without 'txt' - Gundala Viswanath Jakarta - Indonesia __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.