Re: [R] Extracting File Basename without Extension

2009-01-08 Thread jim holtman
You can use 'sub' to get rid of the extensions:

> sub("^([^.]*).*", "\\1", 'filename.extension')
[1] "filename"
> sub("^([^.]*).*", "\\1", 'filename.extension.and.more')
[1] "filename"
> sub("^([^.]*).*", "\\1", 'filename without extension')
[1] "filename without extension"


On Thu, Jan 8, 2009 at 9:10 PM, Gundala Viswanath  wrote:
> Dear all,
>
> The basename() function returns the extension also:
>
>> myfile <- "path1/path2/myoutput.txt"
>> basename(myfile)
> [1] "myoutput.txt"
>
>
> Is there any other function where it just returns
> plain base:
>
> "myoutput"
>
> i.e. without 'txt'
>
> - Gundala Viswanath
> Jakarta - Indonesia
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Extracting File Basename without Extension

2009-01-09 Thread Henrique Dallazuanna
Try this also:

substr(basename(myfile), 1, nchar(basename(myfile)) - 4)

On Fri, Jan 9, 2009 at 12:10 AM, Gundala Viswanath wrote:

> Dear all,
>
> The basename() function returns the extension also:
>
> > myfile <- "path1/path2/myoutput.txt"
> > basename(myfile)
> [1] "myoutput.txt"
>
>
> Is there any other function where it just returns
> plain base:
>
> "myoutput"
>
> i.e. without 'txt'
>
> - Gundala Viswanath
> Jakarta - Indonesia
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40" S 49° 16' 22" O

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Extracting File Basename without Extension

2009-01-09 Thread Berwin A Turlach
G'day all,

On Fri, 9 Jan 2009 08:12:18 -0200
"Henrique Dallazuanna"  wrote:

> Try this also:
> 
> substr(basename(myfile), 1, nchar(basename(myfile)) - 4)

Or, in case that the extension has more than three letters or "myfile"
is a vector of names:

R> myfile <- "path1/path2/myoutput.txt"
R> sapply(strsplit(basename(myfile),"\\."), function(x) 
paste(x[1:(length(x)-1)], collapse="."))
[1] "myoutput"
R> myfile2 <- c(myfile, "path2/path3/myoutput.temp")
R> sapply(strsplit(basename(myfile2),"\\."), function(x) 
paste(x[1:(length(x)-1)], collapse="."))
[1] "myoutput" "myoutput"
R> myfile3 <- c(myfile2, "path4/path5/my.out.put.xls")
R> sapply(strsplit(basename(myfile3),"\\."), function(x) 
paste(x[1:(length(x)-1)], collapse="."))
[1] "myoutput"   "myoutput"   "my.out.put"

HTH.

Cheers,

Berwin

> On Fri, Jan 9, 2009 at 12:10 AM, Gundala Viswanath
> wrote:
> 
> > Dear all,
> >
> > The basename() function returns the extension also:
> >
> > > myfile <- "path1/path2/myoutput.txt"
> > > basename(myfile)
> > [1] "myoutput.txt"
> >
> >
> > Is there any other function where it just returns
> > plain base:
> >
> > "myoutput"
> >
> > i.e. without 'txt'
> >
> > - Gundala Viswanath
> > Jakarta - Indonesia

=== Full address =
Berwin A TurlachTel.: +65 6516 4416 (secr)
Dept of Statistics and Applied Probability+65 6516 6650 (self)
Faculty of Science  FAX : +65 6872 3919   
National University of Singapore 
6 Science Drive 2, Blk S16, Level 7  e-mail: sta...@nus.edu.sg
Singapore 117546http://www.stat.nus.edu.sg/~statba

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Extracting File Basename without Extension

2009-01-09 Thread Wacek Kusnierczyk
Berwin A Turlach wrote:
> G'day all,
>
> On Fri, 9 Jan 2009 08:12:18 -0200
> "Henrique Dallazuanna"  wrote:
>
>   
>> Try this also:
>>
>> substr(basename(myfile), 1, nchar(basename(myfile)) - 4)
>> 
>
> Or, in case that the extension has more than three letters or "myfile"
> is a vector of names:
>
> R> myfile <- "path1/path2/myoutput.txt"
> R> sapply(strsplit(basename(myfile),"\\."), function(x) 
> paste(x[1:(length(x)-1)], collapse="."))
> [1] "myoutput"
> R> myfile2 <- c(myfile, "path2/path3/myoutput.temp")
> R> sapply(strsplit(basename(myfile2),"\\."), function(x) 
> paste(x[1:(length(x)-1)], collapse="."))
> [1] "myoutput" "myoutput"
> R> myfile3 <- c(myfile2, "path4/path5/my.out.put.xls")
> R> sapply(strsplit(basename(myfile3),"\\."), function(x) 
> paste(x[1:(length(x)-1)], collapse="."))
> [1] "myoutput"   "myoutput"   "my.out.put"
>
>   

or have sub do the job for you:

filenames.ext = c("foo.bar", basename("foo/bar/hello.dolly"))
(filenames.noext = sub("[.][^.]*$", "", filenames.ext, perl=TRUE))



vQ

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Extracting File Basename without Extension

2009-01-09 Thread Gabor Grothendieck
On Fri, Jan 9, 2009 at 6:52 AM, Wacek Kusnierczyk
 wrote:
> Berwin A Turlach wrote:
>> G'day all,
>>
>> On Fri, 9 Jan 2009 08:12:18 -0200
>> "Henrique Dallazuanna"  wrote:
>>
>>
>>> Try this also:
>>>
>>> substr(basename(myfile), 1, nchar(basename(myfile)) - 4)
>>>
>>
>> Or, in case that the extension has more than three letters or "myfile"
>> is a vector of names:
>>
>> R> myfile <- "path1/path2/myoutput.txt"
>> R> sapply(strsplit(basename(myfile),"\\."), function(x) 
>> paste(x[1:(length(x)-1)], collapse="."))
>> [1] "myoutput"
>> R> myfile2 <- c(myfile, "path2/path3/myoutput.temp")
>> R> sapply(strsplit(basename(myfile2),"\\."), function(x) 
>> paste(x[1:(length(x)-1)], collapse="."))
>> [1] "myoutput" "myoutput"
>> R> myfile3 <- c(myfile2, "path4/path5/my.out.put.xls")
>> R> sapply(strsplit(basename(myfile3),"\\."), function(x) 
>> paste(x[1:(length(x)-1)], collapse="."))
>> [1] "myoutput"   "myoutput"   "my.out.put"
>>
>>
>
> or have sub do the job for you:
>
> filenames.ext = c("foo.bar", basename("foo/bar/hello.dolly"))
> (filenames.noext = sub("[.][^.]*$", "", filenames.ext, perl=TRUE))

We can omit perl = TRUE here.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Extracting File Basename without Extension

2009-01-09 Thread Wacek Kusnierczyk
Gabor Grothendieck wrote:
> On Fri, Jan 9, 2009 at 6:52 AM, Wacek Kusnierczyk
>
>   
>> or have sub do the job for you:
>>
>> filenames.ext = c("foo.bar", basename("foo/bar/hello.dolly"))
>> (filenames.noext = sub("[.][^.]*$", "", filenames.ext, perl=TRUE))
>> 
>
> We can omit perl = TRUE here.
>
>   


or maybe not, depending on the actual task:

names = replicate(1, paste(sample(c(letters, "."), 100,
replace=TRUE), collapse=""))
system.time(replicate(10, sub("[.][^.]*$", "", names, perl=TRUE)))
system.time(replicate(10, sub("[.][^.]*$", "", names)))

vQ

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Extracting File Basename without Extension

2009-01-09 Thread Berwin A Turlach
G'day Wacek,

On Fri, 09 Jan 2009 12:52:46 +0100
Wacek Kusnierczyk  wrote:

> Berwin A Turlach wrote:
> > G'day all,
> >
> > On Fri, 9 Jan 2009 08:12:18 -0200
> > "Henrique Dallazuanna"  wrote:
> >
> >   
> >> Try this also:
> >>
> >> substr(basename(myfile), 1, nchar(basename(myfile)) - 4)
> >> 
> >
> > Or, in case that the extension has more than three letters or
> > "myfile" is a vector of names:
> >
> > R> myfile <- "path1/path2/myoutput.txt"
> > R> sapply(strsplit(basename(myfile),"\\."), function(x)
> > R> paste(x[1:(length(x)-1)], collapse="."))
> > [1] "myoutput"
> > R> myfile2 <- c(myfile, "path2/path3/myoutput.temp")
> > R> sapply(strsplit(basename(myfile2),"\\."), function(x)
> > R> paste(x[1:(length(x)-1)], collapse="."))
> > [1] "myoutput" "myoutput"
> > R> myfile3 <- c(myfile2, "path4/path5/my.out.put.xls")
> > R> sapply(strsplit(basename(myfile3),"\\."), function(x)
> > R> paste(x[1:(length(x)-1)], collapse="."))
> > [1] "myoutput"   "myoutput"   "my.out.put"
> >
> >   
> 
> or have sub do the job for you:
> 
> filenames.ext = c("foo.bar", basename("foo/bar/hello.dolly"))
> (filenames.noext = sub("[.][^.]*$", "", filenames.ext, perl=TRUE))

Apparently also a possibility, I guess it can be made to work with the
original example and my extensions.

Though, it seems to require the knowledge of perl, or at least perl's
regular expression. 

Cheers,

Berwin

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Extracting File Basename without Extension

2009-01-09 Thread Wacek Kusnierczyk
Berwin A Turlach wrote:
> G'day Wacek,
>
>
>   
>>> Or, in case that the extension has more than three letters or
>>> "myfile" is a vector of names:
>>>
>>> R> myfile <- "path1/path2/myoutput.txt"
>>> R> sapply(strsplit(basename(myfile),"\\."), function(x)
>>> R> paste(x[1:(length(x)-1)], collapse="."))
>>> [1] "myoutput"
>>> R> myfile2 <- c(myfile, "path2/path3/myoutput.temp")
>>> R> sapply(strsplit(basename(myfile2),"\\."), function(x)
>>> R> paste(x[1:(length(x)-1)], collapse="."))
>>> [1] "myoutput" "myoutput"
>>> R> myfile3 <- c(myfile2, "path4/path5/my.out.put.xls")
>>> R> sapply(strsplit(basename(myfile3),"\\."), function(x)
>>> R> paste(x[1:(length(x)-1)], collapse="."))
>>> [1] "myoutput"   "myoutput"   "my.out.put"
>>>
>>>   
>>>   
>> or have sub do the job for you:
>>
>> filenames.ext = c("foo.bar", basename("foo/bar/hello.dolly"))
>> (filenames.noext = sub("[.][^.]*$", "", filenames.ext, perl=TRUE))
>> 
>
>   

g'afternoon berwin,

> Apparently also a possibility, I guess it can be made to work with the
> original example and my extensions.
>   

i guess it does work with the original example and your extensions.

> Though, it seems to require the knowledge of perl, or at least perl's
> regular expression. 
>   

oh my, sorry.  it' so bad to go an inch out of the cosy world of r. 
but, as gabor pointed, 'perl=TRUE' is inessential here, so you actually
need to know just (very basic) regular expressions, with no 'perl'
implied.  having learnt this simple regex syntax you can avoid the need
for looking up strsplit and paste in tfm, so i'd consider it worthwhile.

vQ

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Extracting File Basename without Extension

2009-01-09 Thread Berwin A Turlach
G'day Wacek,

On Fri, 09 Jan 2009 14:22:19 +0100
Wacek Kusnierczyk  wrote:

> > Apparently also a possibility, I guess it can be made to work with
> > the original example and my extensions.
> >   
> 
> i guess it does work with the original example and your extensions.

And I thought that you would have known for sure.

> > Though, it seems to require the knowledge of perl, or at least
> > perl's regular expression.
> 
> oh my, sorry.  it' so bad to go an inch out of the cosy world of r. 

Well, if that's how you feel, don't do it.

I regularly use other languages besides R.  Mostly C and Fortran,
occasionally Python.  But I never found time to learn Perl or Java or
awk or C++ or; some people do not have the time to learn all
languages under the sun. Also, if one concentrates on a few, one can
learn them really well.

> but, as gabor pointed, 'perl=TRUE' is inessential here,

I thought that your answer to Gabor indicated that, depending on the
context, perl=TRUE was essential; though I must admit that I did not
run that code.  

> so you actually need to know just (very basic) regular expressions,
> with no 'perl' implied.  having learnt this simple regex syntax you
> can avoid the need for looking up strsplit and paste in tfm, so i'd
> consider it worthwhile.

As people say, YMMV, I do not need to look up strsplit and/or paste;
but I would have to look up what the regular expression syntax or
finally memorise it; something I did not consider worthwhile so far.

Cheers,

Berwin

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Extracting File Basename without Extension

2009-01-09 Thread Prof Brian Ripley

On Fri, 9 Jan 2009, Berwin A Turlach wrote:


G'day Wacek,

On Fri, 09 Jan 2009 12:52:46 +0100
Wacek Kusnierczyk  wrote:


Berwin A Turlach wrote:

G'day all,

On Fri, 9 Jan 2009 08:12:18 -0200
"Henrique Dallazuanna"  wrote:



Try this also:

substr(basename(myfile), 1, nchar(basename(myfile)) - 4)



Or, in case that the extension has more than three letters or
"myfile" is a vector of names:

R> myfile <- "path1/path2/myoutput.txt"
R> sapply(strsplit(basename(myfile),"\\."), function(x)
R> paste(x[1:(length(x)-1)], collapse="."))
[1] "myoutput"
R> myfile2 <- c(myfile, "path2/path3/myoutput.temp")
R> sapply(strsplit(basename(myfile2),"\\."), function(x)
R> paste(x[1:(length(x)-1)], collapse="."))
[1] "myoutput" "myoutput"
R> myfile3 <- c(myfile2, "path4/path5/my.out.put.xls")
R> sapply(strsplit(basename(myfile3),"\\."), function(x)
R> paste(x[1:(length(x)-1)], collapse="."))
[1] "myoutput"   "myoutput"   "my.out.put"


using fixed = TRUE and not escaping '.' is slightly more efficient.


or have sub do the job for you:

filenames.ext = c("foo.bar", basename("foo/bar/hello.dolly"))
(filenames.noext = sub("[.][^.]*$", "", filenames.ext, perl=TRUE))


Apparently also a possibility, I guess it can be made to work with the
original example and my extensions.

Though, it seems to require the knowledge of perl, or at least perl's
regular expression.


Actually, that's a valid regex in any of the variants offered.  A more 
conventional writing of it is the second of



f <- 'foo.bar.R'
sub("[.][^.]*$", "", f)

[1] "foo.bar"

sub("\\.[^.]*$", "", f)

[1] "foo.bar"

It is the last that is used at various points in R's own code 
(although sometimes with restrictions on what an 'extension' is, e.g.


sub("\\.[[:alnum:]]*$", "", f)

appears in the current SHLIB code.)

--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Extracting File Basename without Extension

2009-01-09 Thread Wacek Kusnierczyk
Berwin A Turlach wrote:
> G'day Wacek,
>
> On Fri, 09 Jan 2009 14:22:19 +0100
> Wacek Kusnierczyk  wrote:
>
>   
>>> Apparently also a possibility, I guess it can be made to work with
>>> the original example and my extensions.
>>>   
>>>   
>> i guess it does work with the original example and your extensions.
>> 
>
> And I thought that you would have known for sure.
>   

i thought i did until you made that comment, which made me think you've
just discovered it doesn't.

>   
>>> Though, it seems to require the knowledge of perl, or at least
>>> perl's regular expression.
>>>   
>> oh my, sorry.  it' so bad to go an inch out of the cosy world of r. 
>> 
>
> Well, if that's how you feel, don't do it.
>   

quite the opposite.

> I regularly use other languages besides R.  Mostly C and Fortran,
> occasionally Python.  But I never found time to learn Perl or Java or
> awk or C++ or; some people do not have the time to learn all
> languages under the sun. Also, if one concentrates on a few, one can
> learn them really well.
>   

i think i did not suggest the original poster to learn perl.

many responses on this list involve regular expressions, and regexes are
so ubiquitous in code that has to do with parsing and processing text,
be it filenames or loads of data, that a user of r may well want to
learn a bit of this stuff in addition to the details about how real
numbers are represented below the surface.

you may want to keep saying 'use r where applicable instead of worse
tools', and i'd like to keep saying 'use regexes where they're
applicable instead of worse tools'.  same philosophy.


>   
>> but, as gabor pointed, 'perl=TRUE' is inessential here,
>> 
>
> I thought that your answer to Gabor indicated that, depending on the
> context, perl=TRUE was essential; though I must admit that I did not
> run that code.  
>   

i see i may have expressed that wrongly.  it should have been "but maybe
you'd want to keep it", meaning this is not essential for the final
result, but may improve the runtime.


>   
>> so you actually need to know just (very basic) regular expressions,
>> with no 'perl' implied.  having learnt this simple regex syntax you
>> can avoid the need for looking up strsplit and paste in tfm, so i'd
>> consider it worthwhile.
>> 
>
> As people say, YMMV, I do not need to look up strsplit and/or paste;
> but I would have to look up what the regular expression syntax or
> finally memorise it; something I did not consider worthwhile so far.
>   

well, the regex syntax is fairly standard, though variations exist among
languages.  once you learn it, or rather the ideas behind the syntax,
you're well equipped for quite a range of tasks.  on the other hand, the
details of strsplit and paste are pretty r-specific, and you don't gain
much by remembering them (except for freeing yourself from having to
read tfm again).

i have seen quite a bunch of programs written by scientists who spent
over one hundred lines of code on just parsing command line arguments; 
i wish they knew regexes exist (and better, getopt-like modules too). 
if you're doing serious programming without knowing regexes, you're
rather lucky.

vQ

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Extracting File Basename without Extension

2009-01-09 Thread Wacek Kusnierczyk
Prof Brian Ripley wrote:
>
> Actually, that's a valid regex in any of the variants offered.  A more
> conventional writing of it is the second of
>
>> f <- 'foo.bar.R'
>> sub("[.][^.]*$", "", f)
> [1] "foo.bar"
>> sub("\\.[^.]*$", "", f)
> [1] "foo.bar"
>

more conventional in r, perhaps.  it's not portable, due to the 'escape
the escape to have an escape' feature of r when it comes to regexes; in
perl, for example, /\\.[^.]*$/ would hardly do the job.

vQ

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Extracting File Basename without Extension

2009-01-09 Thread Berwin A Turlach
G'day Wacek,

On Fri, 09 Jan 2009 15:19:46 +0100
Wacek Kusnierczyk  wrote:

> i think i did not suggest the original poster to learn perl.

As I see it, you didn't suggest anything to the original poster, at
least not directly.  But, since these days you have to be subscribed
to r-help to post IIRC, it is probably reasonable to assume that the
original poster saw your posting.
 
> many responses on this list involve regular expressions, and regexes
> are so ubiquitous in code that has to do with parsing and processing
> text, be it filenames or loads of data, that a user of r may well
> want to learn a bit of this stuff in addition to the details about
> how real numbers are represented below the surface.

You should really get rid of that chip on your shoulder.

> you may want to keep saying 'use r where applicable instead of worse
> tools', 

Now I am getting really confused, isn't your suggested solution not
using R too?  So why would I say something like this?  I suggest you get
rid of the chip on the other shoulder too... :)

> well, the regex syntax is fairly standard, though variations exist
> among languages.  

Exactly, and the existence of these variations which can trip one up
and require to rtfm in anything but the most simplest situations,
that's why regexp are not what jumps first to my mind.

> i have seen quite a bunch of programs written by scientists who spent
> over one hundred lines of code on just parsing command line
> arguments; i wish they knew regexes exist (and better, getopt-like
> modules too). if you're doing serious programming without knowing
> regexes, you're rather lucky.

Probably depends on what one calls serious.

Best wishes,

Berwin

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Extracting File Basename without Extension

2009-01-09 Thread Henrique Dallazuanna
Right,

Other option is:



substr(nameFile, 1, tail(unlist(gregexpr("\\.", nameFile)), 1) - 1)

On Fri, Jan 9, 2009 at 1:23 PM, Rau, Roland  wrote:

> Hi,
>
> > [mailto:r-help-boun...@r-project.org] On Behalf Of Henrique
> > Dallazuanna
> >
> > Try this also:
> >
> > substr(basename(myfile), 1, nchar(basename(myfile)) - 4)
> >
>
> This, of course, assumes that the extensions are always 3 characters.
> Sometimes there might be more ("index.html"), sometimes less
> ("shellscript.sh").
>
> Although my solution is not as compact as the others (I wish I was
> proficient in 'mastering regular expressions'), I'd like to provide my
> little code-snippet which does not require any regular expressions (but
> expects a . in the filename).
>
> ##
> x1 <- "roland.txt"
> x2 <- "roland.html"
> x3 <- "roland.sh"
>
> no.extension <- function(astring) {
>  if (substr(astring, nchar(astring), nchar(astring))==".") {
>return(substr(astring, 1, nchar(astring)-1))
>  } else {
>no.extension(substr(astring, 1, nchar(astring)-1))
>  }
> }
>
> no.extension(x1)
> no.extension(x2)
> no.extension(x3)
> ##
>
> Hope this helps a bit,
> Roland
>
> P.S. Any suggestions how to become more proficient with regular
> expressions? The O'Reilly book ("Mastering...")? Whenever I tried
> anything more complicated than basic usage (things like ^ $ * . ) in R,
> I was way faster to write a new function (like above) instead of finding
> a regex solution.
>
> By the way: it might be still possible to *write* regular expressions,
> but what about code re-use? Are there people who can easily *read*
> complicated regular expressions?
>
> --
> This mail has been sent through the MPI for Demographic Research.  Should
> you receive a mail that is apparently from a MPI user without this text
> displayed, then the address has most likely been faked. If you are uncertain
> about the validity of this message, please check the mail header or ask your
> system administrator for assistance.
>
>


-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40" S 49° 16' 22" O

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Extracting File Basename without Extension

2009-01-09 Thread Duncan Murdoch

On 1/8/2009 9:10 PM, Gundala Viswanath wrote:

Dear all,

The basename() function returns the extension also:


myfile <- "path1/path2/myoutput.txt"
basename(myfile)

[1] "myoutput.txt"


Is there any other function where it just returns
plain base:

"myoutput"

i.e. without 'txt'


I'm curious about something: does "file extension" have a standard 
definition?  Most (all?  I haven't tried them all) of the solutions 
presented in this thread would return an empty string for the "plain 
base" if given the filename ".bashrc".


Windows (where file extensions really mean something), though reluctant 
to create such a file, appears to agree that the extension is bashrc, 
even though to me it appears clear that that file has no extension.


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Extracting File Basename without Extension

2009-01-09 Thread Rau, Roland
Hi, 

> [mailto:r-help-boun...@r-project.org] On Behalf Of Henrique 
> Dallazuanna
> 
> Try this also:
> 
> substr(basename(myfile), 1, nchar(basename(myfile)) - 4)
> 

This, of course, assumes that the extensions are always 3 characters.
Sometimes there might be more ("index.html"), sometimes less
("shellscript.sh").

Although my solution is not as compact as the others (I wish I was
proficient in 'mastering regular expressions'), I'd like to provide my
little code-snippet which does not require any regular expressions (but
expects a . in the filename).

##
x1 <- "roland.txt"
x2 <- "roland.html"
x3 <- "roland.sh"

no.extension <- function(astring) {
  if (substr(astring, nchar(astring), nchar(astring))==".") {
return(substr(astring, 1, nchar(astring)-1))
  } else {
no.extension(substr(astring, 1, nchar(astring)-1))
  }
}

no.extension(x1)
no.extension(x2)
no.extension(x3)
##

Hope this helps a bit,
Roland

P.S. Any suggestions how to become more proficient with regular
expressions? The O'Reilly book ("Mastering...")? Whenever I tried
anything more complicated than basic usage (things like ^ $ * . ) in R,
I was way faster to write a new function (like above) instead of finding
a regex solution.

By the way: it might be still possible to *write* regular expressions,
but what about code re-use? Are there people who can easily *read*
complicated regular expressions?

--
This mail has been sent through the MPI for Demographic Research.  Should you 
receive a mail that is apparently from a MPI user without this text displayed, 
then the address has most likely been faked. If you are uncertain about the 
validity of this message, please check the mail header or ask your system 
administrator for assistance.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Extracting File Basename without Extension

2009-01-09 Thread Gabor Grothendieck
On Fri, Jan 9, 2009 at 10:23 AM, Rau, Roland  wrote:
> Hi,
>
>> [mailto:r-help-boun...@r-project.org] On Behalf Of Henrique
>> Dallazuanna
>>
>> Try this also:
>>
>> substr(basename(myfile), 1, nchar(basename(myfile)) - 4)
>>
>
> This, of course, assumes that the extensions are always 3 characters.

The regex solutions posted did not make this assumption.

>
> P.S. Any suggestions how to become more proficient with regular
> expressions? The O'Reilly book ("Mastering...")? Whenever I tried
> anything more complicated than basic usage (things like ^ $ * . ) in R,
> I was way faster to write a new function (like above) instead of finding
> a regex solution.

See the links in the Links box at:
http://gsubfn.googlecode.com



>
> By the way: it might be still possible to *write* regular expressions,
> but what about code re-use? Are there people who can easily *read*
> complicated regular expressions?
>
> --
> This mail has been sent through the MPI for Demographic Research.  Should you 
> receive a mail that is apparently from a MPI user without this text 
> displayed, then the address has most likely been faked. If you are uncertain 
> about the validity of this message, please check the mail header or ask your 
> system administrator for assistance.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Extracting File Basename without Extension

2009-01-09 Thread Peter Dalgaard
Duncan Murdoch wrote:
> On 1/8/2009 9:10 PM, Gundala Viswanath wrote:
>> Dear all,
>>
>> The basename() function returns the extension also:
>>
>>> myfile <- "path1/path2/myoutput.txt"
>>> basename(myfile)
>> [1] "myoutput.txt"
>>
>>
>> Is there any other function where it just returns
>> plain base:
>>
>> "myoutput"
>>
>> i.e. without 'txt'
> 
> I'm curious about something: does "file extension" have a standard
> definition?  Most (all?  I haven't tried them all) of the solutions
> presented in this thread would return an empty string for the "plain
> base" if given the filename ".bashrc".
> 
> Windows (where file extensions really mean something), though reluctant
> to create such a file, appears to agree that the extension is bashrc,
> even though to me it appears clear that that file has no extension.

I'm not sure what is clear about it, but the GNU utility agrees with you:

>basename abc/.exe .exe
.exe
>basename abc/1.exe .exe
1

Anyone want to contribute code for an optional suffix= argument for R's
basename()?

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - (p.dalga...@biostat.ku.dk)  FAX: (+45) 35327907

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Extracting File Basename without Extension

2009-01-09 Thread Marc Schwartz
on 01/09/2009 09:00 AM Duncan Murdoch wrote:
> On 1/8/2009 9:10 PM, Gundala Viswanath wrote:
>> Dear all,
>>
>> The basename() function returns the extension also:
>>
>>> myfile <- "path1/path2/myoutput.txt"
>>> basename(myfile)
>> [1] "myoutput.txt"
>>
>>
>> Is there any other function where it just returns
>> plain base:
>>
>> "myoutput"
>>
>> i.e. without 'txt'
> 
> I'm curious about something: does "file extension" have a standard
> definition?  Most (all?  I haven't tried them all) of the solutions
> presented in this thread would return an empty string for the "plain
> base" if given the filename ".bashrc".
> 
> Windows (where file extensions really mean something), though reluctant
> to create such a file, appears to agree that the extension is bashrc,
> even though to me it appears clear that that file has no extension.

Duncan,

That is going to be highly OS and even OS version specific. More
information here:

  http://en.wikipedia.org/wiki/Filename
  http://en.wikipedia.org/wiki/Filename_extension

There are relevant standard extensions for standard file formats:

  http://en.wikipedia.org/wiki/List_of_file_formats

but that does not guarantee that user created filenames will adhere to
them, especially for text files.

As you note, filenames beginning with a '.' will be common on
Unixen/Linuxen as otherwise normally hidden system/config files. Such
files would actually create problems if attempted to be opened on
Windows with certain applications and I have even seen problems with
such files when using SMB under Linux to access files on a server.

HTH,

Marc Schwartz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Extracting File Basename without Extension

2009-01-09 Thread Wacek Kusnierczyk

> Duncan Murdoch wrote:
>   
>>
>> I'm curious about something: does "file extension" have a standard
>> definition?  Most (all?  I haven't tried them all) of the solutions
>> presented in this thread would return an empty string for the "plain
>> base" if given the filename ".bashrc".
>> 


right;  there's a straightforward fix to my solution that accounts for
cases such as '.bashrc':

names = c("foo.bar", ".zee")
sub("(.+)[.][^.]+$", "\\1", names)

you could also use a lookbehind if possible (not in r, afaik).

vQ

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Extracting File Basename without Extension

2009-01-09 Thread Wacek Kusnierczyk
Rau, Roland wrote:
>
> P.S. Any suggestions how to become more proficient with regular
> expressions? The O'Reilly book ("Mastering...")? Whenever I tried
> anything more complicated than basic usage (things like ^ $ * . ) in R,
> I was way faster to write a new function (like above) instead of finding
> a regex solution.
>   

the book you mention is good.
you may also consider http://www.regular-expressions.info/

regexes are usually well explained with lots of examples in perl books.

> By the way: it might be still possible to *write* regular expressions,
> but what about code re-use? Are there people who can easily *read*
> complicated regular expressions?
>   


in some cases it is possible to write regular expressions in a way that
facilitates reading them by a human.  in perl, for example, you can use
so-called readable regexes:

/
   (.+)# match and remember at least one arbitrary character
   [.] # match a dot
   [^.]+ # match at least one non-dot character
   $  # end of string anchor
/x;

you can also use within regex comments:

/(.+)(?# one or more chars)[.](?# a dot)[^.]+(?# one or more
non-dots)$(?# end of string)/


nothing of the sorts in r, however.

vQ

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Extracting File Basename without Extension

2009-01-09 Thread Gabor Grothendieck
On Fri, Jan 9, 2009 at 4:20 PM, Wacek Kusnierczyk
 wrote:
>
>> Duncan Murdoch wrote:
>>
>>>
>>> I'm curious about something: does "file extension" have a standard
>>> definition?  Most (all?  I haven't tried them all) of the solutions
>>> presented in this thread would return an empty string for the "plain
>>> base" if given the filename ".bashrc".
>>>
>
>
> right;  there's a straightforward fix to my solution that accounts for
> cases such as '.bashrc':
>
> names = c("foo.bar", ".zee")
> sub("(.+)[.][^.]+$", "\\1", names)
>
> you could also use a lookbehind if possible (not in r, afaik).
>

or:

> sub(".*[.]", ".", names)
[1] ".bar" ".zee"

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Extracting File Basename without Extension

2009-01-09 Thread Gabor Grothendieck
On Fri, Jan 9, 2009 at 4:28 PM, Wacek Kusnierczyk
 wrote:
> Rau, Roland wrote:
>>
>> P.S. Any suggestions how to become more proficient with regular
>> expressions? The O'Reilly book ("Mastering...")? Whenever I tried
>> anything more complicated than basic usage (things like ^ $ * . ) in R,
>> I was way faster to write a new function (like above) instead of finding
>> a regex solution.
>>
>
> the book you mention is good.
> you may also consider http://www.regular-expressions.info/
>
> regexes are usually well explained with lots of examples in perl books.
>
>> By the way: it might be still possible to *write* regular expressions,
>> but what about code re-use? Are there people who can easily *read*
>> complicated regular expressions?
>>
>
>
> in some cases it is possible to write regular expressions in a way that
> facilitates reading them by a human.  in perl, for example, you can use
> so-called readable regexes:
>
> /
>   (.+)# match and remember at least one arbitrary character
>   [.] # match a dot
>   [^.]+ # match at least one non-dot character
>   $  # end of string anchor
> /x;
>
> you can also use within regex comments:
>
> /(.+)(?# one or more chars)[.](?# a dot)[^.]+(?# one or more
> non-dots)$(?# end of string)/
>
>
> nothing of the sorts in r, however.

Supports that if you begin the regular expression with (?x) and
use perl = TRUE.  See ?regexp

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Extracting File Basename without Extension

2009-01-11 Thread Wacek Kusnierczyk
Gabor Grothendieck wrote:
> On Fri, Jan 9, 2009 at 4:20 PM, Wacek Kusnierczyk
>
>   
>> right;  there's a straightforward fix to my solution that accounts for
>> cases such as '.bashrc':
>>
>> names = c("foo.bar", ".zee")
>> sub("(.+)[.][^.]+$", "\\1", names)
>>
>> you could also use a lookbehind if possible (not in r, afaik).
>>
>> 
>
> or:
>
>   
>> sub(".*[.]", ".", names)
>> 
> [1] ".bar" ".zee"
>   

it was "foo" that was desired...

vQ

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Extracting File Basename without Extension

2009-01-11 Thread Wacek Kusnierczyk
Gabor Grothendieck wrote:
> On Fri, Jan 9, 2009 at 4:28 PM, Wacek Kusnierczyk
>  wrote:
>   
>> Rau, Roland wrote:
>> 
>>> P.S. Any suggestions how to become more proficient with regular
>>> expressions? The O'Reilly book ("Mastering...")? Whenever I tried
>>> anything more complicated than basic usage (things like ^ $ * . ) in R,
>>> I was way faster to write a new function (like above) instead of finding
>>> a regex solution.
>>>
>>>   
>> the book you mention is good.
>> you may also consider http://www.regular-expressions.info/
>>
>> regexes are usually well explained with lots of examples in perl books.
>>
>> 
>>> By the way: it might be still possible to *write* regular expressions,
>>> but what about code re-use? Are there people who can easily *read*
>>> complicated regular expressions?
>>>
>>>   
>> in some cases it is possible to write regular expressions in a way that
>> facilitates reading them by a human.  in perl, for example, you can use
>> so-called readable regexes:
>>
>> /
>>   (.+)# match and remember at least one arbitrary character
>>   [.] # match a dot
>>   [^.]+ # match at least one non-dot character
>>   $  # end of string anchor
>> /x;
>>
>> you can also use within regex comments:
>>
>> /(.+)(?# one or more chars)[.](?# a dot)[^.]+(?# one or more
>> non-dots)$(?# end of string)/
>>
>>
>> nothing of the sorts in r, however.
>> 
>
> Supports that if you begin the regular expression with (?x) and
> use perl = TRUE.  See ?regexp
>   

cool, i see ?xism is supported.  so the above can be written in r as:

names = c("foo.bar", ".zee")
sub("(?x) # alloow embedded comemnts
 (.+) # match and remember at least one arbitrary character
[.] # match a dot
[^.]+ # match at least one non-dot character
$ # end of string anchor",
"\\1", names, perl=TRUE)

is this what you wanted, roland?

vQ

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Extracting File Basename without Extension

2009-01-14 Thread William Dunlap
The S+ basename() function has an argument called suffix and
it will remove the suffix from the result.  This was based on
the Unix basename command, but I missed the special case in the
Unix basename that doesn't remove the suffix if the removal
would result in an empty string.  The suffix must include any
initial dot.  Unlike the Unix basename, the suffix is a regular
expression (but other regexpr arguments like ignore.case and
fixed are not acceptedi - there ought to be a regular expression
class so these things get attached to the expression instead of
added to the call). 
  
  > basename(c("foobar", "dir/foo.bar"), suffix=".bar")
  [1] "fo"  "foo"
  > basename(c("foobar", "dir/foo.bar"), suffix="\\.bar")
  [1] "foobar" "foo"   
 
Bill Dunlap
TIBCO Software Inc - Spotfire Division
wdunlap tibco.com 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.