[R] Regular expressions & sub

2005-08-18 Thread Bernd Weiss
Dear all,

I am struggling with the use of regular expression. I got

> as.character(test$sample.id)
 [1] "1.11"   "10.11"  "11.11"  "113.31" "114.2"  "114.3"  "114.8"  

and need

[1] "11"   "11"  "11"  "31" "2"  "3"  "8"

I.e. remove everything before the "." .

TIA,

Bernd

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] regular expressions, sub

2006-01-27 Thread Christian Hoffmann
Hi,

I am trying to use sub, regexpr on expressions like

log(D) ~ log(N)+I(log(N)^2)+log(t)

being a model specification.

The aim is to produce:

"ln D ~ ln N + ln^2 N + ln t"

The variable names N, t may change, the number of terms too.

I succeded only partially, help on regular expressions is hard to 
understand for me, examples on my case are rare. The help page on R-help 
for grep etc. and "regular expressions"

What I am doing:

(f <- log(D) ~ log(N)+I(log(N)^2)+log(t))
(ft <- sub("","",f))   # creates string with parts of formula, how to do 
it simpler?
(fu <- paste(ft[c(2,1,3)],collapse=" "))  # converts to one string

Then I want to use \1 for backreferences something like

(fv <- sub("log( [:alpha:] N  )^ [:alpha:)","ln \\1^\\2",fu))

to change "log(g)^7" to "ln^7 g",

and to eliminate I(): sub("I(blabla)","\\1",fv)  # I(xxx) -> xxx

The special characters are making trouble, sub acceps "(", ")" only in 
pairs. Code for experimentation:

trysub <- function(s,t,e) {
ii<-0; for (i1 in c(TRUE,FALSE)) for (i2 in c(TRUE,FALSE)) for (i3 in 
c(TRUE,FALSE)) for (i4 in c(TRUE,FALSE)) 
print(paste(ii<-ii+1,ifelse(i1,"  "," ~"),"ext",ifelse(i2,"  "," 
~"),"perl",ifelse(i3,"  "," ~"),"fixed ",ifelse(i4,"  "," ~"),"useBytes: 
", try(sub(s,t,e, extended=i1, perl=i2, fixed=i3, 
useBytes=i4)),sep=""));invisible(0) }

trysub("I(log(N)^2)","ln n^2",fu) # A: desired result for cases 
5,6,13..16, the rest unsubstituted

trysub("log(","ln ",fu)   # B: no substitutions; errors for 
cases 1..4,7.. 12   # typical errors:
"3  ext  perl ~fixed   useBytes: Error in sub.perl(pattern, replacement, 
x, ignore.case, useBytes) : \n\tinvalid regular expression 'log('\n"

trysub("log\(","ln ",fu)  # C: same as A

trysub("log\\(","ln ",fu) # D: no substitutions; errors for 
cases 15,16# typical errors:
"15 ~ext ~perl ~fixed   useBytes: Error in sub(pattern, replacement, x, 
ignore.case, extended, fixed, useBytes) : \n\tinvalid regular expression 
'log\\('\n"

trysub("log\\(([:alpha:]+)\\)","ln \1",fu) # no substitutions, no errors
# E: typical errors:
"3  ext  perl ~fixed   useBytes: Error in sub.perl(pattern, replacement, 
x, ignore.case, useBytes) : \n\tinvalid regular expression 
'log\\(([:alpha:]+)\\)'\n"



Thanks for help
Christian

PS. The explanations in the documents
-- 
Dr. Christian W. Hoffmann,
Swiss Federal Research Institute WSL
Mathematics + Statistical Computing
Zuercherstrasse 111
CH-8903 Birmensdorf, Switzerland

Tel +41-44-7392-277  (office)   -111(exchange)
Fax +41-44-7392-215  (fax)
[EMAIL PROTECTED]
http://www.wsl.ch/staff/christian.hoffmann

International Conference 5.-7.6.2006 Ekaterinburg Russia
"Climate changes and their impact on boreal and temperate forests"
http://ecoinf.uran.ru/conference/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Regular expressions & sub

2005-08-18 Thread Tony Plate
 > x <- scan("clipboard", what="")
Read 7 items
 > x
[1] "1.11"   "10.11"  "11.11"  "113.31" "114.2"  "114.3"  "114.8"
 > gsub("[0-9]*\\.", "", x)
[1] "11" "11" "11" "31" "2"  "3"  "8"
 >


Bernd Weiss wrote:
> Dear all,
> 
> I am struggling with the use of regular expression. I got
> 
> 
>>as.character(test$sample.id)
> 
>  [1] "1.11"   "10.11"  "11.11"  "113.31" "114.2"  "114.3"  "114.8"  
> 
> and need
> 
> [1] "11"   "11"  "11"  "31" "2"  "3"  "8"
> 
> I.e. remove everything before the "." .
> 
> TIA,
> 
> Bernd
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Regular expressions & sub

2005-08-18 Thread bogdan romocea
One solution is
test <- c("1.11","10.11","11.11","113.31","114.2","114.3")
id <-  unlist(lapply(strsplit(test,"[.]"),function(x) {x[2]}))


> -Original Message-
> From: Bernd Weiss [mailto:[EMAIL PROTECTED] 
> Sent: Thursday, August 18, 2005 12:10 PM
> To: r-help@stat.math.ethz.ch
> Subject: [R] Regular expressions & sub
> 
> 
> Dear all,
> 
> I am struggling with the use of regular expression. I got
> 
> > as.character(test$sample.id)
>  [1] "1.11"   "10.11"  "11.11"  "113.31" "114.2"  "114.3"  "114.8"  
> 
> and need
> 
> [1] "11"   "11"  "11"  "31" "2"  "3"  "8"
> 
> I.e. remove everything before the "." .
> 
> TIA,
> 
> Bernd
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Regular expressions & sub

2005-08-18 Thread Dirk Eddelbuettel
Bernd Weiss  uni-koeln.de> writes:
> I am struggling with the use of regular expression. I got
> 
> > as.character(test$sample.id)
>  [1] "1.11"   "10.11"  "11.11"  "113.31" "114.2"  "114.3"  "114.8"  
> 
> and need
> 
> [1] "11"   "11"  "11"  "31" "2"  "3"  "8"
> 
> I.e. remove everything before the "." .

Define the dot as the hard separator, and allow for multiple digits before it:

> sample.id <- c("1.11", "10.11", "11.11", "113.31", "114.2", "114.3", "114.8")
> gsub("^[0-9]*\.", "", sample.id)
[1] "11" "11" "11" "31" "2"  "3"  "8" 

Hope this helps,  Dirk

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Regular expressions & sub

2005-08-18 Thread Peter Dalgaard
Dirk Eddelbuettel <[EMAIL PROTECTED]> writes:

> Bernd Weiss  uni-koeln.de> writes:
> > I am struggling with the use of regular expression. I got
> > 
> > > as.character(test$sample.id)
> >  [1] "1.11"   "10.11"  "11.11"  "113.31" "114.2"  "114.3"  "114.8"  
> > 
> > and need
> > 
> > [1] "11"   "11"  "11"  "31" "2"  "3"  "8"
> > 
> > I.e. remove everything before the "." .
> 
> Define the dot as the hard separator, and allow for multiple digits before it:
> 
> > sample.id <- c("1.11", "10.11", "11.11", "113.31", "114.2", "114.3", 
> > "114.8")
> > gsub("^[0-9]*\.", "", sample.id)
> [1] "11" "11" "11" "31" "2"  "3"  "8" 

Or, more longwinded, but with less assumptions about what goes before
the dot:

> gsub("^.*\\.(.*)$","\\1",sample.id)
[1] "11" "11" "11" "31" "2"  "3"  "8"

or,

> gsub("^.*\\.([^.]*)$","\\1",sample.id)
[1] "11" "11" "11" "31" "2"  "3"  "8"


-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Regular expressions & sub

2005-08-18 Thread Bernd Weiss
On 18 Aug 2005 at 21:17, Peter Dalgaard wrote:

> Dirk Eddelbuettel <[EMAIL PROTECTED]> writes:
> 
> > Bernd Weiss  uni-koeln.de> writes:
> > > I am struggling with the use of regular expression. I got
> > > 
> > > > as.character(test$sample.id)
> > >  [1] "1.11"   "10.11"  "11.11"  "113.31" "114.2"  "114.3"  "114.8"
> > >   
> > > 
> > > and need
> > > 
> > > [1] "11"   "11"  "11"  "31" "2"  "3"  "8"
> > > 
> > > I.e. remove everything before the "." .
> > 
> > Define the dot as the hard separator, and allow for multiple digits
> > before it:
> > 
> > > sample.id <- c("1.11", "10.11", "11.11", "113.31", "114.2",
> > > "114.3", "114.8") gsub("^[0-9]*\.", "", sample.id)
> > [1] "11" "11" "11" "31" "2"  "3"  "8" 
> 
> Or, more longwinded, but with less assumptions about what goes before
> the dot:
> 
> > gsub("^.*\\.(.*)$","\\1",sample.id)
> [1] "11" "11" "11" "31" "2"  "3"  "8"

Wow, thanks a lot for all the valuable suggestions.

Bernd

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] regular expressions, sub

2006-01-27 Thread Prof Brian Ripley
Note that [:alpha:] is a pre-defined character class and should only be 
used inside [].  And metacharacters need to be quoted.  See ?regexp.

> f <- log(D) ~ log(N)+I(log(N)^2)+log(t)
> f1 <- deparse(f)
> f1
[1] "log(D) ~ log(N) + I(log(N)^2) + log(t)"

Now we have a string.

(f2 <- gsub("I\\((.*)\\) ", "\\1 ", f1))
[1] "log(D) ~ log(N) + log(N)^2 + log(t)"
(f3 <- gsub("(?U)log\\((.*)\\)", "ln \\1", f2, perl=TRUE))
[1] "ln D ~ ln N + ln N^2 + ln t"
(f4 <- gsub("ln ([[:alpha:]])\\^([[:digit:]])", "ln^\\2 \\1", f3))
[1] "ln D ~ ln N + ln^2 N + ln t"

That should give you some ideas to be going on with.

On Fri, 27 Jan 2006, Christian Hoffmann wrote:

> Hi,
>
> I am trying to use sub, regexpr on expressions like
>
>log(D) ~ log(N)+I(log(N)^2)+log(t)
>
> being a model specification.
>
> The aim is to produce:
>
>"ln D ~ ln N + ln^2 N + ln t"
>
> The variable names N, t may change, the number of terms too.
>
> I succeded only partially, help on regular expressions is hard to
> understand for me, examples on my case are rare. The help page on R-help
> for grep etc. and "regular expressions"
>
> What I am doing:
>
> (f <- log(D) ~ log(N)+I(log(N)^2)+log(t))
> (ft <- sub("","",f))   # creates string with parts of formula, how to do
> it simpler?
> (fu <- paste(ft[c(2,1,3)],collapse=" "))  # converts to one string
>
> Then I want to use \1 for backreferences something like
>
> (fv <- sub("log( [:alpha:] N  )^ [:alpha:)","ln \\1^\\2",fu))
>
> to change "log(g)^7" to "ln^7 g",
>
> and to eliminate I(): sub("I(blabla)","\\1",fv)  # I(xxx) -> xxx
>
> The special characters are making trouble, sub acceps "(", ")" only in
> pairs.

>From ?regexp

   Any metacharacter with special meaning may be quoted by preceding it
   with a backslash.  The metacharacters are '. \ | ( ) [ { ^ $ * +  ?'.


> Code for experimentation:
>
> trysub <- function(s,t,e) {
> ii<-0; for (i1 in c(TRUE,FALSE)) for (i2 in c(TRUE,FALSE)) for (i3 in
> c(TRUE,FALSE)) for (i4 in c(TRUE,FALSE))
> print(paste(ii<-ii+1,ifelse(i1,"  "," ~"),"ext",ifelse(i2,"  ","
> ~"),"perl",ifelse(i3,"  "," ~"),"fixed ",ifelse(i4,"  "," ~"),"useBytes:
> ", try(sub(s,t,e, extended=i1, perl=i2, fixed=i3,
> useBytes=i4)),sep=""));invisible(0) }
>
> trysub("I(log(N)^2)","ln n^2",fu) # A: desired result for cases
> 5,6,13..16, the rest unsubstituted
>
> trysub("log(","ln ",fu)   # B: no substitutions; errors for
> cases 1..4,7.. 12   # typical errors:
> "3  ext  perl ~fixed   useBytes: Error in sub.perl(pattern, replacement,
> x, ignore.case, useBytes) : \n\tinvalid regular expression 'log('\n"
>
> trysub("log\(","ln ",fu)  # C: same as A
>
> trysub("log\\(","ln ",fu) # D: no substitutions; errors for
> cases 15,16# typical errors:
> "15 ~ext ~perl ~fixed   useBytes: Error in sub(pattern, replacement, x,
> ignore.case, extended, fixed, useBytes) : \n\tinvalid regular expression
> 'log\\('\n"
>
> trysub("log\\(([:alpha:]+)\\)","ln \1",fu) # no substitutions, no errors
> # E: typical errors:
> "3  ext  perl ~fixed   useBytes: Error in sub.perl(pattern, replacement,
> x, ignore.case, useBytes) : \n\tinvalid regular expression
> 'log\\(([:alpha:]+)\\)'\n"
>
>
>
> Thanks for help
> Christian
>
> PS. The explanations in the documents
> -- 
> Dr. Christian W. Hoffmann,
> Swiss Federal Research Institute WSL
> Mathematics + Statistical Computing
> Zuercherstrasse 111
> CH-8903 Birmensdorf, Switzerland
>
> Tel +41-44-7392-277  (office)   -111(exchange)
> Fax +41-44-7392-215  (fax)
> [EMAIL PROTECTED]
> http://www.wsl.ch/staff/christian.hoffmann
>
> International Conference 5.-7.6.2006 Ekaterinburg Russia
> "Climate changes and their impact on boreal and temperate forests"
> http://ecoinf.uran.ru/conference/
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] regular expressions, sub

2006-01-27 Thread Philippe Grosjean
Hello,

Here is what I got after playing a little bit with your problem:

# First of all, if you prefer 'ln' instead of 'log', why not to define:
ln <- function(x) log(x)
ln2 <- function(x) log(x)^2
ln3 <- function(x) log(x)^3
ln4 <- function(x) log(x)^4
# ... as many function as powers you need

# Then, your formula is now closer to what you want
# which makes the whole code easier to read for you:

Form <- ln(D) ~ ln(N) + ln2(N) + ln(t) # Same as your original formula

# Here is the function to transform it in a more readable string:
formulaTransform <-
function(form, as.expression = FALSE) {
 if (!inherits(form, "formula"))
 stop("'form' must be a 'formula' object!")

 # Transform the formula into a string (is it a better way?)
 Res <- paste(as.character(form)[c(2, 1, 3)], collapse = " ")

 if (as.expression) { # Transform the formula in a nice expression
 # Change '~' into '=='
 Res <- sub("~", "%~~%", Res) # How to do '~' in an expression?
 # Eliminate brackets
 Res <- gsub("[(]([A-Za-z0-9._]*)[)]", " ~ \\1", Res)
 # Transform powers
 Res <- gsub("ln([2-9])", "ln^\\1", Res)
 Res <- eval(parse(text = Res))
 } else { # Make a nicer string
 # Eliminate brackets
 Res <- gsub("[(]([A-Za-z0-9._]*)[)]", " \\1", Res)
 # Transform powers
 Res <- gsub("ln([2-9])", "ln^\\1", Res)
 }

 # Return the result
 return(Res)
}

# Here is a nicer presentation as a string
formulaTransform(Form)

# Here is an even nicer presentation (creating an expression)
plot(1:3, type = "n")
text(2, 2, formulaTransform(Form, TRUE))

# The later form is really interesting when you use, for instance,
# greek letters for variables, or so...
Form2 <- ln(alpha) ~ ln(beta) + ln2(beta) + ln3(beta)

formulaTransform(Form2)
plot(1:3, type = "n")
text(2, 2, formulaTransform(Form2, TRUE))

# ... but this could be refined even more!

Best,

Philippe Grosjean

..<°}))><
  ) ) ) ) )
( ( ( ( (Prof. Philippe Grosjean
  ) ) ) ) )
( ( ( ( (Numerical Ecology of Aquatic Systems
  ) ) ) ) )   Mons-Hainaut University, Pentagone (3D08)
( ( ( ( (
..

Christian Hoffmann wrote:
> Hi,
> 
> I am trying to use sub, regexpr on expressions like
> 
> log(D) ~ log(N)+I(log(N)^2)+log(t)
> 
> being a model specification.
> 
> The aim is to produce:
> 
> "ln D ~ ln N + ln^2 N + ln t"
> 
> The variable names N, t may change, the number of terms too.
> 
> I succeded only partially, help on regular expressions is hard to 
> understand for me, examples on my case are rare. The help page on R-help 
> for grep etc. and "regular expressions"
> 
> What I am doing:
> 
> (f <- log(D) ~ log(N)+I(log(N)^2)+log(t))
> (ft <- sub("","",f))   # creates string with parts of formula, how to do 
> it simpler?
> (fu <- paste(ft[c(2,1,3)],collapse=" "))  # converts to one string
> 
> Then I want to use \1 for backreferences something like
> 
> (fv <- sub("log( [:alpha:] N  )^ [:alpha:)","ln \\1^\\2",fu))
> 
> to change "log(g)^7" to "ln^7 g",
> 
> and to eliminate I(): sub("I(blabla)","\\1",fv)  # I(xxx) -> xxx
> 
> The special characters are making trouble, sub acceps "(", ")" only in 
> pairs. Code for experimentation:
> 
> trysub <- function(s,t,e) {
> ii<-0; for (i1 in c(TRUE,FALSE)) for (i2 in c(TRUE,FALSE)) for (i3 in 
> c(TRUE,FALSE)) for (i4 in c(TRUE,FALSE)) 
> print(paste(ii<-ii+1,ifelse(i1,"  "," ~"),"ext",ifelse(i2,"  "," 
> ~"),"perl",ifelse(i3,"  "," ~"),"fixed ",ifelse(i4,"  "," ~"),"useBytes: 
> ", try(sub(s,t,e, extended=i1, perl=i2, fixed=i3, 
> useBytes=i4)),sep=""));invisible(0) }
> 
> trysub("I(log(N)^2)","ln n^2",fu) # A: desired result for cases 
> 5,6,13..16, the rest unsubstituted
> 
> trysub("log(","ln ",fu)   # B: no substitutions; errors for 
> cases 1..4,7.. 12   # typical errors:
> "3  ext  perl ~fixed   useBytes: Error in sub.perl(pattern, replacement, 
> x, ignore.case, useBytes) : \n\tinvalid regular expression 'log('\n"
> 
> trysub("log\(","ln ",fu)  # C: same as A
> 
> trysub("log\\(","ln ",fu) # D: no substitutions; errors for 
> cases 15,16# typical errors:
> "15 ~ext ~perl ~fixed   useBytes: Error in sub(pattern, replacement, x, 
> ignore.case, extended, fixed, useBytes) : \n\tinvalid regular expression 
> 'log\\('\n"
> 
> trysub("log\\(([:alpha:]+)\\)","ln \1",fu) # no substitutions, no errors
> # E: typical errors:
> "3  ext  perl ~fixed   useBytes: Error in sub.perl(pattern, replacement, 
> x, ignore.case, useBytes) : \n\tinvalid regular expression 
> 'log\\(([:alpha:]+)\\)'\n"
> 
> 
> 
> Thanks for help
> Christian
> 
> PS. The explanations in the documents

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.ht

Re: [R] regular expressions, sub

2006-01-27 Thread paul sorenson
There are some interactive regex tools around.  I use a python one 
sometimes.  You just then have to be careful re escaping and the style 
of regular expressions used in the tool you worked with and the target 
environment.

Christian Hoffmann wrote:
> Hi,
> 
> I am trying to use sub, regexpr on expressions like
> 
> log(D) ~ log(N)+I(log(N)^2)+log(t)
> 
> being a model specification.
> 
> The aim is to produce:
> 
> "ln D ~ ln N + ln^2 N + ln t"
> 
> The variable names N, t may change, the number of terms too.
> 
> I succeded only partially, help on regular expressions is hard to 
> understand for me, examples on my case are rare. The help page on R-help 
> for grep etc. and "regular expressions"

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] regular expressions, sub

2006-01-27 Thread Gabor Grothendieck
In this post:

http://finzi.psych.upenn.edu/R/Rhelp02a/archive/30590.html

Thomas Lumley provided a function to traverse a formula recursively.
We can modify it as shown to transform ln(m)^n to ln^n(m) producing
proc2.  We then bundle everything up into proc3 which uses substitute
to translate log to ln and remove (, the calls proc2 to do the aforementioned
substitute and finally we use simple character processing to clean up the
rest.

Although this is substantially longer in terms of lines of code
we did not have to write many of them because proc2 is actually
just a modification of the code in the indicated post and the
character processing becomes extremely simple.  Also its more
powerful able to handle expressions like:

log(D) ~ log(log(N)^2)^3




proc2 <-function(formula){
process<-function(expr){
if (length(expr)==1)
  return(expr)
   if(length(expr)==2) {
  expr[[2]] <- process(expr[[2]])
  return(expr)
   }
   if ( expr[[1]]==as.name("^") && length(expr[[2]])==2 &&
  expr[[2]][[1]] == as.name("ln") &&
  class(idx <- expr[[3]]) == "numeric") {
expr <- as.call(list(as.name(paste("ln",idx,sep = "^")),
   expr[[2]][[2]]))
expr[[2]] <- process(expr[[2]])
return(expr)
   }
   expr[[2]]<-process(expr[[2]])
   expr[[3]]<-process(expr[[3]])
   return(expr)
  }
   formula[[3]]<-process(formula[[3]])
   formula
}

proc3 <- function(f) {

# replace log with ln
result <- do.call("substitute", list(f, list(log = as.name("ln"

# remove I
result <- do.call("substitute", list(result, list(I = as.name("("

# transform ln(m)^n to ln^n(m)
result <- proc2(result)

# now clean up using simple character substitutions
result <- deparse(result)

# ( -> space
result <- gsub("[(]", " ", result)

# remove " and )
gsub("[\")]", "", result)
}

# tests

proc3( log(D) ~ log(N)+I(log(N)^2)+log(t) ) # "ln D ~ ln N +  ln^2 N + ln t"

proc3( log(D) ~ log(log(N)^2)^3)   # "ln D ~ ln^3 ln^2 N"



On 1/27/06, Christian Hoffmann <[EMAIL PROTECTED]> wrote:
> Hi,
>
> I am trying to use sub, regexpr on expressions like
>
>log(D) ~ log(N)+I(log(N)^2)+log(t)
>
> being a model specification.
>
> The aim is to produce:
>
>"ln D ~ ln N + ln^2 N + ln t"
>
> The variable names N, t may change, the number of terms too.
>
> I succeded only partially, help on regular expressions is hard to
> understand for me, examples on my case are rare. The help page on R-help
> for grep etc. and "regular expressions"
>
> What I am doing:
>
> (f <- log(D) ~ log(N)+I(log(N)^2)+log(t))
> (ft <- sub("","",f))   # creates string with parts of formula, how to do
> it simpler?
> (fu <- paste(ft[c(2,1,3)],collapse=" "))  # converts to one string
>
> Then I want to use \1 for backreferences something like
>
> (fv <- sub("log( [:alpha:] N  )^ [:alpha:)","ln \\1^\\2",fu))
>
> to change "log(g)^7" to "ln^7 g",
>
> and to eliminate I(): sub("I(blabla)","\\1",fv)  # I(xxx) -> xxx
>
> The special characters are making trouble, sub acceps "(", ")" only in
> pairs. Code for experimentation:
>
> trysub <- function(s,t,e) {
> ii<-0; for (i1 in c(TRUE,FALSE)) for (i2 in c(TRUE,FALSE)) for (i3 in
> c(TRUE,FALSE)) for (i4 in c(TRUE,FALSE))
> print(paste(ii<-ii+1,ifelse(i1,"  "," ~"),"ext",ifelse(i2,"  ","
> ~"),"perl",ifelse(i3,"  "," ~"),"fixed ",ifelse(i4,"  "," ~"),"useBytes:
> ", try(sub(s,t,e, extended=i1, perl=i2, fixed=i3,
> useBytes=i4)),sep=""));invisible(0) }
>
> trysub("I(log(N)^2)","ln n^2",fu) # A: desired result for cases
> 5,6,13..16, the rest unsubstituted
>
> trysub("log(","ln ",fu)   # B: no substitutions; errors for
> cases 1..4,7.. 12   # typical errors:
> "3  ext  perl ~fixed   useBytes: Error in sub.perl(pattern, replacement,
> x, ignore.case, useBytes) : \n\tinvalid regular expression 'log('\n"
>
> trysub("log\(","ln ",fu)  # C: same as A
>
> trysub("log\\(","ln ",fu) # D: no substitutions; errors for
> cases 15,16# typical errors:
> "15 ~ext ~perl ~fixed   useBytes: Error in sub(pattern, replacement, x,
> ignore.case, extended, fixed, useBytes) : \n\tinvalid regular expression
> 'log\\('\n"
>
> trysub("log\\(([:alpha:]+)\\)","ln \1",fu) # no substitutions, no errors
> # E: typical errors:
> "3  ext  perl ~fixed   useBytes: Error in sub.perl(pattern, replacement,
> x, ignore.case, useBytes) : \n\tinvalid regular expression
> 'log\\(([:alpha:]+)\\)'\n"
>
>
>
> Thanks for help
> Christian
>
> PS. The explanations in the documents
> --
> Dr. Christian W. Hoffmann,
> Swiss Federal Research Institute WSL
> Mathematics + Statistical Computing
> Zuercherstrasse 111
> CH-8903 Birmensdorf, Switzerland
>
> Tel +41-44-7392-277  (office)   -111(exchange)
> Fax +41-44-7392-215  (fax)
> [