Re: [R] subset arg in (modified) evalq

2007-05-18 Thread Vadim Ogranovich
Sorry, I didn't explain myself clear enough. I knew about the select arg in 
subset(). My question was, given the expression expression(summary(x+y)), how 
to extract all names that will be looked up during its evaluation. 

As to checking performance assumptions, you are right, in most cases the 
overhead is negligible, but sometimes I work with really big data sets. 

Thanks a lot for your help, 
Vadim 


- Original Message - 
From: " Gabor Grothendieck " < ggrothendieck @ gmail .com> 
To: " Vadim Ogranovich " < vogranovich @ jumptrading .com> 
Cc: r-help @stat.math. ethz .ch 
Sent: Friday, May 18, 2007 9:53:26 AM ( GMT-0600 ) America/Chicago 
Subject: Re: [R] subset arg in (modified) evalq 

I would check your performance assumption with an actual test before 
concluding such but at any rate subset does have a select argument. See 
?subset 

On 5/18/07, Vadim Ogranovich < vogranovich @ jumptrading .com> wrote: 
> Thanks Gabor ! This does exactly what I wanted. 
> 
> One follow-up question, how to extract the var names, in this case y, z, 
> from the expression? The subset function creates a new object and this may 
> be expensive when the data has a lot of irrelevant collumns . So I thougth 
> that I could reduce this to the columns I actually need. 
> 
> Thanks, 
> Vadim 
> 
> 
> 
> - Original Message - 
> From: " Gabor Grothendieck " < ggrothendieck @ gmail .com> 
> To: " Vadim Ogranovich " < vogranovich @ jumptrading .com> 
> Cc: r-help @stat.math. ethz .ch 
> Sent: Friday, May 18, 2007 9:19:49 AM ( GMT-0600 ) America/Chicago 
> Subject: Re: [R] subset arg in (modified) evalq 
> 
> Try this: 
> 
> with(subset(data, x > 0), summary(y + z)) 
> 
> 
> On 5/18/07, Vadim Ogranovich < vogranovich @ jumptrading .com> wrote: 
> > Hi, 
> > 
> > When using evalq to evaluate expressions within a say data.frame context I 
> often wish there was a 'subset' argument, much like in lm () or any ather 
> advanced regression model. I would be grateful for a tip how to do this. 
> > 
> > Here is an illustration of what I want: 
> > 
> > n <- 100 
> > data <- data.frame(x= rnorm (n), y= rnorm (y), z= rnorm (z)) 
> > 
> > # this works 
> > evalq ({ i <- 0 > 
> > # I want to do the above w/o explicit subscripting , e.g. 
> > myevalq (summary(y + z), subset=0 > 
> > Thanks, 
> > Vadim 
> > 
> > [[alternative HTML version deleted]] 
> > 
> > __ 
> > R-help @stat.math. ethz .ch mailing list 
> > https ://stat. ethz .ch/mailman/ listinfo / r-help 
> > PLEASE do read the posting guide 
> http :// www . R-project .org/ posting-guide . html 
> > and provide commented, minimal, self-contained , reproducible code. 
> > 
> 

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] subset arg in (modified) evalq

2007-05-18 Thread Vadim Ogranovich
Thanks Gabor! This does exactly what I wanted. 

One follow-up question, how to extract the var names, in this case y, z, from 
the expression? The subset function creates a new object and this may be 
expensive when the data has a lot of irrelevant collumns. So I thougth that I 
could reduce this to the columns I actually need. 

Thanks, 
Vadim 


- Original Message - 
From: "Gabor Grothendieck" <[EMAIL PROTECTED]> 
To: "Vadim Ogranovich" <[EMAIL PROTECTED]> 
Cc: r-help@stat.math.ethz.ch 
Sent: Friday, May 18, 2007 9:19:49 AM (GMT-0600) America/Chicago 
Subject: Re: [R] subset arg in (modified) evalq 

Try this: 

with(subset(data, x > 0), summary(y + z)) 


On 5/18/07, Vadim Ogranovich <[EMAIL PROTECTED]> wrote: 
> Hi, 
> 
> When using evalq to evaluate expressions within a say data.frame context I 
> often wish there was a 'subset' argument, much like in lm() or any ather 
> advanced regression model. I would be grateful for a tip how to do this. 
> 
> Here is an illustration of what I want: 
> 
> n <- 100 
> data <- data.frame(x=rnorm(n), y=rnorm(y), z=rnorm(z)) 
> 
> # this works 
> evalq({ i <- 0 
> # I want to do the above w/o explicit subscripting, e.g. 
> myevalq(summary(y + z), subset=0 
> Thanks, 
> Vadim 
> 
> [[alternative HTML version deleted]] 
> 
> __ 
> R-help@stat.math.ethz.ch mailing list 
> https://stat.ethz.ch/mailman/listinfo/r-help 
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html 
> and provide commented, minimal, self-contained, reproducible code. 
> 

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] subset arg in (modified) evalq

2007-05-18 Thread Vadim Ogranovich
Hi, 

When using evalq to evaluate expressions within a say data.frame context I 
often wish there was a 'subset' argument, much like in lm() or any ather 
advanced regression model. I would be grateful for a tip how to do this. 

Here is an illustration of what I want: 

n <- 100 
data <- data.frame(x=rnorm(n), y=rnorm(y), z=rnorm(z)) 

# this works 
evalq({ i <- 0https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] convert text to exprission good for lm arguments

2007-05-03 Thread Vadim Ogranovich
Hi, 

I ran into a problem of converting a text representation of an expression into 
parsed expression to be further evaluated inside lm (). 

> n <- 100 
> data <- data.frame(x= rnorm (n), y= rnorm (n)) 
> data. lm <- lm (y ~ x, data=data) 
> 
> ## this works 
> update(data. lm , subset=x<0) 

Call: 
lm (formula = y ~ x, data = data, subset = x < 0) 

Coefficients: 
(Intercept) x 
-0.07864094193322170023 -0.14596982635007796358 

> 
> ## this doesn't work 
> ## text representation of subset 
> subset <- "x<0" 
> update(data. lm , subset=parse(text=subset)) 
Error in `[.data.frame`(list(y = c(-0.601925958140825, -0.111931189071517, : 
invalid subscript type 

What is the correct way to convert "x<0" into a valid subset argument? 

Thanks, 
Vadim 

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] extracting intercept from ppr fit

2007-04-17 Thread Vadim Ogranovich
Sorry for triple-posting : I seem to have a problem w/ my mail client. 

Hi, 

Is there a way, documented or not, to extract the intercept term (the alpha_0 
the MASS book) from a ppr() (Projection Persuit Regression) fit? 

Thanks, 
Vadim 

## Example: 
n <- 1000 

data <- data.frame(x= rnorm (n), y= rnorm (n)) 

a <- 10 
data$z <- evalq(a + atan (x + y) + rnorm (n), data) 

data.ppr <- ppr(z ~ x + y, data=data, nterms =1) 

## how to extract a = 10 from data.ppr? 
[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] extracting intercept from ppr fit

2007-04-17 Thread Vadim Ogranovich
Hi, 

Is there a way, documented or not, to extract the intercept term (the alpha_0 
the MASS book) from a ppr() (Projection Persuit Regression) fit? 

Thanks, 
Vadim 

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] extracting intercept from ppr fit

2007-04-17 Thread Vadim Ogranovich
Hi, 

Is there a way, documented or not, to extract the intercept term (the alpha_0 
the MASS book) from a ppr() (Projection Persuit Regression) fit? 

Thanks, 
Vadim 

## Example: 
n <- 1000 

data <- data.frame(x=rnorm(n), y=rnorm(n)) 

a <- 10 
data$z <- evalq(a + atan(x + y) + rnorm(n), data) 

data.ppr <- ppr(z ~ x + y, data=data, nterms=1) 

## how to extract a = 10 from data.ppr? 


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] approx with ties = 'ordered'

2007-04-03 Thread Vadim Ogranovich
Hi,

I am a bit surprised how approx resolves ties when ties = 'ordered'. In the
following two examples I'd expect the first expression to return 1 (not 2).

The documentation reads that "that 'f=0' is right-continuous and 'f=1' is
left-continuous" so one would expect  the argument to matter when resolving 
ties.

Thanks,
Vadim

->  approx(c(1,1), seq(2), 1, method = "const", rule = 2, f = 1, yleft = NA, 
ties
= "ordered")$y
[1] 2
->  approx(c(1,1), seq(2), 1, method = "const", rule = 2, f = 0, yleft = NA, 
ties
= "ordered")$y
[1] 2


-> version
   _   
platform   i486-pc-linux-gnu   
arch   i486
os linux-gnu   
system i486, linux-gnu 
status 
major  2   
minor  4.1 
year   2006
month  12  
day18  
svn rev40228   
language   R   
version.string R version 2.4.1 (2006-12-18)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Snow vs Rmpi

2007-02-14 Thread Vadim Ogranovich
Hi, 

I have few high-level questions about the Snow and Rmpi packages . I understand 
that Snow uses Rmpi as one of possible transport layers, yet my questions about 
user experience, not technical details: 

1. Does Snow install and work well in Windows? 
2. Interruptibility. I understand that currently it is impossible to interrupt 
a running top-level command in Snow ( Ctl-c or the likes), the only way to kill 
slave processes is to kill the master R process. Is this accurate? What about 
Rmpi ? Is there any difference between Windows and Linux? 
3. When the master process dies , is it guaranteed that the slaves will die 
too? How reliable is this (I've seen some applications, not related to R, that 
were flaky about killing slaves) 

Thank you very much for your help, 
Vadim 



[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Redirect console to file

2005-06-09 Thread Vadim Ogranovich
Will work, but as you mentioned there ought to be an easier way :-) 

> -Original Message-
> From: Mike R [mailto:[EMAIL PROTECTED] 
> Sent: Thursday, June 09, 2005 10:29 PM
> To: Vadim Ogranovich
> Subject: Re: [R] Redirect console to file
> 
> On 6/9/05, Mike R <[EMAIL PROTECTED]> wrote:
> > On 6/9/05, Vadim Ogranovich <[EMAIL PROTECTED]> wrote:
> > > Hi,
> > >
> > > Is it possible to redirect the staff that normally goes to the R 
> > > console window into a file. sink() does this for sdterr 
> and stdout. 
> > > But I need something that redirects "everything" that 
> appear in the 
> > > console window (including the top-level commands).
> > >
> > > I want to achieve the same effect as that of the 
> following command 
> > > under
> > > sh:
> > > R < foo.R &> foo.Rt
> > >
> > > only I want the redirection to be activated within the 
> foo.R script.
> > > (The reason being that the name of the file is computed inside 
> > > foo.R)
> > 
> > 
> > since foo.Rt will be an ascii file, write the name of the file to a 
> > generic name, say foo.Rt
> > 
> > then run your R session with a shell script.   after runnin R,
> > have the shell script use grep or sed to grab the specific filename 
> > from the contents of the generic file (foo.Rt). then have the shell 
> > script rename the generic file.
> > 
> > .. that being said, there has got be an easier way 
> 
> let me restate that in english (LOL)
> 
> Since the output file (foo.Rt) will be an ascii file, have 
> your R-code write the computed name of the file to the 
> generically named output file (foo.Rt) along with everything else.
> 
> Then run your R session with a shell script.   After running R,
> have the shell script use grep or sed to grab the computed 
> filename from the contents of the generic file (foo.Rt). Then 
> have the shell script rename the generic file to the computed name.
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Redirect console to file

2005-06-09 Thread Vadim Ogranovich
Hi,
 
Is it possible to redirect the staff that normally goes to the R console
window into a file. sink() does this for sdterr and stdout. But I need
something that redirects "everything" that appear in the console window
(including the top-level commands).
 
I want to achieve the same effect as that of the following command under
sh:
R < foo.R &> foo.Rt
 
only I want the redirection to be activated within the foo.R script.
(The reason being that the name of the file is computed inside foo.R)
 
Thanks,
Vadim

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] R annoyances

2005-05-19 Thread Vadim Ogranovich
I guess it depends on what kind of data analysis one does. R is designed
and best suited for the analysis that starts with a data frame which
fits in 1/10th of your computer RAM. R programming is then mostly
limited to writing small convenience functions for better presentation,
visualization, etc. Or alternatively one implements a new fitting
procedure/algorithm and applies it to the data.

Now things begin to look harder when you have 200G of data and 8G of RAM
and still need to find "structure" in the data. You need to pre-process
the data, recover from *unexpected* failures, store and retrieve
intermediate data sets, etc. This requires qualities of a good
general-purpose programming language. Note, we do not use R to program a
system, we do data analysis so we should be considered R *users*.
In my view, and the experience of the colleague of my confirms it, R has
a long way to go to become a wrinkle-free general purpose language.

To your specific question, why good (C++) programmers should not
struggle with R? Because they have the skills to plan sizeable programs
in any wrinkle-free language.

Hope this makes my earier comments more clear,
Vadim

> -Original Message-
> From: Berton Gunter [mailto:[EMAIL PROTECTED] 
> Sent: Thursday, May 19, 2005 10:55 AM
> To: Vadim Ogranovich; 'Thomas Lumley'; 'Rod Montgomery'
> Cc: r-help@stat.math.ethz.ch
> Subject: RE: [R] R annoyances
> 
> Vadim et.al:
> 
> I do not care to comment one way or the other about R's 
> "irregularities.'
> But I am puzzled by your statement that a "good C++ 
> programmer is struggling with R." Why should they not 
> struggle?! R is primarily a language for data analysis, 
> statistics, and graphics. I do not understand why someone who is a
> C++ programmer would be expected to have the knowledge and 
> experience to 
> C++ be
> a "data miner" and would not therefore struggle to deal with 
> the statistical and data analysis issues that are 
> deliberately at the heart of many of R's programming conventions.
> 
> Is there something here that I am missing, or is this yet 
> another example of Frank Harrell's "instant brain surgeon" commentary?
> 
> -- Bert Gunter
> Genentech Non-Clinical Statistics
> South San Francisco, CA
>  
> "The business of the statistician is to catalyze the 
> scientific learning process."  - George E. P. Box
>  
>  
> 
> > -Original Message-
> > From: [EMAIL PROTECTED] 
> > [mailto:[EMAIL PROTECTED] On Behalf Of Vadim 
> > Ogranovich
> > Sent: Thursday, May 19, 2005 10:40 AM
> > To: Thomas Lumley; Rod Montgomery
> > Cc: r-help@stat.math.ethz.ch
> > Subject: RE: [R] R annoyances
> > 
> > I think the flaw in this reasoning is that programmers are not 
> > considered users. IMO, making a better language is beneficial for 
> > users.
> > 
> > I am now watching how a new colleague of mine, a very good C++ 
> > programmer turning into a data miner, is struggling w/ many 
> > "irregularities" of R.
> > 
> > > -Original Message-
> > > From: [EMAIL PROTECTED] 
> > > [mailto:[EMAIL PROTECTED] On Behalf Of 
> Thomas Lumley
> > > Sent: Thursday, May 19, 2005 9:39 AM
> > > To: Rod Montgomery
> > > Cc: r-help@stat.math.ethz.ch
> > > Subject: Re: [R] R annoyances
> > > 
> > > On Thu, 19 May 2005, Rod Montgomery wrote:
> > > > Thomas Lumley wrote:
> > > >> This one is actually a FAQ,
> > > >> mtx[,1,drop=FALSE]
> > > >> 
> > > >> -thomas
> > > >> 
> > > > I wonder whether there is, or should be, a way to set FALSE
> > > as the default?
> > > >
> > > 
> > > There shouldn't be (apart from editing the code), because 
> you really 
> > > don't want something this basic to be unpredictable.
> > > 
> > > There have been discussions at several times about whether 
> > > drop=FALSE or drop=TRUE should be the default. The decision has 
> > > always been that programmers can cope either way, but that users 
> > > probably don't expect mtx[,1] to be a vector, and that they 
> > > definitely don't expect mtx[1,1] to be a matrix.
> > > 
> > >   -thomas
> > > 
> > > __
> > > R-help@stat.math.ethz.ch mailing list 
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide! 
> > > http://www.R-project.org/posting-guide.html
> > >
> > 
> > __
> > R-help@stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide! 
> > http://www.R-project.org/posting-guide.html
> > 
> 
> 
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] R annoyances

2005-05-19 Thread Vadim Ogranovich
I think the flaw in this reasoning is that programmers are not
considered users. IMO, making a better language is beneficial for users.

I am now watching how a new colleague of mine, a very good C++
programmer turning into a data miner, is struggling w/ many
"irregularities" of R.

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Thomas Lumley
> Sent: Thursday, May 19, 2005 9:39 AM
> To: Rod Montgomery
> Cc: r-help@stat.math.ethz.ch
> Subject: Re: [R] R annoyances
> 
> On Thu, 19 May 2005, Rod Montgomery wrote:
> > Thomas Lumley wrote:
> >> This one is actually a FAQ,
> >> mtx[,1,drop=FALSE]
> >> 
> >> -thomas
> >> 
> > I wonder whether there is, or should be, a way to set FALSE 
> as the default?
> >
> 
> There shouldn't be (apart from editing the code), because you 
> really don't want something this basic to be unpredictable.
> 
> There have been discussions at several times about whether 
> drop=FALSE or drop=TRUE should be the default. The decision 
> has always been that programmers can cope either way, but 
> that users probably don't expect mtx[,1] to be a vector, and 
> that they definitely don't expect mtx[1,1] to be a matrix.
> 
>   -thomas
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] time zones, daylight saving etc.

2005-05-12 Thread Vadim Ogranovich
Works for me on Linux:
> Sys.time()
[1] "2005-05-12 10:22:31 PDT"
> Sys.putenv(TZ="GMT")
> Sys.time()
[1] "2005-05-12 17:22:37 GMT"

I extensively use the reset of TZ to parse times. 

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Gabor 
> Grothendieck
> Sent: Thursday, May 12, 2005 6:18 AM
> To: Prof Brian Ripley
> Cc: Carla Meurk; r-help@stat.math.ethz.ch
> Subject: Re: [R] time zones, daylight saving etc.
> 
> I have tried this but on Windows XP R 2.1.0 found I had to 
> set it outside of R prior to starting R. 
> 
> 1. unsuccessful
> 
> > Sys.time()
> [1] "2005-05-12 09:08:03 Eastern Daylight Time"
> > Sys.putenv(TZ="GMT")
> > Sys.time() # no change
> [1] "2005-05-12 09:08:12 Eastern Daylight Time"
> 
> 2. OK
> 
> C:\>set tz=GMT
> 
> C:\>start "" "\Program Files\R\rw2010\bin\r.exe"
> 
> R : Copyright 2005, The R Foundation for Statistical 
> Computing Version 2.1.0 Patched (2005-04-18), ISBN 3-900051-07-0
> 
> R is free software and comes with ABSOLUTELY NO WARRANTY.
> You are welcome to redistribute it under certain conditions.
> Type 'license()' or 'licence()' for distribution details.
> 
>   Natural language support but running in an English locale
> 
> R is a collaborative project with many contributors.
> Type 'contributors()' for more information and 'citation()' 
> on how to cite R or R packages in publications.
> 
> Type 'demo()' for some demos, 'help()' for on-line help, or 
> 'help.start()' for a HTML browser interface to help.
> Type 'q()' to quit R.
> 
> > Sys.time()
> [1] "2005-05-12 13:10:58 GMT"
> 
> I assume it could be set in .Renviron but it would be nice if 
> one could set it right from within R so that one can write a 
> function that sets it, does processing and then sets it back. 
>  Don't know if this is possible.
> 
> On 5/12/05, Prof Brian Ripley <[EMAIL PROTECTED]> wrote:
> > Would it not just be easier to set the timezone to GMT for the 
> > duration of the calculations?  I don't see an OS mentioned 
> here, but 
> > on most TZ=GMT for the session will do it.
> > 
> > On Thu, 12 May 2005, Rich FitzJohn wrote:
> > 
> > > Hi,
> > >
> > > seq.dates() in the chron package does not allow creating 
> sequences 
> > > by minutes, so you'd have to roll your own sequence generator.
> > >
> > > Looks like the tzone attribute of the times is lost when using 
> > > min(),
> > > max() and seq().  You can apply it back manually, but it does not 
> > > affect the calculation, since POSIXct times are stored as seconds 
> > > since 1/1/1970 (?DateTimeClasses).
> > >
> > > ## These dates/times just span the move from NZDT to NZST:
> > > dt.dates <- paste(rep(15:16, c(5,7)), "03", "2003", sep="/") 
> > > dt.times <- paste(c(19:23, 0:6), "05", sep=":") dt <- 
> > > paste(dt.dates, dt.times)
> > >
> > > ## No shift in times, or worrying about daylight savings; 
> > > appropriate ## iff the device doing the recording was not itself 
> > > adjusting for ## daylight savings, presumably.
> > > datetime <- as.POSIXct(strptime(dt, "%d/%m/%Y %H:%M"), "GMT")
> > >
> > > ## Create two objects with all the times in your range, 
> one with the 
> > > ## tzone attribute set back to GMT (to match datetimes), 
> and one ## 
> > > without this.
> > > mindata1 <- mindata2 <- seq(from=min(datetime), to=max(datetime),
> > >by="mins") attr(mindata2, "tzone") <- 
> > > "GMT"
> > >
> > > fmt <- "%Y %m %d %H %M"
> > > ## These both do the matching correctly:
> > > match(format(datetime, fmt), format(mindata1, fmt, tz="GMT")) 
> > > match(format(datetime, fmt), format(mindata2, fmt, tz="GMT"))
> > >
> > > ## However, the first of these will not, as it gets the 
> timezone all 
> > > ## wrong, since it's neither specified in the call to 
> format(), or 
> > > as ## an attribute of the POSIXct object.
> > > match(format(datetime, fmt), format(mindata1, fmt)) 
> > > match(format(datetime, fmt), format(mindata2, fmt))
> > >
> > > ## It is also possible to run match() directly off the POSIXct 
> > > object, ## but I'm not sure how this will interact with 
> things like 
> > > leap ## seconds:
> > > match(datetime, mindata1)
> > >
> > > Time zones do my head in, so you probably want to check this all 
> > > pretty carefully.  Looks like there's lots of gotchas (e.g. 
> > > subsetting a POSIXct object strips the tzone attribute).
> > >
> > > Cheers,
> > > Rich
> > >
> > > On 5/12/05, Gabor Grothendieck <[EMAIL PROTECTED]> wrote:
> > >> You could use the chron package.  It represents date 
> times without 
> > >> using time zones so you can't have this sort of problem.
> > >>
> > >> On 5/10/05, Carla Meurk <[EMAIL PROTECTED]> wrote:
> > >>> Hi,  I have a whole bunch of data, which looks like:
> > >>>
> > >>> 15/03/2003   10:20  1
> > >>> 15/03/2003   10:21  0
> > >>> 15/03/2003   12:02  0
> > >>> 16/03/2003   06:10  0
> > >>> 16/03/2003   06:20  0.5
> > >>> 16/03/2003   06:30  0
> > >>> 16/03/2003   06:40  0
> > >>> 16/0

[R] distance between distributions

2005-05-05 Thread Vadim Ogranovich
Hi,
 
This is more of a general stat question. I am looking for a easily
computable measure of a distance between two empirical distributions.
Say I have two samples x and y drawn from X and Y. I want to compute a
statistics rho(x,y) which is zero if X = Y and grows as X and Y become
less similar.
 
Kullback-Leibler distance is the most "official" choice, however it
needs estimation of the density. The estimation of the density requires
one to choose a family of the distributions to fit from or to use some
sort of non-parametric estimation. I have no intuition whether the
resulting KL distance will be sensitive to the choice of the family of
the distribution or of the fitting method.
 
Any suggestion of an alternative measure or insight into sensitivity of
the KL distance will be highly appreciated.
 
The distributions I deal with are those of stock returns and
qualitatively close to the normal dist with much fatter tails. The tails
in general should be modeled non-parametrically.
 
Thanks,
Vadim

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] congratulations to the JGR developers

2005-05-02 Thread Vadim Ogranovich
Well put, Gabor!

P.S. Sorry for wasting the bandwidth. Just couldn't resist, so much true
it is. 

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Gabor 
> Grothendieck
> Sent: Monday, May 02, 2005 8:06 AM
> To: Andy Bunn
> Cc: R-Help
> Subject: Re: [R] congratulations to the JGR developers
> 
> On 5/2/05, Andy Bunn <[EMAIL PROTECTED]> wrote:
> > > Just want to offer my congratulations to the JGR 
> developers as the 
> > > recepient of the 2005 Chambers Award.  Great job, guys!!
> > > http://stats.math.uni-augsburg.de/JGR/
> > 
> > This feels like the future of R to me. It's simple, powerful, and 
> > elegant just like R. As soon as the binary that works with 2.1 is 
> > released I'll use it exclusively on Linux and Windows. I'm 
> deeply impressed.
> 
> I have not tried JGR but regarding your three adjective 
> describing R, R is very powerful but I am not sure I would 
> characterize it as simple and elegant -- complex and 
> practical seem nearer to the mark to me.
> Some parts of R may be simple and elegant but when I think of 
> simple and elegant languages I think of ones that are 
> organized around a
> single concept like APL (arrays), Smalltalk (objects), etc.   
> The underlying
> Lisp roots of R may have a certain simplicity to them and S3 
> (though not
> S4) is relatively simple but not R as a whole.  On the other 
> hand, the fact that it is practical, powerful and free with a 
> broad set of builtin and addon functionality and has become a 
> de facto standard for statistical research have been 
> sufficient reason for me to do all my computing using R.
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] ANNOUNCE: S-PLUS 7.0

2005-04-11 Thread Vadim Ogranovich
David, From the white paper, the BIG DATA THING looks quite impressive.
IMHO, it addresses the biggest limitation the S family has had so far. I
could, of course, think of few features that I wish to see there, but
the existing functionality looks fairly complete. Congratulations! An R
folk Vadim

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of David Smith
> Sent: Monday, April 11, 2005 11:26 AM
> To: r-help@stat.math.ethz.ch
> Subject: [R] ANNOUNCE: S-PLUS 7.0
> 
> Insightful is proud to announce a major update to S-PLUS available
> today: S-PLUS 7. S-PLUS 7 was designed to enable 
> statisticians to create targeted statistical applications 
> with large data sets that can be deployed to business users, 
> researchers, analysts and other end users who do not have 
> special expertise in statistical methods.
> 
> S-PLUS 7 is the result of scores of interviews with S-PLUS 
> users which drove the design and development of the new 
> features.  S-PLUS 7 has also benefited from an extensive beta 
> test program involving many participants on s-news and 
> r-help, to whom I offer sincere thanks for their feedback 
> during the development process.
> 
> The S-PLUS 7 release includes a new member of the S-PLUS 
> product family, S-PLUS Enterprise Developer, that provides 
> additional new features to S-PLUS, including:
> 
> * PIPELINE ARCHITECTURE and BIG DATA LIBRARY: S-PLUS 7 Enterprise
>   Developer introduces a new pipeline architecture by making it
>   possible to process gigabyte-sized data sets, even on machines with
>   modest amounts of RAM. With the new "big data" library, S-PLUS
>   programmers can import or create extremely large data objects by
>   using out-of-memory processing techniques. Instead of holding a
>   large data set entirely in memory, the pipeline architecture caches
>   the data file on disk and uses specialized streaming algorithms to
>   process the data, reading only a small portion of the data into
>   memory at a time.
> 
> * S-PLUS WORKBENCH INTEGRATED DEVELOPMENT ENVIRONMENT: an integrated
>   environment for S code development, based on the Eclipse
>   framework. This release offers the core functionality of code
>   editing, syntax error detection, project and task management,
>   interfaces with source code control systems, and interaction with
>   the S language engine.
> 
> You can read about the new features of S-PLUS, including a 
> link to a detailed white paper about the new big data library at:
> 
>www.insightful.com/products/splus/s7_features.asp
> 
> You can also learn more about the new capabilities of S-PLUS 
> 7 at a webinar I will be giving on April 19. More info at:
> 
>www.insightful.com/news_events/webcasts/2005/04splus
> 
> Finally, my thanks to all the members of the S community -- 
> including R folk -- who have provided such great discussion 
> and debate over the years, which has helped make S-PLUS what 
> it is today.
> 
> # David Smith
> 
> --
> David M Smith <[EMAIL PROTECTED]>
> Senior Product Manager, Insightful Corp, Seattle WA
> Tel: +1 (206) 802 2360
> Fax: +1 (206) 283 6310
> 
> New S-PLUS 7! Create advanced statistical applications with 
> large data sets.
> www.insightful.com/splus
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] How to search an element in matrix ?

2005-04-10 Thread Vadim Ogranovich
A matrix is a vector as well (it is stored by columns), so it has two
ways of indexing [i,j] and [i]. It may be easier for you to use the
latter, thus
which(x == 1) returns all indexes where the matrix x is equal to 1.

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of tong wang
> Sent: Sunday, April 10, 2005 9:37 PM
> To: r-help@stat.math.ethz.ch
> Subject: [R] How to search an element in matrix ?
> 
> Hi you guys,
>  I know this might be too simple a question to post, but 
> i searched a lot still couldn't find it.
> Just want to find an element in matrix and return its 
> index , i think there should be some matrix version of 
> "match" which only works for vector to me.
>  thanks in advance for your help.
> 
> best,
> tong
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] Survey of "moving window" statistical functions - still looking f or fast mad function

2005-04-01 Thread Vadim Ogranovich
Hi,
 
First, let me thank Jaroslaw for making this survey. I find it quite
illuminating.
 
Now the questions:
 
* the #1 solution below (based on cumsum) is numerically unstable.
Specifically if you do the runmean on a positive vector you can easily
get negative numbers due to rounding errors. Does anyone see a
modification which is free of this deficiency?
* is it possible to optimize the algorithm of the filter function,
solution #2 below, for the case of the  rep(1/k,k) kernel?
 
Thanks,
Vadim

[R] Survey of "moving window" statistical functions - still looking f or
fast mad function


*   This message: [ Message body
  ] [ More
options
  ] 
*   Related messages: [ Next message
  ] [ Previous
message   ] [
Next in thread 
] [ Replies
  ] 

From: Tuszynski, Jaroslaw W. mailto:JAROSLAW.W.TUSZYNSKI_at_saic.com?Subject=Re:%20%5BR%5D%20Survey%
20of%20"moving%20window"%20statistical%20functions%20-%20still
%20lookingf%20or%20fast%20mad%20function> > 
Date: Sat 09 Oct 2004 - 06:30:32 EST



Hi, 

Lately I run into a problem that my code R code is spending hours
performing simple moving window statistical operations. As a result I
did searched archives for alternative (faster) ways of performing: mean,
max, median and mad operation over moving window (size 81) on a vector
with about 30K points. And performed some timing for several ways that
were suggested, and few ways I come up with. The purpose of this email
is to share some of my findings and ask for more suggestions (especially
about moving mad function). 

Sum over moving window can be done using many different ways. Here are
some sorted from the fastest to the slowest: 

1.  runmean = function(x, k) { n = length(x) y = x[ k:n ] - x[
c(1,1:(n-k)) ] # this is a difference from the previous cell y[1] =
sum(x[1:k]); # find the first sum y = cumsum(y) # apply precomputed
differences return(y/k) # return mean not sum } 
2.  filter(x, rep(1/k,k), sides=2, circular=T) - (stats package) 
3.  kernapply(x, kernel("daniell", m), circular=T) 
4.  apply(embed(x,k), 1, mean) 
5.  mywinfun <- function(x, k, FUN=mean, ...) { # suggested in news
group n <- length(x) A <- rep(x, length=k*(n+1)) dim(A) <- c(n+1, k)
sapply(split(A, row(A)), FUN, ...)[1:(n-k+1)] } 
6.  rollFun(x, k, FUN=mean) - (fSeries package) 
7.  rollMean(x, k) - (fSeries package) 
8.  SimpleMeanLoop = function(x, k) { n = length(x) # simple-minded
loop used as a baseline y = rep(0, n) k = k%/%2; for (i in (1+k):(n-k))
y[i] = mean(x[(i-k):(i+k)]) } 
9.  running(x, fun=mean, width=k) - (gtools package) 

Some of above functions return results that are the same length as x and
some return arrays with length n-k+1. The relative speeds (on Windows
machine) were as follow: 0.01, 0.09, 1.2, 8.1, 11.2, 13.4, 27.3, 63,
345. As one can see there are about 5 orders of magnitude between the
fastest and the slowest. 

Maximum over moving window can be done as follow, in order of speed 

1.  runmax = function(x, k) { n = length(x) y = rep(0, n) m = k%/%2;
a = 0; for (i in (1+m):(n-m)) { if (a==y[i-1]) y[i] =
max(x[(i-m):(i+m)]) # calculate max of the window else y[i] =
max(y[i-1], x[i+m]); # max of the window is =y[i-1] a = x[i-m] # point
that will be removed from the window } return(y) } 
2.  apply(embed(x,k), 1, max) 
3.  SimpleMaxLoop(x, k) - similar to SimpleMeanLoop above 
4.  mywinfun(x, k, FUN=max) - see above 
5.  rollFun(x, k, FUN=max) - fSeries package 
6.  rollMax(x, k) - fSeries package 
7.  running(x, fun=max, width=k) - gtools package The relative
speeds were: <0.01, 3, 3.4, 5.3, 7.5, 7.7, 15.3 

Median over moving window can be done as follows: 

1.  runmed(x, k) - from stats package 
2.  SimpleMedLoop(x, k) - similar to SimpleMeanLoop above 
3.  apply(embed(x,k), 1, median) 
4.  mywinfun(x, k, FUN=median) - see above 
5.  rollFun (x, k, FUN=median) - fSeries package 
6.  running(x, fun=max, width=k) - gtools package Speeds: <0.01,
3.4, 9, 15, 29, 165 

Mad over moving window can be done as follows: 

1.  runmad = function(x, k) { n = length(x) A = embed(x,k) A = abs(A
- rep(apply(A, 1, median), k)) dim(A) = c(n-k+1, k) apply(A, 1, median)
} 
2.  apply(embed(x,k), 1, mad) 
3.  mywinfun(x, k, FUN=mad) - see above 
4.  SimpleMadLoop(x, k) - similar to SimpleMeanLoop above 
5.  rollFun(x, k, FUN=mad) - fSeries package 
6.  running(x, fun=mad, width=k) - gtools package Speeds: 11, 18,
25, 50, 50, 400 

Some thoughts about those results: 

*   All functions from Stats package (runmed, filter, kernapply) are
fast and hard to improve on

RE: [R] how modify object in parent.env

2005-03-09 Thread Vadim Ogranovich
Thank you to Gabor and Mark Schwartz for the answers. Both of them
solved the problem I posted, but my actual problem, as I now see, is a
little bit more involved. Let me try again.

I have a vector 'x'. I want to compute its entries in a loop (yes, I
know...). Say

x = seq(3)

for (i in seq(length(x)) {
x0 = someValue
x[i] = x0
} 

There are two problems with the above code:
1. x0 pollutes the global envirnoment (not to mention possible
over-write of an existing x0). Therefore I thought I'd wrap it with
local().
2. x0 is not a good name from a readability perspective. I'd rather call
it x to emphasize it's an entry in an outer vector 'x'. (In this small
example it doesn't really matter, but I have much more involved scripts
where consistent naming is important)

Gabor's solution solves 1 but not 2. Maybe there is a simple way around
this restriction?

Thanks,
Vadim



> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Gabor 
> Grothendieck
> Sent: Tuesday, March 08, 2005 4:06 PM
> To: r-help@stat.math.ethz.ch
> Subject: Re: [R] how modify object in parent.env
> 
> 
> You can use "<<-" like this:
> 
> x <- 1:3
> local(x[1] <<- x[1]+1)
> 
> Vadim Ogranovich  evafunds.com> writes:
> 
> : 
> : Assign() re-binds the value, not modifies it (the latter is what I
> : needed)
> : 
> : > -Original Message-
> : > From: McGehee, Robert [mailto:Robert.McGehee  
> geodecapital.com]
> : > Sent: Tuesday, March 08, 2005 3:48 PM
> : > To: Vadim Ogranovich; r-help  stat.math.ethz.ch
> : > Subject: RE: [R] how modify object in parent.env
> : >
> : > This isn't an environment problem. Assigning something to a
> : > get call doesn't make any sense. Use assign.
> : >
> : > > a <- 5
> : > > get("a") <- 10
> : > Error: couldn't find function "get<-"
> : >
> : > And from the ?assign help page, you can pick what environment
> : > you want to make the assignment. Just pick the parent environment.
> : >
> : >
> : > -Original Message-
> : > From: Vadim Ogranovich [mailto:vograno  evafunds.com]
> : > Sent: Tuesday, March 08, 2005 6:36 PM
> : > To: r-help  stat.math.ethz.ch
> : > Subject: [R] how modify object in parent.env
> : >
> : >
> : > Hi,
> : >
> : > Is it possible to modify an object in the parent.env (as 
> opposed to
> : > re-bind)? Here is what I tried:
> : >
> : > > x = 1:3
> : > # try to modify the first element of x from within a new 
> environment
> : > > local(get("x", parent.env(environment()))[1] <- NA)
> : > Error in eval(expr, envir, enclos) : Target of assignment 
> expands to
> : > non-language object
> : >
> : > # On the other hand retrieval works just fine
> : > > local(get("x", parent.env(environment()))[1])
> : > [1] 1
> : >
> : > Thanks,
> : > Vadim
> : >
> : > __
> : > R-help  stat.math.ethz.ch mailing list
> : > https://stat.ethz.ch/mailman/listinfo/r-help
> : > PLEASE do read the posting guide!
> : > http://www.R-project.org/posting-guide.html
> : >
> : 
> : __
> : R-help  stat.math.ethz.ch mailing list
> : https://stat.ethz.ch/mailman/listinfo/r-help
> : PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
> : 
> :
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] how modify object in parent.env

2005-03-08 Thread Vadim Ogranovich
Assign() re-binds the value, not modifies it (the latter is what I
needed) 

> -Original Message-
> From: McGehee, Robert [mailto:[EMAIL PROTECTED] 
> Sent: Tuesday, March 08, 2005 3:48 PM
> To: Vadim Ogranovich; r-help@stat.math.ethz.ch
> Subject: RE: [R] how modify object in parent.env
> 
> This isn't an environment problem. Assigning something to a 
> get call doesn't make any sense. Use assign.
> 
> > a <- 5
> > get("a") <- 10
> Error: couldn't find function "get<-"
> 
> And from the ?assign help page, you can pick what environment 
> you want to make the assignment. Just pick the parent environment.
> 
> 
> -Original Message-
> From: Vadim Ogranovich [mailto:[EMAIL PROTECTED]
> Sent: Tuesday, March 08, 2005 6:36 PM
> To: r-help@stat.math.ethz.ch
> Subject: [R] how modify object in parent.env
> 
> 
> Hi,
>  
> Is it possible to modify an object in the parent.env (as opposed to
> re-bind)? Here is what I tried:
>  
> > x = 1:3
> # try to modify the first element of x from within a new environment
> > local(get("x", parent.env(environment()))[1] <- NA)
> Error in eval(expr, envir, enclos) : Target of assignment expands to
> non-language object
> 
> # On the other hand retrieval works just fine
> > local(get("x", parent.env(environment()))[1])
> [1] 1
> 
> Thanks,
> Vadim
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] how modify object in parent.env

2005-03-08 Thread Vadim Ogranovich
Hi,
 
Is it possible to modify an object in the parent.env (as opposed to
re-bind)? Here is what I tried:
 
> x = 1:3
# try to modify the first element of x from within a new environment
> local(get("x", parent.env(environment()))[1] <- NA)
Error in eval(expr, envir, enclos) : Target of assignment expands to
non-language object

# On the other hand retrieval works just fine
> local(get("x", parent.env(environment()))[1])
[1] 1

Thanks,
Vadim

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] total variation penalty

2005-03-02 Thread Vadim Ogranovich
Hi,
 
I was recently plowing through the docs of the quantreg package by Roger
Koenker and came across the total variation penalty approach to
1-dimensional spline fitting. I googled around a bit and have found some
papers originated in the image processing community, but (apart from
Roger's papers) no paper that would discuss its statistical aspects.
 
I have a couple of questions in this regard:
* Is it more natural to consider the total variation penalty in the
context of quantile regression than in the context of OLS? 
* Could someone please point to a good overview paper on the subject?
Ideally something that compares merits of different penalty functions.
 
Threre seems to be an ongoing effort to generalize this approach to 2d,
but at this time I am more interested in 1-d smoothing.
 
Thanks,
Vadim

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] Finding "runs" of TRUE in binary vector

2005-01-27 Thread Vadim Ogranovich
Untested:

c(TRUE, b[-1] != b[-length(b)]) gives you the (logical) indexes of the
beginnings of the runs
c(b[-1] != b[-length(b)], TRUE) gives you the (logical) indexes of the
ends of the runs

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Sean Davis
> Sent: Thursday, January 27, 2005 2:14 PM
> To: r-help
> Subject: [R] Finding "runs" of TRUE in binary vector
> 
> I have a binary vector and I want to find all "regions" of 
> that vector that are runs of TRUE (or FALSE).
> 
>  > a <- rnorm(10)
>  > b <- a<0.5
>  > b
>   [1]  TRUE  TRUE  TRUE FALSE  TRUE FALSE  TRUE  TRUE  TRUE  TRUE
> 
> My function would return something like a list:
> region[[1]] 1,3
> region[[2]] 5,5
> region[[3]] 7,10
> 
> Any ideas besides looping and setting start and ends directly?
> 
> Thanks,
> Sean
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] recursive penalized regression

2005-01-18 Thread Vadim Ogranovich
Hi,
 
Few days ago I posted a question to r-sig-finance, which I thought would
be an easy one. To my surprise I have received no replies, which makes
me think that it is either harder than I thought, or that it makes no
sense. I am reposting the message (with some modifications) on the
R-help in a hope to get some leads, suggestions for alternatives, etc.
My apologies to those who had seen this on r-sig-finance.
 
 
I want to do a univariate no frills autoregression. The major
non-standard requirements are:
1. When estimating model parameters at time t the algorithm can only use
data up to time t.
2. The weights of the past observations should decay with time, e.g
exponentially
3. ability to apply some penalty, e.g. L2 (ridge), L1 (lasso), etc., to
model coefficients (or any other regularization technique)
 
Thanks in advance for your help,
Vadim

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] how to get to interesting part of pattern match

2004-11-18 Thread Vadim Ogranovich
Hi,

I am looking for a way to extract an "interesting" part of the match to
a regular expression. For example the pattern "[./](*.)" matches a
substring that begins with either "." or "/" followed by anything. I am
interested in this "anything" w/o the "." or "/" prefix. If say I match
the pattern against "abc/foo" I want to get "foo", not "/foo". In Perl
one can simply wrap the "interesting" part in () and get it out of the
match. Is it possible to do a similar thing in R?

There seems to be a way to refer to the match, see below, but I couldn't
figure out how to make gsub return it.
> gsub("[./](*.)", "\\1", "abc/foo")
[1] "abcfoo"


Thanks,
Vadim

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] Enormous Datasets

2004-11-18 Thread Vadim Ogranovich
Very unlikely R will be able to handle this. The problems are:

* the data set may simply not fit into the memory
* it will take forever to read from the ASCII file
* any meaningful analysis of a dataset in R typically require 5 - 10
times more memory than the size of the dataset (unless you are a real
insider and know all the knobs)


Your best strategy is probably to partition the file in meaningful
sub-categories and work with them. To save time on conversion from ASCII
you can read the sub-files into a data frame and then save the data
frame in .rda file using save(). Subsequent loading .rda files is much
faster than reading ASCII

Another strategy which is often advocated on the list is to put the data
into a data base and draw random samples of manageable size from the
database. I have no experience with this approach

HTH,
Vadim

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Thomas 
> W Volscho
> Sent: Thursday, November 18, 2004 12:11 PM
> To: [EMAIL PROTECTED]
> Subject: [R] Enormous Datasets
> 
> Dear List,
> I have some projects where I use enormous datasets.  For 
> instance, the 5% PUMS microdata from the Census Bureau.  
> After deleting cases I may have a dataset with 7 million+ 
> rows and 50+ columns.  Will R handle a datafile of this size? 
>  If so, how?
> 
> Thank you in advance,
> Tom Volscho
> 
> 
> Thomas W. Volscho
> Graduate Student
> Dept. of Sociology U-2068
> University of Connecticut
> Storrs, CT 06269
> Phone: (860) 486-3882
> http://vm.uconn.edu/~twv1
> 
> __
> [EMAIL PROTECTED] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
>

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] R on 64-bit Linux machine

2004-11-15 Thread Vadim Ogranovich
Thanks to everyone for the info. It is very valuable. I am a little bit
uneasy about conflicting reports regarding RHEL 3, but I guess at this
point I just need to try and see. It's also very soothing to know that
there is an "official" 64-bit build on CRAN.

Thanks again for taking time to answer,
Vadim

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] R on 64-bit Linux machine

2004-11-12 Thread Vadim Ogranovich
Hi,
 
We are planning to buy a 64-bit Linux machine which will mainly run R.
There was an interesting thread on 64-bits on r-help back in April that
basically confirmed that the 64-bit R is fine as long as the length of
an atomic object is less than 2^31 - 1.
 
My specific question is on which 64-bit Linux distros (SUSE or RedHat)
and processors R is *known* to build out-of-box and run well. Ease of
maintenance is essential here. We have RedHat 7.3 on other (32-bit)
machines and would try not to proliferate the OS-s.
 
 
Your information will be highly appreciated,
 
Thanks,
Vadim

[[alternative HTML version deleted]]

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] R code debugging

2004-11-09 Thread Vadim Ogranovich
You might want to check out the "debug" package on CRAN:

debug: MVB's debugger for R

Debugger for R functions, with code display, graceful error recovery,
line-numbered conditional breakpoints, access to exit code, flow
control, and full keyboard input.
Version:1.0.1
Depends:R (>= 1.8), mvbutils, tcltk
Date:   18/2/2004
Author: Mark V. Bravington
Maintainer: Mark V. Bravington
License:GPL version 2 or later

 

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Prof 
> Brian Ripley
> Sent: Tuesday, November 09, 2004 4:16 AM
> To: Timur Elzhov
> Cc: [EMAIL PROTECTED]
> Subject: Re: [R] R code debugging
> 
> On Tue, 9 Nov 2004, Timur Elzhov wrote:
> 
> > it's quite difficult for me to find `Error:'s in my R code, 
> because R 
> > does say about error itself, but say nothing about its 
> location, say, 
> > string number and file with an error (which may be 
> `source'd from another file).
> > Are there any option for turning of the similar feature, or 
> R can not 
> > do such a thing at all?
> 
> R code can be created dynamically by R code (called 
> `computing on the language'), and in most cases the source 
> code is not retained (and it is not used for execution).
> 
> traceback() will always tell you the function in which the 
> error occurred.
> If you write reasonably modular code that should suffice, but 
> if not, use debug() on that function and single-step through 
> it to find where the error occurs.  Or set a suitable error 
> handler: have you explored recover(), for example?
> 
> -- 
> Brian D. Ripley,  [EMAIL PROTECTED]
> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> University of Oxford, Tel:  +44 1865 272861 (self)
> 1 South Parks Road, +44 1865 272866 (PA)
> Oxford OX1 3TG, UKFax:  +44 1865 272595
> 
> __
> [EMAIL PROTECTED] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
>

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] computing distribution function online

2004-10-25 Thread Vadim Ogranovich
Hi,
 
I am looking for a means to compute empirical distribution function for
a very large data set and evolution of that edf with time.
 
Here are some specifics. Each day I have an estimate of a distribution
function and a new sample of about 1e4 points from the distribution in
question. I want to update my estimate to include the new observations
(with some aging coefficient to adapt to the changes of the df w/ time).
Is there any R (or even non-R) code that can do this? Any relevant
references will be appreciated as well.
 
Thanks,
Vadim
 

[[alternative HTML version deleted]]

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] help.search intersection

2004-10-19 Thread Vadim Ogranovich
Hi,
 
Is it possible to search for help pages that meet more than one criteria
at a time? Say I want to search for all help pages that mention
"cross-validation" AND "bootstrap". How do I do this?
 
Thanks,
Vadim

[[alternative HTML version deleted]]

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] 2d approx

2004-10-14 Thread Vadim Ogranovich
Hi,
 
I am looking for a function that generalizes 'approx' to two (or more)
dimensions. The references on the approx help page point toward splines,
but a) splines is what I am trying to avoid in the first place and b)
splines (except for mgcv splines) seem to be one dimensional.
 
Here is a more detailed account. Using mgcv:gam I fit an additive model
xy.gam according to the formula y ~ s(x), which is a spline under the
hood. If I now wish to compute model prediction for new data I could use
predict.gam(xy.gam, newdata). However newdata will first be expanded
into a large matrix of coefficients with respect to the spline basis
functions. For example if the length of newdata is 1e6 and the size of
the basis is 100 than the matrix of coefficients is 100*1e6, i.e. huge.
The predict.gam recognizes the problem and works around it by doing a
piece-meal prediction, but this turns out to be too slow for my needs.
 
One way around is to tabulate s(x) on a fine enough grid and use approx
for prediction. Something like this (pseudo-code)
 
x.grid <- seq(min(newdata), max(newdata), length=1000)
y.grid <- predict.gam(xy.gam, x.grid)
 
y.newdata <- approx(x.grid, y.grid, newdata)$y
 
 
I didn't test this, but I expect it to be dramatically faster than
predict.gam.
 
Unfortunately I don't know how to extend it into 2D. Your suggestions
are very welcome!
 
 
Thanks,
Vadim

[[alternative HTML version deleted]]

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] Statistical analysis of a large database

2004-10-14 Thread Vadim Ogranovich
I thought that maybe authors of books on R should be allowed (encouraged ?) to 
announce availability/revisions of their books via the R-packages list?
For example I'd be very interested to have another look at Dr. Torgo's book when it 
becomes more complete and I'd appreciate a revision notice via the list.

Just a suggestion. Thanks, Vadim


> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Luis Torgo
> Sent: Wednesday, October 13, 2004 12:03 PM
> To: Prof Brian Ripley
> Cc: Vito Ricci; [EMAIL PROTECTED]
> Subject: Re: [R] Statistical analysis of a large database
> 
> On Tue, 2004-10-12 at 08:36, Prof Brian Ripley wrote:
> > > Luís Torgo, Data Mining with R. Learning by case studies, Maggio 
> > > 2003 http://www.liacc.up.pt/~ltorgo/DataMiningWithR/
> > 
> > Please note that that reference is not about large 
> datasets, nor about 
> > `data mining' in the generally used sense.  It has two studies, one 
> > incomplete, on linear regression (with 200 samples) and on 
> time series.
> 
> I would like to add a few information on these incomplete 
> comments on the book I'm writing. The book is unfinished as 
> mentioned on its Web page. It has currently two reasonably 
> finished chapters: an introduction to R and MySQL and a case 
> study. As mentioned in the book, the first case study is 
> small by data mining standards (200 observations) and has the 
> goal of illustrating techniques that are shared by data 
> mining and other disciplines as well as smoothly introducing 
> the reader to R and its power. It addresses data 
> pre-processing techniques, data visualization, model 
> construction (yes, linear regression but also regression 
> trees), and model evaluation, selection and combination, so I 
> think it is a bit incorrect to say that it is about linear 
> regression that corresponds to 5 of the 50 pages of that chapter.
>  
> The third (unfinished) chapter (2nd case study) is about 
> financial trading. It includes topics like connections to 
> data bases as well as many other components of a knowledge 
> discovery process. Among those components it includes model 
> construction that involves obviously time series models given 
> the nature of the data. The chapter will include other steps 
> like issues concerning moving from predictions into actions, 
> creation of variables from the original time series, etc.. It 
> is currently being re-written and I expect to upload soon a 
> new revised version of this chapter.
> 
> The book will include at least two further cases studies that 
> will be larger. Still, I would note that the financial 
> trading case study is potentially very large, as it is a 
> problem where data is constantly growing. The final version 
> of that chapter addresses this issue of having a system that 
> is online in the sense that it is receiving new data in real 
> time (also known as mining data streams in the data mining field).
> 
> I'm sorry for being so long, but I think it is dangerous to 
> try to resume around 200 pages of an unfinished work in two 
> lines of text.
> 
> Still, all comments on this on going project are very well 
> welcome and I would like to take this opportunity to thank 
> all people that have been sending me encouraging comments/emails.
> 
> Luis Torgo
> 
> --
> Luis Torgo
>   FEP/LIACC, University of Porto   Phone : (+351) 22 607 88 30
>   Machine Learning Group   Fax   : (+351) 22 600 36 54
>   R. Campo Alegre, 823 email : [EMAIL PROTECTED]
>   4150 PORTO   -  PORTUGAL WWW   : 
> http://www.liacc.up.pt/~ltorgo
> 
> __
> [EMAIL PROTECTED] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
>

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] memory in R

2004-10-11 Thread Vadim Ogranovich
Setting A <- NULL doesn't immediately release the memory, the memory is
actually released in gc(), which R calls for you at some "random" time.
In situations like this I explicitely call gc() and do not wait for R to
do this, e.g
A <- NULL; gc()

Hope this helps,
Vadim

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of yyan liu
> Sent: Monday, October 11, 2004 11:55 AM
> To: [EMAIL PROTECTED]
> Subject: [R] memory in R
> 
> Hi:
>   I am doing a MCMC algorithm which is well known to consume 
> much computer memory. And I have a problem everytime I run my 
> R program. It stopped at certain iteration and says "can not 
> allocate a vector of 19 kb".
> It seems that the computer's memory has been exhausted. 
> However, it is said that after each iteration the objects 
> (such as a huge matrix) can be set to NULL. And the memory 
> will be released so the program will consume as much memory 
> as before. I wonder how to do that, that is, set the object 
> to be NULL. Say, A is a matrix in each iteration. I just need 
> to write A<-NULL ?? And this approach really works?
>   Thank you very much!
> 
> liu 
> 
> 
>   
> ___
> 
> Declare Yourself - Register online to vote today!
> 
> __
> [EMAIL PROTECTED] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
>

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] private on site R training solicited

2004-09-27 Thread Vadim Ogranovich
Hi,
 
R is catching up even with the hardest Excelists. At our company we are
exploring the possibility of having some of the R&D staff introduced to
R during a short on site class in San-Francisco, California. The number
of people in the group will be about 5 - 7. If you are interested in
delivering the class please e-mail me the info for consideration.
 
Thanks,
Vadim Ogranovich
Evnine & Vaughan Assoc.

[[alternative HTML version deleted]]

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] passing formula arg to mgcv::gam

2004-09-27 Thread Vadim Ogranovich
This is a self-response :-).

It was indeed a problem with environments. One way to get around is to "reset" the 
environment, e.g. inside callGam do
formula <- as.formula(unclass(formula)) 


Not too aesthetic, but works. Is there a less kludgy way to do this?

BTW, forgot to mention. This is R-1.9.1 on RH-7.3.

Thanks,
Vadim

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Vadim 
> Ogranovich
> Sent: Monday, September 27, 2004 3:24 PM
> To: [EMAIL PROTECTED]
> Subject: [R] passing formula arg to mgcv::gam 
> 
> Hi,
> 
> I have a function, callGam, that fits a gam model to a subset 
> of a dataframe. The argument to callGam is a formula, the 
> subset is determined inside the function itself. My naïve 
> approach generates and error, see below. I guess this is 
> because 'idx' is loocked up in the environment of 'formula', 
> but I am too ignorant about environments to be able to tell 
> for sure. Could someone please suggest a way around?
> 
> Thanks,
> Vadim
> 
> > library("mgcv")
> > 
> > callGam <- function(formula) {
> +   idx <- seq(10)
> +   gam(formula, data=data.frame(x=rnorm(100), y=rnorm(100)), 
> + subset=idx) }
> > 
> > gam.fit <- callGam(y ~ x)
> Error in eval(expr, envir, enclos) : Object "idx" not found
> >
> 
> __
> [EMAIL PROTECTED] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
>

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] passing formula arg to mgcv::gam

2004-09-27 Thread Vadim Ogranovich
Hi,

I have a function, callGam, that fits a gam model to a subset of a dataframe. The 
argument to callGam is a formula, the subset is determined inside the function itself. 
My naïve approach generates and error, see below. I guess this is because 'idx' is 
loocked up in the environment of 'formula', but I am too ignorant about environments 
to be able to tell for sure. Could someone please suggest a way around?

Thanks,
Vadim

> library("mgcv")
> 
> callGam <- function(formula) {
+   idx <- seq(10)
+   gam(formula, data=data.frame(x=rnorm(100), y=rnorm(100)), subset=idx)
+ }
> 
> gam.fit <- callGam(y ~ x)
Error in eval(expr, envir, enclos) : Object "idx" not found
>

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] how to debug a sudden exit in non-interactive mode

2004-09-03 Thread Vadim Ogranovich
Hi,
 
I have a piece of R code that calls mgcv::gam. The code runs fine in the
interactive mode, but terminates R w/o a single message when run
non-interactively. Though I think I should be able to locate the problem
by brute force I'd appreciate an advise how to do it more intelligently
using R debugging tools.
 
At this time I only know that it has something to do with me loading my
custom library, vor, in .Rprofile.
 
I use R-1.9.1 on RH7.3.
 
Following the posting guide I include an example (note that it may work
for you fine since you don't have my .Rprofile). 
 
This is debug.R file:
#=
dataLength <- 1e3
y <-rnorm(dataLength)
x <- rnorm(dataLength)
 
library("mgcv")
 
cat("before\n")
  
xy.gam <- gam(y ~ s(x), knots=list(place.knots(x, 25)), fit=FALSE)
 
cat("after\n")

 
 
# Here I run it non-interactively from the shell. Note that the last
line, cat("after\n"), doesn't get executed. (it does get executed in the
interactive mode or with --no-init-file)
~% R --no-save --no-restore --silent < debug.R
.First.lib of vor 
> dataLength <- 1e3
> y <-rnorm(dataLength)
> x <- rnorm(dataLength)
> 
> library("mgcv")
This is mgcv 1.0-9 
> 
> cat("before\n")
before
>   
> xy.gam <- gam(y ~ s(x), knots=list(place.knots(x, 25)), fit=FALSE)
~% 
 
 
Thanks,
Vadim

[[alternative HTML version deleted]]

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] how to efficiently bootstrap mgcv::gam

2004-08-26 Thread Vadim Ogranovich
Hi,
 
I repeatedly estimate the same gam model (package mgcv) using different
subsets of the data. A naive approach is, of course, to estimate the
model from scratch for each of the subsets. However I wonder if there
are some computations that can be "factored out" for the sake of
efficiency?
 
To be specific here is a stylized example of what I do now (which I'd
like to make more efficient):
 
gamA <- vector("list", 100)
 
for (i in seq(length(gamA))) {
  gamA[[i]] <- gam(y ~ x2 + s(x1, by=x2), data=myData,
subset=sample(nrow(myData), 1))
}
 
 
Thanks,
Vadim

[[alternative HTML version deleted]]

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] More precision problems in testing with Intel compilers

2004-08-20 Thread Vadim Ogranovich
For what it's worth. I had this very problem, i.e. the diff, this
morning (I reported it to r-devel). I was using gcc, but because my
$CFLAGS env variable was set to some value, the compilation flags were
different from the ones presumably used to produce the Rout. Once I
unset CFLAGS it worked w/o a hitch (thanks to Peter Dalgaard)

The compiler options that lead to the failure were (note that
optimization id turned off):
gcc -D__NO_MATH_INLINES -mieee-fp -DNO_PURE -Wchar-subscripts -Wformat
-Wimplicit -Wreturn-type -Wswitch -Wreorder -Wwrite-strings
-Woverloaded-virtual -Wshadow -Wno-ctor-dtor-privacy -m486 -fPIC
-DOSRELMAJOR=2 -DOSRELMINOR=4

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Martin Maechler
> Sent: Friday, August 20, 2004 3:39 PM
> To: Samuelson, Frank*
> Cc: '[EMAIL PROTECTED] '
> Subject: RE: [R] More precision problems in testing with 
> Intel compilers
> 
> > "FrankSa" == Samuelson, Frank* 
> > on Thu, 19 Aug 2004 16:22:11 -0400 writes:
> 
> FrankSa> The Intel compiled version also fails the below test:
> 
> here you give the desired output.
> What does your 'Intel compiled R' return instead?
> 
> >> ### Very big and very small
> >> umach <- unlist(.Machine)[paste("double.x", 
> c("min","max"), sep='')]
> >> xmin <- umach[1]
> >> xmax <- umach[2]
> >> tx <- unique(outer(-1:1,c(.1,1e-3,1e-7)))# 7 values  (out of 9)
> >> tx <- unique(sort(c(outer(umach,1+tx# 11 values  
> (out of 14)
> >> tx <- tx[is.finite(tx)] #-- all kept
> >> (txp <- tx[tx >= 1])#-- Positive exponent -- 4 values
>  [1] 1.617924e+308 1.795895e+308 1.797693e+308 1.797693e+308
> >> (txn <- tx[tx <1])#-- Negative exponent -- 7 values
> [1] 2.002566e-308 2.222849e-308 2.225074e-308 
> 2.225074e-308 2.225074e-308 2.227299e-308 2.447581e-308
> 
> FrankSa> Does anyone really care about being correct to 1
> FrankSa> unit of machine precision?  If you do, you have a
> FrankSa> bad algorithm.  ??
> 
> We have had these tests there for a long time now and haven't 
> heard of failures before..  so this is interesting.
> DIG(7) makes us only look at 7 digits which is less than half 
> machine precision, but then there's cancellation of another 7 
> digits in some of those above which gets in the region of 
> machine precision, (but still leaves a factor of ~= 45).
> 
> Can you upload the full print-test.Rout file somewhere?
> 
> Regards,
> Martin
> 
> 
> FrankSa> -Original Message-
> FrankSa> From: Samuelson, Frank* [mailto:[EMAIL PROTECTED] 
> FrankSa> Sent: Thursday, August 19, 2004 12:11 PM
> FrankSa> To: '[EMAIL PROTECTED] '
> FrankSa> Subject: [R] precision problems in testing with 
> Intel compilers
> 
> 
> FrankSa> I compiled the 1.9.1 src.rpm with the standard 
> gnu tools and it works.
> FrankSa> I tried compiling the 1.9.1 src.rpm with the 
> Intel 8 C and FORTRAN
> FrankSa> compilers and it bombs out during the testing phase:
> 
> FrankSa> comparing 'd-p-q-r-tests.Rout' to 
> './d-p-q-r-tests.Rout.save' ...267c267
> FrankSa> < df = 0.5[1] "Mean relative  difference: 5.001647e-10"
> FrankSa> ---
> >> df = 0.5[1] TRUE
> FrankSa> make[3]: *** [d-p-q-r-tests.Rout] Error 1
> FrankSa> make[3]: Leaving directory 
> `/usr/src/redhat/BUILD/R-1.9.1/tests'
> FrankSa> make[2]: *** [test-Specific] Error 2
> FrankSa> make[2]: Leaving directory 
> `/usr/src/redhat/BUILD/R-1.9.1/tests'
> FrankSa> make[1]: *** [test-all-basics] Error 1
> FrankSa> make[1]: Leaving directory 
> `/usr/src/redhat/BUILD/R-1.9.1/tests'
> FrankSa> make: *** [check-all] Error 2
> FrankSa> error: Bad exit status from 
> /var/tmp/rpm-tmp.63044 (%build)
> FrankSa> ...
> 
> __
> [EMAIL PROTECTED] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
>

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] Error messages and C

2004-08-20 Thread Vadim Ogranovich
> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Ross Boylan
> Sent: Friday, August 20, 2004 11:35 AM
> To: r-help
> Subject: [R] Error messages and C
> 
> I am calling a C (C++ really) function via the .C interface.
> Sometimes when things go wrong I want to return an error message.
> 
> 1.  R provides C functions error and warning which look about right. 
> But exactly how does this exit, and in particular what 
> happens with cleaning up, calling C++ destructors, and 
> unwinding the stack?  Will I get memory leaks?
> 

I've run across this issue and I couldn't find a satisfactory solution
to the problem. error() is effectively a long jump. Moreover, if your
C++ function calls an R API function, e.g allocVector(), and the latter
calls error() ... you already know what will happen. Same for
interrupts.

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] Object oriented programming resources

2004-08-12 Thread Vadim Ogranovich
The Bioconductor project posts a short tutorial "A guide to using S4
Objects" under "Developer Page" frame. I've found it useful. 

Note that R-s S4-classes approach to OOP is very different from the one
of C++ or Java. Yet you will find member vars, they are called slots.

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Matthew Walker
> Sent: Thursday, August 12, 2004 7:56 PM
> To: [EMAIL PROTECTED]
> Subject: [R] Object oriented programming resources
> 
> Hi,
> 
> I'm looking for resources to read about the object-oriented 
> features of R.
> 
> I have looked through the "Manuals" page on r-project.org.  
> The most useful of the documents seemed to be the "draft of 
> the R language definition".  However it had only about 6 
> pages on the topic. 
> 
> I have also used Google, but my problem here is that "R" appears in a
> *lot* of webpages!  I tried limiting the search by using 
> "site:r-project.org", but didn't find anything very useful.
> 
> Specifically, I'm trying to find information on "member 
> variables" (I think that's the correct term), as I'd like to 
> copy this concept from C++:
> 
> class a {
>   ...
> private:
>   int x;  // I think the term for this is a member variable };
> 
> Thanks for your thoughts,
> 
> Matthew
> 
> __
> [EMAIL PROTECTED] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
>

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] on.exit() inside local()

2004-08-09 Thread Vadim Ogranovich
Hi,

Since I routinely open files in a loop I've developed a habit of using
on.exit() to close them. Since on.exit() needs to be called within a
function I use eval() as a surrogate. For example:

for (fileName in c("a", "b")) eval({
con <- file(fileName);
on.exit(close(con))
}) 

and con will be closed no matter what.


However it stopped working once I wrapped the loop in local():
> local(
+   for (foo in seq(2)) eval({
+ on.exit(cat(foo, "\n"))
+   })
+ )
Error in cat(foo, "\n") : Object "foo" not found


W/o local()it works just fine
>   for (foo in seq(2)) eval({
+ on.exit(cat(foo, "\n"))
+   })
1 
2 

The reason I wanted the local() is to keep 'foo' from interfering with
the existing environments, but somehow this breaks the thing.
At this point I am stuck. Could someone please tell what's going on?

Thanks,
Vadim

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] error when calling debugger()

2004-08-09 Thread Vadim Ogranovich
Hi,

I am getting an error message when I am trying to run the debugger() on
the last.dump. The debugger() stops after I make a selection. Could
someone please suggest what it might mean? The R log is included below.
This is R-1.8.1 on RH 7.3.

Thanks, Vadim

> load("last.dump.rda")
> debugger(last.dump)
Message:  Error in split(x, f) : Group length is 0 but data length > 0
Available environments had calls:
1: try({
2: local(for (ticker in univ$ticker) {
3: eval.parent(substitute(eval(quote(expr), envir)))
4: eval(expr, p)
5: eval(expr, envir, enclos)
6: eval(quote(for (ticker in univ$ticker) {
7: eval(expr, envir, enclos)
8: local(for (i in l) {
9: eval.parent(substitute(eval(quote(expr), envir)))
10: eval(expr, p)
11: eval(expr, envir, enclos)
12: eval(quote(for (i in l) {
13: eval(expr, envir, enclos)
14: accumulate(accu, lapply(split(seq(length(key)), key), function(i) {
15: lapply(split(seq(length(key)), key), function(i) {
16: split(seq(length(key)), key)
17: split.default(seq(length(key)), key)

Enter an environment number, or 0 to exit  Selection: 1
Error in get(.obj, envir = dump[[.selection]]) : 
recursive default argument reference

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] naive question

2004-07-01 Thread Vadim Ogranovich
Richard,

 Thank you for the analysis. I don't think there is an inconsistency
between the factor of 4 you've found in your example and 20 - 50 I found
in my data. I guess the major cause of the difference lies with the
structure of your data set. Specifically, your test data set differs
from mine in two respects:
* you have fewer lines, but each line contains many more fields (12500 *
800 in your case and 3.8M * 10 in my)
* all of your data fields are doubles, not strings. I have a mixture of
doubles and strings.

I posted a more technical message to r-devel where I discussed possible
reasons for the IO slowness. One of them is that R is slow at making
strings. So if you try to read your data as strings,
colClasses=rep("character", 800), I'd guess you will see a very
different timing. Even simple reshaping of your matrix, say make it
(12500*80) rows by 10 columns, will considerably worsen it.
Please let me know the results if you do anything of the above.

In my message to r-devel you may also find some timing that supports my
estimates.

Thanks,
Vadim

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of 
> Richard A. O'Keefe
> Sent: Thursday, July 01, 2004 5:22 PM
> To: [EMAIL PROTECTED]
> Subject: RE: [R] naive question
> 
> As part of a continuing thread on the cost of loading large 
> amounts of data into R,
> 
> "Vadim Ogranovich" <[EMAIL PROTECTED]> wrote:
>   R's IO is indeed 20 - 50 times slower than that of 
> equivalent C code
>   no matter what you do, which has been a pain for some of us.
> 
> I wondered to myself just how bad R is at reading, when it is 
> given a fair chance.  So I performed an experiment.
> My machine (according to "Workstation Info") is a SunBlade 
> 100 with 640MB of physical memory running SunOS 5.9 Generic, 
> according to fpversion this is an Ultra2e with the CPU clock 
> running at 500MHz and the main memory clock running at 84MHz 
> (wow, slow memory).  R.version is platform sparc-sun-solaris2.9
> arch sparc   
> os   solaris2.9  
> system   sparc, solaris2.9   
> status   
> major1   
> minor9.0 
> year 2004
> month04  
> day  12  
> language R   
> and althnough this is a 64-bit machine, it's a 32-bit 
> installation of R.
> 
> The experiment was this:
> (1) I wrote a C program that generated 12500 rows of 800 columns, the
> numbers were integers 0..999,999,999 generated using drand48().
> These numbers were written using printf().  It is possible to do
> quite a bit better by avoiding printf(), but that would ruin the
> spirit of the comparison, which is to see what can be done with
> *straightforward* code using *existing* library functions.
> 
> 21.7 user + 0.9 system = 22.6 cpu seconds; 109 real seconds.
> 
> The sizes were chosen to get 100MB; the actual size was
> 12500 (lines) 1000 (words) 100012500 (bytes)
> 
> (2) I wrote a C program that read these numbers using 
> scanf("%d"); it
> "knew" there were 800 numbers per row and 12500 numbers in all.
> Again, it is possible to do better by avoiding scanf(), but the
> point is to look at *straightforward* code.
> 
> 18.4 user + 0.6 system = 19.0 cpu seconds; 100 real seconds.
> 
> (3) I started R, played around a bit doing other things, then 
> issued this
> command:
> 
> > system.time(xx <- read.table("/tmp/big.dat", 
> header=FALSE, quote="",
> + row.names=NULL, colClasses=rep("numeric",800), nrows=12500,
> + comment.char="")
> 
> So how long _did_ it take to read 100MB on this machine?
> 
> 71.4 user + 2.2 system = 73.5 cpu seconds; 353 real seconds.
> 
> The result:  the R/C ratio was less than 4, whether you 
> measure cpu time or real time.  It certainly wasn't anywhere 
> near 20-50 times slower.
> 
> Of course, *binary* I/O in C *would* be quite a bit faster:
> (1') generate same integers but write a row at a time using fwrite():
>  5 seconds cpu, 25 seconds real; 40 MB.
> 
> (2') read same integers a row at a time using fread()
>  0.26 seconds cpu, 1 second real.
> 
> This would appear to more than justify "20-50 times slower", 
> but reading binary data and reading data in a textual 
> representation are different things, "less than 4 times 
> slower" is the fairer measure.  However, it does emphasise 
> the usefulness of problem-specific bulk reading techniques.
> 
> I thought 

RE: [R] naive question

2004-06-29 Thread Vadim Ogranovich
 R's IO is indeed 20 - 50 times slower than that of equivalent C code no
matter what you do, which has been a pain for some of us. It does
however help read the Import/Export tips as w/o them the ratio gets much
worse. As Gabor G. suggested in another mail, if you use the file
repeatedly you can convert it into internal format: read.table once into
R and save using save()... This is much faster.

In my experience R is not so good at large data sets, where large is
roughly 10% of your RAM.

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] setGeneric / standardGeneric when args are not "literals" - corrected

2004-06-18 Thread Vadim Ogranovich
This is a correction to my previous message, I forgot to swap two lines
in the body of setMakeGenericMethod. Sorry about that. The correct (full
message) reads like this:

Hi,
 
This works

> setGeneric("clear", function(obj) standardGeneric("clear"))
[1] "clear"
 
but this doesn't. Why?

> funName <- "clear"
> setGeneric(funName, function(obj) standardGeneric(funName))
Error in .recursiveCallTest(body, fname) : 
 (converted from warning) The body of the generic function for "clear"
calls standardGeneric to dispatch on a different name ("funName")!
 

This is R-1.8.1 on RH-7.3
 
 
I came across it while trying to write a helper function that would
"safely" create generics when a function with such a name already
exists. Here is what I adapted from S4Objects but it doesn't work
because of the above-mentioned problem. Any suggestion how to make it
work, please?
 
setMakeGenericMethod <- function(methodName, className, fun) {
  # sets a method and creates the generics if necessary
  if (!isGeneric(methodName)) {
if (is.function(methodName)) {
  fun.default <- get(methodName)
}
else {
  assign(methodName, methodName)

  browser()
  fun.default <- function(object) standardGeneric(methodName)
}

setGeneric(methodName, fun.default)
  }

  setMethod(methodName, className, fun)
}

Thanks,
Vadim

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] setGeneric / standardGeneric when args are not "literals"

2004-06-18 Thread Vadim Ogranovich
Hi,
 
This works

> setGeneric("clear", function(obj) standardGeneric("clear"))
[1] "clear"
 
but this doesn't. Why?

> funName <- "clear"
> setGeneric(funName, function(obj) standardGeneric(funName))
Error in .recursiveCallTest(body, fname) : 
 (converted from warning) The body of the generic function for "clear"
calls standardGeneric to dispatch on a different name ("funName")!
 

This is R-1.8.1 on RH-7.3
 
 
I came across it while trying to write a helper function that would
"safely" create generics when a function with such a name already
exists. Here is what I adapted from S4Objects but it doesn't work
becuase of the above-mentioned problem. Any suggestion how to make it
work, please?
 
setMakeGenericMethod <- function(methodName, className, fun) {
  # sets a method and creates the generics if neccessary
  if (!isGeneric(methodName)) {
if (is.function(methodName)) {
  fun.default <- get(methodName)
}
else {
  fun.default <- function(object) standardGeneric(methodName)
}
  }
 
  setGeneric(methodName, fun.default)
 
  setMethod(methodName, className, fun)
}


Thanks,
Vadim

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] mkChar can be interrupted

2004-06-14 Thread Vadim Ogranovich
Thank you. This gives hope, I am looking forward to the next major
release.

May I make a wish that the future try/finally mechanism will inlcude
support for C++ exceptions? For example by allowing the programmer to
set an error handler that raises a C++ exception. It should be easy in
error(), but might be a problem in interrupt handlers (I am not an
expert though).

Thanks,
Vadim

> -Original Message-
> From: Luke Tierney [mailto:[EMAIL PROTECTED] 
> Sent: Monday, June 14, 2004 7:46 PM
> To: Vadim Ogranovich
> Cc: R-Help
> Subject: RE: [R] mkChar can be interrupted
> 
> On Mon, 14 Jun 2004, Vadim Ogranovich wrote:
> 
> > This is disappointing. How on Earth can mkChar know when it 
> is safe or 
> > not to make a long jump? For example if I just opened a 
> file how am I 
> > supposed to close it after the long jump? I am not even 
> talking about
> > C++ where long jumps are simply devastating... (and this is the 
> > C++ language
> > I am coding in :-( )
> >
> > Ok. A practical question: is it possible to somehow block 
> > R_CheckUserInterrupt? I am ready to put up with 
> out-of-memory errors, 
> > but Ctrl-C is too common to be ignored.
> 
> Interrupts are not the issue.  The issue is making sure that 
> cleanup actions occur even if there is a non-local exit.  A 
> solution that addresses that issue will work for any 
> non-local exit, whether it comes from an interrupt or an 
> exception.  So you don't have to put up with anything if you 
> approach this the right way,
> 
> Currently there is no user accessible C level try/finally 
> mechanism for insuring that cleanup code is executed during a 
> non-local exit.
> We should make such a mechanicm available; maybe one will 
> make it into the next major release.
> 
> For now you have two choices:
> 
> You can create an R level object and attach a finalizer 
> to the object
> that will arrange for the GC to close the file at some 
> point in the
> future if a non-local exit occurs.  Search 
> developer.r-project.org for
> finalization and weak references for some info on this.
> 
> One other option is to use the R_ToplevelExec function.  
> This has some
> drawbacks since it effectively makes invisible all other error
> handlers, but it is an option.  It is also not officially 
> documented
> and subject to change.
> 
> > And I think it makes relevant again the question I asked in another 
> > related thread: how is memory allocated by Calloc() and R_alloc() 
> > stand up against long jumps?
> 
> R_alloc is stack-based; the stack is unwound on a non-local 
> exit, so this is released on regular exits and non-local 
> ones.  It uses R allocation, so it could itself cause a 
> non-local exit.
> 
> Calloc is like calloc but will never return NULL.  If the 
> allocation fails, then an error is signaled, which will 
> result in a non-local exit.  If the allocation succeeds, you 
> are responsable for calling Free.
> 
> luke
> 
> > > -Original Message-
> > > From: Luke Tierney [mailto:[EMAIL PROTECTED]
> > > Sent: Monday, June 14, 2004 5:43 PM
> > > To: Vadim Ogranovich
> > > Cc: R-Help
> > > Subject: RE: [R] mkChar can be interrupted
> > > 
> > > On Mon, 14 Jun 2004, Vadim Ogranovich wrote:
> > > 
> > > > I am confused. Here is an excerpt from R-exts:
> > > > 
> > > > "As from R 1.8.0 no port of R can be interrupted whilst
> > > running long
> > > > computations in compiled code,..."
> > > > 
> > > > Doesn't it imply that the primitive functions like allocVector, 
> > > > mkChar, etc., which are likely to occur in any compiled code 
> > > > called via .Call, are not supposed to handle interrupts 
> in any way?
> > > 
> > > No it does not.  Read the full context.  It says that if 
> you wite a 
> > > piece of C code that may run a long time and you want to 
> guarantee 
> > > that users will be able to interrupt your code then you should 
> > > insure that R_CheckUserInterrupt is called periodically.  If your 
> > > code already periodically calls other R code that checks for 
> > > interrupts then you may not need to do this yourself, but 
> in general 
> > > you do.
> > > 
> > > Prior to 1.8.0 on Unix-like systems the asynchronous 
> signal handler 
> > > for SIGINT would longjmp to the nearest top level or browser 
> > > context, which meant that on these sytems any code was 
> interruptible 
> > > at any point u

RE: [R] mkChar can be interrupted

2004-06-14 Thread Vadim Ogranovich
This is disappointing. How on Earth can mkChar know when it is safe or
not to make a long jump? For example if I just opened a file how am I
supposed to close it after the long jump? I am not even talking about
C++ where long jumps are simply devastating... (and this is the language
I am coding in :-( )


Ok. A practical question: is it possible to somehow block
R_CheckUserInterrupt? I am ready to put up with out-of-memory errors,
but Ctrl-C is too common to be ignored.

And I think it makes relevant again the question I asked in another
related thread: how is memory allocated by Calloc() and R_alloc() stand
up against long jumps?

Thanks,
Vadim 



> -Original Message-
> From: Luke Tierney [mailto:[EMAIL PROTECTED] 
> Sent: Monday, June 14, 2004 5:43 PM
> To: Vadim Ogranovich
> Cc: R-Help
> Subject: RE: [R] mkChar can be interrupted
> 
> On Mon, 14 Jun 2004, Vadim Ogranovich wrote:
> 
> > I am confused. Here is an excerpt from R-exts:
> > 
> > "As from R 1.8.0 no port of R can be interrupted whilst 
> running long 
> > computations in compiled code,..."
> > 
> > Doesn't it imply that the primitive functions like allocVector, 
> > mkChar, etc., which are likely to occur in any compiled code called 
> > via .Call, are not supposed to handle interrupts in any way?
> 
> No it does not.  Read the full context.  It says that if you 
> wite a piece of C code that may run a long time and you want 
> to guarantee that users will be able to interrupt your code 
> then you should insure that R_CheckUserInterrupt is called 
> periodically.  If your code already periodically calls other 
> R code that checks for interrupts then you may not need to do 
> this yourself, but in general you do.
> 
> Prior to 1.8.0 on Unix-like systems the asynchronous signal 
> handler for SIGINT would longjmp to the nearest top level or 
> browser context, which meant that on these sytems any code 
> was interruptible at any point unless it was explicitly 
> protected by a construct that suspended interrupts.  Allowing 
> interrupts at any point meant that inopportune interrupts 
> could and did crash R, which is why this was changed.
> 
> Unless there is explicit documentation to the contrary you 
> should assume that every function in the R API might allocate 
> and might cause a non-local exit (i.e. a longjmp) when an 
> exception is raised (and an interrupt is one of, but only one 
> of, the exceptions that might occur).
> 
> luke
> 
> > Thanks,
> > Vadim
> > 
> > 
> > > From: Luke Tierney [mailto:[EMAIL PROTECTED]
> > > 
> > > On Mon, 14 Jun 2004, Vadim Ogranovich wrote:
> > > 
> > > > > From: Luke Tierney [mailto:[EMAIL PROTECTED]
> > ...
> > > > > 
> > > > > Not sure why you think this suggest mkChar can be interrupted.
> > > > > 
> > ...
> > > > > by calls to this function.  I don't believe there are any
> > > such safe
> > > > > points in mkChar, but there are several potential ones
> > > within your
> > > > > example.
> > > > 
> > > > Apart from mkChar I am only calling SET_STRING_ELT. Is this
> > > what you
> > > > mean?
> > > 
> > > You are printing, you have an assignment expression, all of those 
> > > contain points where an interrupt could be checked for.
> > 
> > These are not relevant since Ctrl-C is pressed when the 
> code is inside
> >   for (i=0; i > SET_STRING_ELT(resSexp, i, mkChar("foo"));
> >   }
> > 
> > Just look at the way I deliver the signal.
> > 
> > __
> > [EMAIL PROTECTED] mailing list
> > https://www.stat.math.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide! 
> > http://www.R-project.org/posting-guide.html
> > 
> 
> --
> Luke Tierney
> University of Iowa  Phone: 319-335-3386
> Department of Statistics andFax:   319-335-3017
>Actuarial Science
> 241 Schaeffer Hall  email:  [EMAIL PROTECTED]
> Iowa City, IA 52242 WWW:  http://www.stat.uiowa.edu
> 
> 
>

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] mkChar can be interrupted

2004-06-14 Thread Vadim Ogranovich
I am confused. Here is an excerpt from R-exts:

"As from R 1.8.0 no port of R can be interrupted whilst running long
computations in
compiled code,..."

Doesn't it imply that the primitive functions like allocVector, mkChar,
etc., which are likely to occur in any compiled code called via .Call,
are not supposed to handle interrupts in any way?

Thanks,
Vadim


> From: Luke Tierney [mailto:[EMAIL PROTECTED] 
> 
> On Mon, 14 Jun 2004, Vadim Ogranovich wrote:
> 
> > > From: Luke Tierney [mailto:[EMAIL PROTECTED]
...
> > > 
> > > Not sure why you think this suggest mkChar can be interrupted.
> > > 
...
> > > by calls to this function.  I don't believe there are any 
> such safe 
> > > points in mkChar, but there are several potential ones 
> within your 
> > > example.
> > 
> > Apart from mkChar I am only calling SET_STRING_ELT. Is this 
> what you 
> > mean?
> 
> You are printing, you have an assignment expression, all of 
> those contain points where an interrupt could be checked for.

These are not relevant since Ctrl-C is pressed when the code is inside
  for (i=0; ihttps://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] mkChar can be interrupted

2004-06-14 Thread Vadim Ogranovich
 

> -Original Message-
> From: Luke Tierney [mailto:[EMAIL PROTECTED] 
> Sent: Monday, June 14, 2004 1:30 PM
> To: Vadim Ogranovich
> Cc: R-Help
> Subject: Re: [R] mkChar can be interrupted
> 
> Not sure why you think this suggest mkChar can be interrupted.
> 
> If you want to figure out how interrupt handling works on 
> unix, run under gdb and single step from the signal to the 
> next point where R_CheckUserInterrupt is called.  You should 
> find that the signal handler sets a flag and that flag is 
> checked at various safe points by calls to this function.  I 
> don't believe there are any such safe points in mkChar, but 
> there are several potential ones within your example.

Apart from mkChar I am only calling SET_STRING_ELT. Is this what you
mean?

To make sure, I am not trying to enable or handle interrupts. On the
contrary, I want them to be disabled for the duration of .Call, which is
what I thought R was supposed to do for me. I am surprised it didn't.

As to why I singled out mkChar, well, strictly speaking it is
'SET_STRING_ELT(resSexp, i, mkChar("foo"))' where the interrupt somehow
goes through. But SET_STRING_ELT is much faster than mkChar so I guessed
that it must be mkChar.
Anyway, be it SET_STRING_ELT or mkChar, the interrupt should have been
blocked.


As an additional check I tried to interrupt right before and right after
  for (i=0; i 
> As mentioned in a reply in another thread, interrupt handling 
> is one aspect of R internals that is still evolving.  Among 
> other things, we will need to make changes as we improve 
> support for other event loops.
> [In applications with graphical interfaces signals are not 
> the right way to deal with user interruption (in particular 
> on operating systems that don't support proper signals)].
> 
> Best,
> 
> luke
> 
> On Mon, 14 Jun 2004, Vadim Ogranovich wrote:
> 
> > Hi,
> >  
> > As was discussed earlier in another thread and as 
> documented in R-exts
> > .Call() should not be interruptible by Ctrl-C. However the 
> following 
> > code, which spends most of its time inside mkChar, turned out to be 
> > interruptible on RH-7.3 R-1.8.1 gcc-2.96:
> >  
> >  
> > #include 
> > #include 
> > 
> > SEXP foo0(const SEXP nSexp) {
> >   int i, n;
> >   SEXP resSexp;
> > 
> >   if (!isInteger(nSexp))
> > error("wrong arg type\n");
> > 
> >   n = asInteger(nSexp);
> >   resSexp = PROTECT(allocVector(STRSXP, n));
> > 
> >   Rprintf("!!!time to interrup!!!\n");
> >   for (i=0; i > SET_STRING_ELT(resSexp, i, mkChar("foo"));
> >   }
> > 
> >   Rprintf("end mkChar\n");
> >   UNPROTECT(1);
> > 
> >   return R_NilValue;
> > }
> > 
> > 
> > 
> > # invoke 'foo0' and give it an argument large enough to let 
> you type 
> > Ctrl-C # double the argument if you see "end mkChar" and do 
> it again 
> > :-)
> > > x <- .Call("foo0", as.integer(1e7))
> > !!!time to interrup!!!
> > 
> > > 
> > > version
> >  _
> > platform i686-pc-linux-gnu
> > arch i686 
> > os   linux-gnu
> > system   i686, linux-gnu  
> > status
> > major1
> > minor8.1  
> > year 2003 
> > month11   
> > day  21   
> > language R
> > 
> > 
> > Thanks,
> > Vadim
> > 
> > __
> > [EMAIL PROTECTED] mailing list
> > https://www.stat.math.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide! 
> > http://www.R-project.org/posting-guide.html
> > 
> 
> --
> Luke Tierney
> University of Iowa  Phone: 319-335-3386
> Department of Statistics andFax:   319-335-3017
>Actuarial Science
> 241 Schaeffer Hall  email:  [EMAIL PROTECTED]
> Iowa City, IA 52242 WWW:  http://www.stat.uiowa.edu
> 
> 
>

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] mkChar can be interrupted

2004-06-14 Thread Vadim Ogranovich
Hi,
 
As was discussed earlier in another thread and as documented in R-exts
.Call() should not be interruptible by Ctrl-C. However the following
code, which spends most of its time inside mkChar, turned out to be
interruptible on RH-7.3 R-1.8.1 gcc-2.96:
 
 
#include 
#include 

SEXP foo0(const SEXP nSexp) {
  int i, n;
  SEXP resSexp;

  if (!isInteger(nSexp))
error("wrong arg type\n");

  n = asInteger(nSexp);
  resSexp = PROTECT(allocVector(STRSXP, n));

  Rprintf("!!!time to interrup!!!\n");
  for (i=0; i x <- .Call("foo0", as.integer(1e7))
!!!time to interrup!!!

> 
> version
 _
platform i686-pc-linux-gnu
arch i686 
os   linux-gnu
system   i686, linux-gnu  
status
major1
minor8.1  
year 2003 
month11   
day  21   
language R


Thanks,
Vadim

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] memory allocation and interrupts

2004-06-11 Thread Vadim Ogranovich
Hi,
 
A recent discussion on the list about tryCatch and signals made me think
about memory allocation and signals in C extension modules. What happens
to the memory allocated by R_alloc and Calloc if the user pressed Ctr-C
during the call? R-ext doesn't seem to discuss this. I'd guess that
R_alloc is interrupt-safe while Calloc is not, but I am not sure. In any
case a paragraph in R-ext on signals would be helpful.
 
While looking around for interrupts handling in the code I came across
BEGIN_SUSPEND_INTERRUPTS macro in Defn.h file. Unfortunately, it is not
available via R.h or Rinternals.h. Am I missing something? If not, could
future releases of R make it available via, say, Rinternals.h?
 
Thanks,
Vadim

[[alternative HTML version deleted]]

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] fast mkChar

2004-06-09 Thread Vadim Ogranovich
Thank you for the lead, Peter. It may be useful for other packages I
write.

As to the strings, I think I have to take what is already there. I agree
that strings would be better managed in malloc-style fashion (probably
with reference counter) and not by gc(). However I don't want to have a
system with two different string classes, such close relatives seldom
coexist peacefully.

BTW, the slowness of mkChar explains why R is so slow when it needs to
compute names for long vectors.

Thank you for an interesting discussion,
Vadim 

> -Original Message-
> From: Peter Dalgaard [mailto:[EMAIL PROTECTED] 
> Sent: Tuesday, June 08, 2004 3:35 PM
> To: Vadim Ogranovich
> Cc: R-Help
> Subject: Re: [R] fast mkChar
> 
> "Vadim Ogranovich" <[EMAIL PROTECTED]> writes:
> 
> > I am no expert in memory management in R so it's hard for 
> me to tell 
> > what is and what is not doable. From reading the code of 
> allocVector() 
> > in memory.c I think that the critical part is to vectorize 
> > CLASS_GET_FREE_NODE and use the vectorized version along 
> the lines of 
> > the code fragment below (taken from memory.c).
> > 
> > if (node_class < NUM_SMALL_NODE_CLASSES) {
> > CLASS_GET_FREE_NODE(node_class, s);
> > 
> > If this is possible than the rest is just a matter of code 
> refactoring.
> > 
> > By vectorizing I mean writing a macro 
> CLASS_GET_FREE_NODE2(node_class, 
> > s, n) which in one go allocates n little objects of class 
> node_class 
> > and "inscribes" them into the elements of vector s, which 
> is assumed 
> > to be long enough to hold these objects.
> > 
> > If this is doable than the only missing piece would be a 
> new function 
> > setChar(CHARSXP rstr, const char * cstr) which copies 
> 'cstr' into 'rstr'
> > and (re)allocates the heap memory if necessary. Here the setChar() 
> > macro is safe since s[i]-s are all brand new and thus are 
> not shared 
> > with any other object.
> 
> I had a similar idea initially, but I don't think it can fly: 
> First, allocating n objects at once is not likely to be much 
> faster than allocating them one-by-one, especially when you 
> consider the implications of having to deal with 
> near-out-of-memory conditions.
> Second, you have to know the string lengths when allocating, 
> since the structure of a vector object (CHARSXP) is a header 
> immediately followed by the data.
> 
> A more interesting line to pursue is that - depending on what 
> it really is that you need - you might be able to create a 
> different kind of object that could "walk and quack" like a 
> character vector, but is stored differently internally. E.g. 
> you could set up a representation that is just a block of 
> pointers, pointing to strings that are being maintained in 
> malloc-style.
> 
> Have a look at External pointers and finalization.
> 
> 
> -- 
>O__   Peter Dalgaard Blegdamsvej 3  
>   c/ /'_ --- Dept. of Biostatistics 2200 Cph. N   
>  (*) \(*) -- University of Copenhagen   Denmark  Ph: 
> (+45) 35327918
> ~~ - ([EMAIL PROTECTED]) FAX: 
> (+45) 35327907
> 
>

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] fast mkChar

2004-06-08 Thread Vadim Ogranovich
I am no expert in memory management in R so it's hard for me to tell
what is and what is not doable. From reading the code of allocVector()
in memory.c I think that the critical part is to vectorize
CLASS_GET_FREE_NODE and use the vectorized version along the lines of
the code fragment below (taken from memory.c).

if (node_class < NUM_SMALL_NODE_CLASSES) {
CLASS_GET_FREE_NODE(node_class, s); 

If this is possible than the rest is just a matter of code refactoring.

By vectorizing I mean writing a macro CLASS_GET_FREE_NODE2(node_class,
s, n) which in one go allocates n little objects of class node_class and
"inscribes" them into the elements of vector s, which is assumed to be
long enough to hold these objects.

If this is doable than the only missing piece would be a new function
setChar(CHARSXP rstr, const char * cstr) which copies 'cstr' into 'rstr'
and (re)allocates the heap memory if necessary. Here the setChar() macro
is safe since s[i]-s are all brand new and thus are not shared with any
other object.



> -Original Message-
> From: Peter Dalgaard [mailto:[EMAIL PROTECTED] 
> Sent: Tuesday, June 08, 2004 1:23 PM
> To: Vadim Ogranovich
> Cc: R-Help
> Subject: Re: [R] fast mkChar
> 
> "Vadim Ogranovich" <[EMAIL PROTECTED]> writes:
> 
> > Hi,
> >  
> > To speed up reading of large (few million lines) CSV files I am 
> > writing custom read functions (in C). By timing various 
> approaches I 
> > figured out that one of the bottlenecks in reading 
> character fields is 
> > the mkChar() function which on each call incurs a lot of 
> > garbage-collection-related overhead.
> >  
> > I wonder if there is a "vectorized" version of mkChar, say 
> > mkChar2(char **, int length) that converts an array of C 
> strings to a 
> > string vector, which somehow amortizes the gc overhead over 
> the entire array?
> >  
> > If no such function exists, I'd appreciate any hint as to 
> how to write 
> > it.
> 
> The real issue here is that character vectors are implemented 
> as generic vectors of little R objects (CHARSXP type) that 
> each hold one string. Allocating all those objects is 
> probably what does you in.
> 
> The reason behind the implementation is probably that doing 
> it that way allows the mechanics of the garbage collector to 
> be applied directly (CHARSXPs are just vectors of bytes), but 
> it is obviously wasteful in terms of total allocation. If you 
> can think up something better, please say so (but remember 
> that the memory management issues are nontrivial).
> 
> -- 
>O__   Peter Dalgaard Blegdamsvej 3  
>   c/ /'_ --- Dept. of Biostatistics 2200 Cph. N   
>  (*) \(*) -- University of Copenhagen   Denmark  Ph: 
> (+45) 35327918
> ~~ - ([EMAIL PROTECTED]) FAX: 
> (+45) 35327907
> 
>

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] fast mkChar

2004-06-08 Thread Vadim Ogranovich
Hi,
 
To speed up reading of large (few million lines) CSV files I am writing
custom read functions (in C). By timing various approaches I figured out
that one of the bottlenecks in reading character fields is the mkChar()
function which on each call incurs a lot of garbage-collection-related
overhead.
 
I wonder if there is a "vectorized" version of mkChar, say mkChar2(char
**, int length) that converts an array of C strings to a string vector,
which somehow amortizes the gc overhead over the entire array?
 
If no such function exists, I'd appreciate any hint as to how to write
it.
 
Thanks,
Vadim

[[alternative HTML version deleted]]

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] C-level try-catch

2004-06-01 Thread Vadim Ogranovich
Hi,
 
Is there a C-level equivalent of tryCatch? I want to be able to
intercept errors generated inside functions like allocVector(),
SET_STRING_ELT(), etc. They all are probably generated by error(char *,
...), but this is just a guess.
 
Thanks,
Vadim

[[alternative HTML version deleted]]

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] Using R in C++

2004-05-26 Thread Vadim Ogranovich
Look at "Writing R Extensions" guide. It covers both R-from-C and C-from-R.

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Jörgen 
> Wallerman
> Sent: Wednesday, May 26, 2004 3:09 AM
> To: '[EMAIL PROTECTED]'
> Subject: [R] Using R in C++
> 
> 
> Hello,
> 
> Is it possible to use R functions (in my case: ks.test()) 
> from C++ -applications? That is, I get the impression R can 
> execute C/C++ code, but is there any possibility to do the 
> opposite? Where can I find help?
> 
> 
> ---
> Ph. D. Jörgen Wallerman
> Swedish University of Agricultural Sciences
> Remote Sensing Laboratory
> S901 83 UMEÅ
>  
> ###
> 
> This message has been scanned by F-Secure 
> Anti-Virus for Microsoft Exchange.
> 
> ###
> 
> __
> [EMAIL PROTECTED] mailing list 
> https://www.stat.math.ethz.ch/mailman/listinfo> /r-help
> PLEASE 
> do read the posting guide! 
> http://www.R-project.org/posting-guide.html
>

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] ifelse when test is shorter than yes/no

2004-05-20 Thread Vadim Ogranovich
It does work as documented. My question was why it was designed to work
this way. I can not think of a practical situation when someone might
want to ifelse() on a 'test' that is shorter than yes/no w/o expecting
'test' to recycle (therefore I was asking for a warning).

I find this behavior inconsistent with the (spirit of) R's recycling
rules. For example if 'test', 'yes', 'no' are all of the same length
then the following two expressions are equivalent:

1.
x <- ifelse(test, yes, no)

2.
x <- no; x[test] <- yes[test]

This equivalence breaks when 'test' is shorter than yes/no: in the
second case 'test' will be recycled. And I don't see a good reason for
having them behave differently.

If I had to implement ifelse() I'd probably do:

ifelse2 <- function(test, yes, no) {
x <- rep(no, length.out=max(length(test), length(yes),
length(no)))
x[test] <- yes[test]

x
}

(If there is interest I can extend it to take care of NA-s and submit as
a (trivial) patch)


Here is a simple test:
> ifelse2(c(TRUE, FALSE), seq(10), -seq(5))
 [1]  1 -2  3 -4  5 -1  7 -3  9 -5



Maybe it will help if I tell how I stumbled upon this problem. I had two
m*n matrices, 'yes' and 'no', and a 'test' vector of length m. I wanted
to create a m*n matrix which has 'yes' rows where test==TRUE and 'no'
rows otherwise. So I did

x <- matrix(ifelse(test, yes, no), nrow(yes), ncol(yes))

priding myself for doing it the "whole object way" ... and 'test' did
not recycle (in full accordance with the help page) w/o a warning.


Thanks,
Vadim




> -Original Message-
> From: Liaw, Andy [mailto:[EMAIL PROTECTED] 
> Sent: Thursday, May 20, 2004 2:20 PM
> To: Vadim Ogranovich; R-Help
> Subject: RE: [R] ifelse when test is shorter than yes/no
> 
> 
> > From: Vadim Ogranovich
> > 
> > Hi,
> >  
> > It turns out that the 'test' vector in ifelse(test, yes, no) is not 
> > recycled if it is shorter than the other arguments, e.g.
> >  
> > > ifelse(TRUE, seq(10), -seq(10))
> > [1] 1
> > 
> >  
> > Is there any particular reason it is not recycled? If there is one 
> > indeed a warning message might be in order when someone 
> calls ifelse 
> > with a shorter 'test'.
> 
> ?ifelse says:
> 
> Value:
> 
>  A vector of the same length and attributes (including class) as
>  'test' and data values from the values of 'yes' or 'no'.  ...
> 
> Seems to me it works as documented.  Why do you expected otherwise?
> 
> Andy
>   
> > This is R1.8.1 on RH-7.3
> >  
> > Thanks,
> > Vadim
> > 
> > [[alternative HTML version deleted]]
> > 
> > __
> > [EMAIL PROTECTED] mailing list 
> > https://www.stat.math.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide!
> > http://www.R-project.org/posting-guide.html
> > 
> > 
> 
> 
> --
> 
> Notice:  This e-mail message, together with any attachments, 
> contains information of Merck & Co., Inc. (One Merck Drive, 
> Whitehouse Station, New Jersey, USA 08889), and/or its 
> affiliates (which may be known outside the United States as 
> Merck Frosst, Merck Sharp & Dohme or MSD and in Japan, as 
> Banyu) that may be confidential, proprietary copyrighted 
> and/or legally privileged. It is intended solely for the use 
> of the individual or entity named on this message.  If you 
> are not the intended recipient, and have received this 
> message in error, please notify us immediately by reply 
> e-mail and then delete it from your system.
> --
> 
>

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] ifelse when test is shorter than yes/no

2004-05-20 Thread Vadim Ogranovich
Hi,
 
It turns out that the 'test' vector in ifelse(test, yes, no) is not
recycled if it is shorter than the other arguments, e.g.
 
> ifelse(TRUE, seq(10), -seq(10))
[1] 1

 
Is there any particular reason it is not recycled? If there is one
indeed a warning message might be in order when someone calls ifelse
with a shorter 'test'.
 
This is R1.8.1 on RH-7.3
 
Thanks,
Vadim

[[alternative HTML version deleted]]

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] recover should send messages to stderr, not stdout

2004-05-12 Thread Vadim Ogranovich


> -Original Message-
> From: Prof Brian Ripley [mailto:[EMAIL PROTECTED] 
> Sent: Tuesday, May 11, 2004 11:21 PM
> To: Vadim Ogranovich
> Cc: R-Help
> Subject: Re: [R] recover should send messages to stderr, not stdout
> 
...
> 
> Note that some of us consider recover() to be designed for 
> interactive-only use, and use something like

Unfortunately, R help doesn't reflect the apparent diversity of
opinions. Regarding recover it says


 The use of 'recover' largely supersedes 'dump.frames' as an error
 option, unless you really want to wait to look at the error.  If
 'recover' is called in non-interactive mode, it behaves like
 'dump.frames'.  <...>

> options(error=expression(if(interactive()) recover() else 
> dump.calls()))

This is useful. Thank you very much for the tip!

> On Tue, 11 May 2004, Vadim Ogranovich wrote:
> 
> > recover() sends all its messages, which I consider to be error 
> > messages, to stdout. I think they more properly belong to stderr.
> >  
> > This is an important difference for those of us who use R in batch 
> > mode to generate ASCII files.
> 
> Only to the subset who believe that recover() is a useful 
> error option in 
> non-interactive use.

This subset is likely to include everyone who carefully reads the
documentation, see the above excerpt from a help page.


Thanks,
Vadim

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] recover should send messages to stderr, not stdout

2004-05-11 Thread Vadim Ogranovich
Hi,
 
recover() sends all its messages, which I consider to be error messages,
to stdout. I think they more properly belong to stderr.
 
This is an important difference for those of us who use R in batch mode
to generate ASCII files.
 
Thanks,
Vadim

[[alternative HTML version deleted]]

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] test for end of file on connection

2004-05-10 Thread Vadim Ogranovich
Hi,
 
I am looking for a function to test for end-of-file on a connection.
Apparently this question was already asked a couple of years ago and
then P. Dalgaard suggested to look at help(connections),
help(readLines). Unfortunately, I couldn't find such a function on those
pages, maybe I am missing something.
 
Did anyone figure this out?
 
Thanks,
Vadim
 
 
 

[[alternative HTML version deleted]]

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] dump.frames in non-interactive mode

2004-05-07 Thread Vadim Ogranovich
Hi,
 
recover() resorts to dump.frames() when called in non-interactive
sessions. However dump.frames() still puts the dump into last.dump
object and not into a file, which brings the question of how to get hold
of the object once the session terminates.
 
I guess the supposed answer is that last.dump is saved in .Rdata once R
exits. However I find it more convenient on day-to-day basis to use
--no-save --no-restore options and avoid .Rdata. So my question is how
to best set a hook, in say .Rprofile, so that in non-interactive mode
(and only in it) last.dump (and only it) will be saved in a file upon
exit from R.
 
Thanks,
Vadim

[[alternative HTML version deleted]]

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] error in file(file, "r"): all connections are in use

2004-05-04 Thread Vadim Ogranovich
If you open a connection within a function it is often a good idea to
set an "on.exit" expression that will close the connection. This will be
called even if your function terminates via stop(). Here is an example:

 con <- file("foo")
 open(con)
 on.exit(close(con), add=TRUE)

HTH,
Vadim

> -Original Message-
> From: Lei Jiang [mailto:[EMAIL PROTECTED] 
> Sent: Tuesday, May 04, 2004 2:02 PM
> To: [EMAIL PROTECTED]
> Subject: [R] error in file(file, "r"): all connections are in use
> 
> 
> Hi, there.
> 
> I am trying to read multiple files into R, but I got following message
> 
> Error in file(file, "r"): All connections are in use.
> 
> I clean up memory everytime I read in one file. Do i have to 
> somehow release file connection everytime i read in one??
> 
> Thanks.
> 
> Lei
> 
> Department of Chemsitry
> University of Washington
> Box 351700
> Seattle, WA 98195
> Phone: 206-616-6882
> Fax: 206-685-8665
> 
> __
> [EMAIL PROTECTED] mailing list 
> https://www.stat.math.ethz.ch/mailman/listinfo> /r-help
> PLEASE 
> do read the posting guide! 
> http://www.R-project.org/posting-guide.html
>

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] seek(..., origin='current') doesn't work on gzfile

2004-05-03 Thread Vadim Ogranovich
HI,
 
It seems that seek(..., origin='current') doesn't properly work on
gzfile connections, while it does work properly on ordinary files.
 
# open a connection and repeatedly increment current position by 13
chars. Note that the read position does not move
> con <- gzfile("foo.gz"); open(con)
> seek(con, 13, origin='current')
[1] 0
> seek(con, 13, origin='current')
[1] 13
> seek(con, 13, origin='current')
[1] 13
> seek(con, 13, origin='current')
[1] 13
 
 
# and the same thing on ordinary file works as expected (and as
documented)
> close(con); con <- file("foo"); open(con)
> seek(con, 13, origin='current')
[1] 0
> seek(con, 13, origin='current')
[1] 13
> seek(con, 13, origin='current')
[1] 26
> seek(con, 13, origin='current')
[1] 39

 
This is R-1.8.1 on RH-7.3
 
Thanks,
Vadim
 
 
P.S. For the sake of completeness this the 'foo' file:
 
A;20030110;P
B;20030110;P
B;20030110;P
B;20030110;T
B;20030110;T
B;20030110;T
B;20030110;T
B;20030110;T
B;20030110;T
C;20030110;T


[[alternative HTML version deleted]]

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] skip lines on a connection

2004-05-02 Thread Vadim Ogranovich
> From: Prof Brian Ripley [mailto:[EMAIL PROTECTED] 
> Sent: Saturday, May 01, 2004 11:44 PM
> You will be telling us next you think the default nmax=-1 
> means to read a negative number of lines!

No, I won't. Your extrapolation is inaccurate.


> ...  So reading no 
> lines would mean not calling scan at all, and what would be 
> the point of that?


It would mean skipping the number of lines specified in the skip
argument thus advancing the read point on the connection to where I want
it to be. I guess you wouldn't argue that seek(con, where) has no
meaning.


> 
> nmax <= 0 and nlines <= 0 are ignored.
> 
> Note carefully what nmax actually means, and it is not what `nlines' 
> means!


I had noted that. If one reads no "data value" one reads no line, so the
two should have the same effect in the case at hand.


> Do read the documentation for scan, too, please.


I had. For your convenience this is what it says about nmax.

nmax: the maximum number of data values to be read, or if 'what' is
  a list, the maximum number of records to be read.  If omitted
  (and 'nlines' is not set to a positive value), 'scan' will
  read to the end of 'file'.

It is hard to see from the text that nmax=0 is ignored since "omitted"
means leaving it set to -1.

BTW, the paragraph regarding 'nlines' doesn't mention that nlines=0 is a
special case either.

  nlines: the maximum number of lines of data to be read.


> Note that to read *lines* you do need to read every byte on 
> the file to 
> find the EOL marker(s) so readLines() or scan() with NULL in 
> "what" are as 
> good as anything.  You can use them in blocks of lines, in a loop.


This is a very nice trick indeed! Just what I've been looking for.

Thank you very much,
Vadim

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] skip lines on a connection

2004-05-01 Thread Vadim Ogranovich
Andy,

It is surprising that scan() attempts to read anything at all: note that
I set nmax=0, which AFAIK means read no lines.

Thank you for a reference to replicate(). I didn't know about it.

Thanks,
Vadim

-Original Message-
From: Liaw, Andy [mailto:[EMAIL PROTECTED] 
Sent: Saturday, May 01, 2004 5:28 PM
To: Vadim Ogranovich; [EMAIL PROTECTED]
Subject: RE: [R] skip lines on a connection


Your scan() call doesn't work because default argument what=0; i.e., it
expects numeric data.  You probably can just use what="".

The other alternative is to just loop readLines() n times, reading one
line at a time.  It probably won't be too bad in terms of time, and
surely will save on memory usage.

(Try using replicate().)

HTH,
Andy

> From: Vadim Ogranovich
> 
> Unfortunately, seek only works in terms of bytes not lines and I only 
> know how many lines I need to skip, but not bytes.
> 
> 
> -Original Message-
> From: Gabor Grothendieck [mailto:[EMAIL PROTECTED]
> Sent: Saturday, May 01, 2004 3:44 PM
> To: [EMAIL PROTECTED]
> Subject: Re: [R] skip lines on a connection
> 
> 
> 
> 
> ?seek
> 
> Vadim Ogranovich  evafunds.com> writes:
> 
> :
> : Hi,
> : 
> : I am looking for an efficient way of skipping big chunks of 
> lines on a
> : connection (not necessarily at the beginning of the file). 
> One way is
> to
> : use read lines, e.g. readLines(1e6), but a) this incurs the overhead
> of
> : construction of the return char vector and b) has a (fairly remote)
> : potential to blow up the memory.
> : 
> : Another way would be to use scan(), e.g. 
> : 
> : scan(con, skip=1e6, nmax=0)
> : 
> : but somehow this doesn't work:
> : 
> : > scan(con, skip=10, nmax=0)
> : Error in scan(con, skip = 10, nmax = 0) : 
> :  "scan" expected a real, got "A;12;0;"
> : 
> : I can stick to readLines, but am curious if there is a better way.
> : 
> : I use R-1.8.1 on RH-7.3.
> : 
> : Thanks,
> : Vadim
> 
> __
> [EMAIL PROTECTED] mailing list 
> https://www.stat.math.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
> 
> __
> [EMAIL PROTECTED] mailing list 
> https://www.stat.math.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html
> 
> 



--
Notice:  This e-mail message, together with any attachments,...{{dropped}}

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] skip lines on a connection

2004-05-01 Thread Vadim Ogranovich
Unfortunately, seek only works in terms of bytes not lines and I only
know how many lines I need to skip, but not bytes.


-Original Message-
From: Gabor Grothendieck [mailto:[EMAIL PROTECTED] 
Sent: Saturday, May 01, 2004 3:44 PM
To: [EMAIL PROTECTED]
Subject: Re: [R] skip lines on a connection




?seek

Vadim Ogranovich  evafunds.com> writes:

: 
: Hi,
: 
: I am looking for an efficient way of skipping big chunks of lines on a
: connection (not necessarily at the beginning of the file). One way is
to
: use read lines, e.g. readLines(1e6), but a) this incurs the overhead
of
: construction of the return char vector and b) has a (fairly remote)
: potential to blow up the memory.
: 
: Another way would be to use scan(), e.g. 
: 
: scan(con, skip=1e6, nmax=0)
: 
: but somehow this doesn't work:
: 
: > scan(con, skip=10, nmax=0)
: Error in scan(con, skip = 10, nmax = 0) : 
:  "scan" expected a real, got "A;12;0;"
: 
: I can stick to readLines, but am curious if there is a better way.
: 
: I use R-1.8.1 on RH-7.3.
: 
: Thanks,
: Vadim

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] skip lines on a connection

2004-05-01 Thread Vadim Ogranovich
Hi,
 
I am looking for an efficient way of skipping big chunks of lines on a
connection (not necessarily at the beginning of the file). One way is to
use read lines, e.g. readLines(1e6), but a) this incurs the overhead of
construction of the return char vector and b) has a (fairly remote)
potential to blow up the memory.
 
Another way would be to use scan(), e.g. 
 
scan(con, skip=1e6, nmax=0)
 
but somehow this doesn't work:
 
> scan(con, skip=10, nmax=0)
Error in scan(con, skip = 10, nmax = 0) : 
 "scan" expected a real, got "A;12;0;"

 
I can stick to readLines, but am curious if there is a better way.
 
 
I use R-1.8.1 on RH-7.3.
 
 
Thanks,
Vadim

[[alternative HTML version deleted]]

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] time zones in POSIXt

2004-04-23 Thread Vadim Ogranovich
Thank you for the lead, Dirk! Indeed this works on my machine too:

> as.POSIXct("2000-05-10 10:15:00",  tz="PST8PDT") -
as.POSIXct("2000-05-10 10:15:00",  tz="GMT")
Time difference of 7 hours


However when I replace POSIXct by POSIXlt it breaks (this looks like a
bug to me):

> as.POSIXlt("2000-05-10 10:15:00",  tz="PST8PDT") -
as.POSIXlt("2000-05-10 10:15:00",  tz="GMT")
Time difference of 0 secs



Now a couple of new questions:
* how could I learn about appropriate names for time zones? For example
I was using "PST" whereas it seems I had to use either "PST8" or
"PST8PDT". Why "PST" was not good? Is it documented anywhere?

* there seems to be no difference betweeen GMT and BST on my machine
though PST8 and PST8PDT are treated properly:

# PST8 is not identical to PST8PDT
> ISOdatetime(2003, seq(12), 1, 10, 0, 0, tz="PST8") - ISOdatetime(2003,
seq(12), 1, 10, 0, 0, tz="PST8PDT")
Time differences of0,0,0,0, 3600, 3600, 3600, 3600,
3600, 3600,0,0 secs

# GMT0 is identical to BST
> ISOdatetime(2003, seq(12), 1, 10, 0, 0, tz="GMT0") - ISOdatetime(2003,
seq(12), 1, 10, 0, 0, tz="BST")
Time differences of 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 secs

Why is such dichotomy?

Thanks,
Vadim






> -Original Message-
> From: Dirk Eddelbuettel [mailto:[EMAIL PROTECTED] 
> Sent: Friday, April 23, 2004 1:52 PM
> To: Vadim Ogranovich
> Cc: R-Help
> Subject: Re: [R] time zones in POSIXt
> 
> 
> 
> On Fri, Apr 23, 2004 at 11:30:19AM -0700, Vadim Ogranovich wrote:
> > Hi,
> >  
> > I have two data sources. One records time in PST time zone, 
> the other 
> > in GMT. I want to compute the difference between the two, but don't 
> > see how. Here is an example where I compute time difference between 
> > identical times each (meant to be) relative to its time zone.
> >  
> > > as.POSIXlt("2000-05-10 10:15:00",  "PST") -  
> as.POSIXlt("2000-05-10
> > 10:15:00",  "GMT")
> > Time difference of 0 secs
> > 
> > I was expecting to see 8hrs (which is the time difference between 
> > London and San-Francisco). Why is it so and what is the 
> correct way of 
> > doing it?
> 
> Seems to work with POSIXct in 1.8.1 and 1.9.0:
> 
> > as.POSIXct("2000-05-10 10:15:00",  "PST8PDT") -  
> > as.POSIXct("2000-05-10
> 10:15:00", tz="UTC")
> Time difference of 7 hours
> 
> 
> Dirk
> 
> >  
> >  
> > I use R-1.8.1 on RH-7.3.
> >  
> > Thanks,
> > Vadim
> >  
> >  
> > 
> > [[alternative HTML version deleted]]
> > 
> > __
> > [EMAIL PROTECTED] mailing list 
> > https://www.stat.math.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide! 
> > http://www.R-project.org/posting-guide.html
> > 
> 
> -- 
> The relationship between the computed price and reality is as 
> yet unknown.  
>  -- From the 
> pac(8) manual page
>

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] time zones in POSIXt

2004-04-23 Thread Vadim Ogranovich
Hi,
 
I have two data sources. One records time in PST time zone, the other in
GMT. I want to compute the difference between the two, but don't see
how. Here is an example where I compute time difference between
identical times each (meant to be) relative to its time zone.
 
> as.POSIXlt("2000-05-10 10:15:00",  "PST") -  as.POSIXlt("2000-05-10
10:15:00",  "GMT")
Time difference of 0 secs

I was expecting to see 8hrs (which is the time difference between London
and San-Francisco). Why is it so and what is the correct way of doing
it?
 
 
I use R-1.8.1 on RH-7.3.
 
Thanks,
Vadim
 
 

[[alternative HTML version deleted]]

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] Doing SQL GROUP BY in R

2004-04-02 Thread Vadim Ogranovich
You might want to look at ?table and ?xtab. See also ?tapply for use of
general functions (other than just COUNT) with GROUP BY.

HTH,
Vadim

> -Original Message-
> From: JFRI (Jesper Frickmann) [mailto:[EMAIL PROTECTED] 
> Sent: Friday, April 02, 2004 1:06 PM
> To: [EMAIL PROTECTED]
> Subject: [R] Doing SQL GROUP BY in R
> 
> 
> I want a list of the number of times some factor levels 
> appear together, similar to the following SQL statement:
> 
> SELECT A, B, COUNT(C) FROM TBL GROUP BY A, B
> 
> How do I do that with a data.frame in R?
> 
> Thanks,
> Jesper Frickmann
> Statistician, Quality Control
> Novozymes North America Inc.
> 77 Perry Chapel Church Road
> Franklinton, NC 27525
> USA
> Tel. +1 919 494 3266
> 
> 
>   [[alternative HTML version deleted]]
> 
> __
> [EMAIL PROTECTED] mailing list 
> https://www.stat.math.ethz.ch/mailman/listinfo> /r-help
> PLEASE 
> do read the posting guide! 
> http://www.R-project.org/posting-guide.html
>

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] binding vectors or matrix using their names

2004-03-24 Thread Vadim Ogranovich
?get to convert names into objects

> -Original Message-
> From: Stephane DRAY [mailto:[EMAIL PROTECTED] 
> Sent: Wednesday, March 24, 2004 11:41 AM
> To: [EMAIL PROTECTED]
> Subject: [R] binding vectors or matrix using their names
> 
> 
> Hello list,
> I have two vectors x and x2:
> 
> x=runif(10)
> x2=runif(10)
> 
> and one vectors with their names :
> 
> my.names=c("x","x2")
> 
> I would like to cbind these two vectors using their names 
> contained in the 
> vector my.names.
> I can create a string with comma
> ncomma=paste(my.names,collapse=",")
> 
> and now, I just need a function to transform this string into 
> a adequate 
> argument for cbind:
> 
> cbind(afunction(ncomma))
> 
> Is there in R a function that can do the job ? If not, how 
> can I do it ??
> 
> Thanks in advance,
> Sincerely.
> 
> 
> Stéphane DRAY
> --
>  
> 
> Département des Sciences Biologiques
> Université de Montréal, C.P. 6128, succursale centre-ville 
> Montréal, Québec H3C 3J7, Canada
> 
> Tel : 514 343 6111 poste 1233
> E-mail : [EMAIL PROTECTED]
> --
>  
> 
> Web  
> http://www.steph280.freesurf.fr/
> 
> 
> __
> [EMAIL PROTECTED] mailing list 
> https://www.stat.math.ethz.ch/mailman/listinfo> /r-help
> PLEASE 
> do read the posting guide! 
> http://www.R-project.org/posting-guide.html
>

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] how to customize .First.lib

2004-03-24 Thread Vadim Ogranovich
Hi,
 
I am looking for an elegant solution to the following problem. When I
load a package, let's say ROracle, I want some custom actions to be done
on top of what package's .First.lib does. In this specific example I
want to open a connection to the only database I have around.
And I don't see how this could be done: there seems to be no hooks in a
couple of .First.lib functions I checked out, and there is no
on.library.load() function to register a call-back too.
 
It's just a nice to have, but I am curious to hear what people might
suggest.
 
Thanks,
Vadim

[[alternative HTML version deleted]]

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] why-s of method dispatching

2004-03-18 Thread Vadim Ogranovich
I see. Thank you very much! 

Does R-Core have any plan to promote data.frame to an S4 class? In
general, is there any "road-map" (formal or informal) for phasing out S3
classes?

Thanks,
Vadim

> -Original Message-
> From: Prof Brian Ripley [mailto:[EMAIL PROTECTED] 
> Sent: Thursday, March 18, 2004 12:24 AM
> To: Vadim Ogranovich
> Cc: R Help List
> Subject: Re: [R] why-s of method dispatching
> 
> 
> On Wed, 17 Mar 2004, Vadim Ogranovich wrote:
> 
> > I am having a problem to understand why as.data.frame 
> method doesn't 
> > dispatch properly on my class:
> > 
> > > setClass("Foo", "character")
> > [1] "Foo"
> > > as.data.frame(list(foo=new("Foo", .Data="a")))
> > Error in as.data.frame.default(x[[i]], optional = TRUE) :
> >  can't coerce Foo into a data.frame
> > 
> > I was expecting that this would call as.data.frame.character.
> 
> You have set an S4 class and as.data.frame is an S3 generic.
> 
> > list(foo=new("Foo", .Data="a"))
> $foo
> An object of class "Foo"
> [1] "a"
> 
> and what as.data.frame sees is
> 
> > attributes(list(foo=new("Foo", .Data="a"))$foo)
> $class
> [1] "Foo"
> attr(,"package")
> [1] ".GlobalEnv"
> 
> so thinks this is an S3 class it knows nothing about.
> 
> > Another puzzle. If I explicitly call as.data.frame.character() it 
> > would fail but for a different reason:
> > 
> > > as.data.frame.character(list(foo=new("Foo", .Data="a")))
> > Error in unique.default(x) : unique() applies only to vectors
> > 
> > I was under an impression that an instance of "Foo" would 
> be welcome 
> > anywhere a "character" was, but it seems to be more subtle. 
> What am I 
> > missing?
> 
> The difference between S3 and S4 classes.
> 
> -- 
> Brian D. Ripley,  [EMAIL PROTECTED]
> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> University of Oxford, Tel:  +44 1865 272861 (self)
> 1 South Parks Road, +44 1865 272866 (PA)
> Oxford OX1 3TG, UKFax:  +44 1865 272595
> 
>

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] why-s of method dispatching

2004-03-18 Thread Vadim Ogranovich
I see. Thank you very much!

This brings another question. Does R-Core have any plan to promote
data.frame to an S4 class? In general, is there any "road-map" (formal
or informal) to phasing out S3 classes?

Thanks,
Vadim

> -Original Message-
> From: Prof Brian Ripley [mailto:[EMAIL PROTECTED]
> Sent: Thursday, March 18, 2004 12:24 AM
> To: Vadim Ogranovich
> Cc: R Help List
> Subject: Re: [R] why-s of method dispatching
> 
> 
> On Wed, 17 Mar 2004, Vadim Ogranovich wrote:
> 
> > I am having a problem to understand why as.data.frame
> method doesn't
> > dispatch properly on my class:
> > 
> > > setClass("Foo", "character")
> > [1] "Foo"
> > > as.data.frame(list(foo=new("Foo", .Data="a")))
> > Error in as.data.frame.default(x[[i]], optional = TRUE) :  can't 
> > coerce Foo into a data.frame
> > 
> > I was expecting that this would call as.data.frame.character.
> 
> You have set an S4 class and as.data.frame is an S3 generic.
> 
> > list(foo=new("Foo", .Data="a"))
> $foo
> An object of class "Foo"
> [1] "a"
> 
> and what as.data.frame sees is
> 
> > attributes(list(foo=new("Foo", .Data="a"))$foo)
> $class
> [1] "Foo"
> attr(,"package")
> [1] ".GlobalEnv"
> 
> so thinks this is an S3 class it knows nothing about.
> 
> > Another puzzle. If I explicitly call as.data.frame.character() it
> > would fail but for a different reason:
> > 
> > > as.data.frame.character(list(foo=new("Foo", .Data="a")))
> > Error in unique.default(x) : unique() applies only to vectors
> > 
> > I was under an impression that an instance of "Foo" would
> be welcome
> > anywhere a "character" was, but it seems to be more subtle.
> What am I
> > missing?
> 
> The difference between S3 and S4 classes.
> 
> -- 
> Brian D. Ripley,  [EMAIL PROTECTED]
> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> University of Oxford, Tel:  +44 1865 272861 (self)
> 1 South Parks Road, +44 1865 272866 (PA)
> Oxford OX1 3TG, UKFax:  +44 1865 272595
> 
>

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] why-s of method dispatching

2004-03-17 Thread Vadim Ogranovich
Hi,

I am having a problem to understand why as.data.frame method doesn't
dispatch properly on my class:

> setClass("Foo", "character")
[1] "Foo"
> as.data.frame(list(foo=new("Foo", .Data="a")))
Error in as.data.frame.default(x[[i]], optional = TRUE) : 
 can't coerce Foo into a data.frame

I was expecting that this would call as.data.frame.character.


Another puzzle. If I explicitly call as.data.frame.character() it would
fail but for a different reason:

> as.data.frame.character(list(foo=new("Foo", .Data="a")))
Error in unique.default(x) : unique() applies only to vectors


I was under an impression that an instance of "Foo" would be welcome
anywhere a "character" was, but it seems to be more subtle. What am I
missing?


Thanks,
Vadim

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] dyn.unload fails after some C++ code.

2004-03-10 Thread Vadim Ogranovich
Hi,
 
I have a shared library that I can successfully load/unload until I
execute some non-trivial code on the library. After that the same
dyn.unload() has no effect. Does anyone have an idea what kind of C/C++
code could lock a .so library so that it becomes unloadable? (My code
links with a third-party library so I have no way to narrow down the
search).
I thought it could be memory allocation on heap, but a little experiment
showed this was not the case. I was able to unload the library even
after I allocated (and not released) memory on the heap. I figure that
.so has its own heap, but it's just a guess.
 
I use RedHat 7.3.
 
Thanks,
Vadim

[[alternative HTML version deleted]]

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] how to continue develop package

2004-03-09 Thread Vadim Ogranovich
Hi,
 
I have completed a prototype of a package, say FOO, and now I want to
start using it as an ordinary R package, i.e. attach it via
library("FOO"). On the other hand I will be adding functionality and
fixing bugs so the code is going to change a lot. There is a couple of
problems that don't know how to solve:
 
1. To be able to use library() the package must be INSTALLED. The
installation creates a copy of the code. So it seems each time I change
the code I need to repeat R CMD INSTALL. This is not too bad, but maybe
there is a better way
2. (This is probably more of an ESS question) Suppose I have attached
FOO via library("foo") which is now at the second position of the search
list. Now if I use ESS to modify the definition of some function goo()
from the package "FOO" and send the new definition to R it will NOT
replace the old definition, rather creates a new function at the first
position of the search list. The new function will of course overshadow
the old one so I'll achieve the effect I want, but I'd feel better if I
could replace the old definition rather than overshadow it. Does ESS
allow this sort of things?
 
These are my specific questions, however if you can suggest a different
way of on-going development of a package I'd love to hear from you.
 
Thanks,
Vadim

[[alternative HTML version deleted]]

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] error() and C++ destructors

2004-03-09 Thread Vadim Ogranovich
One shortcoming of Erik's solution is that it can only catch the exceptions of type 
error_exception. For example it won't work if my code calls some third party library 
that can throw exceptions of some other types.

In case it's of interest to someone here is the boilerplate that I ended up using 
(note the similarity between my errMsg and error_message from Erik's solution):

extern "C" SEXP foo(SEXP x) {
int hasFailed=0;
char errMsg[2048];

// NO C++ objects above this line
try {
// do the work
...
}
cach (std::exception e) {
hasFailed = 1;
strncopy(errMsg, e.what(), sizeof(errMsg));
}
catch (OtherException e) {
hasFailed = 1;
...
}

// NO C++ objects below this line

if (hasFailed) {
error(errMsg);
}

return x;
}

One thing I don't like about my solution is that I need to remember to set hasFailed 
in each and every catch block (and "I often remember to forget these sort of things" 
:-)

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] 
> Sent: Tuesday, March 09, 2004 4:59 AM
> To: [EMAIL PROTECTED]
> Subject: RE: [R] error() and C++ destructors
> 
> 
> And maybe I stressed Erik a bit, because he corrected himself 
> some minutes later (when I was no longer looking over his 
> shoulder). Again on behalf of Erik.
> 
> - Lennart
> 
> 
> -Original Message-
> From: Källen, Erik 
> Sent: 9 mars 2004 13:40
> To: Borgman, Lennart
> Subject: RE: [R] error() and C++ destructors
> 
> 
> Whoops.
> 
> This code leaks memory for the exception.
> It would be better to throw a pointer:
> 
> void my_error(const string &str) {
>   throw new error_exception(str);
> }
> 
> int my_method(int a, char *b) {
>   try {
>   return real_my_method(a, b);
>   }
>   catch (error_exception *pe) {
>   static char error_msg[SOME_LARGE_NUMBER];
>   strncpy(error_msg, pe->msg.c_str(), sizeof(error_msg));
>   delete pe;
>   error(error_msg);
>   }
> }
> 
> -Original Message-
> From: Borgman, Lennart 
> Sent: 9 mars 2004 13:35
> To: '[EMAIL PROTECTED]'
> Subject: RE: [R] error() and C++ destructors
> 
> 
> I am sending this reply on behalf of Erik (who is not a 
> member of this list).
> 
> - Lennart
> 
> 
> -Original Message-
> From: Källen, Erik 
> Sent: 9 mars 2004 11:37
> To: Borgman, Lennart
> Subject: RE: [R] error() and C++ destructors
> 
> 
> I would do something like:
> 
> class error_exception {
> public:
>   error_exception(const string &str) : msg(str) {}
>   string msg;
> };
> 
> void my_error(const string &str) {
>   throw error_exception(str);
> }
> 
> int real_my_method(int a, char *b) {
>   /*
>   some code...
>   */
>   return 0;
> }
> 
> // this is the public method:
> int my_method(int a, char *b) {
>   try {
>   return real_my_method(a, b);
>   }
>   catch (error_exception &e) {
>   error(e.msg);
>   }
> }
> 
> You could probably even create a macro like:
> #define R_METHOD_IMPL(rettype, name, paramlist) \
> rettype real_##name paramlist; \
> rettype name paramlist { \
>   try { \
>   return real_##name paramlist; \
>   } \
>   catch (error_exception &e) { \
>   error(e.msg); \
>   } \
> } \
> rettype real_##name paramlist
> 
> 
> You would use this macro like:
> R_METHOD_IMPL(int, my_method, (int a, char *b)) {
>   // source code here
> }
> 
> I think it would work, but I'm not sure (untested).
> 
> 
> /Erik Källén
> 
> 
> -Original Message-
> From: Vadim Ogranovich [mailto:[EMAIL PROTECTED]
> Sent: 2 mars 2004 22:00
> To: R Help List
> Subject: [R] error() and C++ destructors
> 
> 
> Hi,
>  
> I am writing C++ functions that are to be called via .Call() 
> interface. I'd been using error() (from R.h) to return to R 
> if there is an error, but then I realized that this might be 
> not safe as supposedly error() doesn't throw an exception and 
> therefore some destructors do not get called and some memory 
> may leak. Here is a simple example
>  
> extern "C" void foo() {
> string str = "hello";
> error("message");
> }
>  
> The memory allocated for str is leaked.
>  
> Did anyone think about 

[R] .Call: is new attribute of protected object auto-protected

2004-03-05 Thread Vadim Ogranovich
Hi,
 
I have an SEXP obj in a C function called via .Call(). The obj is
protected (in fact it is an argument to .Call and therefore
automatically protected). If I set an attribute of obj does the
attribute become protected too? Here is an example
 
SEXP foo(SEXP obj) {
SET_NAMES(obj, NEW_CHARACTER(3));  /* are names protected or not? */
...
}
 
 
Thanks,
Vadim

[[alternative HTML version deleted]]

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] error() and C++ destructors

2004-03-02 Thread Vadim Ogranovich
Hi,
 
I am writing C++ functions that are to be called via .Call() interface.
I'd been using error() (from R.h) to return to R if there is an error,
but then I realized that this might be not safe as supposedly error()
doesn't throw an exception and therefore some destructors do not get
called and some memory may leak. Here is a simple example
 
extern "C" void foo() {
string str = "hello";
error("message");
}
 
The memory allocated for str is leaked.
 
Did anyone think about this and find a way to work around the problem?
 
Thanks,
Vadim

[[alternative HTML version deleted]]

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] passing a string from .C()

2004-03-01 Thread Vadim Ogranovich
Hi,
 
Could someone please point to an example of passing strings from .C()
calls back to R? I want to be able to do something like this:
 
str <- .C("return_foo_string", str=character(1))$str
 
void return_foo_string(char ** str) {
*str = "foo";
}
 
The above code has at least two memory allocation "concerns": 
1) How to properly allocate "foo". I should probably use R_alloc, e.g.
 
char foo[] = "foo";
*str = R_alloc(sizeof(foo), 1);
 
2) I don't know if the string pointed to by *str before the
re-assignment, which now becomes dangling, will be properly reclaimed.
 
Thanks,
Vadim

[[alternative HTML version deleted]]

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] when .Call can safely modify its arguments

2004-02-28 Thread Vadim Ogranovich
Hi,
 
"Writing R Extensions Guide" clearly states that a C-function interfaced
via .Call() should not modify any of its arguments. However I wonder if
there are exceptions to this rule, i.e. when .Call can safely modify the
arguments. For example when a function creates a list that is then
populated by a call to a C function:
 
getData <- function() {
data <- list(a=double(2), b=character(3))
 
# now populate_list modifies data
.Call("populate_list", data) 
 
data
}
 
 
What can go wrong in this example?
 
And while we are here I wonder what happens to 'data' when getData()
returns it. Is it copied or some more efficient mechanism is used?
 
Thanks,
Vadim

[[alternative HTML version deleted]]

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] books: "Programming with Data: A Guide to the S Language" vs." S Programming"

2004-02-25 Thread Vadim Ogranovich
Hi,
 
Could someone please compare "Programming with Data: A Guide to the S
Language" by J. Chambers and " S Programming" by W. Venables and B.
Ripley? Ideally, I need a "guide" for writing R OO-style packages that
intensively interact with C/C++ libraries.
 
The specific project I have in mind is to write a thin and limited
DB-connectivity package that would interact with Oracle via its OCCI
interface (it's an intention subject to time availability and project
complexity). If you've been there or somehow think this is a daunting
task I'd love to hear from you.
 
Thanks,
Vadim

[[alternative HTML version deleted]]

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Running R remotely in Windows Environment? - Xemacs and ssh

2004-01-29 Thread Vadim Ogranovich
Hi,

While we are on the topic of "Running R remotely in Windows Environment"
maybe someone could help with the following specific problem. I run R on
a Linux box from my WindowsXP laptop. I do so via Exceed, which for some
reasons is inconvenient for me.
As an alternative I tried to ssh into the linux machine and then run R.
This worked fine from Cygwin's bash window, but not from under XEmacs
(native Windows port). After starting ssh Xemacs complained:
"Pseudo-terminal will not be allocated because stdin is not a terminal"
and didn't show the prompt. Did anyone figure out how to remotely run R
from under (X)Emacs on Windows using ssh?

Thanks,
Vadim

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] building r-patch

2003-11-06 Thread Vadim Ogranovich
Thank you for Dirk Eddelbuettel and Prof. Ripley for pointing out to
tools/rsync-recommended. Maybe it is wort mentioning in the R
Administration guide and in the INSTALL file?

Thanks,
Vadim

> -Original Message-
> From: Prof Brian Ripley [mailto:[EMAIL PROTECTED] 
> Sent: Wednesday, November 05, 2003 11:10 PM
> To: [EMAIL PROTECTED]
> Cc: R-Help
> Subject: Re: [R] building r-patch
> 
> 
> The first is more or less what you should expect.
> 
> On Wed, 5 Nov 2003, Vadim Ogranovich wrote:
> 
> > Hi,
> >  
> > I am building r-patch from the sources (rsync-ed today).
> >  
> > make check produced the following message:
> >  
> > running tests of Internet and socket functions
> >   expect some differences
> > make[3]: Entering directory `/usr/evahome/vograno/R/tests' running 
> > code in 'internet.R' ... OK comparing 'internet.Rout' to 
> > './internet.Rout.save' ...18c18 < Content type `text/plain; 
> > charset=iso-8859-1' length 134991 bytes
> > ---
> > > Content type `text/plain; charset=iso-8859-1' length 124178 bytes
> > 22,23c22,23
> > < .. .. .. .
> > < downloaded 131Kb
> > ---
> > > .. .. .
> > > downloaded 121Kb
> > 25c25
> > < [1] 273
> > ---
> > > [1] 251
> > 60,61d59
> > < Error in url(" <http://foo.bar> http://foo.bar";, "r") : unable to 
> > open connection < In addition: Warning message:
> > 62a61
> > > Error in url(" <http://foo.bar> http://foo.bar";, "r") : unable to 
> > > open
> > connection
> > 365,370c364
> > <  Login: root  Name: root
> > < Directory: /root Shell: /bin/tcsh
> > < Last login Wed Nov  5 13:34 (PST) on pts/1 from 
> > verdi.irisfinancial.com < New mail received Wed Nov  5 04:02 2003 
> > (PST)
> > <  Unread since Fri Oct 24 04:02 2003 (PDT )
> > < No Plan.
> > ---
> > > Error in make.socket(host, port) : Socket not established
> >  OK
> > 
> >  
> >  
> > I noticed that I had to expect some differences so my 
> question is how 
> > to tell whether it's harmless or not?
> >  
> >  
> > Other questions are related to building of recommended packages:
> > * The src/library/Recommended directory was empty. Is it expected?
> 
> No.  You forgot to run tools/rsync-recommended in the sources.
> 
> > If
> > yes, how to download the entire bundle of recommended 
> packages (I know 
> > I can get them one by one)? Is install.packaes() the 
> recommended way?
> > * make check tried to test MASS and survival (and failed 
> because the 
> > packages were not there), but it didn't try to test the other 
> > recommended  packages. Why only these two?
> 
> It did not test those packages, it tried to make use of them. 
>  make check-all would have tested the recommended packages.
> 
> -- 
> Brian D. Ripley,  [EMAIL PROTECTED]
> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> University of Oxford, Tel:  +44 1865 272861 (self)
> 1 South Parks Road, +44 1865 272866 (PA)
> Oxford OX1 3TG, UKFax:  +44 1865 272595
>

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


RE: [R] building r-patch

2003-11-05 Thread Vadim Ogranovich
I thought about the "OK". However the MASS and the survival tests
printed "OK" too, though, as far as I could tell, the packages were not
installed at all (well, maybe this was the reason for calling it "OK")

Thank you for the answer. Now I think I can move forward,
Vadim

> -Original Message-
> From: Jason Turner [mailto:[EMAIL PROTECTED] 
> Sent: Wednesday, November 05, 2003 8:50 PM
> To: [EMAIL PROTECTED]
> Cc: R-Help
> Subject: Re: [R] building r-patch
> 
> 
> Vadim Ogranovich wrote:
> ...
> > I am building r-patch from the sources (rsync-ed today).
> >  
> > make check produced the following message:
> >  
> > running tests of Internet and socket functions
> >   expect some differences
> ...  assorted error messages, then ...
> >  OK
> > 
> >  
> >  
> > I noticed that I had to expect some differences so my 
> question is how 
> > to tell whether it's harmless or not?
> 
> The "OK".  If made exited without an error, the regression tests were 
> passed.  In this case, the differences between the 
> maintainers' results 
> and yours were due to local login and network setup differences.
> 
> Hope that helps
> 
> Jason
> -- 
> Indigo Industrial Controls Ltd. 
> http://www.indigoindustrial.co.nz 64-21-343-> 545 
> [EMAIL PROTECTED]
> 
> __
> [EMAIL PROTECTED] mailing list 
> https://www.stat.math.ethz.ch/mailman/listinfo> /r-help
>

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


[R] building r-patch

2003-11-05 Thread Vadim Ogranovich
Hi,
 
I am building r-patch from the sources (rsync-ed today).
 
make check produced the following message:
 
running tests of Internet and socket functions
  expect some differences
make[3]: Entering directory `/usr/evahome/vograno/R/tests'
running code in 'internet.R' ... OK
comparing 'internet.Rout' to './internet.Rout.save' ...18c18
< Content type `text/plain; charset=iso-8859-1' length 134991 bytes
---
> Content type `text/plain; charset=iso-8859-1' length 124178 bytes
22,23c22,23
< .. .. .. .
< downloaded 131Kb
---
> .. .. .
> downloaded 121Kb
25c25
< [1] 273
---
> [1] 251
60,61d59
< Error in url("  http://foo.bar";, "r") : unable to open
connection
< In addition: Warning message: 
62a61
> Error in url("  http://foo.bar";, "r") : unable to open
connection
365,370c364
<  Login: root  Name: root
< Directory: /root Shell: /bin/tcsh
< Last login Wed Nov  5 13:34 (PST) on pts/1 from
verdi.irisfinancial.com
< New mail received Wed Nov  5 04:02 2003 (PST)
<  Unread since Fri Oct 24 04:02 2003 (PDT )
< No Plan.
---
> Error in make.socket(host, port) : Socket not established
 OK

 
 
I noticed that I had to expect some differences so my question is how to
tell whether it's harmless or not?
 
 
Other questions are related to building of recommended packages:
* The src/library/Recommended directory was empty. Is it expected? If
yes, how to download the entire bundle of recommended packages (I know I
can get them one by one)? Is install.packaes() the recommended way?
* make check tried to test MASS and survival (and failed because the
packages were not there), but it didn't try to test the other
recommended  packages. Why only these two?
 
 
Thanks,
Vadim

[[alternative HTML version deleted]]

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


RE: [R] read.table: check.names arg - feature request

2003-09-04 Thread Vadim Ogranovich
I admit I should have been more clear in my original posting. Let me try again (and I 
do know that by deafulat read.table discards everything after '#' which is why I use 
comment.char="", my bad not to mention this).


Here is a typical example of my data file:

#keyvalue
foo 1.2
boo 1.3

As you see the header line begins with '#' and then lists the column names, however 
make.names will convert the raw names  c("#key", "value") to c(".key", "value") while 
I need c("key", "value"), i.e. no dot before key. So I am asking to give us a hook to 
specify the function that will handle this situation.



I am not sure I understand how having this hook can result in an invalid data frame? 
It can return invalid names, but check.names=FALSE can too.

Thanks,
Vadim

-Original Message-
From: Martin Maechler [mailto:[EMAIL PROTECTED]
Sent: Thursday, September 04, 2003 1:28 AM
To: Vadim Ogranovich
Cc: R-Help (E-mail)
Subject: Re: [R] read.table: check.names arg - feature request


>>>>> "Vadim" == Vadim Ogranovich <[EMAIL PROTECTED]>
>>>>> on Wed, 3 Sep 2003 14:29:25 -0700 writes:

Vadim> Hi, I thought it would be convenient if the
Vadim> check.names argument to read.table, which currently
Vadim> can only be TRUE/FALSE, could take a function value
Vadim> as well. If the function is supplied it should be
Vadim> used instead of the default make.names.

One could, but it's not necessary in your case (see below), and
it's a potential pit to fall in..  We want read.table() to
return valid  data frames.

Vadim> Here is an example where it can come in handy. I tend
Vadim> to keep my data in coma-separated files with a header
Vadim> line. The header line is prefixed with a comment sign
Vadim> '#' to simplify identification of these lines. Now
Vadim> when I read.table the files the '#' is converted to
Vadim> '.' while I want it to be discarded.

Hmm, are you using a very old version of R,
or haven't you seen the `comment.char = "#"' argument of
read.table()?

Reading "?read.table", also note the note about
`blank.lines.skip' , and then realize that the default for
blank.lines.skip is  ` !fill ' and that `fill = TRUE' for all
the read.csv* and read.delim* incantation of read.table().

In sum, it's very easy to use current read.table() for your
situation!

Vadim> P.S. I don't know if r-help is the right place for
Vadim> feature requests. If it's not please let me know
Vadim> where the right one is.

Since your proposal can be interpreted as "How do I use
read.table() when my file has comment lines?",
r-help has been very appropriate.

Otherwise, and particularly if the proposal is more technical,
R-devel would be better suited.

Regards,
Martin Maechler <[EMAIL PROTECTED]> http://stat.ethz.ch/~maechler/
Seminar fuer Statistik, ETH-Zentrum  LEO C16Leonhardstr. 27
ETH (Federal Inst. Technology)  8092 Zurich SWITZERLAND
phone: x-41-1-632-3408  fax: ...-1228   <><

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


[R] read.table: check.names arg - feature request

2003-09-03 Thread Vadim Ogranovich
Hi,

I thought it would be convenient if the check.names argument to read.table, which 
currently can only be TRUE/FALSE, could take a function value as well. If the function 
is supplied it should be used instead of the default make.names.

Here is an example where it can come in handy. I tend to keep my data in 
coma-separated files with a header line. The header line is prefixed with a comment 
sign '#' to simplify identification of these lines. Now when I read.table the files 
the '#' is converted to '.' while I want it to be discarded.

Thanks,
Vadim


P.S. I don't know if r-help is the right place for feature requests. If it's not 
please let me know where the right one is.

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


RE: [R] lm with an arbitrary number of terms

2003-04-02 Thread Vadim Ogranovich
I think you can do it like this

lm(y~., data=data.frame) # note the dot to the right of ~

> -Original Message-
> From: Richard Nixon [mailto:[EMAIL PROTECTED]
> Sent: Wednesday, April 02, 2003 8:50 AM
> To: [EMAIL PROTECTED]
> Subject: [R] lm with an arbitrary number of terms 
> 
> 
> Hello folks,
> 
> Any ideas how to do this?
> 
> data.frame is a data frame with column names "x1",...,"xn"
> y is a response variable of length dim(data.frame)[1]
> 
> I want to write a function
> 
> function(y, data.frame){
> lm(y~x1+...+xn)
> }
> 
> This would be easy if n was always the same.
> If n is arbitrary how could I feed the x1+...+xn terms into 
> lm(response~terms)?
> 
> Thanks
> Richard
> 
> --
> Dr. Richard Nixon
> MRC Biostatistics Unit, Cambridge, UK
> http://www.mrc-bsu.cam.ac.uk/personal/richard
> Tel: +44 (0)1223 330382, Fax: +44 (0)1223 33038
> 
> __
> [EMAIL PROTECTED] mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-help
> 

-- 
DISCLAIMER\ This e-mail, and any attachments thereto, is intende... {{dropped}}

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


RE: [R] Some more general questions

2003-03-19 Thread Vadim Ogranovich
The answer to your first question is yes, you can. Under the hood a call
"summary(data[[5]])" invovles an implicit call to print() of the value
returned by summary(). This print produces the output you see. So you need
to suppress the default print and call one of your own, e.g.

> x <- summary(data[[5]])   # no print since the returned value is assigned
> my.custom.print(x)

A caviat, R will try to print the value returned by my.custom.print. To
suppress it the return value should be made invisible, e.g. put
invisible(NULL) in the end of my.custom.print

> -Original Message-
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED]
> Sent: Wednesday, March 19, 2003 11:28 AM
> To: R Help List
> Subject: [R] Some more general questions
> 
> 
> Hi,
> 
> Some general questions. I want to build a web page with 
> numerical analysis
> generated by R. I have a few questions:
> 
> - Can I control the output of a function? For example, if I do:
> 
> > summary(data[[5]])
>Min. 1st Qu.  MedianMean 3rd Qu.Max. 
> 0.0 0.0   120.0   193.3   310.0 10290.0 
> 
> can I control the output to be something like
> 
> min=0
> q1=0.0
> q2=120.0
> q3=193.3
> max=10290.0
> 
> in order to parse with an external program?
> 
> - Yet another question on histograms: can I produce them with 
> character
>   strings? I'm guessing I need to map each character value to 
> a numerical
>   one and use that instead.
> 
> Thanks,
> 
> L
> 
> -- 
> Laurent Duperval <[EMAIL PROTECTED]>
> 
> "My doctor told me to stop having intimate dinners for four.  
> Unless there are
> three other people." 
> - Orson Welles
> 
> __
> [EMAIL PROTECTED] mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-help
> 

-- 
DISCLAIMER\ This e-mail, and any attachments thereto, is intende... [[dropped]]

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


RE: [R] fitting a curve according to a custom loss function

2003-02-19 Thread Vadim Ogranovich
Andy,

Here is a toy example where such model might make sense. Suppose y is the
total income of an individual over the last two years and x_1 and x_2 are
the taxes he paid on each of the two years. If taxes were linear in income
then y ~ a*(x_1 + x_2). With a progressive tax system it is
y ~ f(x_1) + f(x_2)

Hope it makes more sense now,
Vadim

> -Original Message-
> From: Liaw, Andy [mailto:[EMAIL PROTECTED]]
> Sent: Wednesday, February 19, 2003 5:33 AM
> > Vadim Ogranovich wrote:
> > 
> > >Dear R-Users,
> > >
> > >I need to find a smooth function f() and coefficients a_i 
> > that give the best
> > >fit to
> > >
> > >y ~ a_0 + a_1*f(x_1) + a_2*f(x_2)
> > >
> 
> The model is very strange (to me, at least).  It's not 
> obvious to me that
> it's even identifiable.  (Sorry that I don't have anything 
> constructive to
> add.)
> 
> Andy

-- 
DISCLAIMER \ This e-mail, and any attachments thereto, is intend ... [[dropped]]

__
[EMAIL PROTECTED] mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help



[R] fitting a curve according to a custom loss function

2003-02-18 Thread Vadim Ogranovich
Dear R-Users,

I need to find a smooth function f() and coefficients a_i that give the best
fit to

y ~ a_0 + a_1*f(x_1) + a_2*f(x_2)

Note that it is the same non-linear transformation f() that is applied to
both x_1 and x_2.

So my first question is how can I do it in R?

A more general question is this: suppose I have a utility function U(a_i,
f()), where f() is say a spline. Is there a general optimizer that could
find an extremum of such U()? If not, how easy it would be to hack up
something like this? Would it become easier if U() depended on f() only,
i.e. no a_i terms?

Thanks, Vadim

-- 
DISCLAIMER \ This e-mail, and any attachments thereto, is intend ... [[dropped]]

__
[EMAIL PROTECTED] mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help



[R] using locator with xyplot() result

2003-02-14 Thread Vadim Ogranovich
Dear R-Users,

Is there a way to interactively get location of a point on a graph produced
by xyplot() of lattice package (similar to what locator() does with a
regular plot)?

Thanks, Vadim

-- 
DISCLAIMER \ This e-mail, and any attachments thereto, is intend ... [[dropped]]

__
[EMAIL PROTECTED] mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help



[R] modeling interaction of continuous vars

2003-02-13 Thread Vadim Ogranovich
Dear R-Users,

I wonder what methods are available for modeling interaction of continuous
variables. Specifically I am interested in fitting a regression

y ~ f(w) * x

where y, x are vectors and f(w) is a smooth function of a continuous
parameter w (so it is f() that needs to be estimated). I can further assume
that f(w) is positive but I can not take the logarithms as both y and x can
be negative.


Thanks, Vadim

-- 
DISCLAIMER \ This e-mail, and any attachments thereto, is intend ... [[dropped]]

__
[EMAIL PROTECTED] mailing list
http://www.stat.math.ethz.ch/mailman/listinfo/r-help



  1   2   >