Re: [Rd] R CMD check: non source files in src on (2.3.0 RC (2006-04-19 r37860))
Simon Urbanek writes: On Apr 20, 2006, at 1:23 PM, Henrik Bengtsson (max 7Mb) wrote: Is it a general consensus on R-devel that *.tar.gz distributions should only be treated as a distribution for *building* packages and not for developing them? [Actually, distributing so that they can be installed and used.] I don't know whether this is a general consensus, but it definitely an important distinction. Some authors put their own Makefiles in src although they are not needed and in fact harmful, preventing the package to build on other systems - only because they are too lazy to use R building mechanism for development and don't make the above distinction. Right :-) Henrik, as I think I mentioned the last time you asked about this: of course you can basically do everything you want. But it comes at a price. For external sources, you need to write a Makefile of your own, so as to make it clear that you provide a mechanism which is different from the standard one. And, as Simon said, the gain in flexibility comes at a price. Personally and as one of the CRAN maintainers, I'd be very unhappy if package maintainers would start flooding their source .tar.gz packages with full development environment material. (I am also rather unhappy about shipping large data sets which are only used for instructional purposes [rather than providing the data set on its own].) It is simply not true that bandwidth does not matter. If there is need, we could start having developer-package repositories. However, I'd prefer a different approach. We're currently in the process of updating the CRAN server infrastructure, and should be able to start deploying an R-forge project hosting service eventually (hopefully, we can set things up during the summer). This should provide us with an ideal infrastructure for sharing developer resources, in particular as we could add QC testing et al to the standard community services. Best -k __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] plot.default 'ylim' error message is wrong (PR#8784)
Full_Name: Lutz Prechelt Version: 2.2.1 OS: WinXP SP2 Submission from: (NULL) (130.133.8.114) This command plot(0, 0, xlim=c(3, 5), ylim=c(0, 10, 17)) should complain about 'ylim' (because it has three elements). However, it does in fact (in my german error message at least) complain about 'xlim' instead: Fehler in plot.window(xlim, ylim, log, asp, ...) : ungültiger 'xlim' Wert I spent about fifteen minutes debugging my trivial xlim argument before I found this. Shows my general trust in R, I guess. It is a great system. Version 2.2.1 (2005-12-20 r36812) Lutz Prechelt __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] plot.default 'ylim' error message is wrong (PR#8784)
On Fri, 21 Apr 2006, [EMAIL PROTECTED] wrote: Full_Name: Lutz Prechelt Version: 2.2.1 OS: WinXP SP2 Submission from: (NULL) (130.133.8.114) This command plot(0, 0, xlim=c(3, 5), ylim=c(0, 10, 17)) should complain about 'ylim' (because it has three elements). However, it does in fact (in my german error message at least) complain about 'xlim' instead: On 2.3.0RC it does plot(0, 0, xlim=c(3, 5), ylim=c(0, 10, 17)) Fehler in plot.window(xlim, ylim, log, asp, ...) : ungültiger 'ylim' Wert so it seems it was already solved. Fehler in plot.window(xlim, ylim, log, asp, ...) : ungültiger 'xlim' Wert I spent about fifteen minutes debugging my trivial xlim argument before I found this. Shows my general trust in R, I guess. It is a great system. Version 2.2.1 (2005-12-20 r36812) Lutz Prechelt -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595__ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] R_PAPERSIZE and LC_PAPER
Prof Brian Ripley wrote: snipped It does not exist in any system at present (not even mine): just an idea in my head to represent the default found at configure time. In some sense LC_PAPER is set on FC3: gannet% locale -ck LC_PAPER LC_PAPER height=297 width=210 paper-codeset=ISO-8859-1 and it is unsupported on Solaris 8. And Sys.getlocale(LC_PAPER) [1] en_GB in the system I am prototyping just now. Just to chip in on the discussion - on FC5, $ locale -ck LC_PAPER LC_PAPER height=279 width=216 paper-codeset=UTF-8 $ echo $LANG en_US.UTF-8 (hmm, I must have made a mistake somewhere - would prefer en_GB). R 2.2.1 currently on my system (built on FC4 before I ungraded to FC5). Sys.getlocale(LC_PAPER) Error in Sys.getlocale(LC_PAPER) : invalid 'category' argument HT __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] R_PAPERSIZE and LC_PAPER
On Fri, 21 Apr 2006, Hin-Tak Leung wrote: Prof Brian Ripley wrote: snipped It does not exist in any system at present (not even mine): just an idea in my head to represent the default found at configure time. In some sense LC_PAPER is set on FC3: gannet% locale -ck LC_PAPER LC_PAPER height=297 width=210 paper-codeset=ISO-8859-1 and it is unsupported on Solaris 8. And Sys.getlocale(LC_PAPER) [1] en_GB in the system I am prototyping just now. Just to chip in on the discussion - on FC5, $ locale -ck LC_PAPER LC_PAPER height=279 width=216 paper-codeset=UTF-8 $ echo $LANG en_US.UTF-8 (hmm, I must have made a mistake somewhere - would prefer en_GB). R 2.2.1 currently on my system (built on FC4 before I ungraded to FC5). Sys.getlocale(LC_PAPER) Error in Sys.getlocale(LC_PAPER) : invalid 'category' argument Yes, that only works in R-devel (2.4.0 to be). -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] R_PAPERSIZE and LC_PAPER
Marc Schwartz wrote: On Thu, 2006-04-20 at 20:56 +0100, Prof Brian Ripley wrote: snipped gannet% locale -ck LC_PAPER LC_PAPER height=297 width=210 paper-codeset=ISO-8859-1 BTW, on my FC4 system: $ locale -ck LC_PAPER LC_PAPER height=279 width=216 paper-codeset=UTF-8 297/210 is A4 and 279/216 is letter (in mm's). Europe/US difference. (where I am, I should have en_GB instead of en_US on FC5, so I better go and fix my box..). HT __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] R CMD check: non source files in src on (2.3.0 RC (2006-04-19 r37860))
On Fri, 21 Apr 2006, Kurt Hornik wrote: Simon Urbanek writes: On Apr 20, 2006, at 1:23 PM, Henrik Bengtsson (max 7Mb) wrote: Is it a general consensus on R-devel that *.tar.gz distributions should only be treated as a distribution for *building* packages and not for developing them? [Actually, distributing so that they can be installed and used.] I don't know whether this is a general consensus, but it definitely an important distinction. Some authors put their own Makefiles in src although they are not needed and in fact harmful, preventing the package to build on other systems - only because they are too lazy to use R building mechanism for development and don't make the above distinction. Right :-) Henrik, as I think I mentioned the last time you asked about this: of course you can basically do everything you want. But it comes at a price. For external sources, you need to write a Makefile of your own, so as to make it clear that you provide a mechanism which is different from the standard one. And, as Simon said, the gain in flexibility comes at a price. Personally and as one of the CRAN maintainers, I'd be very unhappy if package maintainers would start flooding their source .tar.gz packages with full development environment material. (I am also rather unhappy about shipping large data sets which are only used for instructional purposes [rather than providing the data set on its own].) It is simply not true that bandwidth does not matter. The GPL, for GPL-licensed packages, requires that the source be made available and defines the source as the preferred form of the work for making modifications to it. If the .tar.gz packages are not the source, then, under the GPL, CRAN is required to provide the source as well, which doesn't seem to be an improvement. -thomas Thomas Lumley Assoc. Professor, Biostatistics [EMAIL PROTECTED] University of Washington, Seattle __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] R CMD check: non source files in src on (2.3.0 RC (2006-04-19 r37860))
Kurt Hornik wrote: Simon Urbanek writes: On Apr 20, 2006, at 1:23 PM, Henrik Bengtsson (max 7Mb) wrote: Is it a general consensus on R-devel that *.tar.gz distributions should only be treated as a distribution for *building* packages and not for developing them? [Actually, distributing so that they can be installed and used.] I don't know whether this is a general consensus, but it definitely an important distinction. Some authors put their own Makefiles in src although they are not needed and in fact harmful, preventing the package to build on other systems - only because they are too lazy to use R building mechanism for development and don't make the above distinction. Right :-) Henrik, as I think I mentioned the last time you asked about this: of course you can basically do everything you want. But it comes at a price. For external sources, you need to write a Makefile of your own, so as to make it clear that you provide a mechanism which is different from the standard one. And, as Simon said, the gain in flexibility comes at a price. Personally and as one of the CRAN maintainers, I'd be very unhappy if package maintainers would start flooding their source .tar.gz packages with full development environment material. (I am also rather unhappy about shipping large data sets which are only used for instructional purposes [rather than providing the data set on its own].) It is simply not true that bandwidth does not matter. I can see the problem with large packages, but the current system does nothing about that AFAIC. And as Simon indicated, his biggest problem is the one set of files that we are allowed - so the argument is that the current approach is neither necessary nor sufficient and it imposes a structure on people that seems to be unneccearily restrictive. I don't see how excluding README (or any thing else that a package maintainer has put there) makes life better, but maybe I am missing something here. These are precisely the sorts of things that have helped me to figure out what was intended when it didn't work. So this approach is regressive, IMHO. If the size is not large, who cares what is in a package, and things releated to source should be in src. I see that a similar approach is being taken with the R directory (and probably other directories). This is, in my opinion, unfortunate, imposing restrictions that don't solve the problem mentioned in some general way are not useful. For BioC, we manually check the size etc and ask people to reduce and remove. You could easily do the same at CRAN (and even automate it). BioC packages can be enormous relative to those on CRAN and I don't think we have ever had a serious complaint about it. But then the data sets tend to be large, so maybe people are just more forgiving. As for the difference between source packages and built packages, yes it would be nice at some time to enter into a discussion on that topic. There are lots of things that can be done at build time (that are not currently being done) that would speed up package installation etc. But they come at the price that Henrik has mentioned. The built package is no longer suitable for development. And hence we may usefully consider another format (something between source and binary, .Rgz?) best wishes Robert If there is need, we could start having developer-package repositories. However, I'd prefer a different approach. We're currently in the process of updating the CRAN server infrastructure, and should be able to start deploying an R-forge project hosting service eventually (hopefully, we can set things up during the summer). This should provide us with an ideal infrastructure for sharing developer resources, in particular as we could add QC testing et al to the standard community services. Best -k __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel -- Robert Gentleman, PhD Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M2-B876 PO Box 19024 Seattle, Washington 98109-1024 206-667-7700 [EMAIL PROTECTED] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] R CMD check: non source files in src on (2.3.0 RC (2006-04-19 r37860))
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Robert Gentleman wrote: ... As for the difference between source packages and built packages, yes it would be nice at some time to enter into a discussion on that topic. There are lots of things that can be done at build time (that are not currently being done) that would speed up package installation etc. But they come at the price that Henrik has mentioned. The built package is no longer suitable for development. And hence we may usefully consider another format (something between source and binary, .Rgz?) I didn't get any time before the feature freeze to work on a new package mechanism, but I think that is what is needed rather than tweaking and complicating the existing collection of perl and shell scripts. I believe it is time that we now move the package system to R code and use real OOP to allow for all these differences. Now that I am not teaching 5 days a week, I will have time to work on this and so get the frameworks (packages that provide compiled/native routines for other packages to use) as a first example. Hopefully sometime in (northern hemispher) summer. best wishes Robert If there is need, we could start having developer-package repositories. However, I'd prefer a different approach. We're currently in the process of updating the CRAN server infrastructure, and should be able to start deploying an R-forge project hosting service eventually (hopefully, we can set things up during the summer). This should provide us with an ideal infrastructure for sharing developer resources, in particular as we could add QC testing et al to the standard community services. Best -k __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel - -- Duncan Temple Lang[EMAIL PROTECTED] Department of Statistics work: (530) 752-4782 4210 Mathematical Sciences Building fax: (530) 752-7099 One Shields Ave. University of California at Davis Davis, CA 95616, USA -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.3 (Darwin) iD8DBQFESQKK9p/Jzwa2QP4RAiaYAJwM16032JOLp6Qst7E0xNsAfuoHTgCfW+dq fUnLrPaIzZJiCoAg9+Glmwc= =8ZKC -END PGP SIGNATURE- __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Questions on version arg to setClass and serialized instances
I have a few questions and thoughts regarding class versioning and serialized S4 class instances. How is the version argument to setClass is intended to work? It appears to want an externalptr, but that seems odd to me. setClass(FOO, representation(x=numeric), version=1.2.3) Error in validObject(.Object) : invalid class classRepresentation object: invalid object for slot versionKey in class classRepresentation: got class character, should be or extend class externalptr The use case I'm interested in is: A user has a serialized instance foo of class FOO in old.rda. The class definition lives in FooPkg. Suppose the class definition of FOO in FooPkg changes. When the user loads the new version of FooPkg and then loads their old.rda file, how can the user identify the instance as one belonging to an old class def version? Some thoughts: * A related issue is how to load the appropriate class definition when loading a serialized S4 instance. It makes sense to me that a side-effect of load(foo.rda) is that FooPkg gets loaded so that the instance makes sense. More generally, it would be cool to have hooks into serialization and deserialization. That could allow much more robust handling of serialized instances of old definitions (even if it would have to happen on a class-by-class basis). * A class definition version id might have to be an instance slot. One possible simple-minded implementation is to force a slot with some mangled name like '.class_def_version_string'. But perhaps there is a more elegant approach. One more question: Is there a convenient way to introspect an _instance_? The slotNames() method seems to get data from the class definition and doesn't report the right info for an instance that is out of date w.r.t. to the current class defn (i.e., was deserialized after a class defn update). Thanks for listening. + seth __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] optim CG bug w/patch proposal (PR#8786)
Dear R team, when using optim with method CG I got the wrong $value for the reported $par. Example: f-function(p) { if (!all(p-.7)) return(2) if (!all(p.7)) return(2) sin((p[1])^2)*sin(p[2]) } optim(c(0.1,-0.1),f,method=CG,control=list(trace=0,type=1)) $par 19280.68 -10622.32 $value -0.2346207 # should be 2! optim(c(0.1,-0.1),f,method=CG,control=list(trace=0,type=2)) $par 3834.021 -2718.958 $value -0.0009983175 # should be 2! Fix: --- optim.c (Revision 37878) +++ optim.c (Arbeitskopie) @@ -970,7 +970,8 @@ if (!accpoint) { steplength *= stepredn; if (trace) Rprintf(*); - } + } else + *Fmin = f; } } while (!(count == n || accpoint)); if (count n) { After fix: optim(c(0.1,-0.1),f,method=CG,control=list(trace=0,type=1)) $par 0.6993467 -0.4900145 $value -0.2211150 optim(c(0.1,-0.1),f,method=CG,control=list(trace=0,type=2)) $par 3834.021 -2718.958 $value 2 Wishlist: 1. Please make type=3 the default in optim (it is more robust). 2. The $par reported for type=2 is still not satisfactory. I found out that this can be improved by limiting G3 to a maximum of about 2000 (maybe even smaller). However, I'm not a CG expert and can live with a suboptimal result. --- optim.c (Revision 37878) +++ optim.c (Arbeitskopie) @@ -946,6 +946,8 @@ G3 = G1 / G2; else G3 = 1.0; + if (G3 2e3) + G3 = 2e3; gradproj = 0.0; for (i = 0; i n; i++) { t[i] = t[i] * G3 - g[i]; Andreas -- Andreas Westfeld, 0432 01CC F511 9E2B 0B57 5993 0B22 98F8 4AD8 EEEA [EMAIL PROTECTED] http://www.inf.tu-dresden.de/~aw4 TU Dresden Fakultät Informatik, Institut für Systemarchitektur Datenschutz und Datensicherheit, Tel. +49-351-463-37918 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel