[Rd] Structure of the object: list() and attr()

2006-08-01 Thread Gorjanc Gregor
Hello!

I am writing code where I define objects with new class. When I started,
it was a simple data.frame with attributes, but it is getting more evolved
and I would like to hear any pros and cons to go for list structure,
where one slot would be a data.frame, while other slots would take over
role of attributes.

Lep pozdrav / With regards,
Gregor Gorjanc

--
University of Ljubljana PhD student
Biotechnical FacultyURI: http://www.bfro.uni-lj.si/MR/ggorjan
Zootechnical Department mail: gregor.gorjanc at bfro.uni-lj.si
Groblje 3   tel: +386 (0)1 72 17 861
SI-1230 Domzale fax: +386 (0)1 72 17 888
Slovenia, Europe
--
One must learn by doing the thing; for though you think you know it,
 you have no certainty until you try. Sophocles ~ 450 B.C.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Any interest in merge and by implementations specifically for so

2006-08-01 Thread tshort

Kevin,

Whether or not the R core developers want to merge these functions in base
R, they would make a great little package on CRAN. That way others could
easily use them, and for yourself, the package automatically gets updated
with new versions of R. It sounds like you're done with the hard parts. All
that you need to do is add some documentation along with a couple of
configuration files, and you're done.

- Tom

-- 
View this message in context: 
http://www.nabble.com/Any-interest-in-%22merge%22-and-%22by%22-implementations-specifically-for-sorted-data--tf2009595.html#a5595038
Sent from the R devel forum at Nabble.com.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Structure of the object: list() and attr()

2006-08-01 Thread Seth Falcon
Gorjanc Gregor [EMAIL PROTECTED] writes:

 Hello!

 I am writing code where I define objects with new class. When I started,
 it was a simple data.frame with attributes, but it is getting more evolved
 and I would like to hear any pros and cons to go for list structure,
 where one slot would be a data.frame, while other slots would take over
 role of attributes.

I would suggest using S4 classes for representing more complex
classes.

I don't think a list structure is much different (morally) than a
data.frame with lots of attributes hanging off of it.  Either way, the
slots and they types don't share a common definition that can be
checked.

+ seth

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Structure of the object: list() and attr()

2006-08-01 Thread Gabor Grothendieck
The key issue is inheritance. If you use a data frame with attributes then
you can inherit data frame methods without further definition,
e.g.

   x - structure(data.frame(a = 1:10), my.attr = 33,
   class = c(myclass, data.frame))
   dim(x)  # inherit dim method

but if you do it this way then you need to define your own methods
for each one you want:

   x - structure(list(.Data = data.frame(a = 1:10), my.attr = 33),
class = myclass)
   dim.myclass - function(x) dim(x$.Data)
   dim(x)

for every method you want.


On 8/1/06, Gorjanc Gregor [EMAIL PROTECTED] wrote:
 Hello!

 I am writing code where I define objects with new class. When I started,
 it was a simple data.frame with attributes, but it is getting more evolved
 and I would like to hear any pros and cons to go for list structure,
 where one slot would be a data.frame, while other slots would take over
 role of attributes.

 Lep pozdrav / With regards,
Gregor Gorjanc

 --
 University of Ljubljana PhD student
 Biotechnical FacultyURI: http://www.bfro.uni-lj.si/MR/ggorjan
 Zootechnical Department mail: gregor.gorjanc at bfro.uni-lj.si
 Groblje 3   tel: +386 (0)1 72 17 861
 SI-1230 Domzale fax: +386 (0)1 72 17 888
 Slovenia, Europe
 --
 One must learn by doing the thing; for though you think you know it,
  you have no certainty until you try. Sophocles ~ 450 B.C.

 __
 R-devel@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Any interest in merge and by implementations specifically for so

2006-08-01 Thread Kevin B. Hendricks
Hi Tom,

 Whether or not the R core developers want to merge these functions  
 in base
 R, they would make a great little package on CRAN. That way others  
 could
 easily use them, and for yourself, the package automatically gets  
 updated
 with new versions of R. It sounds like you're done with the hard  
 parts. All
 that you need to do is add some documentation along with a couple of
 configuration files, and you're done.

Thomas Lumley recommended the same thing last night.  I have just  
finished debugging the routines and validating them for use without  
NAs.  I had to fix a number of typos in my code for the other  
functions but now they all work properly.  I still need to test,  
debug, and validate them for use of NAs with na.rm set to both true  
and  false.

Once I have validated them (ie. that they return the exact same  
things as unlist(lappy(split(x,i),FUNCTION)) does), I will get  
together an external package and make it available on CRAN.

I am in the stupid position of knowing how to add the functions  
internally to R with no problems, but I still have to learn how to  
build and add external packages.

So something else to learn!

Thanks,

Kevin

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] R | vnc | X11 fonts

2006-08-01 Thread Hin-Tak Leung
Evan Cooch wrote:
 Quick followup - works fine with fluxbox (and, as noted, default twm). 
 Simply can't get it to work with the gnome desktop, which ultimately I 
 would like to.

The difference between twm and metacity in gnome or other gnome
windows manager is that twm uses X11 core fonts whereas gnome
is xft/fontconfig-aware, and as far as I know R's X11() uses
core font API's and is not xft-aware.

You haven't said anything about your xorg setup - specifically,
whether you are using a font server (it is the default on FC5, so
unless you have changed it, you are using one). If that's the case,
changing this line in /etc/X11/xorg.conf
  FontPath unix/:7100
to use *real* font paths may help.

HTL

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Install R-patched_2006-07-13 on i386-pc-solaris2.10 with Sun Studio 11

2006-08-01 Thread Latchezar Dimitrov
Dear R-developers:

Anybody having installed R-patched_2006-07-13 on i386-pc-solaris2.10
with Sun Studio 11, I need you help/advice please.

Thank you very much
Latchezar Dimitrov

 -Original Message-
 From: Latchezar Dimitrov 
 Sent: Wednesday, July 26, 2006 4:48 PM
 To: 'Prof Brian Ripley'
 Cc: r-devel@stat.math.ethz.ch
 Subject: RE: [Rd] Install R-patched_2006-07-13 on 
 i386-pc-solaris2.10 with Sun Studio 11
 
 Dear Prof. Ripley and R-developers:
 
 Thank you very much for the reply. Please see bellow 
 
  -Original Message-
  From: Prof Brian Ripley [mailto:[EMAIL PROTECTED]
  Sent: Wednesday, July 26, 2006 2:37 AM
  To: Latchezar Dimitrov
  Cc: r-devel@stat.math.ethz.ch
  Subject: Re: [Rd] Install R-patched_2006-07-13 on 
 i386-pc-solaris2.10 
  with Sun Studio 11
  
  On Wed, 26 Jul 2006, Latchezar Dimitrov wrote:
  
   Dear R-developers:
   
   I'm trying to build a 64-bit R-patched_2006-07-24 on
  SunFire V40z with
   on Solaris OS 10 64-bit kernel and using Sun Studio 11 compilers.
   Everything runs OK until it gets to building package 
 tools (all.R) 
   where it fails. Bellow is how I tried it (I can provide any other 
   additional info if needed). Any help please?
  
  You seem to have gcc 4.1.1 in your paths, so can you try 
 that instead.
 
 True (that's why I gave the env) however that is c/c++ only. 
 I cannot build gcc fortran right way and since R is on top of 
 my list I switched to Sun Studio 11. I included the libraries 
 and the path in order to use makeinfo and readline which I 
 compiled with that gcc (I build most essential gnu utils with 
 it too as well as some applications and everything seems ok). 
 To avoid misuse of it I explicitly specified all the programs 
 that I knew could be mistaken (CC,CXX,etc), gnu ld was 
 renamed. I as sure as one can that there was not any (mis)use 
 of /usr/local/ except for readline and makeinfo (I believe 
 the build did not go that far though). One additional thing 
 that I missed in my prev. e-mail is that R binary was build 
 seemingly properly, i.e., it starts ok however with 
 complaints about missing base parts (obviously).
 
 Now to make it clear it has nothing to do with that gcc I 
 removed all the paths from the environment and to make sure 
 nothing leaks through I renamed /usr/local so it stays out of 
 the way. Then I set --with-readline=no and started clean (as 
 you may have already noticed I build in a separate from src 
 dir (obj-R) completely empty at the beginning. It failed the 
 very same way.
 
 Further, I decided to give 32-bit build a try. To make things 
 more interesting I restore /usr/local and the path to 
 /usr/local/bin in order to be able to use makeinfo (since the 
 only readline I had was a 64-bit one I gave it up). And 
 voila, it built like a charm. The only tests that I noticed 
 failed were those involving tcltk (I had only 64-bit ones installed).
 
 The only problem still remaining is I cannot care less about 
 32-bit version.
 
 Please find attached the log of 64-bit try and a little bit 
 of 32 success in the end.
 
 So would someone please help me find out what is wrong and 
 build my favorite 64-bit R?
 
 Thank you very much,
 Latchezar Dimitrov
 
 
  
  This looks like a fairly fundamental problem with your 
 current build, 
  possibly a mis-compile.
  
   
   Thank you very much
   
   Latchezar Dimitrov
   Wake Forest Univ. School of Medicine
   
   
   
   [EMAIL PROTECTED] #   echo   $PATH
   
  
 /opt/SUNWspro/bin:/usr/sbin:/usr/bin:/usr/openwin/bin:/usr/ccs/bin:/us
   r/
   
  
 openwin/bin:/usr/dt/bin:/usr/platform/i86pc/sbin:/opt/SUNWvts/bin:/opt
   /S
   UNWexplo/bin:/usr/local/bin
   [EMAIL PROTECTED] #   echo   $CC
   cc
   [EMAIL PROTECTED] #   echo   $CXX
   CC
   [EMAIL PROTECTED] #   echo   $CFLAGS
   -xarch=amd64 -xmodel=medium
   [EMAIL PROTECTED] #   echo   $CXXFLAGS
   -xarch=amd64 -xmodel=medium
   [EMAIL PROTECTED] #   echo   $FCFLAGS
   -xarch=amd64 -xmodel=medium
   [EMAIL PROTECTED] #   echo   $FFLAGS
   -xarch=amd64 -xmodel=medium
   [EMAIL PROTECTED] #   echo   $LDFLAGS
   -xarch=amd64 -xmodel=medium
   [EMAIL PROTECTED] #   echo   $R_BROWSER
   /usr/sfw/bin/mozilla
   [EMAIL PROTECTED] #   echo   $r_arch
   amd64
   [EMAIL PROTECTED] #   echo   $LD_LIBRARY_PATH
   
  
 /usr/openwin/lib:/usr/local/gcc-4.1.1-x86-bootstrap/lib/gcc/i386-pc-so
   la
   
  
 ris2.10/4.1.1/amd64:/usr/local/lib:/usr/local/gcc-4.1.1-x86-bootstrap/
   li
   
  b/gcc/i386-pc-solaris2.10/4.1.1/amd64:/usr/local/lib:/usr/loca
  l/gcc-4.1.
   
  
 1-x86-bootstrap/lib/gcc/i386-pc-solaris2.10/4.1.1/amd64:/usr/local/lib
   [EMAIL PROTECTED] #
   
   
   ../src/R-patched_2006-07-24/configure
   --prefix=/opt/R-2.3.1-patched_2006-07-24-Sun_Studio_11
  --with-readline
   --disable-mbcs R_PAPERSIZE=letter --disable-rpath --with-bzlib 
   --with-zlib --with-spcre --with-tcltk --disable-R-profiling 
   --disable-nls
   
   
   
   Error in parseNamespaceFile(package, package.lib, mustExist
  = 

Re: [Rd] R | vnc | X11 fonts

2006-08-01 Thread Evan Cooch
Hin-Tak Leung wrote:
 Evan Cooch wrote:
 Quick followup - works fine with fluxbox (and, as noted, default 
 twm). Simply can't get it to work with the gnome desktop, which 
 ultimately I would like to.

 The difference between twm and metacity in gnome or other gnome
 windows manager is that twm uses X11 core fonts whereas gnome
 is xft/fontconfig-aware, and as far as I know R's X11() uses
 core font API's and is not xft-aware.

 You haven't said anything about your xorg setup - specifically,
 whether you are using a font server (it is the default on FC5, so
 unless you have changed it, you are using one). If that's the case,
 changing this line in /etc/X11/xorg.conf
  FontPath unix/:7100
 to use *real* font paths may help.

Thanks very much for the useful summary of some key points.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] compiling R | multi-Opteron | BLAS source

2006-08-01 Thread Prof Brian Ripley
The R-devel version of R provides a pluggable BLAS, which makes such tests 
fairly easy (although building the BLAS themselves is not).  On dual 
Opterons, using multiple threads is often not worthwhile and can be 
counter-productive (Doug Bates has found some dramatic examples, and you 
can see them in my timings below).

So timings for FC3, gcc 3.4.6, dual Opteron 252, 64-bit build of R. ACML 
3.5.0 is by far the easiest to install (on R-devel all you need to do is 
to link libacml.so to lib/libRblas.so) and pretty competitive, so that is 
what I normally use.

These timings are not very repeatable: to a few % only even after 
averaging quite a few runs.

set.seed(123)
X - matrix(rnorm(1e6), 1000)
system.time(for(i in 1:25) X%*%X)
system.time(for(i in 1:25) solve(X))
system.time(for(i in 1:10) svd(X))

internal BLAS (-O3)
 system.time(for(i in 1:25) X%*%X)
[1] 96.939  0.341 97.375  0.000  0.000
 system.time(for(i in 1:25) solve(X))
[1] 110.316   1.652 112.006   0.000   0.000
 system.time(for(i in 1:10) svd(X))
[1] 165.550   1.131 166.806   0.000   0.000

Goto 1.03, 1 thread
 system.time(for(i in 1:25) X%*%X)
[1] 12.949  0.191 13.143  0.000  0.000
 system.time(for(i in 1:25) solve(X))
[1] 23.201  1.449 24.652  0.000  0.000
 system.time(for(i in 1:10) svd(X))
[1] 43.318  1.016 44.361  0.000  0.000

Goto 1.03, dual CPU
 system.time(for(i in 1:25) X%*%X)
[1] 15.038  0.244  8.488  0.000  0.000
 system.time(for(i in 1:25) solve(X))
[1] 26.569  2.239 19.814  0.000  0.000
 system.time(for(i in 1:10) svd(X))
[1] 59.912  1.799 50.350  0.000  0.000

ACML 3.5.0 (single-threaded)
 system.time(for(i in 1:25) X%*%X)
[1] 13.794  0.368 14.164  0.000  0.000
 system.time(for(i in 1:25) solve(X))
[1] 22.990  1.695 24.710  0.000  0.000
 system.time(for(i in 1:10) svd(X))
[1] 48.267  1.373 49.662  0.000  0.000

ATLAS 3.6.0, single-threaded
 system.time(for(i in 1:25) X%*%X)
[1] 16.164  0.404 16.572  0.000  0.000
 system.time(for(i in 1:25) solve(X))
[1] 26.200  1.704 27.907  0.000  0.000
 system.time(for(i in 1:10) svd(X))
[1] 50.150  1.462 51.619  0.000  0.000

ATLAS 3.6.0, multi-threaded
 system.time(for(i in 1:25) X%*%X)
[1] 17.657  0.468  9.775  0.000  0.000
 system.time(for(i in 1:25) solve(X))
[1] 38.388  2.353 30.141  0.000  0.000
 system.time(for(i in 1:10) svd(X))
[1] 95.611  3.039 88.917  0.000  0.000


On Sun, 23 Jul 2006, Evan Cooch wrote:

 Greetings -
 
 A quick perusal of some of the posts to this maillist suggest the level 
 of the questions is probably beyond someone working at my level, but at 
 the risk of looking foolish publicly (something I find I get 
 increasingly comfortable with as I get older), here goes:
 
 My research group recently purchased a multi-Opteron system (bunch of 
 880 chips), running 64-bit RHEL 4 (which we have site licensed at our 
 university, so it cost us nothing - good price) with SMP support built 
 into the kernel (perhaps obviously, for a multi-pro system). Several of 
 our user use [R], which I've only used on a few occasions. However, it 
 is part of my task to get [R] installed for folks using this system.
 
 While the simple, basic compile sequence (./configure, make, make check, 
 make install) went smoothly, its pretty clear from our benchmarks that 
 the [R] code isn't running as 'rocket-fast' as it should for a system 
 like this. So, I dig a bit deeper. Most of the jobs we want to run could 
 benefit from BLAS support (lots of array manipulations and other bits of 
 linear algebra), and a few other compilation optimizations - and here is 
 where I seek advice.
 
 1) Looks like there are 3-4 flavours: LAPACK, ATLAS, ACML 
 (AMD-chips...), and Goto. In reading what I can find, it seems that 
 there are reasons not to use ACML (single-thread) despite the AMD chips, 
 reasons to avoid ATLAS (some hassles compiling on RHEL 4 boxes), reasons 
 to avoid LAPACK (ibid), but apparently no problems with Goto BLAS.
 
 Is that a reasonable summary? At the risk of starting a larger 
 discussion, I'm simply looking to get BLAS support, yielding the fastest 
 [R] code with the minimum of hassles (while tweaking lines of configure 
 fies,  weird linker sequences and all that used to appeal when I was a 
 student, I don't have time at this stage). So, any quick recommendation 
 for *which* BLAS library? My quick assessment suggests goto BLAS, but 
 I'm hoping for some confirmation.
 
 3) compilation of BLAS - I can compile for 32-bit, or 64-bit. 
 Presumably, given we've invested in 64-bit chips, and a 64-bit OS, we'd 
 like to consider a 64-bit compilation. Which, also presumably, means 
 we'd need 64-bit compilation for [R]. While I've read the short blurb on 
 CRAN concerning 64-bi vs 32-bit compilations (data size vs speed), I'd 
 be happy to have both on our machine. But, I'm not sure how one 
 specifies 64-bits in the [R] compilation - what flags to I need to set 
 during ./configure, or what config file do I need to edit?
 
 Thanks very much in advance - and, again, apologies 

Re: [Rd] compiling R | multi-Opteron | BLAS source

2006-08-01 Thread Evan Cooch
Thanks very much - I followed your advice, and have tried a variety of 
permutations (using ACML, and LAPACK). For the most part, I'm still 
'playing' with multiple threads, but given the performance I'm getting 
(quad Opteron 880, 16 GB RAM, 64-bit FC5), I'll stick with that for now 
(but based on your examples, worth considering a single-thread build for 
comparisons - the svd test is pretty compelling). Here are some 'average 
values' from my machine for the benchmarks you posted:

ACML3.5.0 - multi-threaded (compiled with gcc 4.0.1 and gfortran):

system.time(for(i in 1:25) X%*%X)
 11.75   0.335  3.900  0.000  0.000

system.time(for(i in 1:25) solve(X))
22.410   2.621   13.481  0.000 0.000

system.time(for(i in 1:10) svd(X))
67.384   4.28   38.585   0.000   0.000


Needless to say, on this level of system, most things run pretty fast - 
except the svd benchmark which lags, consistent with what you showed in 
your results. What is somewhat intriguing is why the svd example varies 
so much between (say) internal BLAS (165) and goto BLAS (for example; 
43), for a single-thread compilation.

But, it does look as if ACML is holding its own.

Cheers...

 The R-devel version of R provides a pluggable BLAS, which makes such tests 
 fairly easy (although building the BLAS themselves is not).  On dual 
 Opterons, using multiple threads is often not worthwhile and can be 
 counter-productive (Doug Bates has found some dramatic examples, and you 
 can see them in my timings below).

 So timings for FC3, gcc 3.4.6, dual Opteron 252, 64-bit build of R. ACML 
 3.5.0 is by far the easiest to install (on R-devel all you need to do is 
 to link libacml.so to lib/libRblas.so) and pretty competitive, so that is 
 what I normally use.

 These timings are not very repeatable: to a few % only even after 
 averaging quite a few runs.

 set.seed(123)
 X - matrix(rnorm(1e6), 1000)
 system.time(for(i in 1:25) X%*%X)
 system.time(for(i in 1:25) solve(X))
 system.time(for(i in 1:10) svd(X))

 internal BLAS (-O3)
   
 system.time(for(i in 1:25) X%*%X)
 
 [1] 96.939  0.341 97.375  0.000  0.000
   
 system.time(for(i in 1:25) solve(X))
 
 [1] 110.316   1.652 112.006   0.000   0.000
   
 system.time(for(i in 1:10) svd(X))
 
 [1] 165.550   1.131 166.806   0.000   0.000

 Goto 1.03, 1 thread
   
 system.time(for(i in 1:25) X%*%X)
 
 [1] 12.949  0.191 13.143  0.000  0.000
   
 system.time(for(i in 1:25) solve(X))
 
 [1] 23.201  1.449 24.652  0.000  0.000
   
 system.time(for(i in 1:10) svd(X))
 
 [1] 43.318  1.016 44.361  0.000  0.000

 Goto 1.03, dual CPU
   
 system.time(for(i in 1:25) X%*%X)
 
 [1] 15.038  0.244  8.488  0.000  0.000
   
 system.time(for(i in 1:25) solve(X))
 
 [1] 26.569  2.239 19.814  0.000  0.000
   
 system.time(for(i in 1:10) svd(X))
 
 [1] 59.912  1.799 50.350  0.000  0.000

 ACML 3.5.0 (single-threaded)
   
 system.time(for(i in 1:25) X%*%X)
 
 [1] 13.794  0.368 14.164  0.000  0.000
   
 system.time(for(i in 1:25) solve(X))
 
 [1] 22.990  1.695 24.710  0.000  0.000
   
 system.time(for(i in 1:10) svd(X))
 
 [1] 48.267  1.373 49.662  0.000  0.000

 ATLAS 3.6.0, single-threaded
   
 system.time(for(i in 1:25) X%*%X)
 
 [1] 16.164  0.404 16.572  0.000  0.000
   
 system.time(for(i in 1:25) solve(X))
 
 [1] 26.200  1.704 27.907  0.000  0.000
   
 system.time(for(i in 1:10) svd(X))
 
 [1] 50.150  1.462 51.619  0.000  0.000

 ATLAS 3.6.0, multi-threaded
   
 system.time(for(i in 1:25) X%*%X)
 
 [1] 17.657  0.468  9.775  0.000  0.000
   
 system.time(for(i in 1:25) solve(X))
 
 [1] 38.388  2.353 30.141  0.000  0.000
   
 system.time(for(i in 1:10) svd(X))
 
 [1] 95.611  3.039 88.917  0.000  0.000


 On Sun, 23 Jul 2006, Evan Cooch wrote:

   
 Greetings -

 A quick perusal of some of the posts to this maillist suggest the level 
 of the questions is probably beyond someone working at my level, but at 
 the risk of looking foolish publicly (something I find I get 
 increasingly comfortable with as I get older), here goes:

 My research group recently purchased a multi-Opteron system (bunch of 
 880 chips), running 64-bit RHEL 4 (which we have site licensed at our 
 university, so it cost us nothing - good price) with SMP support built 
 into the kernel (perhaps obviously, for a multi-pro system). Several of 
 our user use [R], which I've only used on a few occasions. However, it 
 is part of my task to get [R] installed for folks using this system.

 While the simple, basic compile sequence (./configure, make, make check, 
 make install) went smoothly, its pretty clear from our benchmarks that 
 the [R] code isn't running as 'rocket-fast' as it should for a system 
 like this. So, I dig a bit deeper. Most of the jobs we want to run could 
 benefit from BLAS support (lots of array manipulations and other bits of 
 linear algebra), and a few other compilation optimizations - and here is 
 where I seek advice.

 1) Looks like there are 3-4 flavours: 

[Rd] Artefacts in (screen viewed) PDF output

2006-08-01 Thread Roger Bivand
This issue is probably to do with on-screen viewing of PDF files written
from R (2.3.1, Windows XP, RHEL 4), not with how the files are produced.  
So the question is mainly to ask whether others have seen similar
behaviour, and whether a remedy is known.

When neighbouring polygons are written with the same fill colour, and with
no border line colouring, PDF files show traces of probably unstroked
lines or probably interstices when viewed on-screen in at least acroread
(7.0) on both Windows XP and RHEL 4 (though not xpdf 3.0 on RHEL 4). This
is intrusive when many neighbouring polygons share fill colour, for
example on election party share maps, where borders are suppressed for
clarity. An example is:

library(maps)
us - map(state, fill=TRUE, plot=FALSE)
pdf(borders.pdf)
plot(us, type=n, axes=FALSE, asp=1)
polygon(us, col=blue, border=NA)
dev.off()

Using polygon(us, col=blue, border=transparent) gives the same result. 
Curiously, the same is also observed with postscript() and external 
conversion to PDF (epstopdf), although viewing the EPS file on RHEL 4 in 
ggv does not show any artefacts up to 400%.

My feeling is that the output files are correct but that acroread is 
introducing interstices in rendering to screen - I do not have a printer 
with high enough resolution to check properly, but I believe that 
acroread-printed output does not have the artefacts. They are however 
visible when acroread is used in presentation mode.

Any insight would be very useful.

Roger

-- 
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Helleveien 30, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: [EMAIL PROTECTED]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Any interest in merge and by implementations specifically for so

2006-08-01 Thread Kevin B. Hendricks
Hi,

My last word on this topic until I get a working external R package ...

The igroup code has now been validated both with and without NAs and  
with and without removing them.  Thanks to Bill, Tom, Thomas, and  
everyone for your helpful comments and hints.

The results for my validation run are here in case anyone is interested.
So my code now officially works.  If anyone wants patches against the  
latest development version of R to play around with (do your own  
timings, etc), please just let me know and I will send the patches  
privately.

I will start to work on an external package next week when I have  
more time.

Hope this helps,

Kevin


  x - rnorm(2e6)
  i - rep(1:1e6,2)
  y - runif(2e6)
  is.na(x[y  0.8]) - TRUE
 
  suma = unlist(lapply(split(x,i),sum,na.rm=T))
  names(suma) - NULL
  sumb = igroupSums(x,i,na.rm=T)
  all.equal(suma,sumb)
[1] TRUE
 
 
  suma = unlist(lapply(split(x,i),sum,na.rm=F))
  names(suma) - NULL
  sumb = igroupSums(x,i,na.rm=F)
  all.equal(suma,sumb)
[1] TRUE
 
 
  maxa = unlist(lapply(split(x,i),max,na.rm=T))
There were 50 or more warnings (use warnings() to see the first 50)
  names(maxa)-NULL
  maxb - igroupMaxs(x,i,na.rm=T)
  all.equal(maxa, maxb)
[1] TRUE
 
 
  maxa = unlist(lapply(split(x,i),max,na.rm=F))
  names(maxa)-NULL
  maxb - igroupMaxs(x,i,na.rm=F)
  all.equal(maxa, maxb)
[1] TRUE
 
 
  mina = unlist(lapply(split(x,i),min,na.rm=T))
There were 50 or more warnings (use warnings() to see the first 50)
  names(mina)-NULL
  minb - igroupMins(x,i,na.rm=T)
  all.equal(mina, minb)
[1] TRUE
 
 
  mina = unlist(lapply(split(x,i),min,na.rm=F))
  names(mina)-NULL
  minb - igroupMins(x,i,na.rm=F)
  all.equal(mina, minb)
[1] TRUE
 
 
  meana = unlist(lapply(split(x,i),mean,na.rm=T))
  names(meana)-NULL
  meanb - igroupMeans(x,i,na.rm=T)
  all.equal(meana, meanb)
[1] TRUE
 
  meana = unlist(lapply(split(x,i),mean,na.rm=F))
  names(meana)-NULL
  meanb - igroupMeans(x,i,na.rm=F)
  all.equal(meana, meanb)
[1] TRUE
 
 
  proda = unlist(lapply(split(x,i),prod,na.rm=T))
  names(proda)-NULL
  prodb - igroupProds(x,i,na.rm=T)
  all.equal(proda, prodb)
[1] TRUE
 
  proda = unlist(lapply(split(x,i),prod,na.rm=F))
  names(proda)-NULL
  prodb - igroupProds(x,i,na.rm=F)
  all.equal(proda, prodb)
[1] TRUE
 
 
  cnta - unlist(lapply(split(x,i),length))
  names(cnta) - NULL
  cntb - igroupCounts(x,i,na.rm=F)
  all.equal(cnta,cntb)
[1] TRUE
 
 
  anya - unlist(lapply(split((x1.0),i),any,na.rm=T))
  names(anya)-NULL
  anyb - igroupAnys((x1.0),i,na.rm=T)
  all.equal(anya,anyb)
[1] TRUE
 
 
  anya - unlist(lapply(split((x1.0),i),any,na.rm=F))
  names(anya)-NULL
  anyb - igroupAnys((x1.0),i,na.rm=F)
  all.equal(anya,anyb)
[1] TRUE
 
 
  alla - unlist(lapply(split((x1.0),i),all,na.rm=T))
  names(alla)-NULL
  allb - igroupAlls((x1.0),i,na.rm=T)
  all.equal(alla,allb)
[1] TRUE
 
 
  alla - unlist(lapply(split((x1.0),i),all,na.rm=F))
  names(alla)-NULL
  allb - igroupAlls((x1.0),i,na.rm=F)
  all.equal(alla,allb)
[1] TRUE

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] read.table with more cols than headers

2006-08-01 Thread Gordon Smyth
I am trying to understand the behaviour of read.table() reading 
delimited files (with header=TRUE and fill=TRUE) when there are more 
(possibly spurious) columns than headings.  I give below four small 
data files, all of which have one or two extra columns added to one 
line.  Reading the first file produces an error message, the second 
produces a column of NA, the third adds an extra row, the fourth 
ignores the extra columns with no message and no NA.  Most 
unintuitive!  Here are my attempts to understand this, with questions 
interpolated.

The behaviour on the first file seems self-explanatory.  The number 
of headings determines the number of columns, and extra data columns 
are not allowed.  (On the other hand, the help ?read.table says that 
the number of columns is determined from the first five rows, which 
suggests that the header line is not the only determiner.  If 
headers, when present, are indeed the only determiner, perhaps this 
should be mentioned in the help.  Are headers actually equivalent to 
specifying the same set of names using the col.names argument?)

For the second file, the first column is being taken as row 
names.  This agrees with the help which says if the header line has 
one less entry than the number of columns, the first column is taken 
to be the row names.  OK, perhaps not the ideal solution for this 
data file, but clearly documented behaviour.

In the third file, the extra columns are being taken to be a new 
row.  This seems wrong, because the help says that cases correspond 
to lines.  There is no suggestion in the documentation that a line of 
the file could contain multiple cases.  This is the result I have 
most trouble with.  I guess could prevent this behaviour by flush=TRUE.

File 4 is curious.  Here the number of columns has been determined, 
using the first 5 rows of the file, to be two.  The extra column on 
line 6 can't change this, so the first column doesn't become row 
names.  But in that case, shouldn't the extra column found on line 6 
produce an error message, same as for file 1?

Specifying colClasses to be a vector of length more than 2 when 
reading file 3 will produce a result similar to file 4, but with a 
warning.  It is not clear to me why colClasses should have an 
influence, since it doesn't change the determination of the number of 
columns.  Why a warning here, but an error for file 1 and no message 
for file 4?

Any comments gratefully received.
Gordon

X,Y
a,2
b,4,,
c,6

X,Y
a,2
b,4,
c,6

X,Y
a,2
b,4
c,6
d,8
e,10,,
f,12

X,Y
a,2
b,4
c,6
d,8
e,10,
f,12

  read.csv(test1.txt)
Error in read.table(file = file, header = header, sep = sep, quote = quote,  :
 more columns than column names
  read.csv(test2.txt)
   X  Y
a 2 NA
b 4 NA
c 6 NA
  read.csv(test3.txt)
   X  Y
1 a  2
2 b  4
3 c  6
4 d  8
5 e 10
6   NA
7 f 12
  read.csv(test4.txt)
   X  Y
1 a  2
2 b  4
3 c  6
4 d  8
5 e 10
6 f 12
  read.csv(test3.txt,colClasses=c(NA,NA))
   X  Y
1 a  2
2 b  4
3 c  6
4 d  8
5 e 10
6   NA
7 f 12
  read.csv(test3.txt,colClasses=c(NA,NA,NA,NA))
   X  Y
1 a  2
2 b  4
3 c  6
4 d  8
5 e 10
6 f 12
Warning message:
cols = 2 != length(data) = 4 in: read.table(file = file, header = 
header, sep = sep, quote = quote,

  sessionInfo()
R version 2.4.0 Under development (unstable) (2006-07-25 r38698)
i386-pc-mingw32

locale:
LC_COLLATE=English_Australia.1252;LC_CTYPE=English_Australia.1252;LC_MONETARY=English_Australia.1252;LC_NUMERIC=C;LC_TIME=English_Australia.1252

attached base packages:
[1] methods   stats graphics  grDevices 
utils datasets  base

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel