Re: [R] ps or pdf

2008-04-04 Thread Tribo Laboy
  The graphics devices are very similar (they share a lot of code).  One
  small difference is that PostScript has an arc primitive, and PDF does
  not.


Sorry for interjecting, but I have a burning question. It is a bit off
topic, so I apologize in advance.

What is the stance of the R Developers regarding this missing R
primitive in PDF? Because of the missing primitive all circles are
represented as O characters. I have run into problems when trying to
import R produced PDF plots into Inkscape for some additional post
processing and beautification. As a workaround I currently use the
Cairo device to export to PDF (and SVG). But this is a bit heavy. It
would be nice to be able to save Inkscape editable PDFs directly from
the plot window.

Some other far more important issues that could occur have been raised
in a past thread:
http://www.nabble.com/pdf%28%29-device-uses-fonts-to-represent-points---data-alteration--td13034770.html
For example the unintentional misrepresentation of data on the plot if
font substitution occurs and the points are shifted from their
original location.

Is this considered a bug or a feature?

Regards,
TL

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ps or pdf

2008-04-01 Thread Joerg van den Hoff
have not followed the thread completely, but:

have you tried `bitmap' with `type = pdfwrite' (or psgrb)
for comparison? at least with `pdf' there are some issues which
can be avoided by using ghostscript via `bitmap'.

joerg

On Mon, Mar 31, 2008 at 04:17:50PM -0400, Francois Pepin wrote:
 Prof Brian Ripley wrote:
  Please see the footer of this message.  
 
 Sorry, here is an example. For some reason, I cannot reproduce it 
 without using actual gene names.
 
 set.seed(1)
 ##The row names were originally obtained using the hgug4112a library 
 ##from bioconductor. I set it manually for people who don't have it 
 ##installed.
 ##library(hgug4112a);row-sample(na.omit(unlist(as.list(hgug4112aSYMBOL))),50)
 row-c(BDNF, EMX2, ZNF207, HELLS, PWP1, PDXDC1,  BTD, 
 NETO1, SLCO4C1, FZD7, NICN1, TMSB4Y, PSMB7,  CADM2, 
 SIRT3, ADH6, TM6SF1, AARS, TMEM88, CP110,  ADORA2A, 
 ATAD3A, VAPA, NXPH3, IL27RA, NEBL, FANCF,  PTPRG, 
 HSU79275, CCDC34, EPDR1, FBLN1, PCAF, AP1B1,  TXNRD2, 
 MUC20, MBNL1, STAU2, STK32C, PPIAL4, TGFBR2,  DPY19L2P3, 
 TMEM50B, ENY2, MAN2A2, ZFYVE26, TECTA,  CD55, LOC400794, 
 SLC19A3)
 postscript('/tmp/heatmap.ps',paper='letter',horizontal=F)
 heatmap(matrix(rnorm(2500),50),labRow=row)
 dev.off()
 
  Neither postscript() nor pdf() 
  graphics devices split up strings they are passed (by e.g. text()), so 
  this is being done either by the code used to create the plot (and we 
  have no idea what that is) or by the viewer.  I suspect the problem is 
  rather in the viewer, but without the example we asked for it is 
  impossible to know.
 
 Example of row names that are truncated in Illustrator (* denoting 
 truncation):
 CCDC3*4 (2nd row)
 MUC2*0 (3rd row)
 MBNL*1 (8th row)
 ...
 
 It is likely that Illustrator (CS 3, OS X version) is at fault.  I do 
 not see any truncation if I look at the ps file by hand (lines 4801 and 
 4802):
 
 540.22 545.88 (MUC20) 0 0 0 t
 540.22 553.90 (CCDC34) 0 0 0 t
 
  There also seems to be somewhat arbitrary grouping of the last column
  cells in heatmaps in ps files.
  
  Again, we need an example.
 
 The top right cell (26, TXNRD2) is grouped with the cell just below it 
 (26, CCDC34). It's more of a curiosity than anything else.
 
  I used to prefer the ps because they embed more easily in latex
  documents (although pdf are not difficult and conversions are trivial
  anyhow), but I'm curious if there are other reasons why one format might
  be preferred over the other in this context.
  
  The graphics devices are very similar (they share a lot of code).  One 
  small difference is that PostScript has an arc primitive, and PDF does not.
 
 This is what I thought at first, which is why I found these differences 
 surprising. I think your idea of blaming the viewer is correct. I 
 thought that Adobe of all people could deal with Postscript files 
 properly, but I guess I was overly trusting.
 
 Thanks for the help,
 
 Francois
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ps or pdf

2008-03-31 Thread Francois Pepin
Hi everyone,

I have been making a fair amount of figures in R recently that I've
been touching up with Illustrator and I've found a difference between
pdf and ps files and I was wondering if someone could enlighten me
about them.

While the figures look the same, the ps version tends to have
truncated strings. The last character of short strings tends to be on
a string of its own, located right beside the rest. This makes it a bit 
awkward to manipulate, especially if scaling is involved. Is there a 
reason for this differences?

There also seems to be somewhat arbitrary grouping of the last column 
cells in heatmaps in ps files.

I used to prefer the ps because they embed more easily in latex
documents (although pdf are not difficult and conversions are trivial
anyhow), but I'm curious if there are other reasons why one format might
be preferred over the other in this context.

This is with R 2.6 on linux, and I've seen this behavior with older R
version also.

Francois

sessionInfo()
R version 2.6.0 (2007-10-03)
x86_64-unknown-linux-gnu

locale:
LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=en_US.UTF-8;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

loaded via a namespace (and not attached):
[1] rcompgen_0.1-15

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ps or pdf

2008-03-31 Thread Prof Brian Ripley
On Mon, 31 Mar 2008, Francois Pepin wrote:

 Hi everyone,

 I have been making a fair amount of figures in R recently that I've
 been touching up with Illustrator and I've found a difference between
 pdf and ps files and I was wondering if someone could enlighten me
 about them.

 While the figures look the same, the ps version tends to have
 truncated strings. The last character of short strings tends to be on
 a string of its own, located right beside the rest. This makes it a bit
 awkward to manipulate, especially if scaling is involved. Is there a
 reason for this differences?

Please see the footer of this message.  Neither postscript() nor pdf() 
graphics devices split up strings they are passed (by e.g. text()), so 
this is being done either by the code used to create the plot (and we have 
no idea what that is) or by the viewer.  I suspect the problem is rather 
in the viewer, but without the example we asked for it is impossible to 
know.

 There also seems to be somewhat arbitrary grouping of the last column
 cells in heatmaps in ps files.

Again, we need an example.

 I used to prefer the ps because they embed more easily in latex
 documents (although pdf are not difficult and conversions are trivial
 anyhow), but I'm curious if there are other reasons why one format might
 be preferred over the other in this context.

The graphics devices are very similar (they share a lot of code).  One 
small difference is that PostScript has an arc primitive, and PDF does 
not.

 This is with R 2.6 on linux, and I've seen this behavior with older R
 version also.

Nothing has changed at that level for a long time -- not even in 
current versions of R (and 2.6.0 is obsolete).


 Francois

 sessionInfo()
 R version 2.6.0 (2007-10-03)
 x86_64-unknown-linux-gnu

 locale:
 LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=en_US.UTF-8;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C

 attached base packages:
 [1] stats graphics  grDevices utils datasets  methods   base

 loaded via a namespace (and not attached):
 [1] rcompgen_0.1-15

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ps or pdf

2008-03-31 Thread Francois Pepin
Prof Brian Ripley wrote:
 Please see the footer of this message.  

Sorry, here is an example. For some reason, I cannot reproduce it 
without using actual gene names.

set.seed(1)
##The row names were originally obtained using the hgug4112a library 
##from bioconductor. I set it manually for people who don't have it 
##installed.
##library(hgug4112a);row-sample(na.omit(unlist(as.list(hgug4112aSYMBOL))),50)
row-c(BDNF, EMX2, ZNF207, HELLS, PWP1, PDXDC1,  BTD, 
NETO1, SLCO4C1, FZD7, NICN1, TMSB4Y, PSMB7,  CADM2, 
SIRT3, ADH6, TM6SF1, AARS, TMEM88, CP110,  ADORA2A, 
ATAD3A, VAPA, NXPH3, IL27RA, NEBL, FANCF,  PTPRG, 
HSU79275, CCDC34, EPDR1, FBLN1, PCAF, AP1B1,  TXNRD2, 
MUC20, MBNL1, STAU2, STK32C, PPIAL4, TGFBR2,  DPY19L2P3, 
TMEM50B, ENY2, MAN2A2, ZFYVE26, TECTA,  CD55, LOC400794, 
SLC19A3)
postscript('/tmp/heatmap.ps',paper='letter',horizontal=F)
heatmap(matrix(rnorm(2500),50),labRow=row)
dev.off()

 Neither postscript() nor pdf() 
 graphics devices split up strings they are passed (by e.g. text()), so 
 this is being done either by the code used to create the plot (and we 
 have no idea what that is) or by the viewer.  I suspect the problem is 
 rather in the viewer, but without the example we asked for it is 
 impossible to know.

Example of row names that are truncated in Illustrator (* denoting 
truncation):
CCDC3*4 (2nd row)
MUC2*0 (3rd row)
MBNL*1 (8th row)
...

It is likely that Illustrator (CS 3, OS X version) is at fault.  I do 
not see any truncation if I look at the ps file by hand (lines 4801 and 
4802):

540.22 545.88 (MUC20) 0 0 0 t
540.22 553.90 (CCDC34) 0 0 0 t

 There also seems to be somewhat arbitrary grouping of the last column
 cells in heatmaps in ps files.
 
 Again, we need an example.

The top right cell (26, TXNRD2) is grouped with the cell just below it 
(26, CCDC34). It's more of a curiosity than anything else.

 I used to prefer the ps because they embed more easily in latex
 documents (although pdf are not difficult and conversions are trivial
 anyhow), but I'm curious if there are other reasons why one format might
 be preferred over the other in this context.
 
 The graphics devices are very similar (they share a lot of code).  One 
 small difference is that PostScript has an arc primitive, and PDF does not.

This is what I thought at first, which is why I found these differences 
surprising. I think your idea of blaming the viewer is correct. I 
thought that Adobe of all people could deal with Postscript files 
properly, but I guess I was overly trusting.

Thanks for the help,

Francois

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.