Prof Brian Ripley wrote:

On Wed, 24 Nov 2004 [EMAIL PROTECTED] wrote:

On 24-Nov-04 Witold Eryk Wolski wrote:

Hi,
I want to draw a scatter plot with 1M  and more points
and save it as pdf.
This makes the pdf file large.
So i tried to save the file first as png and than convert
it to pdf. This looks OK if printed but if viewed e.g. with
acrobat as document figure the quality is bad.

Anyone knows a way to reduce the size but keep the quality?


If you want the PDF file to preserve the info about all the
1M points then the problem has no solution. The png file
will already have suppressed most of this (which is one
reason for poor quality).

I think you should give thought to reducing what you need
to plot.

Think about it: suppose you plot with a resolution of
1/200 points per inch (about the limit at which the eye
begins to see rough edges). Then you have 40000 points
per square inch. If your 1M points are separate but as
closely packed as possible, this requires 25 square inches,
or a 5x5 inch (= 12.7x12.7 cm) square. And this would be
solid black!

Presumably in your plot there is a very large number of
points which are effectively indistinguisable from other
points, so these could be eliminated without spoiling
the plot.

I don't have an obviously best strategy for reducing what
you actually plot, but perhaps one line to think along
might be the following:

1. Multiply the data by some factor and then round the
  results to an integer (to avoid problems in step 2).
  Factor chosen so that the result of (4) below is
  satisfactory.

2. Eliminate duplicates in the result of (1).

3. Divide by the factor you used in (1).

4. Plot the result; save plot to PDF.

As to how to do it in R: the critical step is (2),
which with so many points could be very heavy unless
done by a well-chosen procedure. I'm not expert enough
to advise about that, but no doubt others are.


unique will eat that for breakfast

x <- runif(1e6)
system.time(xx <- unique(round(x, 4)))

[1] 0.55 0.09 0.64 0.00 0.00

length(xx)

[1] 10001




?table -> reduces the data
and
?image -> shows it.
And this is doing exactly what I need. (not my idea but one of Thomas Unternäher). Thanks Thomas.



/E

--
Dipl. bio-chem. Witold Eryk Wolski
MPI-Moleculare Genetic
Ihnestrasse 63-73 14195 Berlin
tel: 0049-30-83875219                 __("<    _
http://www.molgen.mpg.de/~wolski      \__/    'v'
http://r4proteomics.sourceforge.net    ||    /   \
mail: [EMAIL PROTECTED]    ^^     m m
     [EMAIL PROTECTED]

______________________________________________
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Reply via email to