Summary: It's possible that a PDF created from a document may be larger, 
perhaps much larger, than the original. I'll look at a few reasons why this 
might be.

 

PDF (Portable Document Format) files are a common and popular way to distribute 
documents. Their primary "feature" is simply that they look pretty much the 
same on just about any computer.

And of course a PDF file typically mimics the layout and feel of an actually 
printed document, only in electronically displayed form.

Why might it be larger than the original word processing or other original 
document? I can think of a few possibilities.

Compress me Once
Adobe has compression options that control how aggressively - or not - it 
compresses images in PDF documents it creates.

That actually makes sense since an uncompressed image can be large, and a good 
compression algorithm can reduce the size required to represent the image 
significantly, even more significantly if you're willing to trade off some of 
the image quality.

" small size isn't really a primary goal for PDF "

A potential problem, however, is that attempting to compress something that's 
already efficiently compressed can make it larger.

It's possible that if a document contains a large number of images, perhaps 
".jpg" formatted photos which are by definition already compressed, the process 
of creating the PDF might actually cause those photographs to become somewhat 
larger. From what you say, that might well be the issue that you're facing.

Recommendation: check and experiment with the compression settings of your PDF 
creation utility.

Fonts: Here, but not There
Fonts and typefaces can be fairly confusing. We're all familiar with nearly 
ubiquitous fonts like Times New Roman, Arial, and even (dare I say it?) Comic 
Sans.

But what happens if you use a font in your document that most people don't 
have? When you print it out on paper it looks great, because that all happens 
on your computer where the font is present. On someone else's machine, things 
might look quite different if that font's not present. Use an obscure font and 
take your original document to a machine where that font isn't present, and 
you'll see what I mean - it'll look different.

PDF attempts to solve this problem by including fonts within the document. My 
belief is that it embeds only non-standard fonts - those which can't be assumed 
to be on most machines - however the rules may be more complex than that.

As a test, I created a small Microsoft Word document consisting of two 
sentences, 25 words total, all in the default font Times New Roman. Changing 
one word in the document to the font "Algerian" took the generated PDF from 
around 2,000 bytes to over 10,000.

Recommendation: examine your font usage, and see if you can reduce the number 
of non-standard fonts in your document.

Size Doesn't Matter (or So They Say)
PDF is relatively efficient, but creating a small file actually isn't its 
primary goal. That, as its name implies, is to be a Portable Document - one 
that looks pretty much the same everywhere, and one that can be viewed on a 
wide variety of machines. If achieving that goal means the file gets bigger, 
then so be it.

One of the apparent design decisions in the format is that a lot of information 
in the document is stored as "plain text", which presumably is easier for that 
"wide variety of machines" to understand.

If you ever open a .pdf file in notepad, or just "Type" it at the Windows 
Command Prompt you'll see a lot of plain text - text you can read and make some 
sense of (even if what it's saying is obscure).

Now, plain text isn't the most efficient way to store information from a space 
perspective. If you want proof, go grab a large plain text document and zip it. 
I'll use the Project Gutenberg copy of Tolstoy's War and Peace as an example. 
The plain text version of this book, known for its length, weighs in at a 
little over 3 megabytes. Zipping it using 7-Zip the result is less than 1/3rd 
the size of the original. That smaller version contains the exact same 
information, albeit in an unreadable form. All you need do is decompress it to 
recover the exact original copy.

Recommendation: try zipping your PDF. Yes, you might be re-compressing 
compressed or even doubly-compressed pictures, per the earlier point, but it's 
worth experimenting with. In a text-heavy document zipping the file for 
distribution might make a fair amount of sense.

There's Probably More
I've probably just scratched the surface of reasons that a PDF file might end 
up being larger than its original. The big take away from my perspective is 
that small size isn't really a primary goal for PDF and as a result some kinds 
of things it needs to do might well end up increasing the size of the result.

And zipping the file is always a quick and easy thing to try, often with good 
results.

 

Warm Regards
MohammadWaseemKhan
Land: 01165154282
Mob: 9213749272
Emails
bestmu...@gmail.com,
mohammadwase...@gmail.com,
mohammadwaseemk...@yahoo.com.
MSN: mr.waseemk...@hotmail.com
Skype: sweetboy250
Voice your thoughts in the blog to discuss the Rights of persons with 
disability bill at:
http://www.accessindia.org.in/harish/blog.htm

To unsubscribe send a message to accessindia-requ...@accessindia.org.in with 
the subject unsubscribe.

To change your subscription to digest mode or make any other changes, please 
visit the list home page at
  http://accessindia.org.in/mailman/listinfo/accessindia_accessindia.org.in

Reply via email to