I'm asking because I've been on servers where applications did everything 
in memory, causing both the dreaded JVM out of memory error and the server 
to bog down.  I'm not as concerned about marginal performance benefits as I 
am about being a good server citizen and keeping my resource consumption 
down. I'm not a java guru and am unfamiliar with what goes on under the 
covers.

I quite agree that one needs to look at a the system, not just a piece of 
it.  Yes, I am optimizing out our ORACLE database for speed, using stored 
procedures, indexing, etc.

So....

Is my assumption wrong about the PDFReader taking the whole thing into 
memory?  If it is taking the whole thing in, then I may as well create the 
PDF in memory in the first place and hook both passes together (assuming 
this is possible).

If PDFReader doesn't do that, then I'm leaning more towards the File side of 
things so that if I get a large output from the DB I won't bog down the 
server.  I expect that most PDFs will be a few pages but my users have been 
known to make strange requests.


----- Original Message ----- 
From: "Mike Marchywka" <marchy...@hotmail.com>
To: <itext-questions@lists.sourceforge.net>
Sent: Friday, March 19, 2010 4:47 PM
Subject: Re: [iText-questions] Perfomance Question - ByteArray vs Files


>
>
> ________________________________
>> From:
>> To: itext-questions@lists.sourceforge.net
>> Date: Fri, 19 Mar 2010 16:16:14 -0500
>> Subject: [iText-questions] Perfomance Question - ByteArray vs Files
>>
>>
>> I'm creating a PDF in two passes with my goal to
>> end up with it as a file on the server. The first pass creates the PDF 
>> and
>> the second adds things like headers, footers, etc. using PDFStamper. The
>> PDF is being generated from a database so there is a possibility that it 
>> could
>> get to be large (a few hundred pages?).
>
> This really has nothing to do with itext but some people have discussed 
> performance
> issues and indeed the inner itext implementations may want to vary 
> depending
> on what the user can say apriori about some sizes etc. ( for large
> tasks, spending some time up front picking a strategy or specific 
> implementation
> can pay off). And, of course, I'm a perennial complainer about the 
> resources
> related to the PDF file versus alternatives.
>
> First, it may really help if you profile whatever you have- if there is 
> anything slower
> than something called PDF, a highly loaded DB could be it. Do you keep
> requesting the same (static) data from it? etc etc.
>
> Of course, trying to do everything "in memory" sounds faster until you
> find out that your "memory" is virtual and you keep thrashing. If
> you want to rely on the OS great but if you think you can do better
> you may benefit from reading/writing to disk the stuff you want
> instead of making a huge heap and letting the VM system deal with it.
> Once you are all in physical memory, then you want to try
> to keep locality and stay in a lower level memory cache ( hard with java).
> On some large data sets in other settings, I have used a sort ( yes, 
> another slow thing)
> to stop memory thrashing and speed improvement was order of magnitude 
> (from
> essentially unusable to quite tolerable).
>
>
> So, I guess the most authoritative answer is, it depends.
>
>
>>
>>
>>
>> Right now I have the PDFWriter directing the output
>> to a FileOutputStream. Once that is done, the PDFReader picks it up,
>> connects to and uses in PDFStamper to process and send the PDF to the 
>> server
>> using another FileOutputStream.
>>
>>
>>
>> It occurred to me that I might be doing this
>> wrong. If PDFReader brings the whole PDF into memory, wouldn't it be
>> better to have PDFWriter put the PDF out as a ByteArrayOutputStream which 
>> (I
>> think) PDFReader can pick up? Or does PDFReader only bring it in as it
>> needs it? Or is there some other issue I'm missing....
>>
>>
>>
>> I'm not real clear on what the tradeoffs are
>> between running everything out files and accepting the I/0 or keeping
>> everything in memory.
>>
>>
>>
>> Can anyone give me some guidance?
>>
>>
>>
>> Thanks!
>>
>>
>>
>> Warren
>
> _________________________________________________________________
> Hotmail is redefining busy with tools for the New Busy. Get more from your 
> inbox.
> http://www.windowslive.com/campaign/thenewbusy?ocid=PID27925::T:WLMTAGL:ON:WL:en-US:WM_HMP:032010_2
> ------------------------------------------------------------------------------
> Download Intel&#174; Parallel Studio Eval
> Try the new software tools for yourself. Speed compiling, find bugs
> proactively, and fine-tune applications for parallel performance.
> See why Intel Parallel Studio got high marks during beta.
> http://p.sf.net/sfu/intel-sw-dev
> _______________________________________________
> iText-questions mailing list
> iText-questions@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/itext-questions
>
> Buy the iText book: http://www.1t3xt.com/docs/book.php
> Check the site with examples before you ask questions: 
> http://www.1t3xt.info/examples/
> You can also search the keywords list: 
> http://1t3xt.info/tutorials/keywords/
> 


------------------------------------------------------------------------------
Download Intel&#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
_______________________________________________
iText-questions mailing list
iText-questions@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/itext-questions

Buy the iText book: http://www.1t3xt.com/docs/book.php
Check the site with examples before you ask questions: 
http://www.1t3xt.info/examples/
You can also search the keywords list: http://1t3xt.info/tutorials/keywords/

Reply via email to