Re: [iText-questions] Guidance Requested - Generating multipage output with header/footer and pg 1 layout
Hello Daniel, On Apr 28, 2010, at 1:32 AM, Daniel Cane wrote: Interesting - could you show me how you define your spot function? The idea of simply assembling a 'top' spot, 'left' spot and 'right' spot is appealing, but I'm a little lost on how to implement it. Are spots = paragraphs or templates or what? Also, how would this work with a different header / footer on all pages after the first page? I very much appreciate your suggestion! The Spot is a class or better yet an enum type which has a corresponding stored configuration in your preferred configuration format e.g. xml, properties, txt etc. The client code can refer to Spots by the appropriate enum instance e.g. spot.properties: TOP_LEFT=23,0,repeat TOP_RIGHT=250,0,repeat etc the client code would look like this: IBuilder builder = ... builder.addPdfTemplate(Spot.TOP_LEFT,load(Parts.HEADER)); note that client code does not exactly know what TOP_LEFT details are ... The Spot type could also have some sort of small gap constant value maybe also externalized in configuration which would support functions like this: Spot.TOP_LEFT.right().right() so you would have finer control on the coordinate positions but again I would avoid any algorithm that moves things around because: - complexity - possible lose of predictability of the visual output From all the ideas I thought of for implementing the layout, this was the simplest one ... HTH, Best regards, Giovanni -- ___ iText-questions mailing list iText-questions@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/itext-questions Buy the iText book: http://www.itextpdf.com/book/ Check the site with examples before you ask questions: http://www.1t3xt.info/examples/ You can also search the keywords list: http://1t3xt.info/tutorials/keywords/
Re: [iText-questions] Guidance Requested - Generating multipage output with header/footer and pg 1 layout
Hello Daniel, Please note that I am a relatively newbie with iText. I have been recently faced with similar questions as yours and this is my attempted solution ... perhaps it will give you some ideas. On Apr 27, 2010, at 9:58 PM, Daniel Cane wrote: My first approach was attempting to create the page from scratch using chunks, paragraphs, etc. This seems to give me a ton of control, but I find myself tinkering (a ton) to get the layout to work. I define templates as simple as a list of coordinate points or hot spots or however you want to call them. These points are uniquely keyed by labels e.g. TOP_LEFT and they also have a recurrence flag i.e. repeat for all pages. So they look like Spot(x,y,TOP_LEFT,repeat). These coordinate tuples are defined manually once so that the visual appealingness is well known in advance and not computed on the fly by some algorithm where you could lose predictability of the visual output. Now that's all :) I then use the Builder Design pattern which is one of the few interfaces to client code with methods similar to e.g. IBuilder#addPdfPart(Spot,IPdfPart) IBuilder#addTable(Spot,ITable) IBuilder#build() my Layout Manager does not shift on the x axis but only on the y-axis and breaks onto new pages automatically. It also checks or will for overlapping. This solution resolves the problem of having to deal with the combinatorial explosion of PDF templates to maintain ... you keep only metadata list of points as template. HTH, Best regards, Giovanni -- ___ iText-questions mailing list iText-questions@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/itext-questions Buy the iText book: http://www.itextpdf.com/book/ Check the site with examples before you ask questions: http://www.1t3xt.info/examples/ You can also search the keywords list: http://1t3xt.info/tutorials/keywords/
Re: [iText-questions] performance follow up
Hello, On Apr 23, 2010, at 10:50 PM, trumpetinc wrote: Don't know if it'll make any difference, but the way you are reading the file is horribly inefficient. If the code you wrote is part of your test times, you might want to re-try, but using this instead (I'm just tossing this together - there might be type-os): ByteArrayOutputStream baos = new ByteArrayOutputStream(); byte[] buf = new byte[8092]; int n; while ((n = is.read(buf)) = 0) { baos.write(buf, 0, n); } return baos.toByteArray(); I tried your suggestion above and made no significative difference compared to doing the loading from iText. The fastest I could get my use case to work using this pre-loading concept was by loading the whole file in one shot using the code below. Applying the cumulative patch plus preloading the whole PDF using the code below, my original test-case now performs 7.74% faster than before, roughly 22% away from competitor now ... btw the average response time numbers I was getting: - average response time of 77ms original unchanged test-case from the office multi-processor-multi-core workstation - average response time of 15ms original unchanged test-case from home using my MBP I attribute the huge difference between those two similar experiments mainly to having an SSD drive in my MBP ... the top Host spots reported from the profiler are related one way or another to IO so would be no wonder that with an SSD drive the response time improves by a factor of 5x. There are other differences though e.g. OS, JVM version. Best regards, Giovanni private static byte[] file2ByteArray(String filePath) throws Exception { InputStream input = null; try { File file = new File(filePath); input = new BufferedInputStream(new FileInputStream(filePath)); byte[] buff = new byte[(int) file.length()]; input.read(buff); return buff; } finally { if (input != null) { input.close(); } } } -- ___ iText-questions mailing list iText-questions@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/itext-questions Buy the iText book: http://www.itextpdf.com/book/ Check the site with examples before you ask questions: http://www.1t3xt.info/examples/ You can also search the keywords list: http://1t3xt.info/tutorials/keywords/
Re: [iText-questions] performance follow up
On Apr 24, 2010, at 8:59 PM, trumpetinc wrote: If the file is being entirely pre-loaded, then I doubt that IO blocking is a significant contributing factor to your test. After I did the entire pre-loading, taking the entire file at once the benchmarks look better yes, meaning there is some bottleneck in the way itext handles the loading of the PDF files. Besides changing to a different storage i.e. from non SSD in the office to SSD in my laptop shows a performance improvement by a factor of 5x, of course there could be other reasons but I would be willing to bet that this 5x faster is by a high margin due to the fast SSD. If there is something SSD are really good at is Random access and iText is doing that and a lot. Benchmarking the alternative in my laptop shows: alternative mean RT: 18ms itext mean RT: 14ms So in my laptop itext is faster than the alternative ... why? I think because of random access. If itext was doing a lot of random access it could slow it down in a non-SSD drive like the one I have in the office.I have to benchmark again itext in the office to see how it performs with the new load the entire file strategy. Because of these variations I will setup the experiment in the actual hardware where it will be deployed. I think that the best clue here may be the difference between performance with form flattening and without form flattening. Just to confirm, am I right in saying that iText outperforms the competitor by a significant amount in the non-flattening scenario? If that's the case, then it seems like we should see significant differences in the profiling results between the flattening and non-flattening scenarios in iText. Would you be willing to post the profiling results for both cases so we can see which code paths are consuming the most runtime in each? I posted this yesterday, see http://old.nabble.com/more-on-performance-td28346917.html - FOOTER 4x shows the Hot spot profiler results in the loading and flattening case - HEADER 4x shows the Hot spot profiler results for the loading only BTW - which profiler are you using? Are you able to expand each of the hotspot code paths and see the actual call path that is causing the bottleneck? I use jvvm, and the results of expanding the hotspot call trees can be quite illuminating. I am using JProfiler. I can expand the Hotspots, it shows the full call trees leading to the Hot spot. What I really would like is to get ahold of your two benchmark tests (with and without flattening) so I can run it on my system - do you have anything you can package up and share? I will prepare it for you ... Best regards, Giovanni -- ___ iText-questions mailing list iText-questions@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/itext-questions Buy the iText book: http://www.itextpdf.com/book/ Check the site with examples before you ask questions: http://www.1t3xt.info/examples/ You can also search the keywords list: http://1t3xt.info/tutorials/keywords/
Re: [iText-questions] performance follow up
Hello Mike, On Apr 23, 2010, at 12:55 AM, Mike Marchywka wrote: Mark Twain gets to the front so quickly. Again, I'm not suggesting you did anything wrong or bad, I haven't actually checked numbers or given the specific test a lot of thought- 9 data points is usually not all that conclusive in any case and I guess that's my point. There are 10 means, each mean comes from 1K data points, so there are 10K data points for each version tested, not just 9 Unlike other tests of significance, t-test doesn't need a large number of observations. It is actually this case of few observations e.g. 10 means one of its main use-cases. Indeed one would need to check the assumptions of independence and normality. Looking at the response times though looks ok normal distribution ... I owe you the scatter and QQ plots. I really would not expect a Gamma going on but I might be wrong :) Best regards, Giovanni -- ___ iText-questions mailing list iText-questions@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/itext-questions Buy the iText book: http://www.itextpdf.com/book/ Check the site with examples before you ask questions: http://www.1t3xt.info/examples/ You can also search the keywords list: http://1t3xt.info/tutorials/keywords/
Re: [iText-questions] performance follow up
Hello Paulo, On Apr 22, 2010, at 11:43 PM, Paulo Soares wrote: FYI I already use a table to map the char to the result for the delimiter testing and the speed improvement was zero in relation to plain comparisons. Paulo You are right ... changing to a table makes no difference. I checked this with the profiler and the results stay the same. Best regards, Giovanni -- ___ iText-questions mailing list iText-questions@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/itext-questions Buy the iText book: http://www.itextpdf.com/book/ Check the site with examples before you ask questions: http://www.1t3xt.info/examples/ You can also search the keywords list: http://1t3xt.info/tutorials/keywords/
Re: [iText-questions] performance follow up
On Apr 22, 2010, at 11:18 PM, trumpetinc wrote: I like your approach! A simple if (ch 32) return false; at the very top would give the most bang for the least effort (if you do go the bitmask route, be sure to include unit tests!). Doing this change spares approximately two seconds out of the full workload so now shows 8s instead of 10s and isWhitespace stays at 1%. The numbers below include two extra changes: the one from trumpetinc above and migrating all StringBuffer references to use instead StringBuilder. The top are now: PRTokeniser.nextToken 8% 77s 19'268'000 invocations RandomAccessFileOrArray.read 6% 53s 149'047'680 invocations MappedRandomAccessFile.read 3% 26s 61'065'680 invocations PdfReader.removeUnusedCode 1% 15s 6000 invocations PdfEncodings.convertToBytes 1% 15s5'296'207 invocations PRTokeniser.nextValidToken1%12s 9'862'000 invocations PdfReader.readPRObject 1%10s 5'974'000 invocations ByteBuffer.append(char) 1%10s 19'379'382 invocations PRTokeniser.backOnePosition 1%10s 17'574'000 invocations PRTokeniser.isWhitespace 1%8s 35'622'000 invocations A bit further down there is ByteBuffer.append_i that often needs to reallocate and do an array copy thus the expensive ByBuffer.append(char) above ... I am playing right now with bigger initial sizes e.g. 512 instead of 127 ... Best regards, Giovanni -- ___ iText-questions mailing list iText-questions@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/itext-questions Buy the iText book: http://www.itextpdf.com/book/ Check the site with examples before you ask questions: http://www.1t3xt.info/examples/ You can also search the keywords list: http://1t3xt.info/tutorials/keywords/
Re: [iText-questions] performance follow up
Hello trumpetinc, On Apr 23, 2010, at 7:29 PM, trumpetinc wrote: Giovanni - if your source PDFs are small enough, you might want to try this, just to get a feel for the impact that IO blocking is having on your results (read entire PDF into byte[] and use PdfReader(byte[])) Trying it right now ... The StringBuffer could definitely be replaced with a StringBuilder, and it could be re-used instead of re-allocating for each call to nextTokeen() This is what I applied yesterday with the patch I posted. It includes both changes in PRTokeniser: StringBuilder + reusing the same instances ... the improvement is somewhere around 6.2% faster for my test case. I want to try this one you suggest above ... and then I will post the new numbers plus the cumulative patch I have ... Best regards, Giovanni -- ___ iText-questions mailing list iText-questions@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/itext-questions Buy the iText book: http://www.itextpdf.com/book/ Check the site with examples before you ask questions: http://www.1t3xt.info/examples/ You can also search the keywords list: http://1t3xt.info/tutorials/keywords/
Re: [iText-questions] MISTAKE! performance follow up
I am sooo sorry the performance is worse with the change for pre-loading the PDFs in the test-case :(( the problem was that I ran the benchmarks with a small mistake in my test case ... The HEADER tests how to load flattened PDF part templates ... The FOOTER tests how to load PDF part templates containing fields that need to be populated. The mistake was to leave fixed the HEADER always ... so it would load only the flattened PDF template and not the footer (see below) [sigh] In any case is good to know that loading flattened PDF parts is cheaper. I mistakenly ran the last benchmark like this: private static byte[] file2ByteArray(String filePath) throws Exception { InputStream input = null; ByteArrayOutputStream output = null; try { input = new BufferedInputStream(new FileInputStream(HEADER_PATH)); output = new ByteArrayOutputStream(); int data = input.read(); while (data != -1) { output.write(data); data = input.read(); } return output.toByteArray(); } finally { if (input != null) { input.close(); } if (output != null) { output.close(); } } } -- ___ iText-questions mailing list iText-questions@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/itext-questions Buy the iText book: http://www.itextpdf.com/book/ Check the site with examples before you ask questions: http://www.1t3xt.info/examples/ You can also search the keywords list: http://1t3xt.info/tutorials/keywords/
Re: [iText-questions] AWESOME! performance follow up
I am sooo sorry the performance is worse with the change for pre-loading the PDFs in the test-case :(( the problem was that I ran the benchmarks with a small mistake in my test case ... Loading the HEADER demonstrates how to load flattened pre-formatted PDF part templates ... Loading the FOOTER demonstrates how to load PDF part templates containing fields that need to be populated. The mistake was to leave fixed the HEADER always ... so it would load only the flattened PDF template and not the footer (see below) [sigh] In any case is good to know that loading flattened PDF parts is cheaper. I mistakenly ran the last benchmark like this: private static byte[] file2ByteArray(String filePath) throws Exception { InputStream input = null; ByteArrayOutputStream output = null; try { input = new BufferedInputStream(new FileInputStream(HEADER_PATH)); output = new ByteArrayOutputStream(); int data = input.read(); while (data != -1) { output.write(data); data = input.read(); } return output.toByteArray(); } finally { if (input != null) { input.close(); } if (output != null) { output.close(); } } }-- ___ iText-questions mailing list iText-questions@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/itext-questions Buy the iText book: http://www.itextpdf.com/book/ Check the site with examples before you ask questions: http://www.1t3xt.info/examples/ You can also search the keywords list: http://1t3xt.info/tutorials/keywords/
Re: [iText-questions] performance follow up
Hello trumpetinc, On Apr 23, 2010, at 10:50 PM, trumpetinc wrote: Don't know if it'll make any difference, but the way you are reading the file is horribly inefficient. If the code you wrote is part of your test times, you might want to re-try, but using this instead (I'm just tossing this together - there might be type-os): No, pre-loading the PDF template with the IO code I submitted was not part of the performance tests before. I added it quick and dirty just to try out and saw the massive performance improvement, I should have been skeptical but the Latin spirit took over :) ByteArrayOutputStream baos = new ByteArrayOutputStream(); byte[] buf = new byte[8092]; int n; while ((n = is.read(buf)) = 0) { baos.write(buf, 0, n); } return baos.toByteArray(); Thank you, I will try this one later ... From your results, are you seeing a big difference between iText and the competitor when you aren't flattening fields vs you are flattening fields? Your profiling results aren't indicating bottlenecks in that area of the code. If iText is much faster than the competitor in the non-flattening scenario, but slower than the competitor in the flattening scenario, I'm having a hard time reconciling the data presented so far. HEADER is a PDF file with no fields FOOTER is a PDF file with fields (needs to be populated and flattened i.e. stamper.setFormFlattening(true)) I prepared the equivalent exact same code for iText and the alternative. However, I did not measure the times for the two templates HEADER and FOOTER separately. So I can not tell if iText is faster loading with flat PDF than the alternative or if iText is faster loading the PDF with fields compared to the alternative. Just now I discovered that if the loaded PDF form does not have fields iText performs much faster. So I just modified my test-case to disable the HEADER and run and profile only FOOTER (the expensive one with fields) four times so that the top bottlenecks for this case will be better evidenced. Best regards, Giovanni -- ___ iText-questions mailing list iText-questions@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/itext-questions Buy the iText book: http://www.itextpdf.com/book/ Check the site with examples before you ask questions: http://www.1t3xt.info/examples/ You can also search the keywords list: http://1t3xt.info/tutorials/keywords/
Re: [iText-questions] performance follow up
Hello, On Apr 22, 2010, at 6:49 PM, 1T3XT info wrote: Paulo Soares wrote: Thank you. I'll include your changes. Due to the lack of time, I wasn't able to follow the discussion, but I've just read the latest mails, and I also want to thank you for the performance tests and improvements. No problem! I am happy to help .. it is a win-win :) The currently established commercial PDF solution I was comparing iText against is fast yes and with a lot of imagination and some compromises somehow you manage to achieve the use-case that you need but the design shortcomings and bugs it has are countless plus the huge risks because of their close code. But this is of course my personal opinion, and not the bank policy :) I might propose some more patches, it would be great to get the performance numbers at least as good as the commercial solution ... 23.8% response time difference still to go. Would this list be the right place for proposing new patches or is there a dedicated list for development? Best regards, Giovanni -- ___ iText-questions mailing list iText-questions@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/itext-questions Buy the iText book: http://www.itextpdf.com/book/ Check the site with examples before you ask questions: http://www.1t3xt.info/examples/ You can also search the keywords list: http://1t3xt.info/tutorials/keywords/
Re: [iText-questions] performance follow up
Hello Mike, On Apr 22, 2010, at 12:22 PM, Mike Marchywka wrote: This of course is where you consult Mark Twain. LOL. iText is or isn't better than before ( for some particular use case) irrespective of the data you currently have but the question is does the data allow you to reject the conclusion that they are have the same execution times with some confidence level? Good, this is exactly what I meant :) Finding ways to explain or attribute the noise into some kind of model of course would be a reasonable thing to consider if you had a few more test cases with some relevant parameters( number of fonts you will need or something). The performance comparison is based on the representative test case exactly as business wants it. As far as I know we need only two fonts: light and bold. So the number of fonts is not a parameter. the parameters you know about it- obviously for the cases you have only one decision makes sense andd off hand based on what you said about nature of patch I don't know of any case where generating gratuitous garbage is a good strategy LOL. I know ... but hypothetically my patch could have well fixed the generating gratuitous garbage while the use-case still be slow i.e. my point being to make sure and prove that the patch delivers the promised performance gain. The paired observation of the means are: At this stage it is usually helpful to look at the data, not just start dumping it into equations you found in a book. This book is the official reference for the course in Advance System Performance Analysis I am taking for my graduate CS Master program in the top-10 Technology University of the world ... so no, it is not just equations I found in a book :) If you need to compare the performance of two systems and you have paired observations, this is the recipe you want to use. I'm not slamming you at all, just that its helpful to have a check on your analysis even if you No worries, I am actually very happy with your feedback. I would actually like to thank you for your insights. Also it sounds like the alt pacakage is still faster by a clinically significant amount- an amount relevant to someone. Now only 23.8% to go. We only need to make 4 more fixes like the last one and the gap will be gone :) The Profiler shows there are still several bottlenecks topping which could also be easy fixes e.g. PRTokeniser.isWhitespace is a simple boolean condition that just happen to be called gazillion times e.g. 35'622'000 times for my test workload ... if instead of doing it like: public static final boolean isWhitespace(int ch) { return (ch == 0 || ch == 9 || ch == 10 || ch == 12 || ch == 13 || ch == 32); } we used a bitwise binary operator with the appropriate mask(s), there could be some good performance gain ... Best regards, Giovanni-- ___ iText-questions mailing list iText-questions@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/itext-questions Buy the iText book: http://www.itextpdf.com/book/ Check the site with examples before you ask questions: http://www.1t3xt.info/examples/ You can also search the keywords list: http://1t3xt.info/tutorials/keywords/
Re: [iText-questions] performance follow up
Hello, On Apr 22, 2010, at 10:59 PM, Giovanni Azua wrote: PRTokeniser.isWhitespace is a simple boolean condition that just happen to be called gazillion times e.g. 35'622'000 times for my test workload ... if instead of doing it like: public static final boolean isWhitespace(int ch) { return (ch == 0 || ch == 9 || ch == 10 || ch == 12 || ch == 13 || ch == 32); } we used a bitwise binary operator with the appropriate mask(s), there could be some good performance gain ... The function already exists in http://java.sun.com/javase/6/docs/api/java/lang/Character.html#isWhitespace%28char%29 I checked and it already uses bitwise binary operators with the right masks ... we would only need to inline it to avoid the function call costs. Best regards, Giovanni-- ___ iText-questions mailing list iText-questions@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/itext-questions Buy the iText book: http://www.itextpdf.com/book/ Check the site with examples before you ask questions: http://www.1t3xt.info/examples/ You can also search the keywords list: http://1t3xt.info/tutorials/keywords/
Re: [iText-questions] performance follow up
Hello, On Apr 22, 2010, at 11:30 PM, trumpetinc wrote: The semantics are different (the JSE call includes more characters in it's definition of whitespace than the PDF spec). Not saying that it can't be easily done, but throwing an if statement at it and seeing what impact it has on performance is pretty easy also. I will try your suggestion too ... What was the overall time %age spent in this call in your tests? Total 10 seconds that accounts for 1% ... 1% is not much but how the Swiss people use to say who does not care about the cents, do not deserve the francs or something like that :) Best regards, Giovanni -- ___ iText-questions mailing list iText-questions@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/itext-questions Buy the iText book: http://www.itextpdf.com/book/ Check the site with examples before you ask questions: http://www.1t3xt.info/examples/ You can also search the keywords list: http://1t3xt.info/tutorials/keywords/
[iText-questions] eclipse project
Hello, I just checked out trunk to try out some changes related to the performance question thread. Is there an easy way to get an eclipse project ... I already tried: cd src/ant mvn eclipse:eclipse -DdownloadSources=true with no luck ... it does not generate the expected eclipse project for me, I mean it does but not pointing to the sources nor fetching the dependencies ... TIA, Best regards, Giovanni -- ___ iText-questions mailing list iText-questions@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/itext-questions Buy the iText book: http://www.itextpdf.com/book/ Check the site with examples before you ask questions: http://www.1t3xt.info/examples/ You can also search the keywords list: http://1t3xt.info/tutorials/keywords/
Re: [iText-questions] eclipse project
Sorry! never mind! I managed with the plain old wizard :) I am too spoiled with Maven ... On Apr 21, 2010, at 11:18 PM, Giovanni Azua wrote: Hello, I just checked out trunk to try out some changes related to the performance question thread. Is there an easy way to get an eclipse project ... I already tried: cd src/ant mvn eclipse:eclipse -DdownloadSources=true with no luck ... it does not generate the expected eclipse project for me, I mean it does but not pointing to the sources nor fetching the dependencies ... TIA, Best regards, Giovanni -- ___ iText-questions mailing list iText-questions@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/itext-questions Buy the iText book: http://www.itextpdf.com/book/ Check the site with examples before you ask questions: http://www.1t3xt.info/examples/ You can also search the keywords list: http://1t3xt.info/tutorials/keywords/
[iText-questions] performance follow up
Hello, Good news ... after applying the attached patch to trunk and doing yet another performance experiment using the previously posted workload these are the results: BEFORE (trunk) MeanVariance 15.125 5.514 15.440 3.474 15.258 9.736 15.621 23.869 15.449 4.817 15.500 2.662 15.221 8.431 15.319 3.419 15.142 1.626 15.457 3.972 AFTER (trunk + patch) MeanVariance 14.404 5.928 14.487 16.781 14.132 1.618 14.314 3.174 14.663 7.522 14.542 15.086 14.283 6.924 14.399 2.064 14.205 1.609 14.471 2.761 The mean values look surprisingly much better than in the office. I'm running here Snow Leopard with JVM 1.6.0_19. Is iText with the patch better than before? The paired observation of the means are: {(15.125, 14.404), (15.440, 14.487), (15.258, 14.132), (15.621, 14.314), (15.449, 14.663), (15.500, 14.542), (15.221, 14.283), (15.319, 14.399), (15.142, 14.205), (15.457, 14.471)} The performance differences constitute a sample of 10 observations: {0.721, 0.953, 1.126, 1.307, 0.786, 0.958, 0.938, 0.92, 0.937, 0.986} For this sample: Sample mean = 0.9632 Sample variance = 0.02651 Sample standard Deviation = 0.16282 Confidence interval for the mean = 0.9632 +/- t*sqrt(0.02651/10) = 0.9632 +/- t*0.0514 The 0.95 quantile of a t-variate with df=N-1=10-1=9 is 1.833113 = 95% confidence interval = 0.9632 +/- 1.833113*0.0514 = [0.9632-0.09422,0.9632+0.09422] = [0.86898,1.05742] Since the confidence interval does NOT include zero we can conclude that the performance improvement is significative (patch is better than no patch) and will be approximately of (0.9632/15.3532)*100% = 6.2% I also ran the workload connected to the profiler and the number of StringBuffer instances decreased to 846'988 The Letter PDF looks good i.e. the patch didn't seem to break anything but you will have to run the unit tests on it. Best regards, Giovanni PS: There are still some StringBuffer around to fix ... PRTokeniser.patch Description: Binary data -- ___ iText-questions mailing list iText-questions@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/itext-questions Buy the iText book: http://www.itextpdf.com/book/ Check the site with examples before you ask questions: http://www.1t3xt.info/examples/ You can also search the keywords list: http://1t3xt.info/tutorials/keywords/