Re: [iText-questions] performance follow up

2010-04-24 Thread Giovanni Azua
Hello,

On Apr 23, 2010, at 10:50 PM, trumpetinc wrote:
 Don't know if it'll make any difference, but the way you are reading the file
 is horribly inefficient.  If the code you wrote is part of your test times,
 you might want to re-try, but using this instead (I'm just tossing this
 together - there might be type-os):
 
 ByteArrayOutputStream baos = new ByteArrayOutputStream();
 byte[] buf = new byte[8092];
 int n;
 while ((n = is.read(buf)) = 0) {
   baos.write(buf, 0, n);
 }
 return baos.toByteArray();
 
I tried your suggestion above and made no significative difference compared to 
doing the loading from iText. The fastest I could get my use case to work using 
this pre-loading concept was by loading the whole file in one shot using the 
code below.

Applying the cumulative patch plus preloading the whole PDF using the code 
below, my original test-case now performs 7.74% faster than before, roughly 22% 
away from competitor now ...  

btw the average response time numbers I was getting:

- average response time of 77ms original unchanged test-case from the office 
multi-processor-multi-core workstation 
- average response time of 15ms original unchanged test-case from home using my 
MBP

I attribute the huge difference between those two similar experiments mainly to 
having an SSD drive in my MBP ... the top Host spots reported from the profiler 
are related one way or another to IO so would be no wonder that with an SSD 
drive the response time improves by a factor of 5x. There are other differences 
though e.g. OS, JVM version.  

Best regards,
Giovanni

private static byte[] file2ByteArray(String filePath) throws Exception {
  InputStream input = null; 
  try {
File file = new File(filePath);
input = new BufferedInputStream(new FileInputStream(filePath));

byte[] buff = new byte[(int) file.length()];
input.read(buff);

return buff;
  } 
  finally {
if (input != null) {
  input.close();
}
  }
}  


--
___
iText-questions mailing list
iText-questions@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/itext-questions

Buy the iText book: http://www.itextpdf.com/book/
Check the site with examples before you ask questions: 
http://www.1t3xt.info/examples/
You can also search the keywords list: http://1t3xt.info/tutorials/keywords/

[iText-questions] iText and Substance together

2010-04-24 Thread enwired

I am using both iText and the Substance look-and-feel together in a
commercial application.  When using iText to output a PDF of a screenshot of
the entire application, it *almost* works.  All GUI elements print fairly
well with the exception of cells in a JTable.

For cells in a JTable, if the table cell font is not bold, the text will be
placed into the PDF document but will be invisible in the PDF (either
because the color is the same as the background, or it is transparent, or
something of that sort.)  If the table cell font is bold, the text will be
placed into the PDF as a bitmap image, but as a weird outline.

A very short program which can reproduce this bug, as well as more
information, is posted on the substance discussion board here:
https://substance.dev.java.net/servlets/ProjectForumMessageView?forumID=1484messageID=35746

While I doubt anyone on either board will have an answer, I live in hope!

Thanks.

This problem is identical with old (2.1.7) and new (5.0.2) 
versions of iText and with old (5.3) or new (6.0) versions 
of Substance.

-- 
View this message in context: 
http://old.nabble.com/iText-and-Substance-together-tp28346191p28346191.html
Sent from the iText - General mailing list archive at Nabble.com.


--
___
iText-questions mailing list
iText-questions@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/itext-questions

Buy the iText book: http://www.itextpdf.com/book/
Check the site with examples before you ask questions: 
http://www.1t3xt.info/examples/
You can also search the keywords list: http://1t3xt.info/tutorials/keywords/


Re: [iText-questions] performance follow up

2010-04-24 Thread Mike Marchywka











 From: brave...@gmail.com
 Date: Sat, 24 Apr 2010 13:05:26 +0200
 To: itext-questions@lists.sourceforge.net
 Subject: Re: [iText-questions] performance follow up



 Hello,

 On Apr 23, 2010, at 10:50 PM, trumpetinc wrote:
 Don't know if it'll make any difference, but the way you are reading the file
 is horribly inefficient. If the code you wrote is part of your test times,
 you might want to re-try, but using this instead (I'm just tossing this
 together - there might be type-os):

 ByteArrayOutputStream baos = new ByteArrayOutputStream();
 byte[] buf = new byte[8092];
 int n;
 while ((n = is.read(buf))= 0) {
 baos.write(buf, 0, n);
 }
 return baos.toByteArray();

 I tried your suggestion above and made no significative difference compared 
 to doing the loading from iText. The fastest I could get my use case to work 
 using this pre-loading concept was by loading the whole file in one shot 
 using the code below.

If as indicated below you are generally IO limited, don't throw
the code out yet. If you must copy data you want to use array
based methods as often as possible but the first preference
is to avoid copies unless of course you are strategicly
preloading or something. 
 
I often just turn everything into a byte array but
obviously this doesn't scale too well unless you are content to let
VM do your swapping for you. Ideally you would just load what you
need in a just-in-time fashion to avoid tying up idle RAM. 
 

 Applying the cumulative patch plus preloading the whole PDF using the code 
 below, my original test-case now performs 7.74% faster than before, roughly 
 22% away from competitor now ...

 btw the average response time numbers I was getting:

 - average response time of 77ms original unchanged test-case from the office 
 multi-processor-multi-core workstation
 - average response time of 15ms original unchanged test-case from home using 
 my MBP

 I attribute the huge difference between those two similar experiments mainly 
 to having an SSD drive in my MBP ... the top Host spots reported from the 
 profiler are related one way or another to IO so would be no wonder that with 
 an SSD drive the response time improves by a factor of 5x. There are other 
 differences though e.g. OS, JVM version.

 
Multi-proc and disk cache can cause some confusions. I wouldn't ignore
task manager for some initial investigations- if the CPU drops and disk
light comes on you are likely to be disk limited. With IO it is easy
to get nickel-and-dimed to death as everyone who relays the data
can be low on profile chart but it adds up. Wall-clock times are least
susceptible to manipulation and may be best for A-B comparisons 
if you have control over other stuff running on machine ( cash flow versus 
pro-forma earnings LOL). If you can subclass
the random access file thing you may be able to first collect statistics
and then write something that can see into the future a few milliseconds.
All the generic caches work on past results, things like MRU except maybe the 
prefetch
which assumes you will continue to do sequential memory accesses. If you
are in a posittion to make forward looking statements that have a material 
impact on your performance you ( ROFL) you may be able to 
do much better.
 
 

 Best regards,
 Giovanni

 private static byte[] file2ByteArray(String filePath) throws Exception {
 InputStream input = null;
 try {
 File file = new File(filePath);
 input = new BufferedInputStream(new FileInputStream(filePath));


 byte[] buff = new byte[(int) file.length()];
 input.read(buff);

 return buff;
 }
 finally {
 if (input != null) {
 input.close();
 }
 }
 }

 
_
Hotmail is redefining busy with tools for the New Busy. Get more from your 
inbox.
http://www.windowslive.com/campaign/thenewbusy?ocid=PID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_2
--
___
iText-questions mailing list
iText-questions@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/itext-questions

Buy the iText book: http://www.itextpdf.com/book/
Check the site with examples before you ask questions: 
http://www.1t3xt.info/examples/
You can also search the keywords list: http://1t3xt.info/tutorials/keywords/


[iText-questions] Extracting Comments

2010-04-24 Thread Ali Vajahat
Hi itext:

I want to parse the comments from the PDF file using itext. Any Idea?

Regards,
Vajahat
--
___
iText-questions mailing list
iText-questions@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/itext-questions

Buy the iText book: http://www.itextpdf.com/book/
Check the site with examples before you ask questions: 
http://www.1t3xt.info/examples/
You can also search the keywords list: http://1t3xt.info/tutorials/keywords/

Re: [iText-questions] iText 5.0.1 embedded fonts and smartcopy

2010-04-24 Thread Jason Berk

anybody know why this happens?  Seems wrong to see the same font embedded 
multiple times yet end up with a smaller document.  Short of Document Props  
Fonts scrolling forever, is there any harm in letting it embed the subset.  My 
main goal was to reduce file size after concat'ing several (~30,000) PDFs.

Jason


-Original Message-
From: Jason Berk [mailto:jb...@purdueefcu.com]
Sent: Fri 4/23/2010 5:14 PM
To: Post all your questions about iText here
Subject: [iText-questions] iText 5.0.1 embedded fonts and smartcopy
 
I have three fonts which each contain 1 glyph.

 

I created 100 identical pdfs that uses this font and then used smartcopy
to merge all 100 pages.

 

The resulting PDF is 184KB and when I look at the document properties,
it shows the font 100 times (presumably because it was an embedded
subset).

 

I added myFont.setSubset(false); and reran the test.

 

Now when I view the properties of the merged pdf, I only see my font
once (as expected), yet the size of my merged PDF grew to 327KB! (not
expected)

 

As I understood it, SmartCopy didn't reuse fonts that were subsets.

 

public class Fonts {

 

  public static final Font VISA;

  public static final Font SCORECARD;

  public static final Font MICR;

 

  static {

BaseFont _visa = null;

BaseFont _scorecard = null;

BaseFont _micr = null;

try {

  _visa = BaseFont.createFont(/fonts/CREDITCARD.ttf,
BaseFont.WINANSI, BaseFont.EMBEDDED);

  _visa.setSubset(false); // INCREASES FILE SIZE?!?!

  _scorecard =
BaseFont.createFont(/fonts/SCORECARD.ttf, BaseFont.WINANSI,
BaseFont.EMBEDDED);

  _scorecard.setSubset(false); // INCREASES FILE
SIZE?!?!

  _micr = BaseFont.createFont(/fonts/OCRAEXT.ttf,
BaseFont.WINANSI, BaseFont.EMBEDDED);

  _micr.setSubset(false); // INCREASES FILE SIZE?!?!

} catch (Exception e) {

  e.printStackTrace();

  System.exit(1);

}

VISA = new Font(_visa, 12);

SCORECARD = new Font(_scorecard, 12);

MICR = new Font(_micr, 12);

  }

}

 

private void generateStatements() {

try {

  log.info(begin generating statements);

  Document d = new Document();

  PdfSmartCopy copy = new PdfSmartCopy(d, new
FileOutputStream(C:/temp/aMerged.pdf));

  d.open();

  for (int i = 1; i = 100; i++) {

 

Document document = new Document();

PdfWriter.getInstance(document, new
FileOutputStream(C:/temp/test + i + .pdf));

document.open();



document.add(new Paragraph(LARGE FONTS,
Fonts.NORMAL));

document.add(new Paragraph(testing our font
class, Fonts.LARGE_NORMAL));

document.add(new Paragraph(testing our font
class, Fonts.LARGE_BOLD));

document.add(new Paragraph(testing our font
class, Fonts.LARGE_UNDERLINE));

document.add(new Paragraph(testing our font
class, Fonts.LARGE_ITALIC));

 

document.add(new Paragraph(\n\nNORMAL FONTS,
Fonts.NORMAL));

document.add(new Paragraph(testing our font
class, Fonts.NORMAL));

document.add(new Paragraph(testing our font
class, Fonts.BOLD));

document.add(new Paragraph(testing our font
class, Fonts.UNDERLINE));

document.add(new Paragraph(testing our font
class, Fonts.ITALIC));

 

document.add(new Paragraph(\n\nSMALL FONTS,
Fonts.NORMAL));

document.add(new Paragraph(testing our font
class, Fonts.SMALL_NORMAL));

document.add(new Paragraph(testing our font
class, Fonts.SMALL_BOLD));

document.add(new Paragraph(testing our font
class, Fonts.SMALL_UNDERLINE));

document.add(new Paragraph(testing our font
class, Fonts.SMALL_ITALIC));

 

document.add(new Paragraph(\n\nCOLORED FONTS,
Fonts.NORMAL));

document.add(new Paragraph(testing our font
class, Fonts.PEFCU_RED_NORMAL));

 

document.add(new Paragraph(\n\nWHITE FONTS,
Fonts.NORMAL));

Chunk chunk = new Chunk(testing our font
class, Fonts.WHITE_NORMAL);

chunk.setBackground(Colors.BLACK);

document.add(new Paragraph(chunk));

 

Chunk chunk2 = new Chunk(testing our font
class, Fonts.WHITE_BOLD);

chunk2.setBackground(Colors.BLACK);

document.add(new Paragraph(chunk2));

 

document.add(new 

Re: [iText-questions] performance follow up

2010-04-24 Thread trumpetinc

If the file is being entirely pre-loaded, then I doubt that IO blocking is a
significant contributing factor to your test.

I think that the best clue here may be the difference between performance
with form flattening and without form flattening.  Just to confirm, am I
right in saying that iText outperforms the competitor by a significant
amount in the non-flattening scenario?  If that's the case, then it seems
like we should see significant differences in the profiling results between
the flattening and non-flattening scenarios in iText.

Would you be willing to post the profiling results for both cases so we can
see which code paths are consuming the most runtime in each?

Another possibility if the profiling results show similar hotspots is that
the form flattening algorithms in iText are using the hotspot areas a lot
more than in the non-flattening case.  There may be a bunch of redundant
reads or something in the flattening case.

Let's take a look at the profiling results and see if we can draw any
conclusions about where to go next.

BTW - which profiler are you using?  Are you able to expand each of the
hotspot code paths and see the actual call path that is causing the
bottleneck?  I use jvvm, and the results of expanding the hotspot call trees
can be quite illuminating.

What I really would like is to get ahold of your two benchmark tests (with
and without flattening) so I can run it on my system - do you have anything
you can package up and share?

- K


Giovanni Azua-2 wrote:
 
 Hello,
 
 On Apr 23, 2010, at 10:50 PM, trumpetinc wrote:
 Don't know if it'll make any difference, but the way you are reading the
 file
 is horribly inefficient.  If the code you wrote is part of your test
 times,
 you might want to re-try, but using this instead (I'm just tossing this
 together - there might be type-os):
 
 ByteArrayOutputStream baos = new ByteArrayOutputStream();
 byte[] buf = new byte[8092];
 int n;
 while ((n = is.read(buf)) = 0) {
  baos.write(buf, 0, n);
 }
 return baos.toByteArray();
 
 I tried your suggestion above and made no significative difference
 compared to doing the loading from iText. The fastest I could get my use
 case to work using this pre-loading concept was by loading the whole file
 in one shot using the code below.
 
 Applying the cumulative patch plus preloading the whole PDF using the code
 below, my original test-case now performs 7.74% faster than before,
 roughly 22% away from competitor now ...  
 
 btw the average response time numbers I was getting:
 
 - average response time of 77ms original unchanged test-case from the
 office multi-processor-multi-core workstation 
 - average response time of 15ms original unchanged test-case from home
 using my MBP
 
 I attribute the huge difference between those two similar experiments
 mainly to having an SSD drive in my MBP ... the top Host spots reported
 from the profiler are related one way or another to IO so would be no
 wonder that with an SSD drive the response time improves by a factor of
 5x. There are other differences though e.g. OS, JVM version.  
 
 Best regards,
 Giovanni
 
 private static byte[] file2ByteArray(String filePath) throws Exception {
   InputStream input = null;   
   try {
 File file = new File(filePath);
 input = new BufferedInputStream(new FileInputStream(filePath));
   
 byte[] buff = new byte[(int) file.length()];
 input.read(buff);
 
 return buff;
   }   
   finally {
 if (input != null) {
   input.close();
 }
   }
 }  
 
 
 
 --
 
 ___
 iText-questions mailing list
 iText-questions@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/itext-questions
 
 Buy the iText book: http://www.itextpdf.com/book/
 Check the site with examples before you ask questions:
 http://www.1t3xt.info/examples/
 You can also search the keywords list:
 http://1t3xt.info/tutorials/keywords/
 

-- 
View this message in context: 
http://old.nabble.com/performance-follow-up-tp28322800p28352147.html
Sent from the iText - General mailing list archive at Nabble.com.


--
___
iText-questions mailing list
iText-questions@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/itext-questions

Buy the iText book: http://www.itextpdf.com/book/
Check the site with examples before you ask questions: 
http://www.1t3xt.info/examples/
You can also search the keywords list: http://1t3xt.info/tutorials/keywords/


Re: [iText-questions] performance follow up

2010-04-24 Thread Mike Marchywka


Isn't there something in PDF about linearization? ( the term
comes up as a suggestion on google, LOL). How
can you compare the two resulting pdf's in terms of 
dynamic attributes or arbitrary ordering or some items- given issues with IO 
and access
patterns this could be an issue. In fact, you could
even imagine that if you could reorder somethings
you get win-win for creation and future rendering time.
 
What is the extent of the freedom here? It sounds like
any hints you would generate for reader could be used
during document manipulation in itext. 







 Date: Sat, 24 Apr 2010 11:59:14 -0700
 From: forum_...@trumpetinc.com
 To: itext-questions@lists.sourceforge.net
 Subject: Re: [iText-questions] performance follow up


 If the file is being entirely pre-loaded, then I doubt that IO blocking is a
 significant contributing factor to your test.

 I think that the best clue here may be the difference between performance
 with form flattening and without form flattening. Just to confirm, am I
 right in saying that iText outperforms the competitor by a significant
 amount in the non-flattening scenario? If that's the case, then it seems
 like we should see significant differences in the profiling results between
 the flattening and non-flattening scenarios in iText.

 Would you be willing to post the profiling results for both cases so we can
 see which code paths are consuming the most runtime in each?

 Another possibility if the profiling results show similar hotspots is that
 the form flattening algorithms in iText are using the hotspot areas a lot
 more than in the non-flattening case. There may be a bunch of redundant
 reads or something in the flattening case.

 Let's take a look at the profiling results and see if we can draw any
 conclusions about where to go next.

 BTW - which profiler are you using? Are you able to expand each of the
 hotspot code paths and see the actual call path that is causing the
 bottleneck? I use jvvm, and the results of expanding the hotspot call trees
 can be quite illuminating.

 What I really would like is to get ahold of your two benchmark tests (with
 and without flattening) so I can run it on my system - do you have anything
 you can package up and share?

 - K


 Giovanni Azua-2 wrote:

 Hello,

 On Apr 23, 2010, at 10:50 PM, trumpetinc wrote:
 Don't know if it'll make any difference, but the way you are reading the
 file
 is horribly inefficient. If the code you wrote is part of your test
 times,
 you might want to re-try, but using this instead (I'm just tossing this
 together - there might be type-os):

 ByteArrayOutputStream baos = new ByteArrayOutputStream();
 byte[] buf = new byte[8092];
 int n;
 while ((n = is.read(buf))= 0) {
 baos.write(buf, 0, n);
 }
 return baos.toByteArray();

 I tried your suggestion above and made no significative difference
 compared to doing the loading from iText. The fastest I could get my use
 case to work using this pre-loading concept was by loading the whole file
 in one shot using the code below.

 Applying the cumulative patch plus preloading the whole PDF using the code
 below, my original test-case now performs 7.74% faster than before,
 roughly 22% away from competitor now ...

 btw the average response time numbers I was getting:

 - average response time of 77ms original unchanged test-case from the
 office multi-processor-multi-core workstation
 - average response time of 15ms original unchanged test-case from home
 using my MBP

 I attribute the huge difference between those two similar experiments
 mainly to having an SSD drive in my MBP ... the top Host spots reported
 from the profiler are related one way or another to IO so would be no
 wonder that with an SSD drive the response time improves by a factor of
 5x. There are other differences though e.g. OS, JVM version.

 Best regards,
 Giovanni

 private static byte[] file2ByteArray(String filePath) throws Exception {
 InputStream input = null;
 try {
 File file = new File(filePath);
 input = new BufferedInputStream(new FileInputStream(filePath));

 byte[] buff = new byte[(int) file.length()];
 input.read(buff);

 return buff;
 }
 finally {
 if (input != null) {
 input.close();
 }
 }
 }



 --

 ___
 iText-questions mailing list
 iText-questions@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/itext-questions

 Buy the iText book: http://www.itextpdf.com/book/
 Check the site with examples before you ask questions:
 http://www.1t3xt.info/examples/
 You can also search the keywords list:
 http://1t3xt.info/tutorials/keywords/


 --
 View this message in context: 
 http://old.nabble.com/performance-follow-up-tp28322800p28352147.html
 Sent from the iText - General mailing list archive at Nabble.com.


 --
 

Re: [iText-questions] performance follow up

2010-04-24 Thread Giovanni Azua

On Apr 24, 2010, at 8:59 PM, trumpetinc wrote:

 If the file is being entirely pre-loaded, then I doubt that IO blocking is a
 significant contributing factor to your test.
 
After I did the entire pre-loading, taking the entire file at once the 
benchmarks look better yes, meaning there is some bottleneck in the way itext 
handles the loading of the PDF files. Besides changing to a different storage 
i.e. from non SSD in the office to SSD in my laptop shows a performance 
improvement by a factor of 5x, of course there could be other reasons but I 
would be willing to bet that this 5x faster is by a high margin due to the fast 
SSD. If there is something SSD are really good at is Random access and iText is 
doing that and a lot. 

Benchmarking the alternative in my laptop shows:
alternative mean RT: 18ms
itext mean RT: 14ms 

So in my laptop itext is faster than the alternative ... why? I think because 
of random access. If itext was doing a lot of random access it could slow it 
down in a non-SSD drive like the one I have in the office.I have to benchmark 
again itext in the office to see how it performs with the new load the entire 
file strategy.

Because of these variations I will setup the experiment in the actual hardware 
where it will be deployed.

 I think that the best clue here may be the difference between performance
 with form flattening and without form flattening.  Just to confirm, am I
 right in saying that iText outperforms the competitor by a significant
 amount in the non-flattening scenario?  If that's the case, then it seems
 like we should see significant differences in the profiling results between
 the flattening and non-flattening scenarios in iText.
 
 Would you be willing to post the profiling results for both cases so we can
 see which code paths are consuming the most runtime in each?
 
I posted this yesterday, see 
http://old.nabble.com/more-on-performance-td28346917.html

- FOOTER 4x shows the Hot spot profiler results in the loading and flattening 
case

- HEADER 4x shows the Hot spot profiler results for the loading only

 BTW - which profiler are you using?  Are you able to expand each of the
 hotspot code paths and see the actual call path that is causing the
 bottleneck?  I use jvvm, and the results of expanding the hotspot call trees
 can be quite illuminating.
 
I am using JProfiler. I can expand the Hotspots, it shows the full call trees 
leading to the Hot spot.

 What I really would like is to get ahold of your two benchmark tests (with
 and without flattening) so I can run it on my system - do you have anything
 you can package up and share?
 
I will prepare it for you ...  

Best regards,
Giovanni
--
___
iText-questions mailing list
iText-questions@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/itext-questions

Buy the iText book: http://www.itextpdf.com/book/
Check the site with examples before you ask questions: 
http://www.1t3xt.info/examples/
You can also search the keywords list: http://1t3xt.info/tutorials/keywords/