Hi,
yes, i know this is not a paid support forum but this seems to be a very
weird behaviour that perhaps needs to be looked into. Why would the exact
same code behave different when writing to a blank page (with its
x,y,x1,y1 coordinates) versus writing to a placeholder on the blank page
(with its x,y,x1 and y1 coordinates)?
From: itext-questions-requ...@lists.sourceforge.net
To: itext-questions@lists.sourceforge.net,
Date: 2013-09-06 16:05
Subject: iText-questions Digest, Vol 88, Issue 6
Send iText-questions mailing list submissions to
itext-questions@lists.sourceforge.net
To subscribe or unsubscribe via the World Wide Web, visit
https://lists.sourceforge.net/lists/listinfo/itext-questions
or, via email, send a message with subject or body 'help' to
itext-questions-requ...@lists.sourceforge.net
You can reach the person managing the list at
itext-questions-ow...@lists.sourceforge.net
When replying, please edit your Subject line so it is more specific
than "Re: Contents of iText-questions digest..."
Today's Topics:
1. XMLWorker - difference between creating blank pdf and filling
out an existing (with a field). (Daniel Lehtihet)
----------------------------------------------------------------------
Message: 1
Date: Fri, 6 Sep 2013 15:02:47 +0200
From: Daniel Lehtihet <daniel.lehti...@folksam.se>
Subject: [iText-questions] XMLWorker - difference between creating
blank pdf and filling out an existing (with a field).
To: itext-questions@lists.sourceforge.net
Message-ID:
<of828a9458.5dd14d7e-onc1257bde.00467582-c1257bde.0047a...@intern.folksam.se>
Content-Type: text/plain; charset="iso-8859-1"
Hi,
I have a question regarding how different the output can be when
transforming xhtml (using xmlworker) using either:
a) a new blank PDF which one creates and fills out
and
b) using an existing pdf using a field as placeholder (well, actually
outputting the result within the field limits)
when using the "blank" route, the html displays just fine. When using the
"existing pdf" route, some xhtml looks very strange (overlapping text).
Let me show you an example of what i mean. I have a class that have two
signatures. One produces a pdf named "outputGen1" (here you will se the
headline text overlap). The other produces a pddf named "outputGen2" (and
here it looks just fine).
Code for "Gen":
package se.folksam.test;
import java.io.BufferedReader;
import java.io.ByteArrayInputStream;
import java.io.ByteArrayOutputStream;
import java.io.FileOutputStream;
import java.io.FileReader;
import java.io.InputStream;
import java.io.OutputStream;
import java.io.StringReader;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import org.w3c.tidy.Tidy;
import com.itextpdf.text.Chunk;
import com.itextpdf.text.Document;
import com.itextpdf.text.Element;
import com.itextpdf.text.FontFactory;
import com.itextpdf.text.pdf.AcroFields;
import com.itextpdf.text.pdf.AcroFields.FieldPosition;
import com.itextpdf.text.pdf.ColumnText;
import com.itextpdf.text.pdf.PdfContentByte;
import com.itextpdf.text.pdf.PdfReader;
import com.itextpdf.text.pdf.PdfStamper;
import com.itextpdf.text.pdf.PdfWriter;
import com.itextpdf.tool.xml.Pipeline;
import com.itextpdf.tool.xml.XMLWorker;
import com.itextpdf.tool.xml.XMLWorkerHelper;
import com.itextpdf.tool.xml.html.Tags;
import com.itextpdf.tool.xml.parser.XMLParser;
import com.itextpdf.tool.xml.pipeline.css.CSSResolver;
import com.itextpdf.tool.xml.pipeline.css.CssResolverPipeline;
import com.itextpdf.tool.xml.pipeline.end.PdfWriterPipeline;
import com.itextpdf.tool.xml.pipeline.html.HtmlPipeline;
import com.itextpdf.tool.xml.pipeline.html.HtmlPipelineContext;
public class Gen {
/**
* @param args
*/
public static void main(String[] args) {
Gen g = new Gen();
g.gen1();
g.gen2();
}
public void gen1() {
try {
FontFactory.registerDirectories();
ByteArrayOutputStream baos = new
ByteArrayOutputStream();
PdfReader reader = new PdfReader(
"c:/temp/Huge_Text_Field.pdf");
PdfStamper stp = new PdfStamper(reader, baos);
AcroFields af = stp.getAcroFields();
String html = readFile("c:/temp/markup.html");
// Use JTidy to force html to xhtml
Tidy tidy = new Tidy();
tidy.setMakeClean(true);
tidy.setXHTML(true);
tidy.setBreakBeforeBR(false);
tidy.setShowWarnings(false);
ByteArrayOutputStream os = new
ByteArrayOutputStream();
InputStream is = new
ByteArrayInputStream(html.getBytes("ISO-8859-1"));
tidy.parse( is, os );
String fieldValue = os.toString();
StringReader sr = new StringReader( fieldValue );
ArrayList array = new ArrayList();
MyElementHandler ehandler = new
MyElementHandler(array);
XMLWorkerHelper wx = XMLWorkerHelper.getInstance
();
wx.parseXHtml(ehandler,sr);
array = ehandler.getArrayList();
// the body field
java.util.List<FieldPosition> posArr =
af.getFieldPositions( "Text" );
FieldPosition bodyPosition = posArr.get(0);
PdfContentByte cb = stp.getOverContent((int
)bodyPosition.page);
ColumnText ct = new ColumnText( cb );
// X1 top, y1 top, x2, y2
//0=page, 1=llx, 2=lly, 3=urx, 4=ury
ct.setSimpleColumn(bodyPosition.position
.getLeft()-(0), bodyPosition.position.getTop(), bodyPosition.position
.getRight(), bodyPosition.position.getBottom());
float curLead = ct.getLeading();
ct.setLeading(curLead-0.5f); // Kan ev.
?ndras till mindre v?rde f?r mindre p?verkan p? radspacing (minska mer och
raderna flyter ihop mera...)
Element el = null;
int currPageNbr[] = new int[1];
currPageNbr[0] = (int)bodyPosition.page;
String text = "";
int myArraySize = array.size();
// loopa igenom bodytexten i sin helhet
for (int idx = 0; idx < myArraySize; idx++)
{
el = (Element)array.get(idx);
List<Chunk> chunks = el.getChunks();
if (chunks.size() > 0) {
Chunk chunk =
(Chunk)chunks.get(0); // get the others if needed
text = chunk.getContent().trim();
} else
text = ""; // Detta inneh?ller
ingenting. ignorera
ct.addElement(el);
int res = ct.go();
}
ct.go();
stp.close();
OutputStream outputStream = new FileOutputStream (
"c:/temp/outputGen1.pdf");
baos.writeTo(outputStream);
baos.flush();
baos.close();
} catch (Exception e) {
e.printStackTrace();
}
}
public void gen2() {
try {
FontFactory.registerDirectories();
Document document = new Document();
ByteArrayOutputStream baos = new ByteArrayOutputStream();
String html = readFile("c:/temp/markup.html");
PdfWriter writer = PdfWriter.getInstance(document, baos);
document.open();
HtmlPipelineContext htmlContext = new HtmlPipelineContext(
null);
htmlContext.setTagFactory(Tags.getHtmlTagProcessorFactory
());
CSSResolver cssResolver =
XMLWorkerHelper.getInstance().getDefaultCssResolver(
true);
Pipeline<?> pipeline =
new CssResolverPipeline(cssResolver,
new HtmlPipeline(htmlContext,
new PdfWriterPipeline(document, writer)));
XMLWorker worker = new XMLWorker(pipeline, true);
XMLParser p = new XMLParser(worker);
// Use JTidy to force html to xhtml
Tidy tidy = new Tidy();
tidy.setMakeClean(true);
tidy.setXHTML(true);
tidy.setBreakBeforeBR(false);
tidy.setShowWarnings(false);
ByteArrayOutputStream os = new ByteArrayOutputStream();
InputStream is = new ByteArrayInputStream(html.getBytes(
"ISO-8859-1"));
tidy.parse( is, os );
String fieldValue = os.toString();
p.parse( new StringReader(fieldValue) );
document.close();
OutputStream outputStream = new FileOutputStream (
"c:/temp/outputGen2.pdf");
baos.writeTo(outputStream);
baos.flush();
baos.close();
} catch (Exception e) {
e.printStackTrace();
}
}
public String readFile(String path) throws Exception {
BufferedReader br = new BufferedReader(new
FileReader(path));
String everything = "";
try {
StringBuilder sb = new StringBuilder();
String line = br.readLine();
while (line != null) {
sb.append(line);
sb.append('\n');
line = br.readLine();
}
everything = sb.toString();
} finally {
br.close();
}
return everything;
}
}
here is the accompanying class MyElementHandler:
package se.folksam.test;
import java.util.ArrayList;
import java.util.List;
import com.itextpdf.text.Element;
import com.itextpdf.tool.xml.ElementHandler;
import com.itextpdf.tool.xml.Writable;
import com.itextpdf.tool.xml.pipeline.WritableElement;
public class MyElementHandler implements ElementHandler {
ArrayList array = null;
public MyElementHandler(ArrayList arr) {
this.array = arr;
}
public ArrayList<Element> getArrayList() {
return this.array;
}
public void add(final Writable w) {
if (w instanceof WritableElement) {
List<Element> elements =
((WritableElement)w).elements();
// collect in array
for (Element e : elements) {
array.add(e);
}
}
}
}
and the actual HTML-file that i use:
<br><strong>sdfsdf<br>s<br>df<br></strong> <ul> <li>sdf <li>s <li>dfs
<li></li> </ul> <ol> <li></li> </ol> <span style="FONT-SIZE: 24px"><span
style="COLOR: #737373">df<br>sd<br>f<br></span></span>s<br><br>V?nliga
h?lsningar<br>Department XXX<br>Joe
Doe<br>some.em...@company.xxx<br>Phone: 555 - 123456
And the PDF-file
(yes, its pure nonsens, but it shows the problem quite well).
My question is really. Why does it differ when one uses a blank document
vs. when you use a (large) field as boundary, doing the exact same thing.
Kind regards
Daniel
Daniel Lehtihet
IT-arkitekt
Arkitektur
Folksam
106 60 Stockholm
Bes?k: Bohusgatan 14
Telefon: 08-7726041
Mobil: 0708-31 51 71
daniel.lehti...@folksam.se
http://www.folksam.se
-------------- next part --------------
An HTML attachment was scrubbed...
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Huge_Text_Field.pdf
Type: application/octet-stream
Size: 6176 bytes
Desc: not available
------------------------------
------------------------------------------------------------------------------
Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more!
Discover the easy way to master current and previous Microsoft
technologies
and advance your career. Get an incredible 1,500+ hours of step-by-step
tutorial videos with LearnDevNow. Subscribe today and save!
http://pubads.g.doubleclick.net/gampad/clk?id=58041391&iu=/4140/ostg.clktrk
------------------------------
_______________________________________________
iText-questions mailing list
iText-questions@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/itext-questions
iText(R) is a registered trademark of 1T3XT BVBA
End of iText-questions Digest, Vol 88, Issue 6
**********************************************
------------------------------------------------------------------------------
LIMITED TIME SALE - Full Year of Microsoft Training For Just $49.99!
1,500+ hours of tutorials including VisualStudio 2012, Windows 8, SharePoint
2013, SQL 2012, MVC 4, more. BEST VALUE: New Multi-Library Power Pack includes
Mobile, Cloud, Java, and UX Design. Lowest price ever! Ends 9/20/13.
http://pubads.g.doubleclick.net/gampad/clk?id=58041151&iu=/4140/ostg.clktrk
_______________________________________________
iText-questions mailing list
iText-questions@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/itext-questions
iText(R) is a registered trademark of 1T3XT BVBA.
Many questions posted to this list can (and will) be answered with a reference
to the iText book: http://www.itextpdf.com/book/
Please check the keywords list before you ask for examples:
http://itextpdf.com/themes/keywords.php