A lot of this can be done in Python and since you need this in 
functionality in Java you can set up a Python version or backend wrapped in 
a RESTful API that uses xhtml2pdf and then consume that API from your Java 
code.

For HTML/image to PDF I use the Python library http://www.xhtml2pdf.com/ which 
uses Reportlab, pyPdf, and html5lib running on GAE. I have been using it to 
generate very nice article PDFs with embedded images and once I figured out 
how to get the page size correct I have found this to be a very good 
library.

For PDF to image I have been using ImageMagick on an EC2 and for extracting 
text I have used Apache PDFBox (Java) and pdfminer (Python) on GAE. I have 
to convert CMYK PDFs into RGB JPGs with watermarks and ImageMagick makes 
this nice and easy but of course cannot be run on GAE so I wrapped it in an 
API and consume from GAE. I mostly use PDFBox to extract text for my search 
index and have no experience trying to get a nicely formatted text version 
from a PDF but I know pdfminer will give you a formatted HTML version of a 
PDF.

- Bryce


On Tuesday, August 21, 2012 4:00:57 AM UTC-7, aswath wrote:
>
> Hello,
> We were deeply involved in utilizing the conversion api for the HTML to 
> PDF conversion.  Suddenly, I got the email from Google about the plan for 
> decommissioning from Nov 2012.
>
> Does anyone has suggestions for doing the HTML to PDF conversion that is 
> compatible with Google Appengine for Java.  
>
>
> Regards
> -Aswath
> www.AccountingGuru.in
>

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to google-appengine+unsubscr...@googlegroups.com.
To post to this group, send email to google-appengine@googlegroups.com.
Visit this group at http://groups.google.com/group/google-appengine?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.


Reply via email to