Re: PDF creation?
On Mon, Apr 22, 2013 at 3:28 AM, Peter Corlett wrote: > My *favourite* approach, which is almost certainly not the consensus answer, > is > to generate a LaTeX document (e.g. using Template.pm) and then run that > through > xelatex to generate a PDF. This does however require you to learn how to drive > LaTeX and how to trawl CTAN etc for useful packages. > > (FWIW, pretty much all of the useful LaTeX packages are already in Debian.) That's what I'm doing too (for web requests, even), using LaTeX::Driver, which did need to be slightly tweaked to support longtable (now in the CPAN distro). At the time, there didn't seem to be any other good solution for multipage tables; it's nice to hear the HTML to PDF process has improved.
Re: PDF creation?
On Mon, 22 Apr 2013, Roger Bell_West wrote: On Mon, Apr 22, 2013 at 11:45:43AM +0100, Mike Whitaker wrote: On a similar subject, what PDF (or even text, assuming I can find something to extract the text on a page by page basis) indexing solutions are there out there in Perl? pdftotext and then throw the text at a generic indexing package. I keep meaning to do something with Plucene. Lucy is possibly a better choice if you dont want to just use Elasticsearch. Since Lucy is actively developed unlike Plucene. https://metacpan.org/release/Lucy -- bob walker everything should be purple and bendy http://randomness.org.uk
Re: PDF creation?
On 22 April 2013 11:51, Jérôme Étévé wrote: > If you want full support for unicode strings and a good control of layout, > I found that the simpliest solution is to use xelatex. > > I didn't find PDF::API2 fits in a multilingual environment. What problem did you have on this score then? I've been using PDF::Report on top of PDF::API2 and not had any problem with german, spanish, etc We have an internal library that handles layout for, admittedly fairly boring business documents / reports ( using a simplistic naive styling and positioning set of attributes ) pretty well. After investing some time working on it and the styling attributes for the parts of the page, we're getting better results than the proprietary off the shelf java product it replaces. A. -- Aaron J Trevena, BSc Hons http://www.aarontrevena.co.uk LAMP System Integration, Development and Consulting
Re: PDF creation?
With an optional job queue and expensive OCR package deal with scanned document. On 22/04/2013, at 8:57 PM, Roger Bell_West wrote: > On Mon, Apr 22, 2013 at 11:45:43AM +0100, Mike Whitaker wrote: >> On a similar subject, what PDF (or even text, assuming I can find something >> to extract the text on a page by page basis) indexing solutions are there >> out there in Perl? > > pdftotext and then throw the text at a generic indexing package. I > keep meaning to do something with Plucene.
Re: PDF creation?
On Mon, Apr 22, 2013 at 11:45:43AM +0100, Mike Whitaker wrote: >On a similar subject, what PDF (or even text, assuming I can find something to >extract the text on a page by page basis) indexing solutions are there out >there in Perl? pdftotext and then throw the text at a generic indexing package. I keep meaning to do something with Plucene.
Re: PDF creation?
If you want full support for unicode strings and a good control of layout, I found that the simpliest solution is to use xelatex. I didn't find PDF::API2 fits in a multilingual environment. Also, http://www.pdflib.com/ is ok (very good layout capabities, but it's a commercial product), although last time I was using it, unicode support was quite poor. J On 22 April 2013 11:42, Kieren Diment wrote: > Similarly one can use pandoc (markdown to pdf and many other formats > including pod and TeX) in the same way. http://johnmacfarlane.net/pandoc > I really like pandoc, although it is not bug free. > > On 22/04/2013, at 8:28 PM, Peter Corlett wrote: > > > On Sun, Apr 21, 2013 at 07:43:11AM -0400, Mark Fowler wrote: > >> In a few weeks I'm going to want to be creating PDFs from Perl, > something I > >> haven't done in a few years. What's the recommended approach these days? > > > > My *favourite* approach, which is almost certainly not the consensus > answer, is > > to generate a LaTeX document (e.g. using Template.pm) and then run that > through > > xelatex to generate a PDF. This does however require you to learn how to > drive > > LaTeX and how to trawl CTAN etc for useful packages. > > > > (FWIW, pretty much all of the useful LaTeX packages are already in > Debian.) > > > >> I know I'm going to want to create the document from scratch, not fill > in a > >> template, and I'm probably going to want multi-line text and basic > drawing (a > >> horizontal line or two) > > > > The "template" in this case would be the LaTeX preamble that pulls in and > > configures all of the packages you use in your document. You get > multi-line > > text, tables, page reflowing and all sorts of other goodies for free. > > > > > -- Jerome Eteve +44(0)7738864546 http://www.eteve.net/
Re: PDF creation?
On a similar subject, what PDF (or even text, assuming I can find something to extract the text on a page by page basis) indexing solutions are there out there in Perl? On 22 Apr 2013, at 11:42, Kieren Diment wrote: > Similarly one can use pandoc (markdown to pdf and many other formats > including pod and TeX) in the same way. http://johnmacfarlane.net/pandoc > I really like pandoc, although it is not bug free. > > On 22/04/2013, at 8:28 PM, Peter Corlett wrote: > >> On Sun, Apr 21, 2013 at 07:43:11AM -0400, Mark Fowler wrote: >> >> My *favourite* approach, which is almost certainly not the consensus answer, >> is >> to generate a LaTeX document (e.g. using Template.pm) and then run that >> through >> xelatex to generate a PDF. This does however require you to learn how to >> drive >> LaTeX and how to trawl CTAN etc for useful packages. >> >> (FWIW, pretty much all of the useful LaTeX packages are already in Debian.) >> >> >> The "template" in this case would be the LaTeX preamble that pulls in and >> configures all of the packages you use in your document. You get multi-line >> text, tables, page reflowing and all sorts of other goodies for free.
Re: PDF creation?
Similarly one can use pandoc (markdown to pdf and many other formats including pod and TeX) in the same way. http://johnmacfarlane.net/pandoc I really like pandoc, although it is not bug free. On 22/04/2013, at 8:28 PM, Peter Corlett wrote: > On Sun, Apr 21, 2013 at 07:43:11AM -0400, Mark Fowler wrote: >> In a few weeks I'm going to want to be creating PDFs from Perl, something I >> haven't done in a few years. What's the recommended approach these days? > > My *favourite* approach, which is almost certainly not the consensus answer, > is > to generate a LaTeX document (e.g. using Template.pm) and then run that > through > xelatex to generate a PDF. This does however require you to learn how to drive > LaTeX and how to trawl CTAN etc for useful packages. > > (FWIW, pretty much all of the useful LaTeX packages are already in Debian.) > >> I know I'm going to want to create the document from scratch, not fill in a >> template, and I'm probably going to want multi-line text and basic drawing (a >> horizontal line or two) > > The "template" in this case would be the LaTeX preamble that pulls in and > configures all of the packages you use in your document. You get multi-line > text, tables, page reflowing and all sorts of other goodies for free. >
Re: PDF creation?
On Sun, Apr 21, 2013 at 07:43:11AM -0400, Mark Fowler wrote: > In a few weeks I'm going to want to be creating PDFs from Perl, something I > haven't done in a few years. What's the recommended approach these days? My *favourite* approach, which is almost certainly not the consensus answer, is to generate a LaTeX document (e.g. using Template.pm) and then run that through xelatex to generate a PDF. This does however require you to learn how to drive LaTeX and how to trawl CTAN etc for useful packages. (FWIW, pretty much all of the useful LaTeX packages are already in Debian.) > I know I'm going to want to create the document from scratch, not fill in a > template, and I'm probably going to want multi-line text and basic drawing (a > horizontal line or two) The "template" in this case would be the LaTeX preamble that pulls in and configures all of the packages you use in your document. You get multi-line text, tables, page reflowing and all sorts of other goodies for free.
Re: PDF creation?
On 21 April 2013 20:06, Leo Lapworth wrote: > On 21 April 2013 12:51, Roger Bell_West wrote: > >> On Sun, Apr 21, 2013 at 07:43:11AM -0400, Mark Fowler wrote: >> >I know I'm going to want to create the document from scratch, not fill in >> a >> >template, and I'm probably going to want multi-line text and basic drawing >> >(a horizontal line or two) >> >> I tend to use PDF::API2: now unmaintained, but gets the job done. >> > > PDF::API2++ # if you want lots of control > > I've got some code at work we'd really like to open source - just takes > someone > extracting the work specific bits from it - which wraps it in moosey > goodness with a bit of a layout framework added in. Nice. I've been using PDF::Report (which wraps PDF::API2, and allows you to jump down to PDF::API2 directly if needed) for work projects which was unmaintained but I now have PAUSE COMAINT on it, and have some fixes and updates in github for it. The code I've been working on at work isn't Moosey, but is OO and uses styles and makes layout fairly easy, I don't know if it's likely to be open sourced, but I'd be interested in collaborating with Leo or anybody else with an eye for moving from an internal API to a more standard CPAN module. Cheers, A. -- Aaron J Trevena, BSc Hons http://www.aarontrevena.co.uk LAMP System Integration, Development and Consulting