On Thu 10 Jan 2013 04:21:10 NZDT +1300, Keith McGavin wrote: > tr removes singular dots which wc may pick up as words. > > pdftotext file.pdf - | tr -d '.' | wc -w > pdftotext file.pdf - | tr -d '.' | wc -l
That's only the start of it. Many PDFs are constructed in such a way that the resulting plain text contains loads of spaces within words, and hyphenated words would be counted as two. More accurate might be a pdf viewer application (if it has a word count option), but that's not command line. Volker -- Volker Kuhlmann http://volker.dnsalias.net/ Please do not CC list postings to me. _______________________________________________ Linux-users mailing list [email protected] http://lists.canterbury.ac.nz/mailman/listinfo/linux-users
