On Mon, 10 Oct 2005, Bill Burns wrote:
> I'm looking to get the size (width, length) of a PDF file. Hi Bill, Just as a side note: you may want to look into using the 'pdfinfo' utility that comes as part of the xpdf package: http://www.foolabs.com/xpdf/ For example: ####################################################################### [EMAIL PROTECTED] ~]$ pdfinfo 05-lexparse.pdf Producer: Acrobat Distiller Command 3.0 for Solaris 2.3 and later (SPARC) CreationDate: Tue Jul 1 18:36:35 1913 Tagged: no Pages: 12 Encrypted: no Page size: 612 x 792 pts (letter) File size: 191874 bytes Optimized: no PDF version: 1.2 ####################################################################### > Every pdf file has a 'tag' (in the file) that looks similar to this > > Example #1 > MediaBox [0 0 612 792] > > or this > > Example #2 > MediaBox [ 0 0 612 792 ] > > I figured a regex might be a good way to get this data but the > whitespace (or no whitespace) after the left bracket has me stumped. I think you might want to look for the whitespace metacharacter '\s'. Also, you can consider using '*' to qualify a previous pattern: it stands for "zero or more of the pattern." For example: ##################################### >>> re.search("a*b", "aab") <_sre.SRE_Match object at 0x403ae250> >>> re.search("a*b", "ab") <_sre.SRE_Match object at 0x403ae138> >>> re.search("a*b", "b") <_sre.SRE_Match object at 0x403ae250> >>> re.search("a*b", "") >>> ##################################### In comparison: ##################################### >>> re.search("a+b", "aab") <_sre.SRE_Match object at 0x403ae138> >>> re.search("a+b", "ab") <_sre.SRE_Match object at 0x403ae250> >>> re.search("a+b", "b") >>> ##################################### Good luck to you! _______________________________________________ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor