Re: io module and pdf question

2013-06-26 Thread wxjmfauth
Le mardi 25 juin 2013 06:18:44 UTC+2, jyou...@kc.rr.com a écrit : > Would like to get your opinion on this. Currently to get the metadata out of > a pdf file, I loop through the guts of the file. I know it's not the > greatest idea to do this, but I'm trying to avoid extra modules, etc. > > >

Re: io module and pdf question

2013-06-25 Thread Dave Angel
On 06/25/2013 12:15 PM, jyoun...@kc.rr.com wrote: Thank you Rusi and Christian! Something I don't think was mentioned was that reading a text file in Python 3, and specifying latin-1, will work simply because every possible 8-bit byte is a character in Latin-1 That doesn't mean that those

Re: io module and pdf question

2013-06-25 Thread MRAB
On 25/06/2013 17:15, jyoun...@kc.rr.com wrote: Thank you Rusi and Christian! So it sounds like I should read the pdf data in as binary: import os pdfPath = '~/Desktop/test.pdf' colorlistData = '' with open(os.path.expanduser(pdfPath), 'rb') as f: for i in f:

Re: io module and pdf question

2013-06-25 Thread rusi
I guess the string constant 'XYZ:colorlist' needs to be a byte-string -- use b prefix? Dunno for sure. Black hole for me -- unicode! -- http://mail.python.org/mailman/listinfo/python-list

RE: io module and pdf question

2013-06-25 Thread jyoung79
Thank you Rusi and Christian! So it sounds like I should read the pdf data in as binary: import os pdfPath = '~/Desktop/test.pdf' colorlistData = '' with open(os.path.expanduser(pdfPath), 'rb') as f: for i in f: if 'XYZ:colorList' in i: colorlistDat

Re: io module and pdf question

2013-06-25 Thread Christian Gollwitzer
Am 25.06.13 08:33, schrieb rusi: On Tuesday, June 25, 2013 9:48:44 AM UTC+5:30, jyou...@kc.rr.com wrote: 1. Is there another way to get metadata out of a pdf without having to install another module? 2. Is it safe to assume pdf files should always be encoded as latin-1 (when trying to read it th

Re: io module and pdf question

2013-06-24 Thread rusi
On Tuesday, June 25, 2013 9:48:44 AM UTC+5:30, jyou...@kc.rr.com wrote: > 1. Is there another way to get metadata out of a pdf without having to > install another module? > 2. Is it safe to assume pdf files should always be encoded as latin-1 (when > trying to read it this way)? Is there a chanc

io module and pdf question

2013-06-24 Thread jyoung79
Would like to get your opinion on this. Currently to get the metadata out of a pdf file, I loop through the guts of the file. I know it's not the greatest idea to do this, but I'm trying to avoid extra modules, etc. Adobe javascript was used to insert the metadata, so the added data looks som