On Friday, February 9, 2018 at 1:08:27 AM UTC-6, dieter wrote: > Stanley Denman <dallasdisabilityattor...@gmail.com> writes: > > > I am new to Python. I am trying to extract text from the bookmarks in a PDF > > file that would provide the data for a Word template merge. I have gotten > > down to a string of text pulled out of the list object that I got from > > using PyPDF2 module. I am stuck on now to get the data out of the string > > that I need. I am calling it a string, but Python is recognizing as a > > dictionary object. > > > > Here is the string: > > > > {'/Title': '1F: Progress Notes Src.: MILANI, JOHN C Tmt. Dt.: > > 05/12/2014 - 05/28/2014 (9 pages)', '/Page': IndirectObject(465, 0), > > '/Type': '/FitB'} > > > > What a want is the following to end up as fields on my Word template merge: > > MedSourceFirstName: "John" > > MedSourceLastName: "Milani" > > MedSourceLastTreatment: "05/28/2014" > > > > If I use keys() on the dictionary I get this: > > ['/Title', '/Page', '/Type']I was hoping "Src" and Tmt Dt." would be > > treated as keys. Seems like the key/value pair of a dictionary would > > translate nicely to fieldname and fielddata for a Word document merge. > > Here is my code so far. > > A Python "dict" is a mapping of keys to values. Its "keys" method > gives you the keys (as you have used above). > The subscription syntax ("<some_dict>[<some_key>]"; e.g. > "pdf_info['/Title']") allows you to access the value associated with > "<some_key>". > > In your case, relevant information is coded inside the values themselves. > You will need to extract this information yourself. Python's "re" module > might be of help (see the "library reference", for details).
Thanks for your response. Nice to know I am at least on the right path. Sounds like I am going to have to did in to Regex to get at the test I want. -- https://mail.python.org/mailman/listinfo/python-list