Re: Regex on a Dictionary
Hello, this question also came up there: https://python-forum.io/Thread-Working-with-Dict-Object Greetings Andre -- https://mail.python.org/mailman/listinfo/python-list
Re: Regex on a Dictionary
On 13/02/18 18:08, Stanley Denman wrote: On Tuesday, February 13, 2018 at 9:41:14 AM UTC-6, Mark Lawrence wrote: On 13/02/18 13:11, Stanley Denman wrote: I am trying to performance a regex on a "string" of text that python isinstance is telling me is a dictionary. When I run the code I get the following error: {'/Title': '1F: Progress Notes Src.: MILANI, JOHN C Tmt. Dt.: 05/12/2014 - 05/28/2014 (9 pages)', '/Page': IndirectObject(465, 0), '/Type': '/FitB'} Traceback (most recent call last): File "C:\Users\stand\Desktop\PythonSublimeText.py", line 9, in x=MyRegex.findall(MyDict) TypeError: expected string or bytes-like object Here is the "string" of code I am working with: Please call it a dictionary as in the subject line, quite clearly it is not a string in any way, shape or form. {'/Title': '1F: Progress Notes Src.: MILANI, JOHN C Tmt. Dt.: 05/12/2014 - 05/28/2014 (9 pages)', '/Page': IndirectObject(465, 0), '/Type': '/FitB'} I want to grab the name "MILANI, JOHN C" and the last date "-mm/dd/" as a pair such that if I have X numbers of string like the above I will end out with N pairs of values (name and date)/ Here is my code: import PyPDF2,re pdfFileObj=open('x.pdf','rb') pdfReader=PyPDF2.PdfFileReader(pdfFileObj) Result=pdfReader.getOutlines() MyDict=(Result[-1][0]) print(MyDict) print(isinstance(MyDict,dict)) MyRegex=re.compile(r"MILANI,") x=MyRegex.findall(MyDict) print(x) Thanks in advance for any help. Was the string methods solution that I gave a week or so ago so bad that you still think that you need a regex to solve this? -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence My Apology Mark. You took the time to give me the basis of a non-regex solution and I had not taken the time to fully review your answer.Did not understand it at first blush, but I think now I do. Accepted :) IIRC you might need a small tweak or two but certainly the foundations were there. -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence -- https://mail.python.org/mailman/listinfo/python-list
Re: Regex on a Dictionary
On Tuesday, February 13, 2018 at 9:41:14 AM UTC-6, Mark Lawrence wrote: > On 13/02/18 13:11, Stanley Denman wrote: > > I am trying to performance a regex on a "string" of text that python > > isinstance is telling me is a dictionary. When I run the code I get the > > following error: > > > > {'/Title': '1F: Progress Notes Src.: MILANI, JOHN C Tmt. Dt.: > > 05/12/2014 - 05/28/2014 (9 pages)', '/Page': IndirectObject(465, 0), > > '/Type': '/FitB'} > > > > Traceback (most recent call last): > >File "C:\Users\stand\Desktop\PythonSublimeText.py", line 9, in > > x=MyRegex.findall(MyDict) > > TypeError: expected string or bytes-like object > > > > Here is the "string" of code I am working with: > > Please call it a dictionary as in the subject line, quite clearly it is > not a string in any way, shape or form. > > > > > {'/Title': '1F: Progress Notes Src.: MILANI, JOHN C Tmt. Dt.: > > 05/12/2014 - 05/28/2014 (9 pages)', '/Page': IndirectObject(465, 0), > > '/Type': '/FitB'} > > > > I want to grab the name "MILANI, JOHN C" and the last date "-mm/dd/" as > > a pair such that if I have X numbers of string like the above I will end > > out with N pairs of values (name and date)/ Here is my code: > > > > import PyPDF2,re > > pdfFileObj=open('x.pdf','rb') > > pdfReader=PyPDF2.PdfFileReader(pdfFileObj) > > Result=pdfReader.getOutlines() > > MyDict=(Result[-1][0]) > > print(MyDict) > > print(isinstance(MyDict,dict)) > > MyRegex=re.compile(r"MILANI,") > > x=MyRegex.findall(MyDict) > > print(x) > > > > Thanks in advance for any help. > > > > Was the string methods solution that I gave a week or so ago so bad that > you still think that you need a regex to solve this? > > -- > My fellow Pythonistas, ask not what our language can do for you, ask > what you can do for our language. > > Mark Lawrence My Apology Mark. You took the time to give me the basis of a non-regex solution and I had not taken the time to fully review your answer.Did not understand it at first blush, but I think now I do. -- https://mail.python.org/mailman/listinfo/python-list
Re: Regex on a Dictionary
On Tue, 13 Feb 2018 13:53:04 +, Mark Lawrence wrote: > Was the string methods solution that I gave a week or so ago so bad that > you still think that you need a regex to solve this? Sometimes regexes are needed, but often Jamie Zawinski is right: Some people, when confronted with a problem, think "I know, I'll use regular expressions." Now they have two problems. Using the nuclear-powered bulldozer of regular expressions to crack the peanut of a simple fixed-string matching problem is rarely a good idea. -- Steve -- https://mail.python.org/mailman/listinfo/python-list
Re: Regex on a Dictionary
On Tue, 13 Feb 2018 05:11:20 -0800, Stanley Denman wrote: > I am trying to performance a regex on a "string" of text that python > isinstance is telling me is a dictionary. Please believe Python when it tells you that something is a dictionary. Trust me, the interpreter knows. It doesn't matter how much you want something to be a string, if it is not a string, it isn't a string. -- Steve -- https://mail.python.org/mailman/listinfo/python-list
Re: Regex on a Dictionary
On 13/02/18 13:11, Stanley Denman wrote: I am trying to performance a regex on a "string" of text that python isinstance is telling me is a dictionary. When I run the code I get the following error: {'/Title': '1F: Progress Notes Src.: MILANI, JOHN C Tmt. Dt.: 05/12/2014 - 05/28/2014 (9 pages)', '/Page': IndirectObject(465, 0), '/Type': '/FitB'} Traceback (most recent call last): File "C:\Users\stand\Desktop\PythonSublimeText.py", line 9, in x=MyRegex.findall(MyDict) TypeError: expected string or bytes-like object Here is the "string" of code I am working with: Please call it a dictionary as in the subject line, quite clearly it is not a string in any way, shape or form. {'/Title': '1F: Progress Notes Src.: MILANI, JOHN C Tmt. Dt.: 05/12/2014 - 05/28/2014 (9 pages)', '/Page': IndirectObject(465, 0), '/Type': '/FitB'} I want to grab the name "MILANI, JOHN C" and the last date "-mm/dd/" as a pair such that if I have X numbers of string like the above I will end out with N pairs of values (name and date)/ Here is my code: import PyPDF2,re pdfFileObj=open('x.pdf','rb') pdfReader=PyPDF2.PdfFileReader(pdfFileObj) Result=pdfReader.getOutlines() MyDict=(Result[-1][0]) print(MyDict) print(isinstance(MyDict,dict)) MyRegex=re.compile(r"MILANI,") x=MyRegex.findall(MyDict) print(x) Thanks in advance for any help. Was the string methods solution that I gave a week or so ago so bad that you still think that you need a regex to solve this? -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence -- https://mail.python.org/mailman/listinfo/python-list
Re: Regex on a Dictionary
On Tue, 13 Feb 2018 13:42:08 +, Rhodri James wrote: > On 13/02/18 13:11, Stanley Denman wrote: >> I am trying to performance a regex on a "string" of text that python >> isinstance is telling me is a dictionary. When I run the code I get >> the following error: >> >> {'/Title': '1F: Progress Notes Src.: MILANI, JOHN C Tmt. Dt.: >> 05/12/2014 - 05/28/2014 (9 pages)', '/Page': IndirectObject(465, 0), >> '/Type': '/FitB'} >> >> Traceback (most recent call last): >>File "C:\Users\stand\Desktop\PythonSublimeText.py", line 9, in >> >> x=MyRegex.findall(MyDict) >> TypeError: expected string or bytes-like object >> >> Here is the "string" of code I am working with: >> >> {'/Title': '1F: Progress Notes Src.: MILANI, JOHN C Tmt. Dt.: >> 05/12/2014 - 05/28/2014 (9 pages)', '/Page': IndirectObject(465, 0), >> '/Type': '/FitB'} >> >> I want to grab the name "MILANI, JOHN C" and the last date >> "-mm/dd/" as a pair such that if I have X numbers of string like >> the above I will end out with N pairs of values (name and date)/ Here >> is my code: >> >> import PyPDF2,re pdfFileObj=open('x.pdf','rb') >> pdfReader=PyPDF2.PdfFileReader(pdfFileObj) >> Result=pdfReader.getOutlines() >> MyDict=(Result[-1][0]) >> print(MyDict) >> print(isinstance(MyDict,dict)) >> MyRegex=re.compile(r"MILANI,") >> x=MyRegex.findall(MyDict) >> print(x) > > As the error message says, re.findall() expects a string. A dictionary > is in no sense a string, so passing it in whole like that won't work. > If you know that the name will always show up in the title field, you > can pass just the title: > >x = MyRegex.findall(MyDict['/Title']) > > Otherwise you will have to loop through all the entries in the > dictionary: > >for entry in MyDict.values(): > x = MyRegex.findall(entry) # ...and do something with x > > I rather suspect you are going to find that the titles aren't in a very > systematic format, though. for what purpose are you trying to run this regex anyway? it is almost certainly the wrong approach for your task -- Larkinson's Law: All laws are basically false. -- https://mail.python.org/mailman/listinfo/python-list
Re: Regex on a Dictionary
On 13/02/18 13:11, Stanley Denman wrote: I am trying to performance a regex on a "string" of text that python isinstance is telling me is a dictionary. When I run the code I get the following error: {'/Title': '1F: Progress Notes Src.: MILANI, JOHN C Tmt. Dt.: 05/12/2014 - 05/28/2014 (9 pages)', '/Page': IndirectObject(465, 0), '/Type': '/FitB'} Traceback (most recent call last): File "C:\Users\stand\Desktop\PythonSublimeText.py", line 9, in x=MyRegex.findall(MyDict) TypeError: expected string or bytes-like object Here is the "string" of code I am working with: {'/Title': '1F: Progress Notes Src.: MILANI, JOHN C Tmt. Dt.: 05/12/2014 - 05/28/2014 (9 pages)', '/Page': IndirectObject(465, 0), '/Type': '/FitB'} I want to grab the name "MILANI, JOHN C" and the last date "-mm/dd/" as a pair such that if I have X numbers of string like the above I will end out with N pairs of values (name and date)/ Here is my code: import PyPDF2,re pdfFileObj=open('x.pdf','rb') pdfReader=PyPDF2.PdfFileReader(pdfFileObj) Result=pdfReader.getOutlines() MyDict=(Result[-1][0]) print(MyDict) print(isinstance(MyDict,dict)) MyRegex=re.compile(r"MILANI,") x=MyRegex.findall(MyDict) print(x) As the error message says, re.findall() expects a string. A dictionary is in no sense a string, so passing it in whole like that won't work. If you know that the name will always show up in the title field, you can pass just the title: x = MyRegex.findall(MyDict['/Title']) Otherwise you will have to loop through all the entries in the dictionary: for entry in MyDict.values(): x = MyRegex.findall(entry) # ...and do something with x I rather suspect you are going to find that the titles aren't in a very systematic format, though. -- Rhodri James *-* Kynesim Ltd -- https://mail.python.org/mailman/listinfo/python-list
Re: Regex on a Dictionary
Hi Stanley, Le 2018-02-13 à 08:11, Stanley Denman a écrit : x=MyRegex.findall(MyDict) How about: x = [MyRegex.findall(item) for item in MyDict] Etienne -- Etienne Robillard tkad...@yandex.com https://www.isotopesoftware.ca/ -- https://mail.python.org/mailman/listinfo/python-list
Regex on a Dictionary
I am trying to performance a regex on a "string" of text that python isinstance is telling me is a dictionary. When I run the code I get the following error: {'/Title': '1F: Progress Notes Src.: MILANI, JOHN C Tmt. Dt.: 05/12/2014 - 05/28/2014 (9 pages)', '/Page': IndirectObject(465, 0), '/Type': '/FitB'} Traceback (most recent call last): File "C:\Users\stand\Desktop\PythonSublimeText.py", line 9, in x=MyRegex.findall(MyDict) TypeError: expected string or bytes-like object Here is the "string" of code I am working with: {'/Title': '1F: Progress Notes Src.: MILANI, JOHN C Tmt. Dt.: 05/12/2014 - 05/28/2014 (9 pages)', '/Page': IndirectObject(465, 0), '/Type': '/FitB'} I want to grab the name "MILANI, JOHN C" and the last date "-mm/dd/" as a pair such that if I have X numbers of string like the above I will end out with N pairs of values (name and date)/ Here is my code: import PyPDF2,re pdfFileObj=open('x.pdf','rb') pdfReader=PyPDF2.PdfFileReader(pdfFileObj) Result=pdfReader.getOutlines() MyDict=(Result[-1][0]) print(MyDict) print(isinstance(MyDict,dict)) MyRegex=re.compile(r"MILANI,") x=MyRegex.findall(MyDict) print(x) Thanks in advance for any help. -- https://mail.python.org/mailman/listinfo/python-list