Re: Regex on a Dictionary

2018-02-15 Thread Andre Müller
Hello,

this question also came up there:
https://python-forum.io/Thread-Working-with-Dict-Object

Greetings
Andre
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Regex on a Dictionary

2018-02-13 Thread Mark Lawrence

On 13/02/18 18:08, Stanley Denman wrote:

On Tuesday, February 13, 2018 at 9:41:14 AM UTC-6, Mark Lawrence wrote:

On 13/02/18 13:11, Stanley Denman wrote:

I am trying to performance a regex on a "string" of text that python isinstance 
is telling me is a dictionary.  When I run the code I get the following error:

{'/Title': '1F:  Progress Notes  Src.:  MILANI, JOHN C Tmt. Dt.:  05/12/2014 - 
05/28/2014 (9 pages)', '/Page': IndirectObject(465, 0), '/Type': '/FitB'}

Traceback (most recent call last):
File "C:\Users\stand\Desktop\PythonSublimeText.py", line 9, in 
  x=MyRegex.findall(MyDict)
TypeError: expected string or bytes-like object

Here is the "string" of code I am working with:


Please call it a dictionary as in the subject line, quite clearly it is
not a string in any way, shape or form.



{'/Title': '1F:  Progress Notes  Src.:  MILANI, JOHN C Tmt. Dt.:  05/12/2014 - 
05/28/2014 (9 pages)', '/Page': IndirectObject(465, 0), '/Type': '/FitB'}

I want to grab the name "MILANI, JOHN C" and the last date "-mm/dd/" as a 
pair such that if I have  X numbers of string like the above I will end out with N pairs of values 
(name and date)/  Here is my code:
   
import PyPDF2,re

pdfFileObj=open('x.pdf','rb')
pdfReader=PyPDF2.PdfFileReader(pdfFileObj)
Result=pdfReader.getOutlines()
MyDict=(Result[-1][0])
print(MyDict)
print(isinstance(MyDict,dict))
MyRegex=re.compile(r"MILANI,")
x=MyRegex.findall(MyDict)
print(x)

Thanks in advance for any help.



Was the string methods solution that I gave a week or so ago so bad that
you still think that you need a regex to solve this?

--
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence


My Apology Mark.  You took the time to give me the basis of a non-regex 
solution and I had not taken the time to fully review your answer.Did not 
understand it at first blush, but I think now I do.



Accepted :)

IIRC you might need a small tweak or two but certainly the foundations 
were there.


--
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence

--
https://mail.python.org/mailman/listinfo/python-list


Re: Regex on a Dictionary

2018-02-13 Thread Stanley Denman
On Tuesday, February 13, 2018 at 9:41:14 AM UTC-6, Mark Lawrence wrote:
> On 13/02/18 13:11, Stanley Denman wrote:
> > I am trying to performance a regex on a "string" of text that python 
> > isinstance is telling me is a dictionary.  When I run the code I get the 
> > following error:
> > 
> > {'/Title': '1F:  Progress Notes  Src.:  MILANI, JOHN C Tmt. Dt.:  
> > 05/12/2014 - 05/28/2014 (9 pages)', '/Page': IndirectObject(465, 0), 
> > '/Type': '/FitB'}
> > 
> > Traceback (most recent call last):
> >File "C:\Users\stand\Desktop\PythonSublimeText.py", line 9, in 
> >  x=MyRegex.findall(MyDict)
> > TypeError: expected string or bytes-like object
> > 
> > Here is the "string" of code I am working with:
> 
> Please call it a dictionary as in the subject line, quite clearly it is 
> not a string in any way, shape or form.
> 
> > 
> > {'/Title': '1F:  Progress Notes  Src.:  MILANI, JOHN C Tmt. Dt.:  
> > 05/12/2014 - 05/28/2014 (9 pages)', '/Page': IndirectObject(465, 0), 
> > '/Type': '/FitB'}
> > 
> > I want to grab the name "MILANI, JOHN C" and the last date "-mm/dd/" as 
> > a pair such that if I have  X numbers of string like the above I will end 
> > out with N pairs of values (name and date)/  Here is my code:
> >   
> > import PyPDF2,re
> > pdfFileObj=open('x.pdf','rb')
> > pdfReader=PyPDF2.PdfFileReader(pdfFileObj)
> > Result=pdfReader.getOutlines()
> > MyDict=(Result[-1][0])
> > print(MyDict)
> > print(isinstance(MyDict,dict))
> > MyRegex=re.compile(r"MILANI,")
> > x=MyRegex.findall(MyDict)
> > print(x)
> > 
> > Thanks in advance for any help.
> > 
> 
> Was the string methods solution that I gave a week or so ago so bad that 
> you still think that you need a regex to solve this?
> 
> -- 
> My fellow Pythonistas, ask not what our language can do for you, ask
> what you can do for our language.
> 
> Mark Lawrence

My Apology Mark.  You took the time to give me the basis of a non-regex 
solution and I had not taken the time to fully review your answer.Did not 
understand it at first blush, but I think now I do.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Regex on a Dictionary

2018-02-13 Thread Steven D'Aprano
On Tue, 13 Feb 2018 13:53:04 +, Mark Lawrence wrote:

> Was the string methods solution that I gave a week or so ago so bad that
> you still think that you need a regex to solve this?

Sometimes regexes are needed, but often Jamie Zawinski is right:

Some people, when confronted with a problem, think "I know, 
I'll use regular expressions." Now they have two problems.


Using the nuclear-powered bulldozer of regular expressions to crack the 
peanut of a simple fixed-string matching problem is rarely a good idea.



-- 
Steve

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Regex on a Dictionary

2018-02-13 Thread Steven D'Aprano
On Tue, 13 Feb 2018 05:11:20 -0800, Stanley Denman wrote:

> I am trying to performance a regex on a "string" of text that python
> isinstance is telling me is a dictionary.

Please believe Python when it tells you that something is a dictionary. 
Trust me, the interpreter knows. It doesn't matter how much you want 
something to be a string, if it is not a string, it isn't a string.



-- 
Steve

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Regex on a Dictionary

2018-02-13 Thread Mark Lawrence

On 13/02/18 13:11, Stanley Denman wrote:

I am trying to performance a regex on a "string" of text that python isinstance 
is telling me is a dictionary.  When I run the code I get the following error:

{'/Title': '1F:  Progress Notes  Src.:  MILANI, JOHN C Tmt. Dt.:  05/12/2014 - 
05/28/2014 (9 pages)', '/Page': IndirectObject(465, 0), '/Type': '/FitB'}

Traceback (most recent call last):
   File "C:\Users\stand\Desktop\PythonSublimeText.py", line 9, in 
 x=MyRegex.findall(MyDict)
TypeError: expected string or bytes-like object

Here is the "string" of code I am working with:


Please call it a dictionary as in the subject line, quite clearly it is 
not a string in any way, shape or form.




{'/Title': '1F:  Progress Notes  Src.:  MILANI, JOHN C Tmt. Dt.:  05/12/2014 - 
05/28/2014 (9 pages)', '/Page': IndirectObject(465, 0), '/Type': '/FitB'}

I want to grab the name "MILANI, JOHN C" and the last date "-mm/dd/" as a 
pair such that if I have  X numbers of string like the above I will end out with N pairs of values 
(name and date)/  Here is my code:
  
import PyPDF2,re

pdfFileObj=open('x.pdf','rb')
pdfReader=PyPDF2.PdfFileReader(pdfFileObj)
Result=pdfReader.getOutlines()
MyDict=(Result[-1][0])
print(MyDict)
print(isinstance(MyDict,dict))
MyRegex=re.compile(r"MILANI,")
x=MyRegex.findall(MyDict)
print(x)

Thanks in advance for any help.



Was the string methods solution that I gave a week or so ago so bad that 
you still think that you need a regex to solve this?


--
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence

--
https://mail.python.org/mailman/listinfo/python-list


Re: Regex on a Dictionary

2018-02-13 Thread alister via Python-list
On Tue, 13 Feb 2018 13:42:08 +, Rhodri James wrote:

> On 13/02/18 13:11, Stanley Denman wrote:
>> I am trying to performance a regex on a "string" of text that python
>> isinstance is telling me is a dictionary.  When I run the code I get
>> the following error:
>> 
>> {'/Title': '1F:  Progress Notes  Src.:  MILANI, JOHN C Tmt. Dt.: 
>> 05/12/2014 - 05/28/2014 (9 pages)', '/Page': IndirectObject(465, 0),
>> '/Type': '/FitB'}
>> 
>> Traceback (most recent call last):
>>File "C:\Users\stand\Desktop\PythonSublimeText.py", line 9, in
>>
>>  x=MyRegex.findall(MyDict)
>> TypeError: expected string or bytes-like object
>> 
>> Here is the "string" of code I am working with:
>> 
>> {'/Title': '1F:  Progress Notes  Src.:  MILANI, JOHN C Tmt. Dt.: 
>> 05/12/2014 - 05/28/2014 (9 pages)', '/Page': IndirectObject(465, 0),
>> '/Type': '/FitB'}
>> 
>> I want to grab the name "MILANI, JOHN C" and the last date
>> "-mm/dd/" as a pair such that if I have  X numbers of string like
>> the above I will end out with N pairs of values (name and date)/  Here
>> is my code:
>>   
>> import PyPDF2,re pdfFileObj=open('x.pdf','rb')
>> pdfReader=PyPDF2.PdfFileReader(pdfFileObj)
>> Result=pdfReader.getOutlines()
>> MyDict=(Result[-1][0])
>> print(MyDict)
>> print(isinstance(MyDict,dict))
>> MyRegex=re.compile(r"MILANI,")
>> x=MyRegex.findall(MyDict)
>> print(x)
> 
> As the error message says, re.findall() expects a string.  A dictionary
> is in no sense a string, so passing it in whole like that won't work.
> If you know that the name will always show up in the title field, you
> can pass just the title:
> 
>x = MyRegex.findall(MyDict['/Title'])
> 
> Otherwise you will have to loop through all the entries in the
> dictionary:
> 
>for entry in MyDict.values():
>  x = MyRegex.findall(entry) # ...and do something with x
> 
> I rather suspect you are going to find that the titles aren't in a very
> systematic format, though.

for what purpose are you trying to run this regex anyway?
it is almost certainly the wrong approach for your task



-- 
Larkinson's Law:
All laws are basically false.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Regex on a Dictionary

2018-02-13 Thread Rhodri James

On 13/02/18 13:11, Stanley Denman wrote:

I am trying to performance a regex on a "string" of text that python isinstance 
is telling me is a dictionary.  When I run the code I get the following error:

{'/Title': '1F:  Progress Notes  Src.:  MILANI, JOHN C Tmt. Dt.:  05/12/2014 - 
05/28/2014 (9 pages)', '/Page': IndirectObject(465, 0), '/Type': '/FitB'}

Traceback (most recent call last):
   File "C:\Users\stand\Desktop\PythonSublimeText.py", line 9, in 
 x=MyRegex.findall(MyDict)
TypeError: expected string or bytes-like object

Here is the "string" of code I am working with:

{'/Title': '1F:  Progress Notes  Src.:  MILANI, JOHN C Tmt. Dt.:  05/12/2014 - 
05/28/2014 (9 pages)', '/Page': IndirectObject(465, 0), '/Type': '/FitB'}

I want to grab the name "MILANI, JOHN C" and the last date "-mm/dd/" as a 
pair such that if I have  X numbers of string like the above I will end out with N pairs of values 
(name and date)/  Here is my code:
  
import PyPDF2,re

pdfFileObj=open('x.pdf','rb')
pdfReader=PyPDF2.PdfFileReader(pdfFileObj)
Result=pdfReader.getOutlines()
MyDict=(Result[-1][0])
print(MyDict)
print(isinstance(MyDict,dict))
MyRegex=re.compile(r"MILANI,")
x=MyRegex.findall(MyDict)
print(x)


As the error message says, re.findall() expects a string.  A dictionary 
is in no sense a string, so passing it in whole like that won't work. 
If you know that the name will always show up in the title field, you 
can pass just the title:


  x = MyRegex.findall(MyDict['/Title'])

Otherwise you will have to loop through all the entries in the dictionary:

  for entry in MyDict.values():
x = MyRegex.findall(entry)
# ...and do something with x

I rather suspect you are going to find that the titles aren't in a very 
systematic format, though.


--
Rhodri James *-* Kynesim Ltd
--
https://mail.python.org/mailman/listinfo/python-list


Re: Regex on a Dictionary

2018-02-13 Thread Etienne Robillard

Hi Stanley,


Le 2018-02-13 à 08:11, Stanley Denman a écrit :

x=MyRegex.findall(MyDict)

How about:
x = [MyRegex.findall(item) for item in MyDict]

Etienne

--
Etienne Robillard
tkad...@yandex.com
https://www.isotopesoftware.ca/

--
https://mail.python.org/mailman/listinfo/python-list


Regex on a Dictionary

2018-02-13 Thread Stanley Denman
I am trying to performance a regex on a "string" of text that python isinstance 
is telling me is a dictionary.  When I run the code I get the following error:

{'/Title': '1F:  Progress Notes  Src.:  MILANI, JOHN C Tmt. Dt.:  05/12/2014 - 
05/28/2014 (9 pages)', '/Page': IndirectObject(465, 0), '/Type': '/FitB'}

Traceback (most recent call last):
  File "C:\Users\stand\Desktop\PythonSublimeText.py", line 9, in 
x=MyRegex.findall(MyDict)
TypeError: expected string or bytes-like object

Here is the "string" of code I am working with:

{'/Title': '1F:  Progress Notes  Src.:  MILANI, JOHN C Tmt. Dt.:  05/12/2014 - 
05/28/2014 (9 pages)', '/Page': IndirectObject(465, 0), '/Type': '/FitB'}

I want to grab the name "MILANI, JOHN C" and the last date "-mm/dd/" as a 
pair such that if I have  X numbers of string like the above I will end out 
with N pairs of values (name and date)/  Here is my code:
 
import PyPDF2,re
pdfFileObj=open('x.pdf','rb')
pdfReader=PyPDF2.PdfFileReader(pdfFileObj)
Result=pdfReader.getOutlines()
MyDict=(Result[-1][0])
print(MyDict)
print(isinstance(MyDict,dict))
MyRegex=re.compile(r"MILANI,")
x=MyRegex.findall(MyDict)
print(x)

Thanks in advance for any help.
-- 
https://mail.python.org/mailman/listinfo/python-list