Re: Pod::Simple can treat binary as pod due to liberal/inconsistent regexp patterns

David E. Wheeler Wed, 07 Jan 2015 21:38:38 -0800

On Jan 7, 2015, at 11:30 AM, Karl Williamson <pub...@khwilliamson.com> wrote:


>> I asked David about the inconsistency and he asked that I bring it up here.
>> 
>> Shouldn't the more strict regexp be used in both places?
> 
> I think so.  Looking at the regexes though, I didn't know that directives 
> could be capitals, and I thought that digits had to always be the last 
> character (or characters ?) in a directive.  It seems to me that both regexes 
> should be tightened.

perlpodspec says:

>     Pod content is contained in Pod blocks. A Pod block starts with a line
>     that matches <m/\A=[a-zA-Z]/>, and continues up to the next line that
>     matches "m/\A=cut/" or up to the end of the file if there is no
>     "m/\A=cut/" line.

I agree that’s too liberal. I suggest

    /\A=([a-zA-Z]+\d*)\b/

>> On the first pass the parser marks the line as pod (presumably matching
>> a directive)
>> but on the second pass the line doesn't match any patterns and it all
>> falls through as a paragraph.
>> 
>> This inconsistency allows binary data to be treated as a pod document.
>> Is there a recommended way to parse the pod out of a document that might
>> have binary data in it?
> 
> I don't know about this.

It seems to me that if the second match does not think it is Pod, then it 
should not be a paragraph (unless it was already in a pod section from a 
previous declaration). I suspect that if we tighten up the first regex as I 
suggest year, and sync the second with it, we should be okay. Thoughts?

Best,

David

smime.p7s
Description: S/MIME cryptographic signature

Re: Pod::Simple can treat binary as pod due to liberal/inconsistent regexp patterns

Reply via email to