[issue34155] email.utils.parseaddr mistakenly parse an email

2019-09-09 Thread STINNER Victor


STINNER Victor  added the comment:

I reopen the issue since Python 2.7 is still vulnerable.

--
resolution: fixed -> 
status: closed -> open

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34155] email.utils.parseaddr mistakenly parse an email

2019-09-09 Thread Riccardo Schirone


Riccardo Schirone  added the comment:

CVE-2019-16056 has been assigned to this issue.
See https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-16056 .

--
nosy: +rschiron

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34155] email.utils.parseaddr mistakenly parse an email

2019-09-06 Thread Larry Hastings


Larry Hastings  added the comment:

All PRs merged.  Thanks, everybody!

--
resolution:  -> fixed
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34155] email.utils.parseaddr mistakenly parse an email

2019-09-06 Thread Larry Hastings


Larry Hastings  added the comment:


New changeset 063eba280a11d3c9a5dd9ee5abe4de640907951b by larryhastings 
(Abhilash Raj) in branch '3.5':
[3.5] bpo-34155: Dont parse domains containing @ (GH-13079) (#15317)
https://github.com/python/cpython/commit/063eba280a11d3c9a5dd9ee5abe4de640907951b


--
nosy: +larry

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34155] email.utils.parseaddr mistakenly parse an email

2019-08-23 Thread Łukasz Langa

Łukasz Langa  added the comment:

Downgraded the severity since 3.6 - 3.9 are merged.

--
nosy: +lukasz.langa
priority: deferred blocker -> critical

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34155] email.utils.parseaddr mistakenly parse an email

2019-08-19 Thread Abhilash Raj


Abhilash Raj  added the comment:

2.7 needs a separate PR since the code is very different and my familiarity 
with 2.7 version of email package is very limited. 

I am going to work on a separate patch later this week for 2.7.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34155] email.utils.parseaddr mistakenly parse an email

2019-08-19 Thread STINNER Victor


STINNER Victor  added the comment:

> Created a backport PR for 3.5.

Thanks. I reviewed it (LGTM).

What about Python 2.7, it's also vulnerable, no?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34155] email.utils.parseaddr mistakenly parse an email

2019-08-16 Thread Abhilash Raj


Abhilash Raj  added the comment:

Created a backport PR for 3.5.

--
stage: patch review -> resolved

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34155] email.utils.parseaddr mistakenly parse an email

2019-08-16 Thread Abhilash Raj


Change by Abhilash Raj :


--
pull_requests: +15036
stage: resolved -> patch review
pull_request: https://github.com/python/cpython/pull/15317

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34155] email.utils.parseaddr mistakenly parse an email

2019-08-16 Thread Kyle Stanley


Kyle Stanley  added the comment:

> This is already backported to 3.6. I am not sure about what gets backported 
> to 3.5 right now, I don't even see a 'Backport to 3.5' label on Github (which 
> made me think we are discouraged to backport to 3.5). I can work on a manual 
> backport if needed?

As far as I'm aware, backports to 3.5 have to be manually approved by those 
with repository management permissions, such the the organization owners 
(https://devguide.python.org/devcycle/#current-owners) and admins 
(https://devguide.python.org/devcycle/#current-administrators)

--
nosy: +aeros167

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34155] email.utils.parseaddr mistakenly parse an email

2019-08-15 Thread Abhilash Raj


Abhilash Raj  added the comment:

@Victor: This is already backported to 3.6. I am not sure about what gets 
backported to 3.5 right now, I don't even see a 'Backport to 3.5' label on 
Github (which made me think we are discouraged to backport to 3.5). I can work 
on a manual backport if needed?

This patch most probably won't backport to 2.7 without re-writing it completely 
since the implementation in 2.7 is much different than what we have today.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34155] email.utils.parseaddr mistakenly parse an email

2019-08-12 Thread STINNER Victor


STINNER Victor  added the comment:

This issue is a security issue so Python 2.7, 3.5, 3.6 should also be fixed, no?

--
status: closed -> open
versions: +Python 2.7, Python 3.5, Python 3.6

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34155] email.utils.parseaddr mistakenly parse an email

2019-08-12 Thread STINNER Victor


STINNER Victor  added the comment:

I change the issue type to security because of 
https://bugs.python.org/issue34155#msg340534: "Note that this bug was used in 
an actual security attack so it is serious".

--
type: behavior -> security

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34155] email.utils.parseaddr mistakenly parse an email

2019-08-10 Thread Abhilash Raj


Abhilash Raj  added the comment:

Closing this since teh PRs are merged.

--
stage: patch review -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34155] email.utils.parseaddr mistakenly parse an email

2019-08-09 Thread Ned Deily


Ned Deily  added the comment:


New changeset 13a19139b5e76175bc95294d54afc9425e4f36c9 by Ned Deily (Miss 
Islington (bot)) in branch '3.6':
bpo-34155: Dont parse domains containing @ (GH-13079) (GH-14826)
https://github.com/python/cpython/commit/13a19139b5e76175bc95294d54afc9425e4f36c9


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34155] email.utils.parseaddr mistakenly parse an email

2019-08-09 Thread miss-islington


miss-islington  added the comment:


New changeset 217077440a6938a0b428f67cfef6e053c4f8673c by Miss Islington (bot) 
in branch '3.8':
bpo-34155: Dont parse domains containing @ (GH-13079)
https://github.com/python/cpython/commit/217077440a6938a0b428f67cfef6e053c4f8673c


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34155] email.utils.parseaddr mistakenly parse an email

2019-08-09 Thread miss-islington


miss-islington  added the comment:


New changeset c48d606adcef395e59fd555496c42203b01dd3e8 by Miss Islington (bot) 
in branch '3.7':
bpo-34155: Dont parse domains containing @ (GH-13079)
https://github.com/python/cpython/commit/c48d606adcef395e59fd555496c42203b01dd3e8


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34155] email.utils.parseaddr mistakenly parse an email

2019-07-17 Thread miss-islington


Change by miss-islington :


--
pull_requests: +14620
pull_request: https://github.com/python/cpython/pull/14826

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34155] email.utils.parseaddr mistakenly parse an email

2019-07-17 Thread miss-islington


Change by miss-islington :


--
pull_requests: +14618
pull_request: https://github.com/python/cpython/pull/14824

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34155] email.utils.parseaddr mistakenly parse an email

2019-07-17 Thread miss-islington


miss-islington  added the comment:


New changeset 8cb65d1381b027f0b09ee36bfed7f35bb4dec9a9 by Miss Islington (bot) 
(jpic) in branch 'master':
bpo-34155: Dont parse domains containing @ (GH-13079)
https://github.com/python/cpython/commit/8cb65d1381b027f0b09ee36bfed7f35bb4dec9a9


--
nosy: +miss-islington

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34155] email.utils.parseaddr mistakenly parse an email

2019-07-17 Thread miss-islington


Change by miss-islington :


--
pull_requests: +14619
pull_request: https://github.com/python/cpython/pull/14825

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34155] email.utils.parseaddr mistakenly parse an email

2019-07-03 Thread jpic


jpic  added the comment:

Thanks for the kind words Cyril, sorry that this patch doesn't address exactly 
the issue that you have described initially, but rather the security issue 
related to it.

The exception depending on the parsing issue is already supported by the new 
API, although it's just "Invalid Domain" for now. For user interfaces it would 
be nice to detail parse errors indeed. Again I wonder if this should be a 
separate issue.

Concerning the default behavior, @maxking will know but I would try to defend 
the "secure by default" paradigm if necessary, especially in the deprecated 
API. Meanwhile, I think it would create more value for Python to invest in 
feature development in the new API, that has a very nice private API but 
apparently lacks unit tests and documentation before becoming available to 
users.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34155] email.utils.parseaddr mistakenly parse an email

2019-07-03 Thread Cyril Nicodème

Cyril Nicodème  added the comment:

This thread has been really interesting to follow, I'm glad to have opened it :)

I would agree with Barry here, it should follow the documentation.

BUT, I would suggest to add a "strict" parameter that would throw exceptions 
depending on the parsing issue (missing a @, having multiple @, etc).

That way, a basic usage would return the empty strings, letting the developer 
know the email is invalid, and advanced case would still be possible.

By default, I think having strict set to False would be logical, since it would 
follow the documentation.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34155] email.utils.parseaddr mistakenly parse an email

2019-07-02 Thread Barry A. Warsaw


Barry A. Warsaw  added the comment:

I still think the only way to read the documentation for parseaddr('a@b@c') is 
to return ('', '') - a tuple of empty strings.

The documentations says:

"Returns a tuple of that information, unless the parse fails, in which case a 
2-tuple of ('', '') is returned."

Of course, it doesn't define exactly what a "failing parse" is, but I would 
claim that a non-RFC compliant address should fail to parse, at least for the 
parseaddr() interface.

I'm not concerned about inconsistencies between message_from_string() and 
parseaddr().  They are difference APIs.

I'll follow up on the PR, but does anybody disagree with that reasoning?

--
versions: +Python 3.9

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34155] email.utils.parseaddr mistakenly parse an email

2019-06-03 Thread Abhilash Raj


Abhilash Raj  added the comment:

slight typo in the previous message:

s/fallback to `get_unstructured` /fallback to *something*/g

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34155] email.utils.parseaddr mistakenly parse an email

2019-06-03 Thread Abhilash Raj


Abhilash Raj  added the comment:

I agree that we currently abandon parsing (raise `HeaderParseError`) when we 
encounter a unexpected token when parsing domain (expected token is 
dot-atom-text).

However, that mechanism is meant to signal the higher level parser that we 
should look for a different type of token. In case of domain, we don't fallback 
to anything.

I believe we should fallback to `get_unstructured` when we do encounter an 
invalid domain (either `foo.` or `foo@exaomple` or `f...@example.com`) and 
register defect. But, the `.domain` attribute on the address class should be 
None if the domain is invalid.

My proposed solution of `get_unstrucutured` is perhaps not a great idea either 
since we would end up parsing more than we should (maybe we should parse until 
`>`?) in case of AddrList or something.

I would love to know what David and Barry think about this one?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34155] email.utils.parseaddr mistakenly parse an email

2019-06-03 Thread jpic


jpic  added the comment:

Thanks for your explanation, but in perspective with other invalid
domains, such as "foo." currently resulting in an empty string too:

>>> email.message_from_string('From: 
>>> a@foo.',policy=email.policy.default)['from'].addresses[0].domain
''

Do you think this should also change the value of domain to "foo." ?

Also yes with parseaddr it seems that domain is an empty string if it
didn't find a valid domain at all, which is pretty safe in case of
malicious injection attempt - if that's what we're trying to save
python programs from, a clarification of the objective of the patch
would be welcome.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34155] email.utils.parseaddr mistakenly parse an email

2019-06-01 Thread Abhilash Raj


Abhilash Raj  added the comment:

I don't know if we can make the API consistent between parseaddr and the 
parsing header value since they are completely different even right now. Like 
you already noticed there is no way to register defects and instead parseaddr 
returns ('', '') to denote the failure to parse.

About parsing malicious domain, my line of thinking was along the lines of 
presenting whatever is there to user of the API, without 'hiding' that 
information. It would be harder to figure out the exception if the domain is 
missing. 

Even though the domain is parsed in the `domain` value, the value itself is 
clearly invalid. Any attempt to ever use that Address() will definitely cause 
an error (perhaps, there should be a sanity check in SMTP.send_message for 
that?).

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34155] email.utils.parseaddr mistakenly parse an email

2019-06-01 Thread jpic


jpic  added the comment:

The email API does error recovery without loading invalid domains into
the domain variable which could lead to dangerous situations, example
with "a@foo.":

>>> email.message_from_string('From: 
>>> a@foo.',policy=email.policy.default)['from'].addresses[0].domain
''

In perspective with the new patch proposed by maxking that lets an
invalid domain make it to the domain variable:

>>> email.message_from_string('From: 
>>> a@b...@c.com',policy=email.policy.default)['from'].addresses[0].domain
'b...@c.com'

For me maxking's suggestion opens the question of where to draw the
line between invalid domains should be loaded into the domain variable
and what invalid domains should not be loaded into the domain
variable.

Another smaller advantage of of Go's net/mail behaviour is that
results between parseaddr and email are consistently empty strings for
an invalid domain: parseaddr does not seem able to return a list of
defects.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34155] email.utils.parseaddr mistakenly parse an email

2019-05-30 Thread Abhilash Raj


Abhilash Raj  added the comment:

How about we go a slightly different route than suggested by jpic and instead 
of returning a None value, we return the entire rest of the string as the 
domain? That would take care of the security issue since it won't be a valid 
domain anymore.


 msg = email.message_from_string(
'From: SomeAbhilashRaj ',
policy=email.policy.default)
 print(msg['From'].addresses)
 print(msg['From'].defects)

 (Address(display_name='SomeAbhilashRaj', username='abhilash', 
domain='malicious@important.com>'),)
 (InvalidHeaderDefect('invalid address in address-list'), 
InvalidHeaderDefect("missing trailing '>' on angle-addr"),  
InvalidHeaderDefect("unpected '@' in domain"), ObsoleteHeaderDefect("period in 
'phrase'"))


This lets us do postel-style error recovery while working in RFC 2822 style 
grammar. 

I wrote this patch to achieve this:


@@ -1573,6 +1574,11 @@ def get_domain(value):
 domain.append(DOT)
 token, value = get_atom(value[1:])
 domain.append(token)
+if value and value[0] == '@':
+domain.defects.append(errors.InvalidHeaderDefect(
+"unpected '@' in domain"))
+token = get_unstructured(value)
+domain.append(token)
 return domain, value

Does this makes sense?

--
nosy: +maxking

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34155] email.utils.parseaddr mistakenly parse an email

2019-05-04 Thread jpic


jpic  added the comment:

At is allowed in the local part if quoted, the proposed patch acts within
get_domain() and getdomain() and does not affect local part parsing. This
still works:

>>> parseaddr('"fo@bar"@bar.com')
('', '"fo@bar"@bar.com')

>>> email.message_from_string('From: "a@b"@ex.com
',policy=email.policy.default)['from'].addresses
(Address(display_name='', username='a@b', domain='ex.com'),)

I'm not against raising an exception in parseaddr, but you should know that
currently nothing in parseaddr seems to raise an exception:

jpic@ci:~/cpython$ grep raise Lib/email/_parseaddr.py
jpic@ci:~/cpython$

For example:

>>> parseaddr('aoeu')
('', 'aoeu')
>>> parseaddr('a@')
('', 'a@')

None of the above calls raised an exception. That is the reason why I did
not introduce a new Exception in the getdomain() change: I thought it would
be more consistent with the rest of the API as such.

As for the new API, the patch does raise a parse error:

 # this detect that the next caracter right after the parsed domain is
another @
if value and value[0] == '@':
  raise errors.HeaderParseError('Multiple domains')

But that's in the lower level API that is planned for going public later on
(probably when it will have unit tests), it's just the higher level API
that the user faces that swallows it. As a user you can still know about
that parse problem using the defects attribute:

>>> email.message_from_string('From: a...@malicious.org@example.com',
policy=email.policy.default)['from'].defects[0]
InvalidHeaderDefect('invalid address in address-list')

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34155] email.utils.parseaddr mistakenly parse an email

2019-05-03 Thread Windson Yang


Windson Yang  added the comment:

Frome the answer from Alnitak 
(https://stackoverflow.com/questions/12355858/how-many-symbol-can-be-in-an-email-address).
 Maybe we should raise an error when the address has multiple @ in it.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34155] email.utils.parseaddr mistakenly parse an email

2019-05-03 Thread jpic


jpic  added the comment:

The pull request has been updated to mimic net/mail's behavior rather than
trying to workaround user input.

Before:

>>> email.message_from_string('From: a...@malicious.org@important.com',
policy=email.policy.default)['from'].addresses
(Address(display_name='', username='a', domain='malicious.org'),)

>>> parseaddr('a...@malicious.org@important.com')
('', 'a...@malicious.org')

After:

>>> email.message_from_string('From: a...@malicious.org@important.com',
policy=email.policy.default)['from'].addresses
(Address(display_name='', username='', domain=''),)

>>> parseaddr('a...@malicious.org@important.com')
('', 'a@')

I like what I saw under the hood, please feel free to hack me for other
tasks in the email stdlib.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34155] email.utils.parseaddr mistakenly parse an email

2019-05-03 Thread jpic


jpic  added the comment:

I haven't found this specific case in an RFC, but checked Go's net/mail
library behavior and it just considers it broken:

$ cat mail.go
package main
import "fmt"
import "net/mail"
func main() {
fmt.Println((&mail.AddressParser{}).Parse("a...@example.com"))
fmt.Println((&mail.AddressParser{}).Parse("a...@malicious.org@example.com
"))
}

$ go run mail.go
 
 mail: expected single address, got "@example.com"

That would fix the security issue but not the whole ticket.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34155] email.utils.parseaddr mistakenly parse an email

2019-05-03 Thread Roundup Robot


Change by Roundup Robot :


--
keywords: +patch
pull_requests: +12994
stage:  -> patch review

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34155] email.utils.parseaddr mistakenly parse an email

2019-05-02 Thread Barry A. Warsaw


Barry A. Warsaw  added the comment:

Well, at least we're not alone.  Here's a screen capture from Mail.app Version 
12.4 (3445.104.8).

--
Added file: https://bugs.python.org/file48295/Screen Shot 2019-05-02 at 
22.07.27.png

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34155] email.utils.parseaddr mistakenly parse an email

2019-05-02 Thread Windson Yang


Windson Yang  added the comment:

I found the issue located in 
https://github.com/python/cpython/blob/master/Lib/email/_parseaddr.py#L277

elif self.field[self.pos] in '.@':
# email address is just an addrspec
# this isn't very efficient since we start over
self.pos = oldpos
self.commentlist = oldcl
addrspec = self.getaddrspec()
returnlist = [(SPACE.join(self.commentlist), addrspec)]

The parseaddr function runs a for in loop over the input string, when it meets 
'.@' it will do something. That is why when the input string is 
'f...@bar.com@example.com' will return ('', 'f...@bar.com'). One possible 
solution will be to check the string in the reverse order then we can always 
get the last '@' in the string.

--
nosy: +Windson Yang

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34155] email.utils.parseaddr mistakenly parse an email

2019-05-02 Thread Ned Deily


Ned Deily  added the comment:

@barry, @r.david.murray, With the additional info about attacks in the wild, 
should we now consider this a security issue?  If so, someone needs to provide 
an actual PR.  (Raising the priority to "deferred blocker" pending evaluation.)

--
nosy: +ned.deily
priority: normal -> deferred blocker

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34155] email.utils.parseaddr mistakenly parse an email

2019-04-29 Thread Jakub Wilk


Change by Jakub Wilk :


--
nosy:  -jwilk

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34155] email.utils.parseaddr mistakenly parse an email

2019-04-29 Thread Dain Dwarf


Dain Dwarf  added the comment:

Hello, kind of new here.

I just wanted to note that the issue that lead to Tchap's security attack still 
exists in the non-deprecated message_from_string function:

email.message_from_string('From: a...@malicious.org@important.com', 
policy=email.policy.default)['from'].addresses

(Address(display_name='', username='a', domain='malicious.org'),)

So, deprecating parseaddr is not enough for security purpose, unless there is 
another ticket for the new email API.

--
nosy: +Dain Dwarf

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34155] email.utils.parseaddr mistakenly parse an email

2019-04-26 Thread Jamesie Pic


Jamesie Pic  added the comment:

Given the situation, could raising a SecurityWarning and a DeprecationWarning 
fix this issue ?

--
nosy: +Jamesie Pic

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34155] email.utils.parseaddr mistakenly parse an email

2019-04-23 Thread STINNER Victor


Change by STINNER Victor :


--
nosy: +vstinner

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34155] email.utils.parseaddr mistakenly parse an email

2019-04-23 Thread STINNER Victor


Change by STINNER Victor :


--
nosy:  -vstinner

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34155] email.utils.parseaddr mistakenly parse an email

2019-04-19 Thread Nicolas Évrard

Change by Nicolas Évrard :


--
nosy: +nicoe

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34155] email.utils.parseaddr mistakenly parse an email

2019-04-19 Thread Karthikeyan Singaravelan


Karthikeyan Singaravelan  added the comment:

Relevant attack from matrix blog post.

https://matrix.org/blog/2019/04/18/security-update-sydent-1-0-2/

> sydent uses python's email.utils.parseaddr function to parse the input email 
> address before sending validation mail to it, but it turns out that if you 
> hand parseaddr an malformed email address of form a...@b.com@c.com, it 
> silently discards the @c.com prefix without error. The result of this is that 
> if one requested a validation token for 'a...@malicious.org@important.com', 
> the token would be sent to 'a...@malicious.org', but the address 
> 'a...@malicious.org@important.com' would be marked as validated. This release 
> fixes this behaviour by asserting that the parsed email address is the same 
> as the input email address.

I am marking this as a security issue.

--
keywords: +security_issue
nosy: +vstinner

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34155] email.utils.parseaddr mistakenly parse an email

2019-04-19 Thread Stéphane Bortzmeyer

Stéphane Bortzmeyer  added the comment:

Note that this bug was used in an actual security attack so it is serious

https://medium.com/@fs0c131y/tchap-the-super-not-secure-app-of-the-french-government-84b31517d144
https://twitter.com/fs0c131y/status/1119143946687434753

--
nosy: +bortzmeyer

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34155] email.utils.parseaddr mistakenly parse an email

2018-11-08 Thread Kal Sze


Kal Sze  added the comment:

Another failure case:

>>> from email.utils import parseaddr
>>> parseaddr('fo@o...@bar.com')
('', 'fo@o')

If I understand the RFC correctly, the correct results should be ('', '') 
because there are two '@' signs. The first '@' would need to be quoted for the 
address to be valid.

--
nosy: +Kal Sze2

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34155] email.utils.parseaddr mistakenly parse an email

2018-11-06 Thread Karthikeyan Singaravelan


Karthikeyan Singaravelan  added the comment:

Ah sorry, I was typing so long and had an idle session that I didn't realize 
@r.david.murray added a comment with the explanation. Just to add I tried using 
Perl module (https://metacpan.org/release/Email-Address) that uses regex for 
parsing that returns me two addresses and the regex is also not much 
comprehensible.

use v5.14;
use Email::Address;

my $line = 'John Doe j...@example.com ';
my @addresses = Email::Address->parse($line);
say $addresses[0];
say $addresses[1];

say "Angle address regex";
say $Email::Address::angle_addr;


j...@example.com
ot...@example.net
Angle address regex
(?^:(?^:(?^:\s*\((?:\s*(?^:(?^:(?>[^()\\]+))|(?^:\\(?^:[^\x0A\x0D]))|))*\s*\)\s*)|\s+)*<(?^:(?^:(?^:(?^:(?^:\s*\((?:\s*(?^:(?^:(?>[^()\\]+))|(?^:\\(?^:[^\x0A\x0D]))|))*\s*\)\s*)|\s+)*(?^:[^\x00-\x1F\x7F()<>\[\]:;@\\,."\s]+(?:\.[^\x00-\x1F\x7F()<>\[\]:;@\\,."\s]+)*)(?^:(?^:\s*\((?:\s*(?^:(?^:(?>[^()\\]+))|(?^:\\(?^:[^\x0A\x0D]))|))*\s*\)\s*)|\s+)*)|(?^:(?^:(?^:\s*\((?:\s*(?^:(?^:(?>[^()\\]+))|(?^:\\(?^:[^\x0A\x0D]))|))*\s*\)\s*)|\s+)*"(?^:(?^:[^\\"])|(?^:\\(?^:[^\x0A\x0D])))*"(?^:(?^:\s*\((?:\s*(?^:(?^:(?>[^()\\]+))|(?^:\\(?^:[^\x0A\x0D]))|))*\s*\)\s*)|\s+)*))\@(?^:(?^:(?^:(?^:\s*\((?:\s*(?^:(?^:(?>[^()\\]+))|(?^:\\(?^:[^\x0A\x0D]))|))*\s*\)\s*)|\s+)*(?^:[^\x00-\x1F\x7F()<>\[\]:;@\\,."\s]+(?:\.[^\x00-\x1F\x7F()<>\[\]:;@\\,."\s]+)*)(?^:(?^:\s*\((?:\s*(?^:(?^:(?>[^()\\]+))|(?^:\\(?^:[^\x0A\x0D]))|))*\s*\)\s*)|\s+)*)|(?^:(?^:(?^:\s*\((?:\s*(?^:(?^:(?>[^()\\]+))|(?^:\\(?^:[^\x0A\x0D]))|))*\s*\)\s*)|\s+)*\[(?:\s*(?^:(?^:[^\[\]\\])|(?^:\\(?^:[^\x0A\x0D]*\s*\](?^:(?^:\s*\((?:\s*(?^:(?^:(
 
?>[^()\\]+))|(?^:\\(?^:[^\x0A\x0D]))|))*\s*\)\s*)|\s+)*)))>(?^:(?^:\s*\((?:\s*(?^:(?^:(?>[^()\\]+))|(?^:\\(?^:[^\x0A\x0D]))|))*\s*\)\s*)|\s+)*)


Thanks

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34155] email.utils.parseaddr mistakenly parse an email

2018-11-06 Thread Mark Sapiro


Mark Sapiro  added the comment:

I agree that my example with an @ in the 'display name', although actually seen 
in the wild, is non-compliant, and that the behavior of parseaddr() in this 
case is not a bug.

Sorry for the noise.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34155] email.utils.parseaddr mistakenly parse an email

2018-11-06 Thread Karthikeyan Singaravelan


Karthikeyan Singaravelan  added the comment:

Is this a case of realname having @ inside an unquoted string? As I can see 
from the RFC the acceptable characters of an atom other than alphabets and 
digits that comprises a phrase are ['!', '#', '$', '%', '&', "'", '*', '+', 
'-', '/', '=', '?', '^', '_', '`', '{', '|', '}', '~'] . So just curious if 
it's a case of @ inside unquoted string as name?

>>> for char in accepted:
... print(parseaddr(f'John Doe jdoe{char}example.com '))
...
('John Doe jdoe!example.com', 'ot...@example.net')
('John Doe jdoe#example.com', 'ot...@example.net')
('John Doe jdoe$example.com', 'ot...@example.net')
('John Doe jdoe%example.com', 'ot...@example.net')
('John Doe jdoe&example.com', 'ot...@example.net')
("John Doe jdoe'example.com", 'ot...@example.net')
('John Doe jdoe*example.com', 'ot...@example.net')
('John Doe jdoe+example.com', 'ot...@example.net')
('John Doe jdoe-example.com', 'ot...@example.net')
('John Doe jdoe/example.com', 'ot...@example.net')
('John Doe jdoe=example.com', 'ot...@example.net')
('John Doe jdoe?example.com', 'ot...@example.net')
('John Doe jdoe^example.com', 'ot...@example.net')
('John Doe jdoe_example.com', 'ot...@example.net')
('John Doe jdoe`example.com', 'ot...@example.net')
('John Doe jdoe{example.com', 'ot...@example.net')
('John Doe jdoe|example.com', 'ot...@example.net')
('John Doe jdoe}example.com', 'ot...@example.net')
('John Doe jdoe~example.com', 'ot...@example.net')

>>> parseaddr('"John Doe j...@example.com" ')
('John Doe j...@example.com', 'ot...@example.net')

>>> parseaddr('John Doe j...@example.com ')
('', 'John Doe j...@example.com')

--
nosy: +xtreak

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34155] email.utils.parseaddr mistakenly parse an email

2018-11-06 Thread R. David Murray


R. David Murray  added the comment:

The formatting of that doctest paragraph got messed up.  Let me try again:

>>> m = message_from_string("From: John Doe j...@example.com 
\n\n", policy=default)
>>> m['From'].addresses
(Address(display_name='', username='John Doe jdoe', domain='example.com'),)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34155] email.utils.parseaddr mistakenly parse an email

2018-11-06 Thread R. David Murray


R. David Murray  added the comment:

>>> m = message_from_string("From: John Doe j...@example.com 
>>> \n\n", policy=default)
>>> m['From'].addresses(Address(display_name='', username='John Doe jdoe', 
domain='example.com'),)

The new policies have more error recovery for non-RFC compliant addresses than 
decode_header, but the two agree in this case.  What is happening here is that 
(1) an unquoted/unencoded '@' is not allowed in a display name (2) if the 
address is not '<>' quoted, then everything before the @ is the username and 
(3) in the absence of a comma after the end of the fqdn (which is not allowed 
to contain blanks) any additional tokens are discarded.

One could argue that we could treat the blank after the FQDN as a "missing 
comma", and there would be some merit to that argument.  You could also argue 
that a "<>" quoted string would trump the occurrence of the @ earlier in the 
token list.  However, the RFC822 grammar is designed to be parsed character by 
character, so that would not be a typical way for an RFC822 parser to try to do 
postel-style error recovery.

So, I don't think there is a bug here, but I'd be curious what other email 
address parsing libraries do, and that could influence whether extensions to 
the "make a guess when the string doesn't conform to the RFC" code would be 
acceptable.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34155] email.utils.parseaddr mistakenly parse an email

2018-11-06 Thread Mark Sapiro


Mark Sapiro  added the comment:

The issue is illustrated much more simply as follows:

email.utils.parseaddr('John Doe j...@example.com ')

returns

('', 'John Doe j...@example.com')

whereas it should return

('John Doe j...@example.com', 'ot...@example.net')

I'll look at developing a patch.

--
nosy: +msapiro

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34155] email.utils.parseaddr mistakenly parse an email

2018-07-19 Thread Jakub Wilk


Jakub Wilk  added the comment:

You should not use decode_header() on the whole From header, because that loses
information. You should parse the header first, then decode the parts that
could be RFC2047-encoded.

Quoting :

> NOTE: Decoding and display of encoded-words occurs *after* a
> structured field body is parsed into tokens.  It is therefore
> possible to hide 'special' characters in encoded-words which, when
> displayed, will be indistinguishable from 'special' characters in the
> surrounding text.  For this and other reasons, it is NOT generally
> possible to translate a message header containing 'encoded-word's to
> an unencoded form which can be parsed by an RFC 822 mail reader.

So I don't see a bug in parseaddr() here, except that the API is a bit of a
footgun.

--
nosy: +jwilk

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34155] email.utils.parseaddr mistakenly parse an email

2018-07-19 Thread R. David Murray


R. David Murray  added the comment:

Ah, maybe it doesn't handle it completely correctly; that decode looks 
different now that I look at it in detail.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34155] email.utils.parseaddr mistakenly parse an email

2018-07-19 Thread R. David Murray


R. David Murray  added the comment:

Oops, I left out a step in that cut and paste.  For completeness:

>>> x = x[3:]

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34155] email.utils.parseaddr mistakenly parse an email

2018-07-19 Thread R. David Murray

R. David Murray  added the comment:

That does appear to be a bug.  Note that the new email API handles it correctly:

>>> x = """
... > From: =?utf-8?Q?z...@redacted.com.cn=E3=82=86=E2=86=91=E3=82=86?=
...  =?utf-8?Q?=E3=82=83=E3=82=85=E3=81=87=E3=81=BA=E3=81=BD=E3=81=BC"\=E3?=
...  =?utf-8?Q?=81=A9=E3=81=A5=E3=81=A2l=E3=81=A0=E3=81=B0=E3=81=A8=E3=81?=
...  =?utf-8?Q?=8FKL=E3=81=84=E3=82=8C=E3=82=8B=E3=82=86>KL=E3=82=89JF?=
...  
... """
>>> from email import message_from_string
>>> from email.policy import default
>>> m = message_from_string(x+'\n\ntest', policy=default)
>>> m['from']
'"z...@redacted.com.cnゆ↑ゆ ゃゅぇぺぽぼ\\"� ��づぢlだばと� �KLいれるゆ>KLらJF" 
'
>>> m['from'].addresses[0].addr_spec
'm...@redacted2.com'
>>> m['from'].addresses[0].display_name
'z...@redacted.com.cnゆ↑ゆ ゃゅぇぺぽぼ"\\\udce3 \udc81\udca9づぢlだばと\udce3\udc81 
\udc8fKLいれるゆ>KLらJF'

I'm not particularly interested myself in fixing parseaddr to handle this case 
correctly, since it is the legacy API, but if someone else wants to I'll review 
the patch.

--
versions: +Python 3.7, Python 3.8 -Python 3.6

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34155] email.utils.parseaddr mistakenly parse an email

2018-07-19 Thread Cyril Nicodème

New submission from Cyril Nicodème :

Hi!

I'm trying to parse some emails, and I discovered that email.utils.parseaddr 
wrongly parse an email.

Here's the corresponding header:

> From: =?utf-8?Q?z...@redacted.com.cn=E3=82=86=E2=86=91=E3=82=86?=
 =?utf-8?Q?=E3=82=83=E3=82=85=E3=81=87=E3=81=BA=E3=81=BD=E3=81=BC"\=E3?=
 =?utf-8?Q?=81=A9=E3=81=A5=E3=81=A2l=E3=81=A0=E3=81=B0=E3=81=A8=E3=81?=
 =?utf-8?Q?=8FKL=E3=81=84=E3=82=8C=E3=82=8B=E3=82=86>KL=E3=82=89JF?=
 

Once this has been parsed via `decode_header`, we obtain this value:

> From: z...@redacted.com.cnゆ↑ゆゃゅぇぺぽぼ"\どづぢlだばとくKLいれるゆ>KLらJF 

(I agree, not really a nice looking From email ...)

Then, when this value is given to parseaddr, here's the result:

> ('', 'z...@redacted.com.cnゆ↑ゆゃゅぇぺぽぼ')

But it should be:

> ('z...@redacted.com.cnゆ↑ゆゃゅぇぺぽぼ"\どづぢlだばとくKLいれるゆ>KLらJF', 'm...@redacted2.com')

(Note that the email in the "name" part is not the same as the email in the 
"email" part!)

--
components: email
messages: 321956
nosy: Cyril Nicodème, barry, r.david.murray
priority: normal
severity: normal
status: open
title: email.utils.parseaddr mistakenly parse an email
type: behavior
versions: Python 3.6

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com