[Python-ideas] Re: Pickle security improvements

Chris Angelico Wed, 15 Jul 2020 05:17:49 -0700

On Wed, Jul 15, 2020 at 9:37 PM Steven D'Aprano <st...@pearwood.info> wrote:
>
> On Wed, Jul 15, 2020 at 11:24:17AM +1000, Chris Angelico wrote:
> > So if you're distributing your code, then maybe you don't use pickle.
>
> Sure. What do I use to serialise my complex data structure? I guess I
> could write out the repr and then call eval on it, that should be
> fine... *wink*


Maybe don't HAVE an arbitrarily complex data structure for
serialization. Maybe have a way to turn the in-memory representation
into a much simpler structure, serialize that, and then load from your
saved form.

It'll make your code a lot easier to reason about and refactor, since
you're no longer intrinsically binding your code to your save format.

> I'm not a pickle expert, but I don't think that's quite right. pickle
> has to be able to execute arbitrary code in order to be able to
> de-serialise arbitrary pickles, but that doesn't mean it has to
> de-serialise arbitrary pickles if you aren't expecting arbitrary
> pickles.
>
> Random beat it to me by suggesting a white-list, but I was thinking the
> same way. The pickle protocol has to be able to deal with arbitrary
> instances, but very few apps using pickle need to, or want to, accept
> arbitrary instances. If my app serialised Widgets and Gadgets, then it
> ought to be an error to attempt to deserialise anything else.
>
> Then all I need do is ensure that the Widget and Gadget classes are
> secure, not the entire Python universe :-)

If that's what you want, then have a way to serialize Widgets and
Gadgets, and *not* a way to serialize arbitrary objects. That, to me,
sounds more like "enhanced JSON" than "magically safe pickle".

> Security is always about tradeoffs, and we shouldn't let the idea of
> some unattainable perfectly secure pickle get in the way of improving
> the safety of pickle.

Nor should we let the idea of a secure pickle get in the way of
improving the functionality of safer options.

> > If someone claims they've created a way to allow untrusted users to
> > insert code into your Python programs and have it execute, but they've
> > made it safe, would you oppose its inclusion in the stdlib?
>
> But that's not really what we're asking for. We're asking for a way to
> *avoid* executing arbitrary code, while still allowing *trusted* objects
> to be depickled.

Except that you are. It's equivalent to trying to create a safe
version of eval() instead of building a simple arithmetic expression
parser. You're starting from danger and trying to patch until it's
safe, instead of starting from safety and adding functionality until
it's usable.

Remember: If you have insufficient functionality, you'll know about
it; if you are insufficiently secure, you won't know till it's too
late.

> > You want "pickle but magically able to know what's safe and what's
> > not"?
>
> Of course not. But maybe I want to be able to tell pickle what I think
> is safe, and have everything else fail.
>

That's fair, but are you actually guaranteeing that it will never read
arbitrary attributes from objects? Can pickle grab a module or
function, pick up a dunder from it, and go to town? Are you able to
give a total 100% guarantee that it cannot? If not, how do you know
that it's safe?

Edwin has given further information on the inherent unsafe nature of
pickle. It should be used for trusted pickles, NOT as a basis for some
magical "safe" parser.

ChrisA
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/ZDOJSZJOPKVY2BSQJZY6XVHLQ7ZLIQBY/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Pickle security improvements

Reply via email to