subject:"Make a unique filesystem path, without creating the file"

Re: Make a unique filesystem path, without creating the file

2016-02-28 Thread Cameron Simpson


On 29Feb2016 00:47, Alan Bawden  wrote:

Cameron Simpson  writes:

On 22Feb2016 12:34, Alan Bawden  wrote:


I have deleted the part of discussion where it seems that we must simply
agree to disagree.  You think mktemp() is _way_ more dangerous that I
do.


I certainly think the habit of using it is. And thus we're off into the realm 
of risk assessment I suppose, where one's value sets greatly affect the 
outcome. But there are concrete arguments to be made about risks.


To your other question...

[...]

In fact, mkstemp() also performs that same generate-and-open loop, and of
course it is careful to use os.O_EXCL along with os.O_CREAT when it
opens the file.  So let me re-state my argument using mkstemp() instead:

If the code I wrote in my original message is "unsafe" because some
_other_ process might be using mktemp() badly and stumble over the same
path, then the current implementation of tempfile.mkstemp() is also
"unsafe" for exactly the same reason: some other process badly using
mktemp() to create its own file might accidentally grab the same file.

In other words, if that other process does:

 path = mktemp()
 tmpfp = open(path, "w")

Then yes indeed, it might accidentally grab my fifo when I used my
original code for making a temporary fifo.  But it might _also_ succeed
in grabbing any temporary files I make using tempfile.mkstemp()!  So if
you think what I wrote is "unsafe", it seems that you must conclude that
the standard tempfile.mkstemp() is exactly as "unsafe".

So is that what you think?


Yes and no?

You're quite right that a task using mkstemp is not safe against a task 
misusing mktemp.


_However_:

In a space where everyone uses mktemp, everyone is unsafe from collision.

In a space where everyone uses mkstemp, everyone is safe from collision.

So provided everyone "upgrades", safety is reliable without any added burden in 
program complexity.


Of course, that sidesteps the scenario where someone is using mktemp to obtain 
a pathname for a non-file, but I am of the opinion that in almost all such 
cases the programmer is better off using mkdtemp and making their non-file 
inside the temporary directory. Again, provided everyone "upgrades" to such a 
practice, safety is arranged.


Because of this, I think that _any_ use of mktemp invites risk of collision, 
and needs to be justified with a robust argument establishing that the problem 
cannot be solved with mkstemp or mkdtemp.


Your example was not such a case. Ben's is, in that (a) he needs a "valid" name 
and (b) he isn't going to make an actual filesystem object using the name 
obtained. As it happens it looks like the uuid generation functions from the 
stdlib may meet his needs, addressing his desire to do it simply with the 
stdlib instead of making his own wheel.


So I remain against mktemp without an outsandingly special use case.

Cheers,
Cameron Simpson 
--
https://mail.python.org/mailman/listinfo/python-list

Re: Make a unique filesystem path, without creating the file

2016-02-28 Thread Alan Bawden

Cameron Simpson  writes:
> On 22Feb2016 12:34, Alan Bawden  wrote:

I have deleted the part of discussion where it seems that we must simply
agree to disagree.  You think mktemp() is _way_ more dangerous that I
do.

>>> In fact your use case isn't safe, because _another_ task using mktemp
>>> in conflict as a plain old temporary file may grab your fifo.
>>
>>But here in very last sentence I really must disagree.  If the code I
>>wrote above is "unsafe" because some _other_ process might be using
>>mktemp() badly and stumble over the same path, then the current
>>implementation of tempfile.mkdtemp() is also "unsafe" for exactly the
>>same reason: some other process using mktemp() badly to create its own
>>directory might accidentally grab the same directory.
>
> When the other taks goes mkdir with the generated name it will fail, so no.

Quite right.  I sabotaged my own argument by picking mkdtemp() instead
of mkstemp().  I was trying to shorten my text by taking advantage of
the fact that I had _already_ mentioned that mkdtemp() performs exactly
the same generate-and-open loop than the code I had written.  I
apologize for the confusion.

In fact, mkstemp() also performs that same generate-and-open loop, and of
course it is careful to use os.O_EXCL along with os.O_CREAT when it
opens the file.  So let me re-state my argument using mkstemp() instead:

If the code I wrote in my original message is "unsafe" because some
_other_ process might be using mktemp() badly and stumble over the same
path, then the current implementation of tempfile.mkstemp() is also
"unsafe" for exactly the same reason: some other process badly using
mktemp() to create its own file might accidentally grab the same file.

In other words, if that other process does:

  path = mktemp()
  tmpfp = open(path, "w")

Then yes indeed, it might accidentally grab my fifo when I used my
original code for making a temporary fifo.  But it might _also_ succeed
in grabbing any temporary files I make using tempfile.mkstemp()!  So if
you think what I wrote is "unsafe", it seems that you must conclude that
the standard tempfile.mkstemp() is exactly as "unsafe".  So is that what
you think?

-- 
Alan Bawden
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Make a unique filesystem path, without creating the file

2016-02-25 Thread Random832

On Tue, Feb 23, 2016, at 03:22, Paul Rubin wrote:
> Thanks.  It would be nice if those were gatewayed to usenet like this
> group is.  I can't bring myself to subscribe to mailing lists.

Have you tried gmane?
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Make a unique filesystem path, without creating the file

2016-02-25 Thread Jon Ribbens

On 2016-02-25, Steven D'Aprano  wrote:
> The links already provided go through the evidence. For example, they 
> explain that /dev/random and /dev/urandom both use the exact same CSPRNG. If 
> you don't believe that, you can actually read the source to Linux, FreeBSD, 
> OpenBSD and NetBSD. (But not OS X, sorry.)

Actually yes OS X:

http://www.opensource.apple.com/source/xnu/xnu-3248.20.55/bsd/dev/random/
http://www.opensource.apple.com/source/xnu/xnu-3248.20.55/osfmk/prng/
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Make a unique filesystem path, without creating the file

2016-02-25 Thread Steven D'Aprano

On Thursday 25 February 2016 17:54, Marko Rauhamaa wrote:

> Steven D'Aprano :
> 
>> On Wednesday 24 February 2016 18:20, Marko Rauhamaa wrote:
>>> Steven D'Aprano :
 And that is where you repeat something which is rank superstition.
>>> 
>>> Can you find info to back that up.
>>
>> The links already provided go through the evidence. For example, they
>> explain that /dev/random and /dev/urandom both use the exact same
>> CSPRNG.
> 
> A non-issue. The question is, after the initial entropy is collected and
> used to seed the CSPRNG, is any further entropy needed for any
> cryptographic purposes?

The short answer: "yes".

The long answer: "probably not, but it can't hurt".

The longer answer: "probably not, and it usually won't hurt, but it could".

If, somehow, an attacker manages to work out the state of your CSPRNG, 
including the entropy pool, then they can predict what values you get until 
they no longer know the state of the CSPRNG. The idea is that if, somehow, 
somebody knows the current state of the CSPRNG (including the entropy pool), 
but can't influence what future values go into the entropy pool, then they 
will only be able to predict the output values for a short time.

But it's hard to think of any actual attack where somebody can see what's in 
the entropy pool but can't influence the values going into it. It seems to 
me that this is an unrealistic attack:

"Assume that you're kidnapped by somebody with no arms or legs..."

The conventional wisdom is that adding poor sources of entropy into the pool 
will never hurt, but that is actually wrong. If an attacker knows what is in 
the entropy pool, and can craft the values going in, they can force the 
CSPRNG to return more predictable values. So sometimes adding more entropy 
can hurt. And it usually won't help. It *might* help if your system is 
compromised, but if so, it's not really clear how the attacker has 
compromised your current entropy pool but not the future ones.

> Are there any nagging fears that weaknesses
> could be found in the deterministic sequence?

Of course there are. Nobody really knows what capabilities the NSA have, but 
they almost surely aren't *that* advanced. CSPRNGs are subject to much the 
same sort of issues as other crypto, such as hash functions:

http://valerieaurora.org/hash.html

and encryption algorithms. (The main real difference between a hash function 
and encryption algorithm is that hashes don't have to be reversible.)

Expect the current crop of CSPRNGs (Yarrow, AES, whatever Linux uses) to be 
replaced long before there is a proven attack on them.

> /dev/random is supposed to be hardened against such concerns by stirring
> the pot constantly (if rather slowly).

As is /dev/urandom.

> Here's what Linus Torvalds said on the matter years back:
> 
>> No, it says /dev/random is primarily useful for generating large
>> (>>160 bit) keys.
> 
>Which is exactly what something like sshd would want to use for
>generating keys for the machine, right? That is _the_ primary reason
>to use /dev/random.
> 
>Yet apparently our /dev/random has been too conservative to be
>actually useful, because (as you point out somewhere else) even sshd
>uses /dev/urandom for the host key generation by default.
> 
>That is really sad. That is the _one_ application that is common and
>that should really have a reason to maybe care about /dev/random vs
>urandom. And that application uses urandom. To me that says that
>/dev/random has turned out to be less than useful in real life.
> 
>Is there anything that actually uses /dev/random at all (except for
>clueless programs that really don't need to)?

Most other Unixes have decided that /dev/random is unnecessary, and urandom 
is the right thing to do. SSH uses urandom by default, but allows the 
paranoid/clueless to use /dev/random if they insist. 

-- 
Steve

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Make a unique filesystem path, without creating the file

2016-02-24 Thread Marko Rauhamaa

Steven D'Aprano :

> On Wednesday 24 February 2016 18:20, Marko Rauhamaa wrote:
>> Steven D'Aprano :
>>> And that is where you repeat something which is rank superstition.
>> 
>> Can you find info to back that up. 
>
> The links already provided go through the evidence. For example, they 
> explain that /dev/random and /dev/urandom both use the exact same
> CSPRNG.

A non-issue. The question is, after the initial entropy is collected and
used to seed the CSPRNG, is any further entropy needed for any
cryptographic purposes? Are there any nagging fears that weaknesses
could be found in the deterministic sequence?

/dev/random is supposed to be hardened against such concerns by stirring
the pot constantly (if rather slowly).

Here's what Linus Torvalds said on the matter years back:

   > No, it says /dev/random is primarily useful for generating large
   > (>>160 bit) keys.

   Which is exactly what something like sshd would want to use for
   generating keys for the machine, right? That is _the_ primary reason
   to use /dev/random.

   Yet apparently our /dev/random has been too conservative to be
   actually useful, because (as you point out somewhere else) even sshd
   uses /dev/urandom for the host key generation by default.

   That is really sad. That is the _one_ application that is common and
   that should really have a reason to maybe care about /dev/random vs
   urandom. And that application uses urandom. To me that says that
   /dev/random has turned out to be less than useful in real life.

   Is there anything that actually uses /dev/random at all (except for
   clueless programs that really don't need to)?

   http://article.gmane.org/gmane.linux.kernel/47437>

> If you don't trust the CSPRNG, then you shouldn't trust it whether it comes 
> from /dev/random or /dev/urandom. If you do trust it, then why would you 
> want it to block? Blocking doesn't make it more random.

It might not make it more secure cryptographically, but the point is
that it should make it more genuinely random.

> That's not how it works. It just makes you vulnerable to a Denial Of
> Service attack.

Understood. You should not use /dev/random for any reactive purposes
(like nonces or session encryption keys).

> There's that myth about urandom being "less random" than random again,
> but even this guy admits that the difference is "extremely hard"
> (actually: impossible) to measure, and that CSPRNG's "work". Which is
> precisely why OpenBSD uses arc4random for their /dev/random and
> /dev/urandom, and presumably why he wants to bring it to Linux.

That's for the cryptographic experts to judge. CSPRNG's aren't always as
CS as one would think:

   In December 2013, a Reuters news article alleged that in 2004, before
   NIST standardized Dual_EC_DRBG, NSA paid RSA Security $10 million in
   a secret deal to use Dual_EC_DRBG as the default in the RSA BSAFE
   cryptography library, which resulted in RSA Security becoming the
   most important distributor of the insecure algorithm.

   https://en.wikipedia.org/wiki/Dual_EC_DRBG>

> The bottom line is, nobody can distinguish the output of urandom and
> random (apart from the blocking behaviour). Nobody has demonstrated
> any way to distinguish the output of either random or urandom from
> "actual randomness". There are theoretical attacks on urandom that
> random might be immune to, but if so, I haven't heard what they are.

What I'm looking for is a cryptography mailing list (or equivalent)
giving their stamp of approval. As can be seen above, NIST ain't it.

It seems, though, that cryptography researchers are not ready to declare
any scheme void of vulnerabilities. At best they can mention that there
are no *known* vulnerabilities.

> What evidence do they give that /dev/urandom is weak? If it is weak,
> why are they using it as the default?

It's a big mess, but not a mess I would disentangle. Once the crypto
libraries, utilities, facilities and the OS come to a consensus, I can
hope they've done their homework. As it stands, the STRONG vs VERY
STRONG dichotomy seems to be alive all over the place.

Marko
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Make a unique filesystem path, without creating the file

2016-02-24 Thread Steven D'Aprano

On Wednesday 24 February 2016 18:20, Marko Rauhamaa wrote:

> Steven D'Aprano :
> 
>> On Tue, 23 Feb 2016 05:54 pm, Marko Rauhamaa wrote:
>>> However, when you are generating signing or encryption keys, you
>>> should use /dev/random.
>>
>> And that is where you repeat something which is rank superstition.
> 
> Can you find info to back that up. 

The links already provided go through the evidence. For example, they 
explain that /dev/random and /dev/urandom both use the exact same CSPRNG. If 
you don't believe that, you can actually read the source to Linux, FreeBSD, 
OpenBSD and NetBSD. (But not OS X, sorry.)

Put aside the known bug in Linux, where urandom will provide predictable 
values in the period following a fresh install before the OS has collected 
enough entropy. We all agree that's a problem, and that the right solution 
is to block. The question is, outside of that narrow set of circumstances, 
when it is appropriate to block?

As I mentioned, most Unixes don't block. urandom and random behave exactly 
the same way in three of the most popular Unixes (FreeBSD, OpenBSD, OS X):

https://en.wikipedia.org/wiki//dev/random

so let's consider just those that do block (Linux, and NetBSD). They both 
use the same CSPRNG. Do you dispute that? Then read the source. For one to 
be "better" than the other, there would need to be a detectable difference 
between the two. Nobody has ever found one, and nor will they, because 
they're both coming from the same CSPRNG (AES in the case of NetBSD, I'm not 
sure what in the case of Linux).

If you don't trust the CSPRNG, then you shouldn't trust it whether it comes 
from /dev/random or /dev/urandom. If you do trust it, then why would you 
want it to block? Blocking doesn't make it more random. That's not how it 
works. It just makes you vulnerable to a Denial Of Service attack.

There really doesn't seem to be any valid reason for random blocking. It's 
like warnings and timers on fans in South Korea to prevent fan death:

http://www.snopes.com/medical/freakish/fandeath.asp

(My favourite explanation is that the blades of the fan chop the oxygen 
molecules in two.)

I'm not surprised that there is so much misinformation about random/urandom. 
Here's a blog post by somebody wanting to port arc4 to Linux, so he clearly 
knows a few things about crypto. I can't judge whether arc4 is better or 
worse than what Linux already uses, but look at this quote:

http://insanecoding.blogspot.com.au/2014/05/a-good-idea-with-bad-usage-
devurandom.html

Quote:

Linux is well known for inventing and supplying two default files,
/dev/random and /dev/urandom (unlimited random). The former is 
pretty much raw entropy, while the latter is the output of a CSPRNG
function like OpenBSD's arc4random family. The former can be seen 
as more random, and the latter as less random, but the differences
are extremely hard to measure, which is why CSPRNGs work in the 
first place. Since the former is only entropy, it is limited as to
how much it can output, and one needing a lot of random data can be
stuck waiting a while for it to fill up the random buffer. Since the
latter is a CSPRNG, it can keep outputting data indefinitely, 
without any significant waiting periods."

There's that myth about urandom being "less random" than random again, but 
even this guy admits that the difference is "extremely hard" (actually: 
impossible) to measure, and that CSPRNG's "work". Which is precisely why 
OpenBSD uses arc4random for their /dev/random and /dev/urandom, and 
presumably why he wants to bring it to Linux.

This author is *completely wrong* to say that /dev/random is "pretty much 
raw entropy". If it were, it would be biased, and easily manipulated by an 
attacker. Entropy is collected from (among other things) network traffic, 
which would allow an attacker to control at least one source of entropy and 
hence (in theory) make it easier to predict the output of /dev/random.

But fortunately it is not true. Linux's random system works like this:

- entropy is collected from various sources and fed into a pool;

- entropy from that pool is fed through a CSPRNG into two separate pools, 
  one each for /dev/random and /dev/urandom;

- when you read from /dev/random or urandom, they both collect entropy
  from their own pool, and again pass it through a CSPRNG;

- /dev/random has a throttle (it blocks if you take out too much);

- /dev/urandom doesn't have a throttle.

https://events.linuxfoundation.org/images/stories/pdf/lceu2012_anvin.pdf

Somebody criticized the author for spreading this misapprehension that 
/dev/random is "raw entropy" and here is his response:

I tried giving an explanation which should be simple for a 
layman to follow of what goes on. I wouldn't take it as precise 
fact, especially when there's a washing machine involved in the
explanation ;)

Or, in other words, "When I said the moon was made of green cheese, I was

Re: Make a unique filesystem path, without creating the file

2016-02-24 Thread Steven D'Aprano

On Tue, 23 Feb 2016 07:22 pm, Paul Rubin wrote:

> Mark Lawrence  writes:
>> https://mail.python.org/pipermail/python-ideas/2015-September/036333.html
>> then http://www.gossamer-threads.com/lists/python/dev/1223780
> 
> Thanks.  It would be nice if those were gatewayed to usenet like this
> group is.  I can't bring myself to subscribe to mailing lists.
> 
>>> There are a few other choices in the PEP whose benefit is unclear to me,
>>> but they aren't harmful, and I guess the decisions have already been
>>> made.
>> The PEP status is draft so is subject to change.
> 
> Well they might be changeable but it sounds like there's a level of
> consensus by now, that wouldn't be helped by more bikeshedding over
> relatively minor stuff.  I might write up some further comments and post
> them here

If you're going to do so, please do so in the next few days (or write to me
off list to ask for an extension) because I intend to ask Guido for a
ruling early next week.


-- 
Steven

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Make a unique filesystem path, without creating the file

2016-02-23 Thread Marko Rauhamaa

Steven D'Aprano :

> On Tue, 23 Feb 2016 05:54 pm, Marko Rauhamaa wrote:
>> However, when you are generating signing or encryption keys, you
>> should use /dev/random.
>
> And that is where you repeat something which is rank superstition.

Can you find info to back that up. All I've seen so far is forceful
claims that's superstition ("These are not the droids you're looking
for"). Even the ssh-keygen man page has:

The reseeding of the OpenSSL random generator is usually done from
/dev/urandom. If the SSH_USE_STRONG_RNG environment vari‐ able is
set to value other than 0 the OpenSSL random generator is reseeded
from /dev/random.

Marko
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Make a unique filesystem path, without creating the file

2016-02-23 Thread Steven D'Aprano

On Tue, 23 Feb 2016 05:54 pm, Marko Rauhamaa wrote:

> Steven D'Aprano :
> 
>> On Tue, 23 Feb 2016 06:32 am, Marko Rauhamaa wrote:
>>> Under Linux, /dev/random is the way to go when strong security is
>>> needed. Note that /dev/random is a scarce resource on ordinary
>>> systems.
>>
>> That's actually incorrect, but you're not the only one to have been
>> mislead by the man pages.
>>
>> http://sockpuppet.org/blog/2014/02/25/safely-generate-random-numbers/
> 
> Still, mostly hypnotic repetitions.

Repetition for the sake of emphasis, because there are so many misled and
confused people on the internet who misunderstand the difference between
urandom and random and consequently give bad advice. I believe that the
Linux man page for urandom is to blame, although I don't know why it hasn't
been fixed.

Possibly because it is *technically* correct, in the sense of "if you are
concerned by the risk of being hit by a meteorite, wearing a stainless
steel cooking pot on your head will give you some protection from meteorite
strikes to the head". Everything in it is technically correct, but
misleading.

> However, it admits:
> 
>But /dev/random also tries to keep track of how much entropy remains
>in its kernel pool, and will occasionally go on strike if it decides
>not enough remains.
> 
> That's the whole point. 

Exactly, but you've missed the point. That is precisely why the blocking
random is HARMFUL and should not be used. There is one, and only one,
scenario when your CSPRNG should block: before the system has enough
entropy to securely seed the CSPRNG and it is at risk of returning
predictable numbers.

But after that point has passed, there is no test you can perform to
distinguish the outputs of /dev/random and /dev/urandom (apart from the
blocking behaviour itself). If I give you a million numbers, there is no
way you can tell whether I used random or urandom.

The important thing here is that there is no difference in "quality"
(whatever that means!) between the random numbers generated by urandom and
those generated by random. They are equally unpredictable. They pass the
same randomness tests. Neither is "better" or "worse" than the other,
because they are both generated by the same CSPRNG or HRNG.

Here is a summary of the random/urandom distinction on various Unixes:

Linux: 
random blocks, urandom never blocks, both use the same CSPRNG 
based on SHA-1 hashes, both will use a HRNG if available

FreeBSD:
urandom is a link to random, which never blocks; uses 256-bit 
Yarrow CSPRNG, will use a HRNG if available

OpenBSD:
both never block; both use a variant of the RC4 CSPRNG 
(misleadingly renamed ARC4 due to licencing issues), in
newer versions use the ChaCha20 CSPRNG

OS X:
both never block and use 160-bit Yarrow

NetBSD:
random blocks, urandom never blocks, both use the same AES-128
CSPRNG

The NetBSD man pages are quite scathing:

"The entropy accounting described here is not grounded in any
cryptography theory.  It is done because it was always done, 
and because it gives people a warm fuzzy feeling about 
information theory.

...

History is littered with examples of broken entropy sources and
failed system engineering for random number generators.  Nobody 
has ever reported distinguishing AES ciphertext from uniform 
random without side channels, nor reported computing SHA-1 
preimages faster than brute force.  The folklore information-
theoretic defence against computationally unbounded attackers 
replaces system engineering that successfully defends against 
realistic threat models by imaginary theory that defends only 
against fantasy threat models."

To be clear, the "folklore information-theoretic defence" they are referring
to is /dev/random's blocking behaviour.

http://netbsd.gw.com/cgi-bin/man-cgi?rnd+4+NetBSD-current

The blocking behaviour of /dev/random (on Linux) doesn't solve any real
problems, but it *creates* new problems. /dev/random can block for minutes
or even hours, especially straight after booting a freshly installed OS.
This can be considered a Denial Of Service attack, and even if it isn't, it
encourages developers to "fix" the problem by using their own home-brewed
random numbers, weakening the security of the system.

There's even a minority viewpoint that constantly adding new entropy to the
CSPRNG is useless. Apart from collecting sufficient entropy for the initial
seed, you should never add new entropy to the CSPRNG. Your CSPRNG is either
cryptographically strong, or it isn't. If it is, then it is already
unpredictable and adding more entropy is a waste of time. If it isn't, then
adding more entropy isn't going to help you.

Adding entropy is just one more component that can contain bugs (see the
NetSBD comment about "broken entropy sources") or even allow an attack on
the CSPRNG:

http://blog.cr.yp.to/20140205-entropy.html

There's one good argument for

Re: Make a unique filesystem path, without creating the file

2016-02-23 Thread Marko Rauhamaa

Paul Rubin :

> Marko Rauhamaa  writes:
>> It is also correct that /dev/urandom depletes the entropy pool as
>> effectively as /dev/random. 
>
> I think see what's confusing you: the above is a misconception that is
> probably held by lots of people.  Entropy is not water and from a
> cryptographic standpoint there is essentially no such thing as
> "depleting" an entropy pool.  There is either enough entropy (say 256
> bits or more) in the PRNG or else there isn't.  If there's not enough,
> urandom can misbehave by giving you bad output because it doesn't block
> until more is gathered.  If there is enough, /dev/random misbehaves by
> blocking under this bogus concept of "depletion".

You are making my point. /dev/random is correct to block until
top-quality random numbers can be supplied. That's not misbehaving.

> So once /dev/random unblocks, it should never again block, the behavior
> of getrandom.

What you are saying is that /dev/random has no reason to exist (and the
GRND_RANDOM flag to getrandom() is redundant).

I'm no cryptographer and can't judge that. However, as long as the
distinction is maintained, I have to abide by the documented
characteristics.

> No really, all you've done is repeat bad advice. The people cited in
> that article are very knowledgeable and the stuff they say makes good
> mathematical sense. The stuff you say makes no sense and you haven't
> given any convincing reason for anyone to listen to you.

Thing is, neither you nor me nor the cited articles has provided any
more info than insisting on a position, my position being relying on the
documented API.

So we have

 * /dev/urandom vs /dev/random

 * getrandom(0) vs getrandom(GRND_RANDOM)

 * GCRY_STRONG_RANDOM ("Use this level for session keys and similar
   purposes") vs GCRY_VERY_STRONG_RANDOM ("Use this level for long term
   key material") (in libgcrypt)

You don't need to convince me that that distinction is silly. You need
to convince the crypto facility providers.

Marko
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Make a unique filesystem path, without creating the file

2016-02-23 Thread Grant Edwards

On 2016-02-23, Mark Lawrence  wrote:
> On 23/02/2016 08:22, Paul Rubin wrote:
>> Mark Lawrence  writes:
>>> https://mail.python.org/pipermail/python-ideas/2015-September/036333.html
>>> then http://www.gossamer-threads.com/lists/python/dev/1223780
>>
>> Thanks.  It would be nice if those were gatewayed to usenet like
>> this group is. I can't bring myself to subscribe to mailing lists.
>
> Piece of cake using even a semi-decent email client (I use Thunderbird 
> on Windows) via gmane.

And gmane is even better using a decent news (NNTP) client.  I prefer
slrn, but that may be a bit old-school for many.  Technically, gmane's
news server is not "Usenet", but the UI is the same.

Gmane's internal search facility is a bit lame, but searching gmane
with Google works fairly well.

> It provides access to hundreds of Python mailing lists, blogs and
> even updates to the Activestate recipes :)

-- 
Grant Edwards   grant.b.edwardsYow! World War Three can
  at   be averted by adherence
  gmail.comto a strictly enforced
   dress code!
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Make a unique filesystem path, without creating the file

2016-02-23 Thread Mark Lawrence


On 23/02/2016 08:22, Paul Rubin wrote:

Mark Lawrence  writes:

https://mail.python.org/pipermail/python-ideas/2015-September/036333.html
then http://www.gossamer-threads.com/lists/python/dev/1223780


Thanks.  It would be nice if those were gatewayed to usenet like this
group is.  I can't bring myself to subscribe to mailing lists.


Piece of cake using even a semi-decent email client (I use Thunderbird 
on Windows) via gmane.  It provides access to hundreds of Python mailing 
lists, blogs and even updates to the Activestate recipes :)





There are a few other choices in the PEP whose benefit is unclear to me,
but they aren't harmful, and I guess the decisions have already been
made.

The PEP status is draft so is subject to change.


Well they might be changeable but it sounds like there's a level of
consensus by now, that wouldn't be helped by more bikeshedding over
relatively minor stuff.  I might write up some further comments and post
them here



You might as well, can't do any harm and somebody might pick up on 
something.


--
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence

--
https://mail.python.org/mailman/listinfo/python-list

Re: Make a unique filesystem path, without creating the file

2016-02-23 Thread Peter Otten

Paul Rubin wrote:

> Mark Lawrence  writes:
>> https://mail.python.org/pipermail/python-ideas/2015-September/036333.html
>> then http://www.gossamer-threads.com/lists/python/dev/1223780
> 
> Thanks.  It would be nice if those were gatewayed to usenet like this
> group is.  I can't bring myself to subscribe to mailing lists.

They are available via news.gmane.org as

gmane.comp.python.devel
gmane.comp.python.ideas


-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Make a unique filesystem path, without creating the file

2016-02-23 Thread Paul Rubin

Mark Lawrence  writes:
> https://mail.python.org/pipermail/python-ideas/2015-September/036333.html
> then http://www.gossamer-threads.com/lists/python/dev/1223780

Thanks.  It would be nice if those were gatewayed to usenet like this
group is.  I can't bring myself to subscribe to mailing lists.

>> There are a few other choices in the PEP whose benefit is unclear to me,
>> but they aren't harmful, and I guess the decisions have already been
>> made.
> The PEP status is draft so is subject to change.

Well they might be changeable but it sounds like there's a level of
consensus by now, that wouldn't be helped by more bikeshedding over
relatively minor stuff.  I might write up some further comments and post
them here 
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Make a unique filesystem path, without creating the file

2016-02-23 Thread Mark Lawrence


On 23/02/2016 02:27, Paul Rubin wrote:

Steven D'Aprano  writes:

https://www.python.org/dev/peps/pep-0506/


I didn't know about this!  The discussion was all on mailing lists?


https://mail.python.org/pipermail/python-ideas/2015-September/036333.html then 
http://www.gossamer-threads.com/lists/python/dev/1223780




A few things I suggest changing:

   1) the default system RNG for Linux should be getrandom(2) on kernels
   that support it (3.17 and later).

   2) Some effort should be directed at simulating getrandom's behaviour
   on kernels that don't have it, using the /dev/random entropy estimator
   and the /dev/urandom interface.  I.e. it should block if the system
   hasn't seen enough entropy to get the CSPRNG started securely, and
   never block after that.

   3) The default token length should be long enough to not have to "change
   in the future".  If the user wants a shorter token, they ask for that,
   or can truncate a longer one that they receive from the default.

There are a few other choices in the PEP whose benefit is unclear to me,
but they aren't harmful, and I guess the decisions have already been
made.



The PEP status is draft so is subject to change.

--
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence

--
https://mail.python.org/mailman/listinfo/python-list

Re: Make a unique filesystem path, without creating the file

2016-02-22 Thread Paul Rubin

Marko Rauhamaa  writes:
> It is also correct that /dev/urandom depletes the entropy pool as
> effectively as /dev/random. 

I think see what's confusing you: the above is a misconception that is
probably held by lots of people.  Entropy is not water and from a
cryptographic standpoint there is essentially no such thing as
"depleting" an entropy pool.  There is either enough entropy (say 256
bits or more) in the PRNG or else there isn't.  If there's not enough,
urandom can misbehave by giving you bad output because it doesn't block
until more is gathered.  If there is enough, /dev/random misbehaves by
blocking under this bogus concept of "depletion".  If you have a seed
with 256 bits of entropy and you generate a gigabyte of random numbers
from it, you have not increased the predictability of the seed in any
significant way.

So once /dev/random unblocks, it should never again block, the behavior
of getrandom.  There used to be an article on David Wagner's web site
(cs.berkeley.edu/~daw) about the concept of "depleting" entropy by
iterated hashing, but I can't find it now.  That's unfortunate since it
might help cast light on the subject.

>> http://www.2uo.de/myths-about-urandom/
> Already addressed.

No really, all you've done is repeat bad advice.  The people cited in
that article are very knowledgeable and the stuff they say makes good
mathematical sense.  The stuff you say makes no sense and you haven't
given any convincing reason for anyone to listen to you.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Make a unique filesystem path, without creating the file

2016-02-22 Thread Marko Rauhamaa

Steven D'Aprano :

> On Tue, 23 Feb 2016 06:32 am, Marko Rauhamaa wrote:
>> Under Linux, /dev/random is the way to go when strong security is
>> needed. Note that /dev/random is a scarce resource on ordinary
>> systems.
>
> That's actually incorrect, but you're not the only one to have been
> mislead by the man pages.
>
> http://sockpuppet.org/blog/2014/02/25/safely-generate-random-numbers/

Still, mostly hypnotic repetitions.

However, it admits:

   But /dev/random also tries to keep track of how much entropy remains
   in its kernel pool, and will occasionally go on strike if it decides
   not enough remains.

That's the whole point. /dev/random will rather block the program than
lower the quality of the random numbers below a threshold. /dev/urandom
has no such qualms.

   If you use /dev/random instead of urandom, your program will
   unpredictably (or, if you’re an attacker, very predictably) hang when
   Linux gets confused about how its own RNG works.

Yes, possibly indefinitely, too.

   Using /dev/random will make your programs less stable, but it won’t
   make them any more cryptographically safe.

It is correct that you shouldn't use /dev/random as a routine source of
bulk random numbers. It is also correct that /dev/urandom depletes the
entropy pool as effectively as /dev/random. However, when you are
generating signing or encryption keys, you should use /dev/random.

As stated in https://lwn.net/Articles/606141/>:

   /dev/urandom should be used for essentially all random numbers
   required, but /dev/random is sometimes used for things like extremely
   sensitive, long-lived keys (e.g. GPG) or one-time pads.

> See also:
>
> http://www.2uo.de/myths-about-urandom/

Already addressed.

Marko
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Make a unique filesystem path, without creating the file

2016-02-22 Thread Paul Rubin

Chris Angelico  writes:
> How much future are you expecting?

This is old but its methodology still seems ok:

  http://saluc.engr.uconn.edu/refs/keymgr/blaze95minimalkeylength.pdf

I also like this:

  http://cr.yp.to/talks/2015.10.05/slides-djb-20151005-a4.pdf

Quote (slide 37): 

  The crypto users' fantasy is boring crypto: crypto that simply works,
  solidly resists attacks, never needs any upgrades.

HN discussion: https://news.ycombinator.com/item?id=10345965
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Make a unique filesystem path, without creating the file

2016-02-22 Thread Chris Angelico

On Tue, Feb 23, 2016 at 1:27 PM, Paul Rubin  wrote:
>   3) The default token length should be long enough to not have to "change
>   in the future".  If the user wants a shorter token, they ask for that,
>   or can truncate a longer one that they receive from the default.

How much future are you expecting?

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Make a unique filesystem path, without creating the file

2016-02-22 Thread Paul Rubin

Steven D'Aprano  writes:
> https://www.python.org/dev/peps/pep-0506/

I didn't know about this!  The discussion was all on mailing lists?

A few things I suggest changing:

  1) the default system RNG for Linux should be getrandom(2) on kernels
  that support it (3.17 and later).

  2) Some effort should be directed at simulating getrandom's behaviour
  on kernels that don't have it, using the /dev/random entropy estimator
  and the /dev/urandom interface.  I.e. it should block if the system
  hasn't seen enough entropy to get the CSPRNG started securely, and
  never block after that.

  3) The default token length should be long enough to not have to "change
  in the future".  If the user wants a shorter token, they ask for that,
  or can truncate a longer one that they receive from the default.

There are a few other choices in the PEP whose benefit is unclear to me,
but they aren't harmful, and I guess the decisions have already been
made.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Make a unique filesystem path, without creating the file

2016-02-22 Thread Jon Ribbens

On 2016-02-23, Ben Finney  wrote:
> Oscar Benjamin  writes:
>> What does unpredictable mean in this context? Maybe I'm reading too
>> much into that...
>
> I think you may be, yes. The request in this thread requires making
> direct use of the “generate a new valid temporary fielsystem path”
> functionality already implemented in ‘tempfile’.
>
> Implementations of that functionality outside of ‘tempfile’ are a fun
> exercise, but miss the point of this thread.

I think you have missed the point of your own thread.

You can't do what you wanted using tempfile, the only possible
answer is to choose a filename that is sufficiently random that
your hope that it is unique won't be proven futile. tempfile has
two main modes, mktemp which meets your requirements but should
never be used as it is insecure, and mkstemp which doesn't meet
your requirements because it fundamentally operates by actually
creating the file in question and relying on the filesystem to
guarantee uniqueness.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Make a unique filesystem path, without creating the file

2016-02-22 Thread Chris Angelico

On Tue, Feb 23, 2016 at 11:44 AM, Jon Ribbens
 wrote:
> On 2016-02-23, Chris Angelico  wrote:
>> On Tue, Feb 23, 2016 at 11:26 AM, Jon Ribbens
>> wrote:
>>> On 2016-02-23, Chris Angelico  wrote:
 On Tue, Feb 23, 2016 at 11:08 AM, Jon Ribbens
 wrote:
>> If you generate 2**128 + 1 such numbers, you are *guaranteed* to
>
> ... have expired due to the heat death of the universe.

 Maybe... but by the time you get to 2**64 of them, you have a 50%
 chance of a collision. (That's either utterly intuitive or completely
 counter-intuitive, depending on who you are.)
>>>
>>> Um, did you mean to say 2**127? Are you thinking of the
>>> birthday paradox or something, which doesn't apply here?
>>
>> By the time you generate 2**64 of them, you have a 50% chance that
>> some pair of them collides. Yes, the birthday paradox does apply here.
>
> Oh, I see, you're thinking of it differently. I was thinking of it as
> Alice is choosing a filename and Mallet is trying to guess it, in which
> case the birthday paradox doesn't apply. You're thinking of it as Alice
> is generating many random filenames and, even though she could avoid
> collisions with 100% certainty by remembering what she's already had,
> isn't doing so, and must avoid colliding with herself. I don't think
> your version makes has much relevance as an attack model.

Ah. Steven was talking about collisions; once you have 2**128+1 of
them, you're guaranteed a collision (pigeonhole principle). What
you're talking about gives certainty slightly sooner - specifically,
once you've tried 2**128 of them, you're guaranteed to have hit it :)

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Make a unique filesystem path, without creating the file

2016-02-22 Thread Jon Ribbens

On 2016-02-23, Chris Angelico  wrote:
> On Tue, Feb 23, 2016 at 11:26 AM, Jon Ribbens
> wrote:
>> On 2016-02-23, Chris Angelico  wrote:
>>> On Tue, Feb 23, 2016 at 11:08 AM, Jon Ribbens
>>> wrote:
> If you generate 2**128 + 1 such numbers, you are *guaranteed* to

 ... have expired due to the heat death of the universe.
>>>
>>> Maybe... but by the time you get to 2**64 of them, you have a 50%
>>> chance of a collision. (That's either utterly intuitive or completely
>>> counter-intuitive, depending on who you are.)
>>
>> Um, did you mean to say 2**127? Are you thinking of the
>> birthday paradox or something, which doesn't apply here?
>
> By the time you generate 2**64 of them, you have a 50% chance that
> some pair of them collides. Yes, the birthday paradox does apply here.

Oh, I see, you're thinking of it differently. I was thinking of it as
Alice is choosing a filename and Mallet is trying to guess it, in which
case the birthday paradox doesn't apply. You're thinking of it as Alice
is generating many random filenames and, even though she could avoid
collisions with 100% certainty by remembering what she's already had,
isn't doing so, and must avoid colliding with herself. I don't think
your version makes has much relevance as an attack model.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Make a unique filesystem path, without creating the file

2016-02-22 Thread Steven D'Aprano

On Tue, 23 Feb 2016 06:32 am, Marko Rauhamaa wrote:

> Jon Ribbens :
> 
>> Suppose you had code like this:
>>
>>   filename = binascii.hexlify(os.urandom(16)).decode("ascii")
>>
>> Do we really think that is insecure or that there are any practical
>> attacks against it? It would be basically the same as saying that
>> urandom() is broken, surely?
> 
> urandom() is not quite random and so should not be considered
> cryptographically airtight.
> 
> Under Linux, /dev/random is the way to go when strong security is
> needed. Note that /dev/random is a scarce resource on ordinary systems.

That's actually incorrect, but you're not the only one to have been mislead
by the man pages.

http://sockpuppet.org/blog/2014/02/25/safely-generate-random-numbers/

On non-Linux Unixes, the difference between urandom and random is mostly, or
entirely, gone, in favour of urandom's non-blocking behaviour. And it's a
myth that the output of random is "more random" or "more pure" than
urandom's. In reality, on Linux both urandom and random use exactly the
same CSPRNG.

See also:

http://www.2uo.de/myths-about-urandom/

for a good explanation of how random and urandom actually work on Linux.

-- 
Steven

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Make a unique filesystem path, without creating the file

2016-02-22 Thread Chris Angelico

On Tue, Feb 23, 2016 at 11:26 AM, Jon Ribbens
 wrote:
> On 2016-02-23, Chris Angelico  wrote:
>> On Tue, Feb 23, 2016 at 11:08 AM, Jon Ribbens
>> wrote:
 If you generate 2**128 + 1 such numbers, you are *guaranteed* to
>>>
>>> ... have expired due to the heat death of the universe.
>>
>> Maybe... but by the time you get to 2**64 of them, you have a 50%
>> chance of a collision. (That's either utterly intuitive or completely
>> counter-intuitive, depending on who you are.)
>
> Um, did you mean to say 2**127? Are you thinking of the
> birthday paradox or something, which doesn't apply here?

By the time you generate 2**64 of them, you have a 50% chance that
some pair of them collides. Yes, the birthday paradox does apply here.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Make a unique filesystem path, without creating the file

2016-02-22 Thread Ben Finney

Oscar Benjamin  writes:

> What does unpredictable mean in this context? Maybe I'm reading too
> much into that...

I think you may be, yes. The request in this thread requires making
direct use of the “generate a new valid temporary fielsystem path”
functionality already implemented in ‘tempfile’.

Implementations of that functionality outside of ‘tempfile’ are a fun
exercise, but miss the point of this thread.

-- 
 \   “But Marge, what if we chose the wrong religion? Each week we |
  `\  just make God madder and madder.” —Homer, _The Simpsons_ |
_o__)  |
Ben Finney

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Make a unique filesystem path, without creating the file

2016-02-22 Thread Jon Ribbens

On 2016-02-23, Chris Angelico  wrote:
> On Tue, Feb 23, 2016 at 11:08 AM, Jon Ribbens
> wrote:
>>> If you generate 2**128 + 1 such numbers, you are *guaranteed* to
>>
>> ... have expired due to the heat death of the universe.
>
> Maybe... but by the time you get to 2**64 of them, you have a 50%
> chance of a collision. (That's either utterly intuitive or completely
> counter-intuitive, depending on who you are.)

Um, did you mean to say 2**127? Are you thinking of the
birthday paradox or something, which doesn't apply here?
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Make a unique filesystem path, without creating the file

2016-02-22 Thread Steven D'Aprano

On Tue, 23 Feb 2016 05:17 am, Jon Ribbens wrote:

> On 2016-02-22, Ethan Furman  wrote:
>> On 02/14/2016 04:08 PM, Ben Finney wrote:
>>> I am unconcerned with whether there is a real filesystem entry of that
>>> name; the goal entails having no filesystem activity for this. I want a
>>> valid unique filesystem path, without touching the filesystem.
>>
>> This is impossible.  If you don't touch the file system you have no way
>> to know if the path is unique.
> 
> Weell, I have a lot of sympathy for that point, but on the other
> hand the whole concept of UUIDs ("import uuid") is predicated on the
> opposite assumption.

You're referring to uuid4, presumably, as the other varieties of UUID use
non-secret information, such as the time, or a namespace, either of which
is potentially public knowledge. 

Only uuid4 is considered "globally unique", and that's not *certainly*
globally unique, only that the chances of an *accidental* collision is
below some threshold deemed "small enough that we don't care".

Deliberate collisions of public UUIDs are *trivial*. Pick a UUID you know is
already in use, and use it again.

There's a lot of assumptions involved in the "globally unique" claim, and
there are probably ways to contrive to generate the same UUIDs as someone
else. But to what benefit? UUIDs are not intended as security tokens, and
are not hardened against attack. Even uuid4 may not be suitable for
security, since it may use a cryptographically weak PRNG such as Mersenne
Twister.

-- 
Steven

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Make a unique filesystem path, without creating the file

2016-02-22 Thread Chris Angelico

On Tue, Feb 23, 2016 at 11:08 AM, Jon Ribbens
 wrote:
> On 2016-02-22, Steven D'Aprano  wrote:
>> On Tue, 23 Feb 2016 05:48 am, Marko Rauhamaa wrote:
>>> Jon Ribbens :
 I was under the impression that the point of UUIDs is that you can be
 *so* confident that there won't be a collision that for all practical
 purposes it's indistinguishable from being certain.
>>>
>>> Yes, if you generate a random 128-bit number, it will be unique --
>>
>> If you generate a second random 128 bit number, you have a chance of 1 in
>> 2**128 of a collision. All you can say is that it will be *very probably*
>> unique. (I might even allow "almost certainly" unique.)
>
> If you are not prepared to say that something with a
> 340282366920938463463374607431768211455 /
> 340282366920938463463374607431768211456 chance of being true
> is not "certainly true" then I'm not sure how you would not
> be too scared to ever leave the house. Or not leave the house.
> I mean, you're probably going to be hit by 10^25 meteorites,
> which sounds painful.
>
>> If you generate 2**128 + 1 such numbers, you are *guaranteed* to
>
> ... have expired due to the heat death of the universe.

Maybe... but by the time you get to 2**64 of them, you have a 50%
chance of a collision. (That's either utterly intuitive or completely
counter-intuitive, depending on who you are.)

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Make a unique filesystem path, without creating the file

2016-02-22 Thread Jon Ribbens

On 2016-02-22, Steven D'Aprano  wrote:
> On Tue, 23 Feb 2016 05:48 am, Marko Rauhamaa wrote:
>> Jon Ribbens :
>>> I was under the impression that the point of UUIDs is that you can be
>>> *so* confident that there won't be a collision that for all practical
>>> purposes it's indistinguishable from being certain.
>> 
>> Yes, if you generate a random 128-bit number, it will be unique --
>
> If you generate a second random 128 bit number, you have a chance of 1 in
> 2**128 of a collision. All you can say is that it will be *very probably*
> unique. (I might even allow "almost certainly" unique.)

If you are not prepared to say that something with a
340282366920938463463374607431768211455 /
340282366920938463463374607431768211456 chance of being true
is not "certainly true" then I'm not sure how you would not
be too scared to ever leave the house. Or not leave the house.
I mean, you're probably going to be hit by 10^25 meteorites,
which sounds painful.

> If you generate 2**128 + 1 such numbers, you are *guaranteed* to

... have expired due to the heat death of the universe.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Make a unique filesystem path, without creating the file

2016-02-22 Thread Jon Ribbens

On 2016-02-23, Steven D'Aprano  wrote:
> On Tue, 23 Feb 2016 06:22 am, Jon Ribbens wrote:
>> Suppose you had code like this:
>> 
>> filename = binascii.hexlify(os.urandom(16)).decode("ascii")
>> 
>> Do we really think that is insecure or that there are any practical
>> attacks against it? It would be basically the same as saying that
>> urandom() is broken, surely?
>
> Correct. Any attack against urandom would be an attack on this. You would
> just have to trust that the kernel devs have made urandom as secure as
> possible, and pay no attention to what the man page says, as its wrong.
>
> By the way, Python 3.6 will have (once Guido formally approves it) a new
> module, "secrets", for securely generating (pseudo)random tokens like this:
>
> import secrets
> filename = secrets.token_hex(16)

+1
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Make a unique filesystem path, without creating the file

2016-02-22 Thread Steven D'Aprano

On Tue, 23 Feb 2016 06:22 am, Jon Ribbens wrote:

> Suppose you had code like this:
> 
> filename = binascii.hexlify(os.urandom(16)).decode("ascii")
> 
> Do we really think that is insecure or that there are any practical
> attacks against it? It would be basically the same as saying that
> urandom() is broken, surely?

Correct. Any attack against urandom would be an attack on this. You would
just have to trust that the kernel devs have made urandom as secure as
possible, and pay no attention to what the man page says, as its wrong.

By the way, Python 3.6 will have (once Guido formally approves it) a new
module, "secrets", for securely generating (pseudo)random tokens like this:

import secrets
filename = secrets.token_hex(16)

https://www.python.org/dev/peps/pep-0506/

-- 
Steven

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Make a unique filesystem path, without creating the file

2016-02-22 Thread Oscar Benjamin

On 22 Feb 2016 22:50, "Ben Finney"  wrote:
>
> Ethan Furman  writes:
>
> > On 02/14/2016 04:08 PM, Ben Finney wrote:
> >
> > > I am unconcerned with whether there is a real filesystem entry of that
> > > name; the goal entails having no filesystem activity for this. I want
a
> > > valid unique filesystem path, without touching the filesystem.
> >
> > This is impossible.  If you don't touch the file system you have no
> > way to know if the path is unique.
>
> That was unclear. Later in the same thread, I clarified that by “unique”
> I mean nothing about entries already on the filesystem. Instead it means
> “unpredictably different each time the function is called”.

What does unpredictable mean in this context? Maybe I'm reading too much
into that... What's wrong with the example I posted before?

--
Oscar
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Make a unique filesystem path, without creating the file

2016-02-22 Thread Steven D'Aprano

On Tue, 23 Feb 2016 06:22 am, Paul Rubin wrote:

> Chris Angelico  writes:
>>> I was under the impression that the point of UUIDs is that you can be
>>> *so* confident that there won't be a collision that for all practical
>>> purposes it's indistinguishable from being certain.
>> Maybe, if everyone's cooperating. I'm not sure how they fare in the
>> face of malice though.
> 
> There are different UUID algorithms, some of which have useful syntax
> but are easy to spoof.  Uuid4 is random and implemented properly, should
> be hard to spoof.

I'm not sure what you mean by "spoof" in this context. Do you mean generate
collisions?

Do you mean "pretend to generate a UUID, but without actually doing so"?
That's how I interpret "spoof", but I don't quite understand why that would
be difficult. Here's one I just made now:

{00010203-0405-0607-0809-0a0b0c0d0e0f}

And another:

{836313e2-3b8a-53f2-9b90-0c9ade199e5d}

They weren't hard to spoof :-)

-- 
Steven

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Make a unique filesystem path, without creating the file

2016-02-22 Thread Steven D'Aprano

On Tue, 23 Feb 2016 05:48 am, Marko Rauhamaa wrote:

> Jon Ribbens :
> 
>> I was under the impression that the point of UUIDs is that you can be
>> *so* confident that there won't be a collision that for all practical
>> purposes it's indistinguishable from being certain.
> 
> Yes, if you generate a random 128-bit number, it will be unique --

If you generate a second random 128 bit number, you have a chance of 1 in
2**128 of a collision. All you can say is that it will be *very probably*
unique. (I might even allow "almost certainly" unique.)

If you generate 2**128 + 1 such numbers, you are *guaranteed* to have at
least one collision.

If I can arrange matters so that I am using the same seed as you, then I can
generate the same UUIDs as you.

If I know you are using the Mersenne Twister PRNG, and I can get hold of (by
memory) 128 consecutive UUIDs, I can reconstruct the seed you are using and
generate all future (and past) UUIDs the same as yours. (Well, when I
say "I can", I don't mean *me*, I mean some attacker who is smarter than
me, but not that much smarter.)

> unless someone clones it.
> 
> Cloning will be a practical issue when you clone virtual machines, for
> example.

This is certainly a practical issue that people have to be aware of.

-- 
Steven

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Make a unique filesystem path, without creating the file

2016-02-22 Thread Paul Rubin

Marko Rauhamaa  writes:
 http://www.2uo.de/myths-about-urandom/
>> I don't know what web pamphlet you mean,
> The only one linked above.

Oh, I wouldn't have called that a pamphlet.  I could quibble with the
writing style but the points in the article are basically correct.

> getrandom(2) is a good interface that distinguishes between the flag
> values
>0=>  /dev/urandom
>GRND_RANDOM  =>  /dev/random
>GRND_RANDOM | GRND_NONBLOCK  =>  /dev/random (O_NONBLOCK)
> However, although os.urandom() delegates to getrandom(), the
> documentation suggests it uses the flag value 0 (/dev/urandom).

Flag value 0 does the right thing and blocks if the entropy pool is not
yet initialized, and doesn't block after that.  That fixes the errors of
both urandom (fails to block before there's enough entropy) and random
(blocks even after there's enough entropy).  The getrandom doc is also
misleading about the workings of the entropy pools but that's ok.  The
actual algorithm is described here:

  http://www.pinkas.net/PAPERS/gpr06.pdf

It's pretty clumsy but discussions about replacing it have gotten bogged
down several times.  OTOH maybe I'm out of date on this.

>> The random/urandom interface was poorly designed and misleadingly
>> documented.
> It could be better I suppose, but I never found it particularly bad. The
> nice thing about it is that it is readily usable in shell scripts.

DJB describes the problems:

https://groups.google.com/forum/#!msg/randomness-generation/4opmDHA6_3w/__TyKhbnNWsJ

Regarding shell scripts, it should be a simple matter to put a wrapper
around the system call.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Make a unique filesystem path, without creating the file

2016-02-22 Thread Cameron Simpson


On 22Feb2016 12:34, Alan Bawden  wrote:

Cameron Simpson  writes:


On 16Feb2016 19:24, Alan Bawden  wrote:

So in the FIFO case, I might write something like the following:

   def make_temp_fifo(mode=0o600):
   while True:
   path = tempfile.mktemp()
   try:
   os.mkfifo(path, mode=mode)
   except FileExistsError:
   pass
   else:
   return path

So is there something wrong with the above code?  Other than the fact
that the documentation says something scary about mktemp()?


Well, it has a few shortcomings.

It relies on mkfifo reliably failing if the name exists. It shounds like
mkfifo is reliable this way, but I can imagine analogous use cases without
such a convenient core action, and your code only avoids mktemp's security
issue _because_ mkfifo has that fortuitous aspect.


I don't understand your use of the word "fortuitous" here.  mkfifo is
defined to act that way according to POSIX.  I wrote the code that way
precisely because of that property.  I sometimes write code knowing that
adding two even numbers together results in an even answer.  I suppose
you might describe that as "fortuitous", but it's just things behaving
as they are defined to behave!


I mean here that your scheme isn't adaptable to a system call which will reuse 
an existing name. Of course, mkfifo, mkdir and open(.., O_EXCL) all have this 
nice feature.



Secondly, why is your example better than::
 os.mkfifo(os.path.join(mkdtemp(), 'myfifo'))


My way is not much better, but I think it is a little better because
your way I have to worry about deleting both the file and the directory
when I am done, and I have to get the permissions right on two
filesystem objects.  (If I can use a TemporaryDirectory() context
manager, the cleaning up part does get easier.)

And it also seems wasteful to me, given that the way mkdtemp() is
implemented is to generate a possible name, try creating it, and loop if
the mkdir() call fails.  (POSIX makes the same guarantee for mkdir() as
it does for mkfifo().)  Why not just let me do an equivalent loop
myself?


Go ahead. But I think Ben's specificly trying to avoid writing his own loop.


On that basis, this example doesn't present a use case what can't be
addressed by mkstemp or mkdtemp.


Yes, if mktemp() were taken away from me, I could work around it.  I'm
just saying that in order to justify taking something like this away, it
has to be both below some threshold of utility and above some threshold
of dangerousness.  In the canonical case of gets() in C, not only is
fgets() almost a perfectly exact replacement for gets(), gets() is
insanely dangerous.  But the case of mktemp() doesn't seem to me to come
close to this combination of redundancy and danger.


You _do_ understand the security issue, yes? I sure looked like you did,
until here.


Well, it's always dangerous to say that you understand all the security
issues of anything.  In part that is why I wrote the code quoted above.
I am open to the possibility that there is a security problem here that
I haven't thought of.  But so far the only problem anybody has with it
is that you think there is something "fortuitous" about the way that it
works.


(As if that would be of any use in the
situation above!)  It looks like anxiety that some people might use
mktemp() in a stupid way has caused an over-reaction.


No, it is anxiety that mktemp's _normal_ use is inherently unsafe.


So are you saying that the way I used mktemp() above is _abnormal_?


In that you're not making a file. I mean "abnormal" in a statistical sense, and 
also in the "anticipated use case for mktemp's design". I'm not suggestioning 
you're wrong to use it like this.



[ Here I have removed some perfectly reasonable text describing the
  race condition in question -- yes I really do understand that. ]

This is neither weird nor even unlikely which is why kmtemp is strongly
discouraged - naive (and standard) use is not safe.

That you have contrived a use case where you can _carefully_ use mktemp in
safety in no way makes mktemp recommendable.


OK, so you _do_ seem to be saying that I have used mktemp() in a
"contrived" and "non-standard" (and "non-naive"!) way.  I'm genuinely
surprised.  I though I was just writing straightforward correct code and
demonstrating that this was a useful utility that it was not hard to use
safely.  You seem to think what I did is something that ordinary
programmers can not be expected to do.  Your judgement is definitely
different from mine!


No, I meant only that (a) mktemp is normally used for regular files and (b) 
that mkdtemp()/mkfifo() present equivalent results without hand making a 
pick-a-name loop. Of course any programmer should be able to read the mktemp() 
spec and built from it.



And ultimately this does all boil down to making judgements.  It does
make sense to remove things from libraries that are safety hazards (like
gets() in C), I'm just trying to

Re: Make a unique filesystem path, without creating the file

2016-02-22 Thread Ethan Furman


On 02/22/2016 02:25 PM, Cameron Simpson wrote:

On 22Feb2016 10:11, Ethan Furman  wrote:

On 02/14/2016 04:08 PM, Ben Finney wrote:


I am unconcerned with whether there is a real filesystem entry of that
name; the goal entails having no filesystem activity for this. I want a
valid unique filesystem path, without touching the filesystem.


This is impossible.  If you don't touch the file system you have no
way to know if the path is unique.


I think Ben wants to avoid filesystem modification (let us ignore atime
here). So one can read the filesystem to see what is current, but he
does not want to actually make any new filesystem entry.


Hmm -- well, he says "the goal entails having no filesystem activity for 
this", and seeing what already exists definitely requires file system 
activity . . .


--
~Ethan~
--
https://mail.python.org/mailman/listinfo/python-list

Re: Make a unique filesystem path, without creating the file

2016-02-22 Thread Ben Finney

Ethan Furman  writes:

> On 02/14/2016 04:08 PM, Ben Finney wrote:
>
> > I am unconcerned with whether there is a real filesystem entry of that
> > name; the goal entails having no filesystem activity for this. I want a
> > valid unique filesystem path, without touching the filesystem.
>
> This is impossible.  If you don't touch the file system you have no
> way to know if the path is unique.

That was unclear. Later in the same thread, I clarified that by “unique”
I mean nothing about entries already on the filesystem. Instead it means
“unpredictably different each time the function is called”.

-- 
 \  “It is difficult to get a man to understand something when his |
  `\   salary depends upon his not understanding it.” —Upton Sinclair, |
_o__) 1935 |
Ben Finney

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Make a unique filesystem path, without creating the file

2016-02-22 Thread Cameron Simpson


On 22Feb2016 10:11, Ethan Furman  wrote:

On 02/14/2016 04:08 PM, Ben Finney wrote:


I am unconcerned with whether there is a real filesystem entry of that
name; the goal entails having no filesystem activity for this. I want a
valid unique filesystem path, without touching the filesystem.


This is impossible.  If you don't touch the file system you have no way to 
know if the path is unique.


I think Ben wants to avoid filesystem modification (let us ignore atime here).  
So one can read the filesystem to see what is current, but he does not want to 
actually make any new filesystem entry. 


Cheers,
Cameron Simpson 
--
https://mail.python.org/mailman/listinfo/python-list

Re: Make a unique filesystem path, without creating the file

2016-02-22 Thread Marko Rauhamaa

Paul Rubin :

>>> http://www.2uo.de/myths-about-urandom/
>> Did you post the link because you agreed with the Web pamphlet?
>
> I don't know what web pamphlet you mean,

The only one linked above.

Cryptography is tricky business, indeed. I know enough about it not to
improvise too much. Infinitesimal weaknesses can make a difference
between feasible and unfeasible attacks.

> but the right thing to use now is getrandom(2).

getrandom(2) is a good interface that distinguishes between the flag
values

   0=>  /dev/urandom
   GRND_RANDOM  =>  /dev/random
   GRND_RANDOM | GRND_NONBLOCK  =>  /dev/random (O_NONBLOCK)

However, although os.urandom() delegates to getrandom(), the
documentation suggests it uses the flag value 0 (/dev/urandom).

> The random/urandom interface was poorly designed and misleadingly
> documented.

It could be better I suppose, but I never found it particularly bad. The
nice thing about it is that it is readily usable in shell scripts.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Make a unique filesystem path, without creating the file

2016-02-22 Thread Paul Rubin

Marko Rauhamaa  writes:
>> http://www.2uo.de/myths-about-urandom/
> Did you post the link because you agreed with the Web pamphlet?

I don't know what web pamphlet you mean, but the right thing to use now
is getrandom(2).  The random/urandom interface was poorly designed and
misleadingly documented.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Make a unique filesystem path, without creating the file

2016-02-22 Thread Marko Rauhamaa

Random832 :

> On Mon, Feb 22, 2016, at 14:32, Marko Rauhamaa wrote:
>> urandom() is not quite random and so should not be considered
>> cryptographically airtight.
>> 
>> Under Linux, /dev/random is the way to go when strong security is
>> needed. Note that /dev/random is a scarce resource on ordinary
>> systems.
>
> http://www.2uo.de/myths-about-urandom/

Did you post the link because you agreed with the Web pamphlet?


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Make a unique filesystem path, without creating the file

2016-02-22 Thread Random832

On Mon, Feb 22, 2016, at 14:32, Marko Rauhamaa wrote:
> urandom() is not quite random and so should not be considered
> cryptographically airtight.
> 
> Under Linux, /dev/random is the way to go when strong security is
> needed. Note that /dev/random is a scarce resource on ordinary systems.

http://www.2uo.de/myths-about-urandom/
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Make a unique filesystem path, without creating the file

2016-02-22 Thread Chris Angelico

On Tue, Feb 23, 2016 at 6:22 AM, Jon Ribbens
 wrote:
>> Maybe, if everyone's cooperating. I'm not sure how they fare in the
>> face of malice though.
>
> Suppose you had code like this:
>
>   filename = binascii.hexlify(os.urandom(16)).decode("ascii")
>
> Do we really think that is insecure or that there are any practical
> attacks against it? It would be basically the same as saying that
> urandom() is broken, surely?

Sure, that would be safe. But UUIDs aren't necessarily based on "give
me sixteen bytes from urandom". They can involve
potentially-predictable information such as MAC addresses, current
time of day, and so on, which gives them significantly less
randomness. In that kind of usage, they're not intended to be
cryptographically secure.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Make a unique filesystem path, without creating the file

2016-02-22 Thread Marko Rauhamaa

Jon Ribbens :

> Suppose you had code like this:
>
>   filename = binascii.hexlify(os.urandom(16)).decode("ascii")
>
> Do we really think that is insecure or that there are any practical
> attacks against it? It would be basically the same as saying that
> urandom() is broken, surely?

urandom() is not quite random and so should not be considered
cryptographically airtight.

Under Linux, /dev/random is the way to go when strong security is
needed. Note that /dev/random is a scarce resource on ordinary systems.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Make a unique filesystem path, without creating the file

2016-02-22 Thread Jon Ribbens

On 2016-02-22, Chris Angelico  wrote:
> On Tue, Feb 23, 2016 at 5:39 AM, Jon Ribbens
> wrote:
>> On 2016-02-22, Chris Angelico  wrote:
>>> On Tue, Feb 23, 2016 at 5:17 AM, Jon Ribbens
>>> wrote:
 Weell, I have a lot of sympathy for that point, but on the other
 hand the whole concept of UUIDs ("import uuid") is predicated on the
 opposite assumption.
>>>
>>> Not quite opposite. Ethan is asserting that you cannot be *certain*
>>> without actually checking the FS; the point of UUIDs is that you can
>>> be fairly *confident* that there won't be a collision. There is a
>>> nonzero probability of accidental collisions, and if an attacker is
>>> deliberately trying to _force_ a collision, it's most definitely
>>> possible. So both views are correct.
>>
>> I was under the impression that the point of UUIDs is that you can be
>> *so* confident that there won't be a collision that for all practical
>> purposes it's indistinguishable from being certain.
>
> Maybe, if everyone's cooperating. I'm not sure how they fare in the
> face of malice though.

Suppose you had code like this:

  filename = binascii.hexlify(os.urandom(16)).decode("ascii")

Do we really think that is insecure or that there are any practical
attacks against it? It would be basically the same as saying that
urandom() is broken, surely?
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Make a unique filesystem path, without creating the file

2016-02-22 Thread Paul Rubin

Chris Angelico  writes:
>> I was under the impression that the point of UUIDs is that you can be
>> *so* confident that there won't be a collision that for all practical
>> purposes it's indistinguishable from being certain.
> Maybe, if everyone's cooperating. I'm not sure how they fare in the
> face of malice though.

There are different UUID algorithms, some of which have useful syntax
but are easy to spoof.  Uuid4 is random and implemented properly, should
be hard to spoof.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Make a unique filesystem path, without creating the file

2016-02-22 Thread Chris Angelico

On Tue, Feb 23, 2016 at 5:39 AM, Jon Ribbens
 wrote:
> On 2016-02-22, Chris Angelico  wrote:
>> On Tue, Feb 23, 2016 at 5:17 AM, Jon Ribbens
>> wrote:
>>> Weell, I have a lot of sympathy for that point, but on the other
>>> hand the whole concept of UUIDs ("import uuid") is predicated on the
>>> opposite assumption.
>>
>> Not quite opposite. Ethan is asserting that you cannot be *certain*
>> without actually checking the FS; the point of UUIDs is that you can
>> be fairly *confident* that there won't be a collision. There is a
>> nonzero probability of accidental collisions, and if an attacker is
>> deliberately trying to _force_ a collision, it's most definitely
>> possible. So both views are correct.
>
> I was under the impression that the point of UUIDs is that you can be
> *so* confident that there won't be a collision that for all practical
> purposes it's indistinguishable from being certain.

Maybe, if everyone's cooperating. I'm not sure how they fare in the
face of malice though.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Make a unique filesystem path, without creating the file

2016-02-22 Thread Marko Rauhamaa

Jon Ribbens :

> I was under the impression that the point of UUIDs is that you can be
> *so* confident that there won't be a collision that for all practical
> purposes it's indistinguishable from being certain.

Yes, if you generate a random 128-bit number, it will be unique --
unless someone clones it.

Cloning will be a practical issue when you clone virtual machines, for
example.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Make a unique filesystem path, without creating the file

2016-02-22 Thread Jon Ribbens

On 2016-02-22, Chris Angelico  wrote:
> On Tue, Feb 23, 2016 at 5:17 AM, Jon Ribbens
> wrote:
>> Weell, I have a lot of sympathy for that point, but on the other
>> hand the whole concept of UUIDs ("import uuid") is predicated on the
>> opposite assumption.
>
> Not quite opposite. Ethan is asserting that you cannot be *certain*
> without actually checking the FS; the point of UUIDs is that you can
> be fairly *confident* that there won't be a collision. There is a
> nonzero probability of accidental collisions, and if an attacker is
> deliberately trying to _force_ a collision, it's most definitely
> possible. So both views are correct.

I was under the impression that the point of UUIDs is that you can be
*so* confident that there won't be a collision that for all practical
purposes it's indistinguishable from being certain.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Make a unique filesystem path, without creating the file

2016-02-22 Thread Jon Ribbens

On 2016-02-22, Ethan Furman  wrote:
> On 02/14/2016 04:08 PM, Ben Finney wrote:
>> I am unconcerned with whether there is a real filesystem entry of that
>> name; the goal entails having no filesystem activity for this. I want a
>> valid unique filesystem path, without touching the filesystem.
>
> This is impossible.  If you don't touch the file system you have no way 
> to know if the path is unique.

Weell, I have a lot of sympathy for that point, but on the other
hand the whole concept of UUIDs ("import uuid") is predicated on the
opposite assumption.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Make a unique filesystem path, without creating the file

2016-02-22 Thread Chris Angelico

On Tue, Feb 23, 2016 at 5:17 AM, Jon Ribbens
 wrote:
> On 2016-02-22, Ethan Furman  wrote:
>> On 02/14/2016 04:08 PM, Ben Finney wrote:
>>> I am unconcerned with whether there is a real filesystem entry of that
>>> name; the goal entails having no filesystem activity for this. I want a
>>> valid unique filesystem path, without touching the filesystem.
>>
>> This is impossible.  If you don't touch the file system you have no way
>> to know if the path is unique.
>
> Weell, I have a lot of sympathy for that point, but on the other
> hand the whole concept of UUIDs ("import uuid") is predicated on the
> opposite assumption.

Not quite opposite. Ethan is asserting that you cannot be *certain*
without actually checking the FS; the point of UUIDs is that you can
be fairly *confident* that there won't be a collision. There is a
nonzero probability of accidental collisions, and if an attacker is
deliberately trying to _force_ a collision, it's most definitely
possible. So both views are correct.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Make a unique filesystem path, without creating the file

2016-02-22 Thread Ethan Furman


On 02/14/2016 04:08 PM, Ben Finney wrote:


I am unconcerned with whether there is a real filesystem entry of that
name; the goal entails having no filesystem activity for this. I want a
valid unique filesystem path, without touching the filesystem.


This is impossible.  If you don't touch the file system you have no way 
to know if the path is unique.


--
~Ethan~

--
https://mail.python.org/mailman/listinfo/python-list

Re: Make a unique filesystem path, without creating the file

2016-02-22 Thread Alan Bawden

Cameron Simpson  writes:

> On 16Feb2016 19:24, Alan Bawden  wrote:
>>So in the FIFO case, I might write something like the following:
>>
>>def make_temp_fifo(mode=0o600):
>>while True:
>>path = tempfile.mktemp()
>>try:
>>os.mkfifo(path, mode=mode)
>>except FileExistsError:
>>pass
>>else:
>>return path
>>
>>So is there something wrong with the above code?  Other than the fact
>>that the documentation says something scary about mktemp()?
>
> Well, it has a few shortcomings.
>
> It relies on mkfifo reliably failing if the name exists. It shounds like
> mkfifo is reliable this way, but I can imagine analogous use cases without
> such a convenient core action, and your code only avoids mktemp's security
> issue _because_ mkfifo has that fortuitous aspect.

I don't understand your use of the word "fortuitous" here.  mkfifo is
defined to act that way according to POSIX.  I wrote the code that way
precisely because of that property.  I sometimes write code knowing that
adding two even numbers together results in an even answer.  I suppose
you might describe that as "fortuitous", but it's just things behaving
as they are defined to behave!

> Secondly, why is your example better than::
>
>  os.mkfifo(os.path.join(mkdtemp(), 'myfifo'))

My way is not much better, but I think it is a little better because
your way I have to worry about deleting both the file and the directory
when I am done, and I have to get the permissions right on two
filesystem objects.  (If I can use a TemporaryDirectory() context
manager, the cleaning up part does get easier.)

And it also seems wasteful to me, given that the way mkdtemp() is
implemented is to generate a possible name, try creating it, and loop if
the mkdir() call fails.  (POSIX makes the same guarantee for mkdir() as
it does for mkfifo().)  Why not just let me do an equivalent loop
myself?

> On that basis, this example doesn't present a use case what can't be
> addressed by mkstemp or mkdtemp.

Yes, if mktemp() were taken away from me, I could work around it.  I'm
just saying that in order to justify taking something like this away, it
has to be both below some threshold of utility and above some threshold
of dangerousness.  In the canonical case of gets() in C, not only is
fgets() almost a perfectly exact replacement for gets(), gets() is
insanely dangerous.  But the case of mktemp() doesn't seem to me to come
close to this combination of redundancy and danger.

> You _do_ understand the security issue, yes? I sure looked like you did,
> until here.

Well, it's always dangerous to say that you understand all the security
issues of anything.  In part that is why I wrote the code quoted above.
I am open to the possibility that there is a security problem here that
I haven't thought of.  But so far the only problem anybody has with it
is that you think there is something "fortuitous" about the way that it
works.

>>(As if that would be of any use in the
>>situation above!)  It looks like anxiety that some people might use
>>mktemp() in a stupid way has caused an over-reaction.
>
> No, it is anxiety that mktemp's _normal_ use is inherently unsafe.

So are you saying that the way I used mktemp() above is _abnormal_?

> [ Here I have removed some perfectly reasonable text describing the
>   race condition in question -- yes I really do understand that. ]
>
> This is neither weird nor even unlikely which is why kmtemp is strongly
> discouraged - naive (and standard) use is not safe.
>
> That you have contrived a use case where you can _carefully_ use mktemp in
> safety in no way makes mktemp recommendable.

OK, so you _do_ seem to be saying that I have used mktemp() in a
"contrived" and "non-standard" (and "non-naive"!) way.  I'm genuinely
surprised.  I though I was just writing straightforward correct code and
demonstrating that this was a useful utility that it was not hard to use
safely.  You seem to think what I did is something that ordinary
programmers can not be expected to do.  Your judgement is definitely
different from mine!

And ultimately this does all boil down to making judgements.  It does
make sense to remove things from libraries that are safety hazards (like
gets() in C), I'm just trying to argue that mktemp() isn't nearly
dangerous enough to deserve more than a warning in its documentation.
You don't agree.  Oh well...

Up until this point, you haven't said anything that I actually think is
flat out wrong, we just disagree about what tools it is reasonable to
take away from _all_ programmers just because _some_ programmers might
use them to make a mess.

> In fact your use case isn't safe, because _another_ task using mktemp
> in conflict as a plain old temporary file may grab your fifo.

But here in very last sentence I really must disagree.  If the code I
wrote above is "unsafe" because some _other_ process might be using
mktemp() badly and stumble over

Re: Make a unique filesystem path, without creating the file

2016-02-21 Thread Cameron Simpson


On 16Feb2016 19:24, Alan Bawden  wrote:

Ben Finney  writes:

Cameron Simpson  writes:

I've been watching this for a few days, and am struggling to
understand your use case.


Yes, you're not alone. This surprises me, which is why I'm persisting.


Can you elaborate with a concrete example and its purpose which would
work with a mktemp-ish official function?


An example::


Let me present another example that might strike some as more
straightforward.

If I want to create a temporary file, I can call mkstemp().
If I want to create a temporary directory, I can call mkdtemp().

Suppose that instead of a file or a directory, I want a FIFO or a
socket.

A FIFO is created by passing a pathname to os.mkfifo().  A socket is
created by passing a pathname to an AF_UNIX socket's bind() method.  In
both cases, the pathname must not name anything yet (not even a symbolic
link), otherwise the call will fail.

So in the FIFO case, I might write something like the following:

   def make_temp_fifo(mode=0o600):
   while True:
   path = tempfile.mktemp()
   try:
   os.mkfifo(path, mode=mode)
   except FileExistsError:
   pass
   else:
   return path

mktemp() is convenient here, because I don't have to worry about whether
I should be using "/tmp" or "/var/tmp" or "c:\temp", or whether the
TMPDIR environment variable is set, or whether I have permission to
create entries in those directories.  It just gives me a pathname
without making me think about the rest of that stuff.


Yes, that is highly desirable.


Yes, I have to
defend against the possibility that somebody else creates something with
the same name first, but as you can see, I did that, and it wasn't
rocket science.

So is there something wrong with the above code?  Other than the fact
that the documentation says something scary about mktemp()?


Well, it has a few shortcomings.

It relies on mkfifo reliably failing if the name exists. It shounds like mkfifo 
is reliable this way, but I can imagine analogous use cases without such a 
convenient core action, and your code only avoids mktemp's security issue 
_because_ mkfifo has that fortuitous aspect.



It looks to me like mktemp() provides some real utility, packaged up in
a way that is orthogonal to the type of file system entry I want to
create, the permissions I want to give to that entry, and the mode I
want use to open it.  It looks like a useful, albeit low-level,
primitive that it is perfectly reasonable for the tempfile module to
supply.


Secondly, why is your example better than::

 os.mkfifo(os.path.join(mkdtemp(), 'myfifo'))

On that basis, this example doesn't present a use case what can't be addressed 
by mkstemp or mkdtemp.


By contrast, Ben's example does look like it needs something like mktemp.


And yet the documentation condemns it as "deprecated", and tells me I
should use mkstemp() instead.


You _do_ understand the security issue, yes? I sure looked like you did, until 
here. 


(As if that would be of any use in the
situation above!)  It looks like anxiety that some people might use
mktemp() in a stupid way has caused an over-reaction.


No, it is anxiety that mktemp's _normal_ use is inherently unsafe.


Let the
documentation warn about the problem and point to prepackaged solutions
in the common cases of making files and directories, but I see no good
reason to deprecate this useful utility.


I think it is like C's gets() function (albeit not as dangerous). It really 
shouldn't be used.


One of the things about mktemp() is its raciness, which is the core of the 
security issue. People look at the term "security issue" and think "Ah, it can 
be attacked." But the flipside is that it is simply unreliable. Its normal use 
was to make an ordinary temp file. Consider the case where two instances of the 
same task are running at the same time, doing that. They can easily, by 
accident, end us using the same scratch file!


This is by no means unlikely; any shell script running tasks in parallel can 
arrange it, any procmail script filing a message with a "copy" rule (which 
causes procmail simply to fork and proceed), etc.


This is neither weird nor even unlikely which is why kmtemp is strongly 
discouraged - naive (and standard) use is not safe.


That you have contrived a use case where you can _carefully_ use mktemp in 
safety in no way makes mktemp recommendable. In fact your use case isn't safe, 
because _another_ task using mktemp in conflict as a plain old temporary file 
may grab your fifo.


Cheers,
Cameron Simpson 
--
https://mail.python.org/mailman/listinfo/python-list

Re: Make a unique filesystem path, without creating the file

2016-02-17 Thread Oscar Benjamin

On 16 February 2016 at 19:40, Ben Finney  wrote:
> Oscar Benjamin  writes:
>
>> If you're going to patch open to return a fake file when asked to open
>> fake_file_path why do you care whether there is a real file of that
>> name?
>
> I don't, and have been saying explicitly many times in this thread that
> I do not care whether the file exists. Somehow that is still not clear?

Sorry Ben I misunderstood. I think I can see the source of confusion
which is in your first message:
"""
In some code (e.g. unit tests) I am calling ‘tempfile.mktemp’ to
generate a unique path for a filesystem entry that I *do not want* to
exist on the real filesystem.
"""
I read that as meaning that it was important that the file did not
exist. But you say that you don't care if the file actually exists in
the filesystem or not and just want a unique path.

What do you mean by unique here? The intention of mktemp is that the
path is unique so that there would not exist a file of that name and
if you opened it for writing you wouldn't be interfering with any
existing file. Do you just mean a function that returns a different
value each time it's called? How about this:

count = 0
def unique_path():
global count
count += 1
return os.path.join(tempfile.gettempdir(), str(count))

--
Oscar
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Make a unique filesystem path, without creating the file

2016-02-16 Thread Alan Bawden

Ben Finney  writes:

> Cameron Simpson  writes:
>
>> I've been watching this for a few days, and am struggling to
>> understand your use case.
>
> Yes, you're not alone. This surprises me, which is why I'm persisting.
>
>> Can you elaborate with a concrete example and its purpose which would
>> work with a mktemp-ish official function?
>
> An example::

Let me present another example that might strike some as more
straightforward.

If I want to create a temporary file, I can call mkstemp().  
If I want to create a temporary directory, I can call mkdtemp().

Suppose that instead of a file or a directory, I want a FIFO or a
socket.

A FIFO is created by passing a pathname to os.mkfifo().  A socket is
created by passing a pathname to an AF_UNIX socket's bind() method.  In
both cases, the pathname must not name anything yet (not even a symbolic
link), otherwise the call will fail.

So in the FIFO case, I might write something like the following:

def make_temp_fifo(mode=0o600):
while True:
path = tempfile.mktemp()
try:
os.mkfifo(path, mode=mode)
except FileExistsError:
pass
else:
return path

mktemp() is convenient here, because I don't have to worry about whether
I should be using "/tmp" or "/var/tmp" or "c:\temp", or whether the
TMPDIR environment variable is set, or whether I have permission to
create entries in those directories.  It just gives me a pathname
without making me think about the rest of that stuff.  Yes, I have to
defend against the possibility that somebody else creates something with
the same name first, but as you can see, I did that, and it wasn't
rocket science.

So is there something wrong with the above code?  Other than the fact
that the documentation says something scary about mktemp()?

It looks to me like mktemp() provides some real utility, packaged up in
a way that is orthogonal to the type of file system entry I want to
create, the permissions I want to give to that entry, and the mode I
want use to open it.  It looks like a useful, albeit low-level,
primitive that it is perfectly reasonable for the tempfile module to
supply.

And yet the documentation condemns it as "deprecated", and tells me I
should use mkstemp() instead.  (As if that would be of any use in the
situation above!)  It looks like anxiety that some people might use
mktemp() in a stupid way has caused an over-reaction.  Let the
documentation warn about the problem and point to prepackaged solutions
in the common cases of making files and directories, but I see no good
reason to deprecate this useful utility.

-- 
Alan Bawden
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Make a unique filesystem path, without creating the file

2016-02-16 Thread Ben Finney

Steven D'Aprano  writes:

> On Tue, 16 Feb 2016 04:56 pm, Ben Finney wrote:
>
> > names = tempfile._get_candidate_names()
>
> I'm not sure that calling a private function of the tempfile module is
> better than calling a deprecated function.

Agreed, which is why I'm seeking a public API that is not deprecated.

> So why not just pick a random bunch of characters?
>
> chars = list(string.ascii_letters)
> random.shuffle(chars)
> fake_file_path = ''.join(chars[:10])

This (an equivalent) is already implemented, internally to ‘tempfile’
and tested and maintained and more robust than me re-inventing the wheel.

> Yes, but the system doesn't try to enforce the filesystem's rules,
> does it?

The test case I'm writing should not be prone to failure if the system
happens to perform some arbitrary validation of filesystem paths.

‘tempfile’ already knows how to generate filesystem paths, I want to use
that and not have to get it right myself.

> and your system shouldn't care.

If it does, this test case should not fail.

> Since your test doesn't know what filesystem your code will be running
> on, you can't make any assumptions about what paths are valid or not
> valid.

That implies that ‘tempfile._get_candidate_names’ would generate paths
that would potentially be invalid. Is that what you intend to imply?

> > Almost. I want the filesystem paths to be valid because the system
> > under test expects them, it may perform its own validation,
>
> If the system tries to validate paths, it is broken.

This is “you don't want what you say you want”, and seeing the
justifications presented I don't agree.

-- 
 \ “I must say that I find television very educational. The minute |
  `\   somebody turns it on, I go to the library and read a book.” |
_o__)—Groucho Marx |
Ben Finney

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Make a unique filesystem path, without creating the file

2016-02-16 Thread Ben Finney

Oscar Benjamin  writes:

> If you're going to patch open to return a fake file when asked to open
> fake_file_path why do you care whether there is a real file of that
> name?

I don't, and have been saying explicitly many times in this thread that
I do not care whether the file exists. Somehow that is still not clear?

-- 
 \   “Nothing exists except atoms and empty space; everything else |
  `\is opinion.” —Democritus, c. 460 BCE – 370 BCE |
_o__)  |
Ben Finney

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Make a unique filesystem path, without creating the file

2016-02-16 Thread Steven D'Aprano

On Tue, 16 Feb 2016 04:56 pm, Ben Finney wrote:

> An example::
> 
> import io
> import tempfile
> names = tempfile._get_candidate_names()

I'm not sure that calling a private function of the tempfile module is
better than calling a deprecated function.

> def test_frobnicates_configured_spungfile():
> """ ‘foo’ should frobnicate the configured spungfile. """
> 
> fake_file_path = os.path.join(tempfile.gettempdir(), names.next())

At this point, you have a valid pathname, but no guarantee whether it refers
to a real file on the file system or not. That's the whole problem with
tempfile.makepath -- it can return a file name which is not in use, but by
the time it returns to you, you cannot guarantee that it still doesn't
exist.

Now, since this is a test which doesn't actually open that file, it doesn't
matter. There's no actual security vulnerability here. So your test doesn't
actually require that the file is unique, or that it doesn't actually
exist. (Which is good, because you can't guarantee that it doesn't exist.)

So why not just pick a random bunch of characters?

chars = list(string.ascii_letters)
random.shuffle(chars)
fake_file_path = ''.join(chars[:10])

> fake_file = io.BytesIO("Lorem ipsum, dolor sit
> amet".encode("utf-8"))
>
> patch_builtins_open(
> when_accessing_path=fake_file_path,
> provide_file=fake_file)

There's nothing apparent in this that requires that fake_file_path not
actually exist, which is good since (as I've pointed out before) you cannot
guarantee that it doesn't exist. One could just as easily, and just as
correctly, write:

patch_builtins_open(
when_accessing_path='/foo/bar/baz',
provide_file=fake_file)

and regardless of whether /foo/bar/baz actually exists or not, you are
guaranteed to get the fake file rather than the real file. So I question
whether you actually need this tempfile.makepath function at all.

*But* having questioned it, for the sake of the argument I'll assume you do
need it, and continue accordingly.

> system_under_test.config.spungfile_path = fake_file_path
> system_under_test.foo()
> assert_correctly_frobnicated(fake_file)
> 
> So the test case creates a fake file, makes a valid filesystem path to
> associate with it, then patches the ‘open’ function so that it will
> return the fake file when that specific path is requested.
> 
> Then the test case alters the system under test's configuration, giving
> it the generated filesystem path for an important file. The test case
> then calls the function about which the unit test is asserting
> behaviour, ‘system_under_test.foo’. When that call returns, the test
> case asserts some properties of the fake file to ensure the system under
> test actually accessed that file.

Personally, I think it would be simpler and easier to understand if, instead
of patching open, you allowed the test to read and write real files:

file_path = '/tmp/spam'
system_under_test.config.spungfile_path = file_path
system_under_test.foo()
assert_correctly_frobnicated(file_path)
os.unlink(file_path)

In practice, I'd want to only unlike the file if the test passes. If it
fails, I'd want to look at the file to see why it wasn't frobnicated.

I think that a correctly-working filesystem is a perfectly reasonable
prerequisite for the test, just like a working CPU, memory, power supply,
operating system and Python interpreter. You don't have to guard against
every imaginable failure ("fixme: test may return invalid results if the
speed of light changes by more than 0.0001%"), and you might as well take
advantage of real files for debugging. But that's my opinion, and if you
have another, that's your personal choice.

> With a supported standard library API for this – ‘tempfile.makepath’ for
> example – the generation of the filesystem path would change from four
> separate function calls, one of which is a private API::
> 
> names = tempfile._get_candidate_names()
> fake_file_path = os.path.join(tempfile.gettempdir(), names.next())
> 
> to a simple public function call::
> 
> fake_file_path = tempfile.makepath()

Nobody doubts that your use of tempfile.makepath is legitimate for your
use-case. But it is *not* legitimate for the tempfile module, and it is a
mistake that it was added in the first place, hence the deprecation.
Assuming that your test suite needs this function, your test library, or
test suite, should provide that function, not tempfile. I believe it is
unreasonable to expect the tempfile module to keep a function which is a
security risk in the context of "temp files" just because it is useful for
some completely unrelated use-cases.

After all, your use of this doesn't actually have anything to do with
temporary files. It is a mocked *permanent* file, not a real temporary one.

> This whole thread began because I expected s

Re: Make a unique filesystem path, without creating the file

2016-02-16 Thread Oscar Benjamin

On 16 Feb 2016 05:57, "Ben Finney"  wrote:
>
> Cameron Simpson  writes:
>
> > I've been watching this for a few days, and am struggling to
> > understand your use case.
>
> Yes, you're not alone. This surprises me, which is why I'm persisting.
>
> > Can you elaborate with a concrete example and its purpose which would
> > work with a mktemp-ish official function?
>
> An example::
>
> import io
> import tempfile
> names = tempfile._get_candidate_names()
>
> def test_frobnicates_configured_spungfile():
> """ ‘foo’ should frobnicate the configured spungfile. """
>
> fake_file_path = os.path.join(tempfile.gettempdir(), names.next())
> fake_file = io.BytesIO("Lorem ipsum, dolor sit
amet".encode("utf-8"))
>
> patch_builtins_open(
> when_accessing_path=fake_file_path,
> provide_file=fake_file)
>
> system_under_test.config.spungfile_path = fake_file_path
> system_under_test.foo()
> assert_correctly_frobnicated(fake_file)

If you're going to patch open to return a fake file when asked to open
fake_file_path why do you care whether there is a real file of that name?

--
Oscar
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Make a unique filesystem path, without creating the file

2016-02-15 Thread Ben Finney

Cameron Simpson  writes:

> I've been watching this for a few days, and am struggling to
> understand your use case.

Yes, you're not alone. This surprises me, which is why I'm persisting.

> Can you elaborate with a concrete example and its purpose which would
> work with a mktemp-ish official function?

An example::

import io
import tempfile
names = tempfile._get_candidate_names()

def test_frobnicates_configured_spungfile():
""" ‘foo’ should frobnicate the configured spungfile. """

fake_file_path = os.path.join(tempfile.gettempdir(), names.next())
fake_file = io.BytesIO("Lorem ipsum, dolor sit amet".encode("utf-8"))

patch_builtins_open(
when_accessing_path=fake_file_path,
provide_file=fake_file)

system_under_test.config.spungfile_path = fake_file_path
system_under_test.foo()
assert_correctly_frobnicated(fake_file)

So the test case creates a fake file, makes a valid filesystem path to
associate with it, then patches the ‘open’ function so that it will
return the fake file when that specific path is requested.

Then the test case alters the system under test's configuration, giving
it the generated filesystem path for an important file. The test case
then calls the function about which the unit test is asserting
behaviour, ‘system_under_test.foo’. When that call returns, the test
case asserts some properties of the fake file to ensure the system under
test actually accessed that file.

With a supported standard library API for this – ‘tempfile.makepath’ for
example – the generation of the filesystem path would change from four
separate function calls, one of which is a private API::

names = tempfile._get_candidate_names()
fake_file_path = os.path.join(tempfile.gettempdir(), names.next())

to a simple public function call::

fake_file_path = tempfile.makepath()

This whole thread began because I expected such an API would exist.

> I don't see how it is useful to have a notion of a filepath at all
> in this case, and therefore I don't see why you would want a
> mktemp-like function available.

Because the system under test expects to be dealing with a filesystem,
including normal restrictions on filesystem paths.

The filesystem path needs to be valid because the test case isn't making
assertions about what the system does with invalid paths. A test case
should be very narrow in what it asserts so that the failure's cause is
as obvious as possible.

The filesystem path needs to be unpredictable to make sure we're not
using some hard-coded value; the test case asserts that the system under
test will access whatever file is named in the configuration.

The file object needs to be fake because the test case should not be
prone to irrelevant failures when the real filesystem isn't behaving as
expected; this test case makes assertions only about what
‘system_under_test.foo’ does internally, not what the filesystem does.

The system library functionality should be providing this because it's
*already implemented there* and well tested and maintained. It should be
in a public non-deprecated API because merely generating filesystem
paths is not a security risk.

> But.. then why a filesystem path at all in that case?

Because the system under test is expecting valid filesystem paths, and I
have no good reason to violate that constraint.

> Why use a filesystem as a reference at all?

An actual running filesystem is irrelevant to this inquiry.

I'm only wanting to use functionality, with the constraints I enumerated
earlier (already implemented in the standard library), to generate
filesystem paths.

> The only modes I can imagine for such a thing (a generated but unused
> filename) are:
>
>  checking that the name is syntactly valid, for whatever constrains
> you may have (but if you're calling an opaque mktemp-like function, is
> this feasible or remediable?)

Almost. I want the filesystem paths to be valid because the system under
test expects them, it may perform its own validation, and I have no good
reason to complicate the unit test by possibly supplying an invalid path
when that's not relevant to the test case.

>  generating test paths without using a real filesystem as a reference,
> but then you can't even use mktemp

I hadn't realised the filesystem was accessed by ‘tempfile.mktemp’, and
I apologise for the complication that entails.

I would prefer to access some standard public documented non-deprecated
function that internally uses ‘tempfile._get_candidate_names’ and
returns a new path each time.

> I think "the standard library clearly has this useful functionality
> implemented, but simultaneously warns strongly against its use" pretty
> much precludes this.

I hope to get that addressed with https://bugs.python.org/issue26362>.

-- 
 \   “Timid men prefer the calm of despotism to the boisterous sea |
  `\of liberty.” —Thomas Jefferson |

Re: Make a unique filesystem path, without creating the file

2016-02-15 Thread Ben Finney

"Mario R. Osorio"  writes:

> I would create a RAM disk
> (http://www.cyberciti.biz/faq/howto-create-linux-ram-disk-filesystem/),
> generate all the path/files I want with any, or my own algorithm, run
> the tests, unmount it, destroy it, be happy ... Whats wrong with
> that??

It is addressing the problem at a different level. I am not asking about
writing a wrapper around the test suite, I am asking about an API to
generate filesystem paths.

Your solution is a fine response to a different question.

-- 
 \“Consider the daffodil. And while you're doing that, I'll be |
  `\  over here, looking through your stuff.” —Jack Handey |
_o__)  |
Ben Finney

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Make a unique filesystem path, without creating the file

2016-02-15 Thread Mario R. Osorio

I would create a RAM disk 
(http://www.cyberciti.biz/faq/howto-create-linux-ram-disk-filesystem/), 
generate all the path/files I want with any, or my own algorithm, run the 
tests, unmount it, destroy it, be happy ... Whats wrong with that?? AFAIK, RAM 
disks do not get logged, and even if they do, any "insecure" file created would 
also be gone.



On Sunday, February 14, 2016 at 4:46:42 PM UTC-5, Ben Finney wrote:
> Howdy all,
> 
> How should a program generate a unique filesystem path and *not* create
> the filesystem entry?
> 
> The 'tempfile.mktemp' function is strongly deprecated, and rightly so
> https://docs.python.org/3/library/tempfile.html#tempfile.mktemp>
> because it leaves the program vulnerable to insecure file creation.
> 
> In some code (e.g. unit tests) I am calling 'tempfile.mktemp' to
> generate a unique path for a filesystem entry that I *do not want* to
> exist on the real filesystem. In this case the filesystem security
> concerns are irrelevant because there is no file.
> 
> The deprecation of that function is a concern still, because I don't
> want code that makes every conscientious reader need to decide whether
> the code is a problem. Instead the code should avoid rightly-deprecated
> APIs.
> 
> It is also prone to that API function disappearing at some point in the
> future, because it is explicitly and strongly deprecated.
> 
> So I agree with the deprecation, but the library doesn't appear to
> provide a replacement.
> 
> What standard library function should I be using to generate
> 'tempfile.mktemp'-like unique paths, and *not* ever create a real file
> by that path?
> 
> -- 
>  \"If you have the facts on your side, pound the facts. If you |
>   `\ have the law on your side, pound the law. If you have neither |
> _o__)   on your side, pound the table." --anonymous |
> Ben Finney
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Make a unique filesystem path, without creating the file

2016-02-15 Thread Rick Johnson

On Sunday, February 14, 2016 at 10:55:11 PM UTC-6, Steven D'Aprano wrote:
> If you want to guarantee that these faux pathnames can't
> leak out of your test suite and touch the file system,
> prepend an ASCII NUL to them. That will make it an illegal
> path on all file systems that I'm aware of.

Hmm, the unfounded fears in this thread are beginning to
remind me of a famous Black Sabbath song.

  Finished with "py tempfile",
  'cause it,
  couldn't help to,
  ease my mind.

  People think i'm insane,
  because,
  i want "faux paths",
  all the time.

  All day long i think of ways, 
  but nothing seems to,
  satisfy.

  Think i'll loose my mind,
  if i don't,
  find a py-module to,
  pacify.

  CAN YOU HELP ME?

  MAKE "FAUX PATHS" TODY,

  OH YEAH...

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Make a unique filesystem path, without creating the file

2016-02-15 Thread Nobody

On Mon, 15 Feb 2016 15:28:27 +1100, Ben Finney wrote:

> The behaviour is already implemented in the standard library. What I'm
> looking for is a way to use it (not re-implement it) that is public API
> and isn't scolded by the library documentation.

So, basically you want (essentially) the exact behaviour of
tempfile.mktemp(), except without any mention of the (genuine) risks that
such a function presents?

I suspect that you'll have to settle for either a) using that function and
simply documenting the reasons why it isn't an issue in this particular
case, or b) re-implementing it (so that you can choose to avoid mentioning
the issue in its documentation).

At the outside, you *might* have a third option: c) persuade the
maintainers to tweak the documentation to further clarify that the risk
arises from creating a file with the returned name, not from simply
calling the function. But actually it's already fairly clear if you
actually read it.

If it's the bold-face "Warning:" and the red background that you don't
like, I wouldn't expect those to go away either for mktemp() or for any
other function with similar behaviour (i.e. something which someone
*might* try to use to actually create temporary files). The simple fact
that it might get used that way is enough to warrant a prominent warning.

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Make a unique filesystem path, without creating the file

2016-02-15 Thread Ben Finney

Roel Schroeven  writes:

> Use uuid.uuid1()?

That has potential. A little counter-intuitive, for use in documentation
about testing filesystem paths; but not frightening or dubious to the
conscientious reader.

I'll see whether that meets this use case, thank you.

The bug report (to make a supported ‘tempfile’ API for generating
filesystem paths only) remains, and fixing that would be the correct way
to address this IMO.

-- 
 \  “I used to be a proofreader for a skywriting company.” —Steven |
  `\Wright |
_o__)  |
Ben Finney

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Make a unique filesystem path, without creating the file

2016-02-15 Thread Thomas 'PointedEars' Lahn

Gregory Ewing wrote:

> Ben Finney wrote:
>> One valid filesystem path each time it's accessed. That is, behaviour
>> equivalent to ‘tempfile.mktemp’.
>> 
>> My question is because the standard library clearly has this useful
>> functionality implemented, but simultaneously warns strongly against its
>> use.
> 
> But it *doesn't*,

Yes, it does.

> if your requirement is truly to not touch the filesystem at all, because
> tempfile.mktemp() *reads* the file system to make sure the name it's
> returning isn't in use.

But there is a race condition occurring between the moment that the 
filesystem has been read and is being written to by another user.  Hence the 
deprecation in favor of tempfile.mkstemp() which also *creates* the file 
instead, and the warning about the security hole if tempfile.mktemp() is 
used anyway.

You can use tempfile.mktemp() only as long as it is irrelevant if a file 
with that name already exists, or exists later but was not created by you.

-- 
PointedEars

Twitter: @PointedEars2
Please do not cc me. / Bitte keine Kopien per E-Mail.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Make a unique filesystem path, without creating the file

2016-02-15 Thread Cameron Simpson

On 15Feb2016 12:19, Ben Finney  wrote:

Dan Sommers  writes:

On Mon, 15 Feb 2016 11:08:52 +1100, Ben Finney wrote:
> I am unconcerned with whether there is a real filesystem entry of
> that name; the goal entails having no filesystem activity for this.
> I want a valid unique filesystem path, without touching the
> filesystem.

That's an odd use case.

It's very common to want filesystem paths divorced from accessing a
filesystem entry.

For example: test paths in a unit test. Filesystem access is orders of
magnitude slower than accessing fake files in memory only, it is more
complex and prone to irrelevant failures. So in such a test case
filesystem access should be avoided as unnecessary.

But.. then why a filesystem path at all in that case? Why use a filesystem as a 
reference at all?

I've been watching this for a few days, and am struggling to understand your 
use case.

The only modes I can imagine for such a thing (a generated but unused filename) 
are:

 checking that the name is syntactly valid, for whatever constrains you may 
 have (but if you're calling an opaque mktemp-like function, is this feasible 
 or remediable?)

 checking that the name generated does in fact not correspond to an existing 
 file (which presumes that the target directory has no other users, which also 
 implies that you don't need mktemp - a simple prefix+unused-ordinal will do)

 generating test paths using a real filesystem as a reference but not making a 
 test file - I'm having trouble imagining how this can be useful

 generating test paths without using a real filesystem as a reference, but 
 then you can't even use mktemp

I think I can contrive your test case scenario using #3:

 filepath = mktemp(existing_dir_path)
 fp = InMemoryFileLikeClassWithBogusName(filepath)
 do I/O on fp ...

but I don't see how it is useful to have a notion of a filepath at all in this 
case, and therefore I don't see why you would want a mktemp-like function 
available. Can you elaborate with a concrete example and its purpose which 
would work with a mktemp-ish official function?

You say:

One valid filesystem path each time it's accessed. That is, behaviour
equivalent to ‘tempfile.mktemp’.

My question is because the standard library clearly has this useful
functionality implemented, but simultaneously warns strongly against its
use.

I'm looking for how to get at that functionality in a non-deprecated
way, without re-implementing it myself.

I think "the standard library clearly has this useful functionality 
implemented, but simultaneously warns strongly against its use" pretty much 
precludes this.

I think you probably need to reimplement. However if your intent is never to 
use the path you can use something very simple (my personal habit is 
prefix+ordinal where that doesn't already exist - keep the last ordinal to 
arrange a distinct name next time).

Cheers,
Cameron Simpson 
--
https://mail.python.org/mailman/listinfo/python-list

Re: Make a unique filesystem path, without creating the file

2016-02-15 Thread Roel Schroeven


Ben Finney schreef op 2016-02-14 22:46:

How should a program generate a unique filesystem path and *not* create
the filesystem entry?

> ...

What standard library function should I be using to generate
‘tempfile.mktemp’-like unique paths, and *not* ever create a real file
by that path?


Use uuid.uuid1()?

--
The saddest aspect of life right now is that science gathers knowledge
faster than society gathers wisdom.
  -- Isaac Asimov

Roel Schroeven

--
https://mail.python.org/mailman/listinfo/python-list

Re: Make a unique filesystem path, without creating the file

2016-02-15 Thread Grant Edwards

On 2016-02-15, Ben Finney  wrote:
> Dan Sommers  writes:
>
>> On Mon, 15 Feb 2016 11:08:52 +1100, Ben Finney wrote:
>>
>> > I am unconcerned with whether there is a real filesystem entry of
>> > that name; the goal entails having no filesystem activity for this.
>> > I want a valid unique filesystem path, without touching the
>> > filesystem.
>>
>> That's an odd use case.
>
> It's very common to want filesystem paths divorced from accessing a
> filesystem entry.

If the filesystem paths are not associated with a filesystem, what do
you mean by "unique"?  You want to make sure that path 
which doesn't exist in some filesystem is different from all other
paths that don't exist in some filesystem?

> For example: test paths in a unit test. Filesystem access is orders
> of magnitude slower than accessing fake files in memory only,

How is "fake files in memory" not a filesystem?

-- 
Grant Edwards   grant.b.edwardsYow! The Korean War must
  at   have been fun.
  gmail.com
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Make a unique filesystem path, without creating the file

2016-02-15 Thread Grant Edwards

On 2016-02-14, Ben Finney  wrote:
> Howdy all,
>
> How should a program generate a unique filesystem path and *not* create
> the filesystem entry?

Short answer: you can't because it's the filesystem entry operation
that is atomic and guarantees uniqueness.

> [..]

> What standard library function should I be using to generate
> ‘tempfile.mktemp’-like unique paths, and *not* ever create a real file
> by that path?

What's the point of creating a unique path if you don't want to create
the file?

-- 
Grant Edwards   grant.b.edwardsYow! I'm rated PG-34!!
  at   
  gmail.com
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Make a unique filesystem path, without creating the file

2016-02-15 Thread Gregory Ewing


Ben Finney wrote:


The existing behaviour of ‘tempfile.mktemp’ – actually of its internal
class ‘tempfile._RandomNameSequence’ – is to generate unpredictable,
unique, valid filesystem paths that are different each time.


But that's not documented behaviour, so even if mktemp()
weren't marked as deprecated, you'd still be relying on
undocumented and potentially changeable behaviour.


What I'm
looking for is a way to use it (not re-implement it) that is public API
and isn't scolded by the library documentation.


Then you're looking for something that doesn't exist,
I'm sorry to say, and it's unlikely you'll persuade
anyone to make it exist.

If you want to leverage stdlib functionality for this,
I'd suggest something along the lines of:

  def fakefilename(dir, ext):
return os.path.join(dir, str(uuid.uuid4())) + ext

--
Greg
--
https://mail.python.org/mailman/listinfo/python-list

Re: Make a unique filesystem path, without creating the file

2016-02-15 Thread Gregory Ewing


Ben Finney wrote:

One valid filesystem path each time it's accessed. That is, behaviour
equivalent to ‘tempfile.mktemp’.

My question is because the standard library clearly has this useful
functionality implemented, but simultaneously warns strongly against its
use.


But it *doesn't*, if your requirement is truly to not touch
the filesystem at all, because tempfile.mktemp() *reads* the
file system to make sure the name it's returning isn't
in use.

What's more, because you're *not* creating the file, mktemp()
would be within its rights to return the same file name the
second time you call it.

If you want something that really doesn't go near the file
system and/or is guaranteed to produce multiple different
non-existing file names, you'll have to write it yourself.

--
Greg
--
https://mail.python.org/mailman/listinfo/python-list

Re: Make a unique filesystem path, without creating the file

2016-02-14 Thread Ben Finney

Steven D'Aprano  writes:

> If you can absolutely guarantee that this string will never actually
> be used on a real filesystem, then go right ahead and use it.

I'm giving advice in examples in documentation. It's not enough to have
some private usage that I know is good, I am looking for a standard API
that when the reader looks it up will not be laden with big scary
warnings.

Currently I can write about the public API ‘tempfile.mktemp’ in
documentation, but the conscientious reader will be correct to have
concerns when the examples I give are sternly deprecated in the standard
library documentation.

Or I can write about the private API ‘tempfile._RandomNameSequence’ in
the documentation, and the conscientious reader will be correct to have
concerns about use of an undocumented private-use API.

I'm looking for a way to give examples that use that standard library
functionality, with an API that is both public and not discouraged.

> > I'm looking for how to get at that functionality in a non-deprecated
> > way, without re-implementing it myself.
>
> You probably can't, not if you want to future-proof your code against
> the day when tempfile.mktemp is removed.

That's disappointing. It is already implemented and well-tested, it is
useful as is. Forking and duplicating it is poor practice if it can
simply be used in a standard place.

I have reported https://bugs.python.org/issue26362> for this
request.

-- 
 \ “Nothing worth saying is inoffensive to everyone. Nothing worth |
  `\saying will fail to make you enemies. And nothing worth saying |
_o__)will not produce a confrontation.” —Johann Hari, 2011 |
Ben Finney

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Make a unique filesystem path, without creating the file

2016-02-14 Thread Steven D'Aprano

On Monday 15 February 2016 12:19, Ben Finney wrote:

> One valid filesystem path each time it's accessed. That is, behaviour
> equivalent to ‘tempfile.mktemp’.
> 
> My question is because the standard library clearly has this useful
> functionality implemented, but simultaneously warns strongly against its
> use.

If you can absolutely guarantee that this string will never actually be used 
on a real filesystem, then go right ahead and use it. There's nothing wrong 
with (for instance) calling mktemp to generate *strings* that merely *look* 
like pathnames.

If you want to guarantee that these faux pathnames can't leak out of your 
test suite and touch the file system, prepend an ASCII NUL to them. That 
will make it an illegal path on all file systems that I'm aware of.

> I'm looking for how to get at that functionality in a non-deprecated
> way, without re-implementing it myself.

You probably can't, not if you want to future-proof your code against the 
day when tempfile.mktemp is removed.

But you can simply fork that module, delete all the irrelevant bits, and 
make the mktemp function a private utility in your test suite.

-- 
Steve

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Make a unique filesystem path, without creating the file

2016-02-14 Thread Martin A. Brown


Good evening/morning Ben,

>> > I am unconcerned with whether there is a real filesystem entry of
>> > that name; the goal entails having no filesystem activity for this.
>> > I want a valid unique filesystem path, without touching the
>> > filesystem.
>>
>> Your phrasing is ambiguous.
>
>The existing behaviour of ‘tempfile.mktemp’ – actually of its 
>internal class ‘tempfile._RandomNameSequence’ – is to generate 
>unpredictable, unique, valid filesystem paths that are different 
>each time.
>
>That's the behaviour I want, in a public API that exposes what 
>‘tempfile’ already has implemented, documented in a way that 
>doesn't create a scare about security.

If your code is not actually touching the filesystem, then it will 
not be affected by the race condition identified in the 
tempfile.mktemp() warning anyway.  So, I'm unsure of your worry.

>> But if you explain in more detail why you want this filename, perhaps
>> we can come up with some ideas that will help.
>
>The behaviour is already implemented in the standard library. What 
>I'm looking for is a way to use it (not re-implement it) that is 
>public API and isn't scolded by the library documentation.

I might also suggest the (bound) method _create_tmp() on class 
mailbox.Maildir, which achieves roughly the same goals, but for a 
permanent file.

Of course, that particular method also touches the filesystem.  The 
Maildir naming approach is based on the assumptions* that time is 
monotonically increasing, that system nodes never share the same 
name and that you don't need more than 1 uniquely named file per 
directory per millisecond.

If so, then you can use the 9 or 10 lines of that method.

Good luck,

-Martin

  * I was tempted to joke about these two guarantees, but I think 
that undermines my basic message.  To wit, you can probably rely 
on this naming technique about as much as you can rely on your 
system clock.  I'll assume that you aren't naming all of your 
nodes 'franklin.p.gundersnip'.

-- 
Martin A. Brown
http://linux-ip.net/
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Make a unique filesystem path, without creating the file

2016-02-14 Thread Ben Finney

Steven D'Aprano  writes:

> On Monday 15 February 2016 11:08, Ben Finney wrote:
>
> > I am unconcerned with whether there is a real filesystem entry of
> > that name; the goal entails having no filesystem activity for this.
> > I want a valid unique filesystem path, without touching the
> > filesystem.
>
> Your phrasing is ambiguous.

The existing behaviour of ‘tempfile.mktemp’ – actually of its internal
class ‘tempfile._RandomNameSequence’ – is to generate unpredictable,
unique, valid filesystem paths that are different each time.

That's the behaviour I want, in a public API that exposes what
‘tempfile’ already has implemented, documented in a way that doesn't
create a scare about security.

> But if you explain in more detail why you want this filename, perhaps
> we can come up with some ideas that will help.

The behaviour is already implemented in the standard library. What I'm
looking for is a way to use it (not re-implement it) that is public API
and isn't scolded by the library documentation.

-- 
 \ “Try adding “as long as you don't breach the terms of service – |
  `\  according to our sole judgement” to the end of any cloud |
_o__)  computing pitch.” —Simon Phipps, 2010-12-11 |
Ben Finney

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Make a unique filesystem path, without creating the file

2016-02-14 Thread Steven D'Aprano

On Monday 15 February 2016 11:08, Ben Finney wrote:

> I am unconcerned with whether there is a real filesystem entry of that
> name; the goal entails having no filesystem activity for this. I want a
> valid unique filesystem path, without touching the filesystem.

Your phrasing is ambiguous.

If you are unconcerned whether or not a file of that name exists, then just 
pick a name and use that:

unique_path = /tmp/foo

is guaranteed to be valid on POSIX systems and unique, and it may or may not 
exist.

If you actually do care that /tmp/foo *doesn't* exist, then you have a 
problem: whatever name you pick *now* may no longer "not exist" a 
millisecond later. In general there's no way to create a valid pathname 
which doesn't exist *now* and is guaranteed to continue to not exist unless 
you touch the file system.

But if you explain in more detail why you want this filename, perhaps we can 
come up with some ideas that will help.

-- 
Steve

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Make a unique filesystem path, without creating the file

2016-02-14 Thread Ben Finney

Dan Sommers  writes:

> On Mon, 15 Feb 2016 11:08:52 +1100, Ben Finney wrote:
>
> > I am unconcerned with whether there is a real filesystem entry of
> > that name; the goal entails having no filesystem activity for this.
> > I want a valid unique filesystem path, without touching the
> > filesystem.
>
> That's an odd use case.

It's very common to want filesystem paths divorced from accessing a
filesystem entry.

For example: test paths in a unit test. Filesystem access is orders of
magnitude slower than accessing fake files in memory only, it is more
complex and prone to irrelevant failures. So in such a test case
filesystem access should be avoided as unnecessary.

> If it's really just one valid filesystem path (your original post said
> *paths*, plural), then how about __file__? or os.__file__?

One valid filesystem path each time it's accessed. That is, behaviour
equivalent to ‘tempfile.mktemp’.

My question is because the standard library clearly has this useful
functionality implemented, but simultaneously warns strongly against its
use.

I'm looking for how to get at that functionality in a non-deprecated
way, without re-implementing it myself.

-- 
 \  “The most common way people give up their power is by thinking |
  `\   they don't have any.” —Alice Walker |
_o__)  |
Ben Finney

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Make a unique filesystem path, without creating the file

2016-02-14 Thread Dan Sommers

On Mon, 15 Feb 2016 11:08:52 +1100, Ben Finney wrote:

> I am unconcerned with whether there is a real filesystem entry of that
> name; the goal entails having no filesystem activity for this. I want
> a valid unique filesystem path, without touching the filesystem.

That's an odd use case.

If it's really just one valid filesystem path (your original post said
*paths*, plural), then how about __file__? or os.__file__?
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Make a unique filesystem path, without creating the file

2016-02-14 Thread Ben Finney

Matt Wheeler  writes:

> On 14 Feb 2016 21:46, "Ben Finney"  wrote:
> > What standard library function should I be using to generate
> > ‘tempfile.mktemp’-like unique paths, and *not* ever create a real
> > file by that path?
>
> Could you use tempfile.TemporaryDirectory and then just use a
> consistent name within that directory.

That fails because it touches the filesystem. I want to avoid using a
real file or a real directory.

> It's guaranteed not to exist

I am unconcerned with whether there is a real filesystem entry of that
name; the goal entails having no filesystem activity for this. I want a
valid unique filesystem path, without touching the filesystem.

-- 
 \ “I believe our future depends powerfully on how well we |
  `\ understand this cosmos, in which we float like a mote of dust |
_o__) in the morning sky.” —Carl Sagan, _Cosmos_, 1980 |
Ben Finney

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Make a unique filesystem path, without creating the file

2016-02-14 Thread Matt Wheeler

On 14 Feb 2016 21:46, "Ben Finney"  wrote:
> What standard library function should I be using to generate
> ‘tempfile.mktemp’-like unique paths, and *not* ever create a real file
> by that path?

Could you use tempfile.TemporaryDirectory and then just use a consistent
name within that directory.

It's guaranteed not to exist because the directory was only just created
and only you can write to it? Has the added bonus of still being reasonably
secure, to appease people like Mr PointedEars.

(If you need multiple nonexistent paths in the same dir then perhaps use
tempfile.NamedTemporaryFile with your newly created temp dir and an
arbitrary suffix, and strip the suffix off to get the name you actually
use.)

--
Matt Wheeler
http://funkyh.at
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Make a unique filesystem path, without creating the file

2016-02-14 Thread Thomas 'PointedEars' Lahn

Ben Finney wrote:

> How should a program generate a unique filesystem path and *not* create
> the filesystem entry?

The Python documentation suggests that it should not.

> The ‘tempfile.mktemp’ function is strongly deprecated, and rightly so
> https://docs.python.org/3/library/tempfile.html#tempfile.mktemp>
> because it leaves the program vulnerable to insecure file creation.
> 
> In some code (e.g. unit tests) I am calling ‘tempfile.mktemp’ to
> generate a unique path for a filesystem entry that I *do not want* to
> exist on the real filesystem. In this case the filesystem security
> concerns are irrelevant because there is no file.

I do not think that you have properly understood the problems with 
tmpfile.mktemp().

> […]
> It is also prone to that API function disappearing at some point in the
> future, because it is explicitly and strongly deprecated.
> 
> So I agree with the deprecation, but the library doesn't appear to
> provide a replacement.

| mktemp() usage can be replaced easily with NamedTemporaryFile(), passing 
| it the delete=False parameter: [example]

> What standard library function should I be using to generate
> ‘tempfile.mktemp’-like unique paths, and *not* ever create a real file
> by that path?

I do not think it is possible to avoid the creation of a real file using the 
PSL; in fact, that a file is created appears to be precisely what fixes the 
problems with tempfile.mktemp() because then it cannot happen that someone 
else creates a file with the same name at the same time:

| tempfile.NamedTemporaryFile(mode='w+b', buffering=None, encoding=None, 
| newline=None, suffix=None, prefix=None, dir=None, delete=True)
| 
| This function operates exactly as TemporaryFile() does, except that the 
| file is guaranteed to have a visible name in the file system (on Unix, the 
| directory entry is not unlinked). […] If delete is true (the default), the 
| file is deleted as soon as it is closed. […]

It is of course possible to generate a filename that is not currently used, 
but I am not aware of a PSL feature that does this, and if there were such a 
feature there would be the same problems with it as with mktemp().

-- 
PointedEars

Twitter: @PointedEars2
Please do not cc me. / Bitte keine Kopien per E-Mail.
-- 
https://mail.python.org/mailman/listinfo/python-list

Make a unique filesystem path, without creating the file

2016-02-14 Thread Ben Finney

Howdy all,

How should a program generate a unique filesystem path and *not* create
the filesystem entry?

The ‘tempfile.mktemp’ function is strongly deprecated, and rightly so
https://docs.python.org/3/library/tempfile.html#tempfile.mktemp>
because it leaves the program vulnerable to insecure file creation.

In some code (e.g. unit tests) I am calling ‘tempfile.mktemp’ to
generate a unique path for a filesystem entry that I *do not want* to
exist on the real filesystem. In this case the filesystem security
concerns are irrelevant because there is no file.

The deprecation of that function is a concern still, because I don't
want code that makes every conscientious reader need to decide whether
the code is a problem. Instead the code should avoid rightly-deprecated
APIs.

It is also prone to that API function disappearing at some point in the
future, because it is explicitly and strongly deprecated.

So I agree with the deprecation, but the library doesn't appear to
provide a replacement.

What standard library function should I be using to generate
‘tempfile.mktemp’-like unique paths, and *not* ever create a real file
by that path?

-- 
 \“If you have the facts on your side, pound the facts. If you |
  `\ have the law on your side, pound the law. If you have neither |
_o__)   on your side, pound the table.” —anonymous |
Ben Finney

-- 
https://mail.python.org/mailman/listinfo/python-list

87 matches

Mail list logo