Re: Make a unique filesystem path, without creating the file
On 29Feb2016 00:47, Alan Bawden wrote: Cameron Simpson writes: On 22Feb2016 12:34, Alan Bawden wrote: I have deleted the part of discussion where it seems that we must simply agree to disagree. You think mktemp() is _way_ more dangerous that I do. I certainly think the habit of using it is. And thus we're off into the realm of risk assessment I suppose, where one's value sets greatly affect the outcome. But there are concrete arguments to be made about risks. To your other question... [...] In fact, mkstemp() also performs that same generate-and-open loop, and of course it is careful to use os.O_EXCL along with os.O_CREAT when it opens the file. So let me re-state my argument using mkstemp() instead: If the code I wrote in my original message is "unsafe" because some _other_ process might be using mktemp() badly and stumble over the same path, then the current implementation of tempfile.mkstemp() is also "unsafe" for exactly the same reason: some other process badly using mktemp() to create its own file might accidentally grab the same file. In other words, if that other process does: path = mktemp() tmpfp = open(path, "w") Then yes indeed, it might accidentally grab my fifo when I used my original code for making a temporary fifo. But it might _also_ succeed in grabbing any temporary files I make using tempfile.mkstemp()! So if you think what I wrote is "unsafe", it seems that you must conclude that the standard tempfile.mkstemp() is exactly as "unsafe". So is that what you think? Yes and no? You're quite right that a task using mkstemp is not safe against a task misusing mktemp. _However_: In a space where everyone uses mktemp, everyone is unsafe from collision. In a space where everyone uses mkstemp, everyone is safe from collision. So provided everyone "upgrades", safety is reliable without any added burden in program complexity. Of course, that sidesteps the scenario where someone is using mktemp to obtain a pathname for a non-file, but I am of the opinion that in almost all such cases the programmer is better off using mkdtemp and making their non-file inside the temporary directory. Again, provided everyone "upgrades" to such a practice, safety is arranged. Because of this, I think that _any_ use of mktemp invites risk of collision, and needs to be justified with a robust argument establishing that the problem cannot be solved with mkstemp or mkdtemp. Your example was not such a case. Ben's is, in that (a) he needs a "valid" name and (b) he isn't going to make an actual filesystem object using the name obtained. As it happens it looks like the uuid generation functions from the stdlib may meet his needs, addressing his desire to do it simply with the stdlib instead of making his own wheel. So I remain against mktemp without an outsandingly special use case. Cheers, Cameron Simpson -- https://mail.python.org/mailman/listinfo/python-list
Re: Make a unique filesystem path, without creating the file
Cameron Simpson writes: > On 22Feb2016 12:34, Alan Bawden wrote: I have deleted the part of discussion where it seems that we must simply agree to disagree. You think mktemp() is _way_ more dangerous that I do. >>> In fact your use case isn't safe, because _another_ task using mktemp >>> in conflict as a plain old temporary file may grab your fifo. >> >>But here in very last sentence I really must disagree. If the code I >>wrote above is "unsafe" because some _other_ process might be using >>mktemp() badly and stumble over the same path, then the current >>implementation of tempfile.mkdtemp() is also "unsafe" for exactly the >>same reason: some other process using mktemp() badly to create its own >>directory might accidentally grab the same directory. > > When the other taks goes mkdir with the generated name it will fail, so no. Quite right. I sabotaged my own argument by picking mkdtemp() instead of mkstemp(). I was trying to shorten my text by taking advantage of the fact that I had _already_ mentioned that mkdtemp() performs exactly the same generate-and-open loop than the code I had written. I apologize for the confusion. In fact, mkstemp() also performs that same generate-and-open loop, and of course it is careful to use os.O_EXCL along with os.O_CREAT when it opens the file. So let me re-state my argument using mkstemp() instead: If the code I wrote in my original message is "unsafe" because some _other_ process might be using mktemp() badly and stumble over the same path, then the current implementation of tempfile.mkstemp() is also "unsafe" for exactly the same reason: some other process badly using mktemp() to create its own file might accidentally grab the same file. In other words, if that other process does: path = mktemp() tmpfp = open(path, "w") Then yes indeed, it might accidentally grab my fifo when I used my original code for making a temporary fifo. But it might _also_ succeed in grabbing any temporary files I make using tempfile.mkstemp()! So if you think what I wrote is "unsafe", it seems that you must conclude that the standard tempfile.mkstemp() is exactly as "unsafe". So is that what you think? -- Alan Bawden -- https://mail.python.org/mailman/listinfo/python-list
Re: Make a unique filesystem path, without creating the file
On Tue, Feb 23, 2016, at 03:22, Paul Rubin wrote: > Thanks. It would be nice if those were gatewayed to usenet like this > group is. I can't bring myself to subscribe to mailing lists. Have you tried gmane? -- https://mail.python.org/mailman/listinfo/python-list
Re: Make a unique filesystem path, without creating the file
On 2016-02-25, Steven D'Aprano wrote: > The links already provided go through the evidence. For example, they > explain that /dev/random and /dev/urandom both use the exact same CSPRNG. If > you don't believe that, you can actually read the source to Linux, FreeBSD, > OpenBSD and NetBSD. (But not OS X, sorry.) Actually yes OS X: http://www.opensource.apple.com/source/xnu/xnu-3248.20.55/bsd/dev/random/ http://www.opensource.apple.com/source/xnu/xnu-3248.20.55/osfmk/prng/ -- https://mail.python.org/mailman/listinfo/python-list
Re: Make a unique filesystem path, without creating the file
On Thursday 25 February 2016 17:54, Marko Rauhamaa wrote: > Steven D'Aprano : > >> On Wednesday 24 February 2016 18:20, Marko Rauhamaa wrote: >>> Steven D'Aprano : And that is where you repeat something which is rank superstition. >>> >>> Can you find info to back that up. >> >> The links already provided go through the evidence. For example, they >> explain that /dev/random and /dev/urandom both use the exact same >> CSPRNG. > > A non-issue. The question is, after the initial entropy is collected and > used to seed the CSPRNG, is any further entropy needed for any > cryptographic purposes? The short answer: "yes". The long answer: "probably not, but it can't hurt". The longer answer: "probably not, and it usually won't hurt, but it could". If, somehow, an attacker manages to work out the state of your CSPRNG, including the entropy pool, then they can predict what values you get until they no longer know the state of the CSPRNG. The idea is that if, somehow, somebody knows the current state of the CSPRNG (including the entropy pool), but can't influence what future values go into the entropy pool, then they will only be able to predict the output values for a short time. But it's hard to think of any actual attack where somebody can see what's in the entropy pool but can't influence the values going into it. It seems to me that this is an unrealistic attack: "Assume that you're kidnapped by somebody with no arms or legs..." The conventional wisdom is that adding poor sources of entropy into the pool will never hurt, but that is actually wrong. If an attacker knows what is in the entropy pool, and can craft the values going in, they can force the CSPRNG to return more predictable values. So sometimes adding more entropy can hurt. And it usually won't help. It *might* help if your system is compromised, but if so, it's not really clear how the attacker has compromised your current entropy pool but not the future ones. > Are there any nagging fears that weaknesses > could be found in the deterministic sequence? Of course there are. Nobody really knows what capabilities the NSA have, but they almost surely aren't *that* advanced. CSPRNGs are subject to much the same sort of issues as other crypto, such as hash functions: http://valerieaurora.org/hash.html and encryption algorithms. (The main real difference between a hash function and encryption algorithm is that hashes don't have to be reversible.) Expect the current crop of CSPRNGs (Yarrow, AES, whatever Linux uses) to be replaced long before there is a proven attack on them. > /dev/random is supposed to be hardened against such concerns by stirring > the pot constantly (if rather slowly). As is /dev/urandom. > Here's what Linus Torvalds said on the matter years back: > >> No, it says /dev/random is primarily useful for generating large >> (>>160 bit) keys. > >Which is exactly what something like sshd would want to use for >generating keys for the machine, right? That is _the_ primary reason >to use /dev/random. > >Yet apparently our /dev/random has been too conservative to be >actually useful, because (as you point out somewhere else) even sshd >uses /dev/urandom for the host key generation by default. > >That is really sad. That is the _one_ application that is common and >that should really have a reason to maybe care about /dev/random vs >urandom. And that application uses urandom. To me that says that >/dev/random has turned out to be less than useful in real life. > >Is there anything that actually uses /dev/random at all (except for >clueless programs that really don't need to)? Most other Unixes have decided that /dev/random is unnecessary, and urandom is the right thing to do. SSH uses urandom by default, but allows the paranoid/clueless to use /dev/random if they insist. -- Steve -- https://mail.python.org/mailman/listinfo/python-list
Re: Make a unique filesystem path, without creating the file
Steven D'Aprano : > On Wednesday 24 February 2016 18:20, Marko Rauhamaa wrote: >> Steven D'Aprano : >>> And that is where you repeat something which is rank superstition. >> >> Can you find info to back that up. > > The links already provided go through the evidence. For example, they > explain that /dev/random and /dev/urandom both use the exact same > CSPRNG. A non-issue. The question is, after the initial entropy is collected and used to seed the CSPRNG, is any further entropy needed for any cryptographic purposes? Are there any nagging fears that weaknesses could be found in the deterministic sequence? /dev/random is supposed to be hardened against such concerns by stirring the pot constantly (if rather slowly). Here's what Linus Torvalds said on the matter years back: > No, it says /dev/random is primarily useful for generating large > (>>160 bit) keys. Which is exactly what something like sshd would want to use for generating keys for the machine, right? That is _the_ primary reason to use /dev/random. Yet apparently our /dev/random has been too conservative to be actually useful, because (as you point out somewhere else) even sshd uses /dev/urandom for the host key generation by default. That is really sad. That is the _one_ application that is common and that should really have a reason to maybe care about /dev/random vs urandom. And that application uses urandom. To me that says that /dev/random has turned out to be less than useful in real life. Is there anything that actually uses /dev/random at all (except for clueless programs that really don't need to)? http://article.gmane.org/gmane.linux.kernel/47437> > If you don't trust the CSPRNG, then you shouldn't trust it whether it comes > from /dev/random or /dev/urandom. If you do trust it, then why would you > want it to block? Blocking doesn't make it more random. It might not make it more secure cryptographically, but the point is that it should make it more genuinely random. > That's not how it works. It just makes you vulnerable to a Denial Of > Service attack. Understood. You should not use /dev/random for any reactive purposes (like nonces or session encryption keys). > There's that myth about urandom being "less random" than random again, > but even this guy admits that the difference is "extremely hard" > (actually: impossible) to measure, and that CSPRNG's "work". Which is > precisely why OpenBSD uses arc4random for their /dev/random and > /dev/urandom, and presumably why he wants to bring it to Linux. That's for the cryptographic experts to judge. CSPRNG's aren't always as CS as one would think: In December 2013, a Reuters news article alleged that in 2004, before NIST standardized Dual_EC_DRBG, NSA paid RSA Security $10 million in a secret deal to use Dual_EC_DRBG as the default in the RSA BSAFE cryptography library, which resulted in RSA Security becoming the most important distributor of the insecure algorithm. https://en.wikipedia.org/wiki/Dual_EC_DRBG> > The bottom line is, nobody can distinguish the output of urandom and > random (apart from the blocking behaviour). Nobody has demonstrated > any way to distinguish the output of either random or urandom from > "actual randomness". There are theoretical attacks on urandom that > random might be immune to, but if so, I haven't heard what they are. What I'm looking for is a cryptography mailing list (or equivalent) giving their stamp of approval. As can be seen above, NIST ain't it. It seems, though, that cryptography researchers are not ready to declare any scheme void of vulnerabilities. At best they can mention that there are no *known* vulnerabilities. > What evidence do they give that /dev/urandom is weak? If it is weak, > why are they using it as the default? It's a big mess, but not a mess I would disentangle. Once the crypto libraries, utilities, facilities and the OS come to a consensus, I can hope they've done their homework. As it stands, the STRONG vs VERY STRONG dichotomy seems to be alive all over the place. Marko -- https://mail.python.org/mailman/listinfo/python-list
Re: Make a unique filesystem path, without creating the file
On Wednesday 24 February 2016 18:20, Marko Rauhamaa wrote: > Steven D'Aprano : > >> On Tue, 23 Feb 2016 05:54 pm, Marko Rauhamaa wrote: >>> However, when you are generating signing or encryption keys, you >>> should use /dev/random. >> >> And that is where you repeat something which is rank superstition. > > Can you find info to back that up. The links already provided go through the evidence. For example, they explain that /dev/random and /dev/urandom both use the exact same CSPRNG. If you don't believe that, you can actually read the source to Linux, FreeBSD, OpenBSD and NetBSD. (But not OS X, sorry.) Put aside the known bug in Linux, where urandom will provide predictable values in the period following a fresh install before the OS has collected enough entropy. We all agree that's a problem, and that the right solution is to block. The question is, outside of that narrow set of circumstances, when it is appropriate to block? As I mentioned, most Unixes don't block. urandom and random behave exactly the same way in three of the most popular Unixes (FreeBSD, OpenBSD, OS X): https://en.wikipedia.org/wiki//dev/random so let's consider just those that do block (Linux, and NetBSD). They both use the same CSPRNG. Do you dispute that? Then read the source. For one to be "better" than the other, there would need to be a detectable difference between the two. Nobody has ever found one, and nor will they, because they're both coming from the same CSPRNG (AES in the case of NetBSD, I'm not sure what in the case of Linux). If you don't trust the CSPRNG, then you shouldn't trust it whether it comes from /dev/random or /dev/urandom. If you do trust it, then why would you want it to block? Blocking doesn't make it more random. That's not how it works. It just makes you vulnerable to a Denial Of Service attack. There really doesn't seem to be any valid reason for random blocking. It's like warnings and timers on fans in South Korea to prevent fan death: http://www.snopes.com/medical/freakish/fandeath.asp (My favourite explanation is that the blades of the fan chop the oxygen molecules in two.) I'm not surprised that there is so much misinformation about random/urandom. Here's a blog post by somebody wanting to port arc4 to Linux, so he clearly knows a few things about crypto. I can't judge whether arc4 is better or worse than what Linux already uses, but look at this quote: http://insanecoding.blogspot.com.au/2014/05/a-good-idea-with-bad-usage- devurandom.html Quote: Linux is well known for inventing and supplying two default files, /dev/random and /dev/urandom (unlimited random). The former is pretty much raw entropy, while the latter is the output of a CSPRNG function like OpenBSD's arc4random family. The former can be seen as more random, and the latter as less random, but the differences are extremely hard to measure, which is why CSPRNGs work in the first place. Since the former is only entropy, it is limited as to how much it can output, and one needing a lot of random data can be stuck waiting a while for it to fill up the random buffer. Since the latter is a CSPRNG, it can keep outputting data indefinitely, without any significant waiting periods." There's that myth about urandom being "less random" than random again, but even this guy admits that the difference is "extremely hard" (actually: impossible) to measure, and that CSPRNG's "work". Which is precisely why OpenBSD uses arc4random for their /dev/random and /dev/urandom, and presumably why he wants to bring it to Linux. This author is *completely wrong* to say that /dev/random is "pretty much raw entropy". If it were, it would be biased, and easily manipulated by an attacker. Entropy is collected from (among other things) network traffic, which would allow an attacker to control at least one source of entropy and hence (in theory) make it easier to predict the output of /dev/random. But fortunately it is not true. Linux's random system works like this: - entropy is collected from various sources and fed into a pool; - entropy from that pool is fed through a CSPRNG into two separate pools, one each for /dev/random and /dev/urandom; - when you read from /dev/random or urandom, they both collect entropy from their own pool, and again pass it through a CSPRNG; - /dev/random has a throttle (it blocks if you take out too much); - /dev/urandom doesn't have a throttle. https://events.linuxfoundation.org/images/stories/pdf/lceu2012_anvin.pdf Somebody criticized the author for spreading this misapprehension that /dev/random is "raw entropy" and here is his response: I tried giving an explanation which should be simple for a layman to follow of what goes on. I wouldn't take it as precise fact, especially when there's a washing machine involved in the explanation ;) Or, in other words, "When I said the moon was made of green cheese, I was
Re: Make a unique filesystem path, without creating the file
On Tue, 23 Feb 2016 07:22 pm, Paul Rubin wrote: > Mark Lawrence writes: >> https://mail.python.org/pipermail/python-ideas/2015-September/036333.html >> then http://www.gossamer-threads.com/lists/python/dev/1223780 > > Thanks. It would be nice if those were gatewayed to usenet like this > group is. I can't bring myself to subscribe to mailing lists. > >>> There are a few other choices in the PEP whose benefit is unclear to me, >>> but they aren't harmful, and I guess the decisions have already been >>> made. >> The PEP status is draft so is subject to change. > > Well they might be changeable but it sounds like there's a level of > consensus by now, that wouldn't be helped by more bikeshedding over > relatively minor stuff. I might write up some further comments and post > them here If you're going to do so, please do so in the next few days (or write to me off list to ask for an extension) because I intend to ask Guido for a ruling early next week. -- Steven -- https://mail.python.org/mailman/listinfo/python-list
Re: Make a unique filesystem path, without creating the file
Steven D'Aprano : > On Tue, 23 Feb 2016 05:54 pm, Marko Rauhamaa wrote: >> However, when you are generating signing or encryption keys, you >> should use /dev/random. > > And that is where you repeat something which is rank superstition. Can you find info to back that up. All I've seen so far is forceful claims that's superstition ("These are not the droids you're looking for"). Even the ssh-keygen man page has: The reseeding of the OpenSSL random generator is usually done from /dev/urandom. If the SSH_USE_STRONG_RNG environment vari‐ able is set to value other than 0 the OpenSSL random generator is reseeded from /dev/random. Marko -- https://mail.python.org/mailman/listinfo/python-list
Re: Make a unique filesystem path, without creating the file
On Tue, 23 Feb 2016 05:54 pm, Marko Rauhamaa wrote: > Steven D'Aprano : > >> On Tue, 23 Feb 2016 06:32 am, Marko Rauhamaa wrote: >>> Under Linux, /dev/random is the way to go when strong security is >>> needed. Note that /dev/random is a scarce resource on ordinary >>> systems. >> >> That's actually incorrect, but you're not the only one to have been >> mislead by the man pages. >> >> http://sockpuppet.org/blog/2014/02/25/safely-generate-random-numbers/ > > Still, mostly hypnotic repetitions. Repetition for the sake of emphasis, because there are so many misled and confused people on the internet who misunderstand the difference between urandom and random and consequently give bad advice. I believe that the Linux man page for urandom is to blame, although I don't know why it hasn't been fixed. Possibly because it is *technically* correct, in the sense of "if you are concerned by the risk of being hit by a meteorite, wearing a stainless steel cooking pot on your head will give you some protection from meteorite strikes to the head". Everything in it is technically correct, but misleading. > However, it admits: > >But /dev/random also tries to keep track of how much entropy remains >in its kernel pool, and will occasionally go on strike if it decides >not enough remains. > > That's the whole point. Exactly, but you've missed the point. That is precisely why the blocking random is HARMFUL and should not be used. There is one, and only one, scenario when your CSPRNG should block: before the system has enough entropy to securely seed the CSPRNG and it is at risk of returning predictable numbers. But after that point has passed, there is no test you can perform to distinguish the outputs of /dev/random and /dev/urandom (apart from the blocking behaviour itself). If I give you a million numbers, there is no way you can tell whether I used random or urandom. The important thing here is that there is no difference in "quality" (whatever that means!) between the random numbers generated by urandom and those generated by random. They are equally unpredictable. They pass the same randomness tests. Neither is "better" or "worse" than the other, because they are both generated by the same CSPRNG or HRNG. Here is a summary of the random/urandom distinction on various Unixes: Linux: random blocks, urandom never blocks, both use the same CSPRNG based on SHA-1 hashes, both will use a HRNG if available FreeBSD: urandom is a link to random, which never blocks; uses 256-bit Yarrow CSPRNG, will use a HRNG if available OpenBSD: both never block; both use a variant of the RC4 CSPRNG (misleadingly renamed ARC4 due to licencing issues), in newer versions use the ChaCha20 CSPRNG OS X: both never block and use 160-bit Yarrow NetBSD: random blocks, urandom never blocks, both use the same AES-128 CSPRNG The NetBSD man pages are quite scathing: "The entropy accounting described here is not grounded in any cryptography theory. It is done because it was always done, and because it gives people a warm fuzzy feeling about information theory. ... History is littered with examples of broken entropy sources and failed system engineering for random number generators. Nobody has ever reported distinguishing AES ciphertext from uniform random without side channels, nor reported computing SHA-1 preimages faster than brute force. The folklore information- theoretic defence against computationally unbounded attackers replaces system engineering that successfully defends against realistic threat models by imaginary theory that defends only against fantasy threat models." To be clear, the "folklore information-theoretic defence" they are referring to is /dev/random's blocking behaviour. http://netbsd.gw.com/cgi-bin/man-cgi?rnd+4+NetBSD-current The blocking behaviour of /dev/random (on Linux) doesn't solve any real problems, but it *creates* new problems. /dev/random can block for minutes or even hours, especially straight after booting a freshly installed OS. This can be considered a Denial Of Service attack, and even if it isn't, it encourages developers to "fix" the problem by using their own home-brewed random numbers, weakening the security of the system. There's even a minority viewpoint that constantly adding new entropy to the CSPRNG is useless. Apart from collecting sufficient entropy for the initial seed, you should never add new entropy to the CSPRNG. Your CSPRNG is either cryptographically strong, or it isn't. If it is, then it is already unpredictable and adding more entropy is a waste of time. If it isn't, then adding more entropy isn't going to help you. Adding entropy is just one more component that can contain bugs (see the NetSBD comment about "broken entropy sources") or even allow an attack on the CSPRNG: http://blog.cr.yp.to/20140205-entropy.html There's one good argument for
Re: Make a unique filesystem path, without creating the file
Paul Rubin : > Marko Rauhamaa writes: >> It is also correct that /dev/urandom depletes the entropy pool as >> effectively as /dev/random. > > I think see what's confusing you: the above is a misconception that is > probably held by lots of people. Entropy is not water and from a > cryptographic standpoint there is essentially no such thing as > "depleting" an entropy pool. There is either enough entropy (say 256 > bits or more) in the PRNG or else there isn't. If there's not enough, > urandom can misbehave by giving you bad output because it doesn't block > until more is gathered. If there is enough, /dev/random misbehaves by > blocking under this bogus concept of "depletion". You are making my point. /dev/random is correct to block until top-quality random numbers can be supplied. That's not misbehaving. > So once /dev/random unblocks, it should never again block, the behavior > of getrandom. What you are saying is that /dev/random has no reason to exist (and the GRND_RANDOM flag to getrandom() is redundant). I'm no cryptographer and can't judge that. However, as long as the distinction is maintained, I have to abide by the documented characteristics. > No really, all you've done is repeat bad advice. The people cited in > that article are very knowledgeable and the stuff they say makes good > mathematical sense. The stuff you say makes no sense and you haven't > given any convincing reason for anyone to listen to you. Thing is, neither you nor me nor the cited articles has provided any more info than insisting on a position, my position being relying on the documented API. So we have * /dev/urandom vs /dev/random * getrandom(0) vs getrandom(GRND_RANDOM) * GCRY_STRONG_RANDOM ("Use this level for session keys and similar purposes") vs GCRY_VERY_STRONG_RANDOM ("Use this level for long term key material") (in libgcrypt) You don't need to convince me that that distinction is silly. You need to convince the crypto facility providers. Marko -- https://mail.python.org/mailman/listinfo/python-list
Re: Make a unique filesystem path, without creating the file
On 2016-02-23, Mark Lawrence wrote: > On 23/02/2016 08:22, Paul Rubin wrote: >> Mark Lawrence writes: >>> https://mail.python.org/pipermail/python-ideas/2015-September/036333.html >>> then http://www.gossamer-threads.com/lists/python/dev/1223780 >> >> Thanks. It would be nice if those were gatewayed to usenet like >> this group is. I can't bring myself to subscribe to mailing lists. > > Piece of cake using even a semi-decent email client (I use Thunderbird > on Windows) via gmane. And gmane is even better using a decent news (NNTP) client. I prefer slrn, but that may be a bit old-school for many. Technically, gmane's news server is not "Usenet", but the UI is the same. Gmane's internal search facility is a bit lame, but searching gmane with Google works fairly well. > It provides access to hundreds of Python mailing lists, blogs and > even updates to the Activestate recipes :) -- Grant Edwards grant.b.edwardsYow! World War Three can at be averted by adherence gmail.comto a strictly enforced dress code! -- https://mail.python.org/mailman/listinfo/python-list
Re: Make a unique filesystem path, without creating the file
On 23/02/2016 08:22, Paul Rubin wrote: Mark Lawrence writes: https://mail.python.org/pipermail/python-ideas/2015-September/036333.html then http://www.gossamer-threads.com/lists/python/dev/1223780 Thanks. It would be nice if those were gatewayed to usenet like this group is. I can't bring myself to subscribe to mailing lists. Piece of cake using even a semi-decent email client (I use Thunderbird on Windows) via gmane. It provides access to hundreds of Python mailing lists, blogs and even updates to the Activestate recipes :) There are a few other choices in the PEP whose benefit is unclear to me, but they aren't harmful, and I guess the decisions have already been made. The PEP status is draft so is subject to change. Well they might be changeable but it sounds like there's a level of consensus by now, that wouldn't be helped by more bikeshedding over relatively minor stuff. I might write up some further comments and post them here You might as well, can't do any harm and somebody might pick up on something. -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence -- https://mail.python.org/mailman/listinfo/python-list
Re: Make a unique filesystem path, without creating the file
Paul Rubin wrote: > Mark Lawrence writes: >> https://mail.python.org/pipermail/python-ideas/2015-September/036333.html >> then http://www.gossamer-threads.com/lists/python/dev/1223780 > > Thanks. It would be nice if those were gatewayed to usenet like this > group is. I can't bring myself to subscribe to mailing lists. They are available via news.gmane.org as gmane.comp.python.devel gmane.comp.python.ideas -- https://mail.python.org/mailman/listinfo/python-list
Re: Make a unique filesystem path, without creating the file
Mark Lawrence writes: > https://mail.python.org/pipermail/python-ideas/2015-September/036333.html > then http://www.gossamer-threads.com/lists/python/dev/1223780 Thanks. It would be nice if those were gatewayed to usenet like this group is. I can't bring myself to subscribe to mailing lists. >> There are a few other choices in the PEP whose benefit is unclear to me, >> but they aren't harmful, and I guess the decisions have already been >> made. > The PEP status is draft so is subject to change. Well they might be changeable but it sounds like there's a level of consensus by now, that wouldn't be helped by more bikeshedding over relatively minor stuff. I might write up some further comments and post them here -- https://mail.python.org/mailman/listinfo/python-list
Re: Make a unique filesystem path, without creating the file
On 23/02/2016 02:27, Paul Rubin wrote: Steven D'Aprano writes: https://www.python.org/dev/peps/pep-0506/ I didn't know about this! The discussion was all on mailing lists? https://mail.python.org/pipermail/python-ideas/2015-September/036333.html then http://www.gossamer-threads.com/lists/python/dev/1223780 A few things I suggest changing: 1) the default system RNG for Linux should be getrandom(2) on kernels that support it (3.17 and later). 2) Some effort should be directed at simulating getrandom's behaviour on kernels that don't have it, using the /dev/random entropy estimator and the /dev/urandom interface. I.e. it should block if the system hasn't seen enough entropy to get the CSPRNG started securely, and never block after that. 3) The default token length should be long enough to not have to "change in the future". If the user wants a shorter token, they ask for that, or can truncate a longer one that they receive from the default. There are a few other choices in the PEP whose benefit is unclear to me, but they aren't harmful, and I guess the decisions have already been made. The PEP status is draft so is subject to change. -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence -- https://mail.python.org/mailman/listinfo/python-list
Re: Make a unique filesystem path, without creating the file
Marko Rauhamaa writes: > It is also correct that /dev/urandom depletes the entropy pool as > effectively as /dev/random. I think see what's confusing you: the above is a misconception that is probably held by lots of people. Entropy is not water and from a cryptographic standpoint there is essentially no such thing as "depleting" an entropy pool. There is either enough entropy (say 256 bits or more) in the PRNG or else there isn't. If there's not enough, urandom can misbehave by giving you bad output because it doesn't block until more is gathered. If there is enough, /dev/random misbehaves by blocking under this bogus concept of "depletion". If you have a seed with 256 bits of entropy and you generate a gigabyte of random numbers from it, you have not increased the predictability of the seed in any significant way. So once /dev/random unblocks, it should never again block, the behavior of getrandom. There used to be an article on David Wagner's web site (cs.berkeley.edu/~daw) about the concept of "depleting" entropy by iterated hashing, but I can't find it now. That's unfortunate since it might help cast light on the subject. >> http://www.2uo.de/myths-about-urandom/ > Already addressed. No really, all you've done is repeat bad advice. The people cited in that article are very knowledgeable and the stuff they say makes good mathematical sense. The stuff you say makes no sense and you haven't given any convincing reason for anyone to listen to you. -- https://mail.python.org/mailman/listinfo/python-list
Re: Make a unique filesystem path, without creating the file
Steven D'Aprano : > On Tue, 23 Feb 2016 06:32 am, Marko Rauhamaa wrote: >> Under Linux, /dev/random is the way to go when strong security is >> needed. Note that /dev/random is a scarce resource on ordinary >> systems. > > That's actually incorrect, but you're not the only one to have been > mislead by the man pages. > > http://sockpuppet.org/blog/2014/02/25/safely-generate-random-numbers/ Still, mostly hypnotic repetitions. However, it admits: But /dev/random also tries to keep track of how much entropy remains in its kernel pool, and will occasionally go on strike if it decides not enough remains. That's the whole point. /dev/random will rather block the program than lower the quality of the random numbers below a threshold. /dev/urandom has no such qualms. If you use /dev/random instead of urandom, your program will unpredictably (or, if you’re an attacker, very predictably) hang when Linux gets confused about how its own RNG works. Yes, possibly indefinitely, too. Using /dev/random will make your programs less stable, but it won’t make them any more cryptographically safe. It is correct that you shouldn't use /dev/random as a routine source of bulk random numbers. It is also correct that /dev/urandom depletes the entropy pool as effectively as /dev/random. However, when you are generating signing or encryption keys, you should use /dev/random. As stated in https://lwn.net/Articles/606141/>: /dev/urandom should be used for essentially all random numbers required, but /dev/random is sometimes used for things like extremely sensitive, long-lived keys (e.g. GPG) or one-time pads. > See also: > > http://www.2uo.de/myths-about-urandom/ Already addressed. Marko -- https://mail.python.org/mailman/listinfo/python-list
Re: Make a unique filesystem path, without creating the file
Chris Angelico writes: > How much future are you expecting? This is old but its methodology still seems ok: http://saluc.engr.uconn.edu/refs/keymgr/blaze95minimalkeylength.pdf I also like this: http://cr.yp.to/talks/2015.10.05/slides-djb-20151005-a4.pdf Quote (slide 37): The crypto users' fantasy is boring crypto: crypto that simply works, solidly resists attacks, never needs any upgrades. HN discussion: https://news.ycombinator.com/item?id=10345965 -- https://mail.python.org/mailman/listinfo/python-list
Re: Make a unique filesystem path, without creating the file
On Tue, Feb 23, 2016 at 1:27 PM, Paul Rubin wrote: > 3) The default token length should be long enough to not have to "change > in the future". If the user wants a shorter token, they ask for that, > or can truncate a longer one that they receive from the default. How much future are you expecting? ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: Make a unique filesystem path, without creating the file
Steven D'Aprano writes: > https://www.python.org/dev/peps/pep-0506/ I didn't know about this! The discussion was all on mailing lists? A few things I suggest changing: 1) the default system RNG for Linux should be getrandom(2) on kernels that support it (3.17 and later). 2) Some effort should be directed at simulating getrandom's behaviour on kernels that don't have it, using the /dev/random entropy estimator and the /dev/urandom interface. I.e. it should block if the system hasn't seen enough entropy to get the CSPRNG started securely, and never block after that. 3) The default token length should be long enough to not have to "change in the future". If the user wants a shorter token, they ask for that, or can truncate a longer one that they receive from the default. There are a few other choices in the PEP whose benefit is unclear to me, but they aren't harmful, and I guess the decisions have already been made. -- https://mail.python.org/mailman/listinfo/python-list
Re: Make a unique filesystem path, without creating the file
On 2016-02-23, Ben Finney wrote: > Oscar Benjamin writes: >> What does unpredictable mean in this context? Maybe I'm reading too >> much into that... > > I think you may be, yes. The request in this thread requires making > direct use of the “generate a new valid temporary fielsystem path” > functionality already implemented in ‘tempfile’. > > Implementations of that functionality outside of ‘tempfile’ are a fun > exercise, but miss the point of this thread. I think you have missed the point of your own thread. You can't do what you wanted using tempfile, the only possible answer is to choose a filename that is sufficiently random that your hope that it is unique won't be proven futile. tempfile has two main modes, mktemp which meets your requirements but should never be used as it is insecure, and mkstemp which doesn't meet your requirements because it fundamentally operates by actually creating the file in question and relying on the filesystem to guarantee uniqueness. -- https://mail.python.org/mailman/listinfo/python-list
Re: Make a unique filesystem path, without creating the file
On Tue, Feb 23, 2016 at 11:44 AM, Jon Ribbens wrote: > On 2016-02-23, Chris Angelico wrote: >> On Tue, Feb 23, 2016 at 11:26 AM, Jon Ribbens >> wrote: >>> On 2016-02-23, Chris Angelico wrote: On Tue, Feb 23, 2016 at 11:08 AM, Jon Ribbens wrote: >> If you generate 2**128 + 1 such numbers, you are *guaranteed* to > > ... have expired due to the heat death of the universe. Maybe... but by the time you get to 2**64 of them, you have a 50% chance of a collision. (That's either utterly intuitive or completely counter-intuitive, depending on who you are.) >>> >>> Um, did you mean to say 2**127? Are you thinking of the >>> birthday paradox or something, which doesn't apply here? >> >> By the time you generate 2**64 of them, you have a 50% chance that >> some pair of them collides. Yes, the birthday paradox does apply here. > > Oh, I see, you're thinking of it differently. I was thinking of it as > Alice is choosing a filename and Mallet is trying to guess it, in which > case the birthday paradox doesn't apply. You're thinking of it as Alice > is generating many random filenames and, even though she could avoid > collisions with 100% certainty by remembering what she's already had, > isn't doing so, and must avoid colliding with herself. I don't think > your version makes has much relevance as an attack model. Ah. Steven was talking about collisions; once you have 2**128+1 of them, you're guaranteed a collision (pigeonhole principle). What you're talking about gives certainty slightly sooner - specifically, once you've tried 2**128 of them, you're guaranteed to have hit it :) ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: Make a unique filesystem path, without creating the file
On 2016-02-23, Chris Angelico wrote: > On Tue, Feb 23, 2016 at 11:26 AM, Jon Ribbens > wrote: >> On 2016-02-23, Chris Angelico wrote: >>> On Tue, Feb 23, 2016 at 11:08 AM, Jon Ribbens >>> wrote: > If you generate 2**128 + 1 such numbers, you are *guaranteed* to ... have expired due to the heat death of the universe. >>> >>> Maybe... but by the time you get to 2**64 of them, you have a 50% >>> chance of a collision. (That's either utterly intuitive or completely >>> counter-intuitive, depending on who you are.) >> >> Um, did you mean to say 2**127? Are you thinking of the >> birthday paradox or something, which doesn't apply here? > > By the time you generate 2**64 of them, you have a 50% chance that > some pair of them collides. Yes, the birthday paradox does apply here. Oh, I see, you're thinking of it differently. I was thinking of it as Alice is choosing a filename and Mallet is trying to guess it, in which case the birthday paradox doesn't apply. You're thinking of it as Alice is generating many random filenames and, even though she could avoid collisions with 100% certainty by remembering what she's already had, isn't doing so, and must avoid colliding with herself. I don't think your version makes has much relevance as an attack model. -- https://mail.python.org/mailman/listinfo/python-list
Re: Make a unique filesystem path, without creating the file
On Tue, 23 Feb 2016 06:32 am, Marko Rauhamaa wrote: > Jon Ribbens : > >> Suppose you had code like this: >> >> filename = binascii.hexlify(os.urandom(16)).decode("ascii") >> >> Do we really think that is insecure or that there are any practical >> attacks against it? It would be basically the same as saying that >> urandom() is broken, surely? > > urandom() is not quite random and so should not be considered > cryptographically airtight. > > Under Linux, /dev/random is the way to go when strong security is > needed. Note that /dev/random is a scarce resource on ordinary systems. That's actually incorrect, but you're not the only one to have been mislead by the man pages. http://sockpuppet.org/blog/2014/02/25/safely-generate-random-numbers/ On non-Linux Unixes, the difference between urandom and random is mostly, or entirely, gone, in favour of urandom's non-blocking behaviour. And it's a myth that the output of random is "more random" or "more pure" than urandom's. In reality, on Linux both urandom and random use exactly the same CSPRNG. See also: http://www.2uo.de/myths-about-urandom/ for a good explanation of how random and urandom actually work on Linux. -- Steven -- https://mail.python.org/mailman/listinfo/python-list
Re: Make a unique filesystem path, without creating the file
On Tue, Feb 23, 2016 at 11:26 AM, Jon Ribbens wrote: > On 2016-02-23, Chris Angelico wrote: >> On Tue, Feb 23, 2016 at 11:08 AM, Jon Ribbens >> wrote: If you generate 2**128 + 1 such numbers, you are *guaranteed* to >>> >>> ... have expired due to the heat death of the universe. >> >> Maybe... but by the time you get to 2**64 of them, you have a 50% >> chance of a collision. (That's either utterly intuitive or completely >> counter-intuitive, depending on who you are.) > > Um, did you mean to say 2**127? Are you thinking of the > birthday paradox or something, which doesn't apply here? By the time you generate 2**64 of them, you have a 50% chance that some pair of them collides. Yes, the birthday paradox does apply here. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: Make a unique filesystem path, without creating the file
Oscar Benjamin writes: > What does unpredictable mean in this context? Maybe I'm reading too > much into that... I think you may be, yes. The request in this thread requires making direct use of the “generate a new valid temporary fielsystem path” functionality already implemented in ‘tempfile’. Implementations of that functionality outside of ‘tempfile’ are a fun exercise, but miss the point of this thread. -- \ “But Marge, what if we chose the wrong religion? Each week we | `\ just make God madder and madder.” —Homer, _The Simpsons_ | _o__) | Ben Finney -- https://mail.python.org/mailman/listinfo/python-list
Re: Make a unique filesystem path, without creating the file
On 2016-02-23, Chris Angelico wrote: > On Tue, Feb 23, 2016 at 11:08 AM, Jon Ribbens > wrote: >>> If you generate 2**128 + 1 such numbers, you are *guaranteed* to >> >> ... have expired due to the heat death of the universe. > > Maybe... but by the time you get to 2**64 of them, you have a 50% > chance of a collision. (That's either utterly intuitive or completely > counter-intuitive, depending on who you are.) Um, did you mean to say 2**127? Are you thinking of the birthday paradox or something, which doesn't apply here? -- https://mail.python.org/mailman/listinfo/python-list
Re: Make a unique filesystem path, without creating the file
On Tue, 23 Feb 2016 05:17 am, Jon Ribbens wrote: > On 2016-02-22, Ethan Furman wrote: >> On 02/14/2016 04:08 PM, Ben Finney wrote: >>> I am unconcerned with whether there is a real filesystem entry of that >>> name; the goal entails having no filesystem activity for this. I want a >>> valid unique filesystem path, without touching the filesystem. >> >> This is impossible. If you don't touch the file system you have no way >> to know if the path is unique. > > Weell, I have a lot of sympathy for that point, but on the other > hand the whole concept of UUIDs ("import uuid") is predicated on the > opposite assumption. You're referring to uuid4, presumably, as the other varieties of UUID use non-secret information, such as the time, or a namespace, either of which is potentially public knowledge. Only uuid4 is considered "globally unique", and that's not *certainly* globally unique, only that the chances of an *accidental* collision is below some threshold deemed "small enough that we don't care". Deliberate collisions of public UUIDs are *trivial*. Pick a UUID you know is already in use, and use it again. There's a lot of assumptions involved in the "globally unique" claim, and there are probably ways to contrive to generate the same UUIDs as someone else. But to what benefit? UUIDs are not intended as security tokens, and are not hardened against attack. Even uuid4 may not be suitable for security, since it may use a cryptographically weak PRNG such as Mersenne Twister. -- Steven -- https://mail.python.org/mailman/listinfo/python-list
Re: Make a unique filesystem path, without creating the file
On Tue, Feb 23, 2016 at 11:08 AM, Jon Ribbens wrote: > On 2016-02-22, Steven D'Aprano wrote: >> On Tue, 23 Feb 2016 05:48 am, Marko Rauhamaa wrote: >>> Jon Ribbens : I was under the impression that the point of UUIDs is that you can be *so* confident that there won't be a collision that for all practical purposes it's indistinguishable from being certain. >>> >>> Yes, if you generate a random 128-bit number, it will be unique -- >> >> If you generate a second random 128 bit number, you have a chance of 1 in >> 2**128 of a collision. All you can say is that it will be *very probably* >> unique. (I might even allow "almost certainly" unique.) > > If you are not prepared to say that something with a > 340282366920938463463374607431768211455 / > 340282366920938463463374607431768211456 chance of being true > is not "certainly true" then I'm not sure how you would not > be too scared to ever leave the house. Or not leave the house. > I mean, you're probably going to be hit by 10^25 meteorites, > which sounds painful. > >> If you generate 2**128 + 1 such numbers, you are *guaranteed* to > > ... have expired due to the heat death of the universe. Maybe... but by the time you get to 2**64 of them, you have a 50% chance of a collision. (That's either utterly intuitive or completely counter-intuitive, depending on who you are.) ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: Make a unique filesystem path, without creating the file
On 2016-02-22, Steven D'Aprano wrote: > On Tue, 23 Feb 2016 05:48 am, Marko Rauhamaa wrote: >> Jon Ribbens : >>> I was under the impression that the point of UUIDs is that you can be >>> *so* confident that there won't be a collision that for all practical >>> purposes it's indistinguishable from being certain. >> >> Yes, if you generate a random 128-bit number, it will be unique -- > > If you generate a second random 128 bit number, you have a chance of 1 in > 2**128 of a collision. All you can say is that it will be *very probably* > unique. (I might even allow "almost certainly" unique.) If you are not prepared to say that something with a 340282366920938463463374607431768211455 / 340282366920938463463374607431768211456 chance of being true is not "certainly true" then I'm not sure how you would not be too scared to ever leave the house. Or not leave the house. I mean, you're probably going to be hit by 10^25 meteorites, which sounds painful. > If you generate 2**128 + 1 such numbers, you are *guaranteed* to ... have expired due to the heat death of the universe. -- https://mail.python.org/mailman/listinfo/python-list
Re: Make a unique filesystem path, without creating the file
On 2016-02-23, Steven D'Aprano wrote: > On Tue, 23 Feb 2016 06:22 am, Jon Ribbens wrote: >> Suppose you had code like this: >> >> filename = binascii.hexlify(os.urandom(16)).decode("ascii") >> >> Do we really think that is insecure or that there are any practical >> attacks against it? It would be basically the same as saying that >> urandom() is broken, surely? > > Correct. Any attack against urandom would be an attack on this. You would > just have to trust that the kernel devs have made urandom as secure as > possible, and pay no attention to what the man page says, as its wrong. > > By the way, Python 3.6 will have (once Guido formally approves it) a new > module, "secrets", for securely generating (pseudo)random tokens like this: > > import secrets > filename = secrets.token_hex(16) +1 -- https://mail.python.org/mailman/listinfo/python-list
Re: Make a unique filesystem path, without creating the file
On Tue, 23 Feb 2016 06:22 am, Jon Ribbens wrote: > Suppose you had code like this: > > filename = binascii.hexlify(os.urandom(16)).decode("ascii") > > Do we really think that is insecure or that there are any practical > attacks against it? It would be basically the same as saying that > urandom() is broken, surely? Correct. Any attack against urandom would be an attack on this. You would just have to trust that the kernel devs have made urandom as secure as possible, and pay no attention to what the man page says, as its wrong. By the way, Python 3.6 will have (once Guido formally approves it) a new module, "secrets", for securely generating (pseudo)random tokens like this: import secrets filename = secrets.token_hex(16) https://www.python.org/dev/peps/pep-0506/ -- Steven -- https://mail.python.org/mailman/listinfo/python-list
Re: Make a unique filesystem path, without creating the file
On 22 Feb 2016 22:50, "Ben Finney" wrote: > > Ethan Furman writes: > > > On 02/14/2016 04:08 PM, Ben Finney wrote: > > > > > I am unconcerned with whether there is a real filesystem entry of that > > > name; the goal entails having no filesystem activity for this. I want a > > > valid unique filesystem path, without touching the filesystem. > > > > This is impossible. If you don't touch the file system you have no > > way to know if the path is unique. > > That was unclear. Later in the same thread, I clarified that by “unique” > I mean nothing about entries already on the filesystem. Instead it means > “unpredictably different each time the function is called”. What does unpredictable mean in this context? Maybe I'm reading too much into that... What's wrong with the example I posted before? -- Oscar -- https://mail.python.org/mailman/listinfo/python-list
Re: Make a unique filesystem path, without creating the file
On Tue, 23 Feb 2016 06:22 am, Paul Rubin wrote: > Chris Angelico writes: >>> I was under the impression that the point of UUIDs is that you can be >>> *so* confident that there won't be a collision that for all practical >>> purposes it's indistinguishable from being certain. >> Maybe, if everyone's cooperating. I'm not sure how they fare in the >> face of malice though. > > There are different UUID algorithms, some of which have useful syntax > but are easy to spoof. Uuid4 is random and implemented properly, should > be hard to spoof. I'm not sure what you mean by "spoof" in this context. Do you mean generate collisions? Do you mean "pretend to generate a UUID, but without actually doing so"? That's how I interpret "spoof", but I don't quite understand why that would be difficult. Here's one I just made now: {00010203-0405-0607-0809-0a0b0c0d0e0f} And another: {836313e2-3b8a-53f2-9b90-0c9ade199e5d} They weren't hard to spoof :-) -- Steven -- https://mail.python.org/mailman/listinfo/python-list
Re: Make a unique filesystem path, without creating the file
On Tue, 23 Feb 2016 05:48 am, Marko Rauhamaa wrote: > Jon Ribbens : > >> I was under the impression that the point of UUIDs is that you can be >> *so* confident that there won't be a collision that for all practical >> purposes it's indistinguishable from being certain. > > Yes, if you generate a random 128-bit number, it will be unique -- If you generate a second random 128 bit number, you have a chance of 1 in 2**128 of a collision. All you can say is that it will be *very probably* unique. (I might even allow "almost certainly" unique.) If you generate 2**128 + 1 such numbers, you are *guaranteed* to have at least one collision. If I can arrange matters so that I am using the same seed as you, then I can generate the same UUIDs as you. If I know you are using the Mersenne Twister PRNG, and I can get hold of (by memory) 128 consecutive UUIDs, I can reconstruct the seed you are using and generate all future (and past) UUIDs the same as yours. (Well, when I say "I can", I don't mean *me*, I mean some attacker who is smarter than me, but not that much smarter.) > unless someone clones it. > > Cloning will be a practical issue when you clone virtual machines, for > example. This is certainly a practical issue that people have to be aware of. -- Steven -- https://mail.python.org/mailman/listinfo/python-list
Re: Make a unique filesystem path, without creating the file
Marko Rauhamaa writes: http://www.2uo.de/myths-about-urandom/ >> I don't know what web pamphlet you mean, > The only one linked above. Oh, I wouldn't have called that a pamphlet. I could quibble with the writing style but the points in the article are basically correct. > getrandom(2) is a good interface that distinguishes between the flag > values >0=> /dev/urandom >GRND_RANDOM => /dev/random >GRND_RANDOM | GRND_NONBLOCK => /dev/random (O_NONBLOCK) > However, although os.urandom() delegates to getrandom(), the > documentation suggests it uses the flag value 0 (/dev/urandom). Flag value 0 does the right thing and blocks if the entropy pool is not yet initialized, and doesn't block after that. That fixes the errors of both urandom (fails to block before there's enough entropy) and random (blocks even after there's enough entropy). The getrandom doc is also misleading about the workings of the entropy pools but that's ok. The actual algorithm is described here: http://www.pinkas.net/PAPERS/gpr06.pdf It's pretty clumsy but discussions about replacing it have gotten bogged down several times. OTOH maybe I'm out of date on this. >> The random/urandom interface was poorly designed and misleadingly >> documented. > It could be better I suppose, but I never found it particularly bad. The > nice thing about it is that it is readily usable in shell scripts. DJB describes the problems: https://groups.google.com/forum/#!msg/randomness-generation/4opmDHA6_3w/__TyKhbnNWsJ Regarding shell scripts, it should be a simple matter to put a wrapper around the system call. -- https://mail.python.org/mailman/listinfo/python-list
Re: Make a unique filesystem path, without creating the file
On 22Feb2016 12:34, Alan Bawden wrote: Cameron Simpson writes: On 16Feb2016 19:24, Alan Bawden wrote: So in the FIFO case, I might write something like the following: def make_temp_fifo(mode=0o600): while True: path = tempfile.mktemp() try: os.mkfifo(path, mode=mode) except FileExistsError: pass else: return path So is there something wrong with the above code? Other than the fact that the documentation says something scary about mktemp()? Well, it has a few shortcomings. It relies on mkfifo reliably failing if the name exists. It shounds like mkfifo is reliable this way, but I can imagine analogous use cases without such a convenient core action, and your code only avoids mktemp's security issue _because_ mkfifo has that fortuitous aspect. I don't understand your use of the word "fortuitous" here. mkfifo is defined to act that way according to POSIX. I wrote the code that way precisely because of that property. I sometimes write code knowing that adding two even numbers together results in an even answer. I suppose you might describe that as "fortuitous", but it's just things behaving as they are defined to behave! I mean here that your scheme isn't adaptable to a system call which will reuse an existing name. Of course, mkfifo, mkdir and open(.., O_EXCL) all have this nice feature. Secondly, why is your example better than:: os.mkfifo(os.path.join(mkdtemp(), 'myfifo')) My way is not much better, but I think it is a little better because your way I have to worry about deleting both the file and the directory when I am done, and I have to get the permissions right on two filesystem objects. (If I can use a TemporaryDirectory() context manager, the cleaning up part does get easier.) And it also seems wasteful to me, given that the way mkdtemp() is implemented is to generate a possible name, try creating it, and loop if the mkdir() call fails. (POSIX makes the same guarantee for mkdir() as it does for mkfifo().) Why not just let me do an equivalent loop myself? Go ahead. But I think Ben's specificly trying to avoid writing his own loop. On that basis, this example doesn't present a use case what can't be addressed by mkstemp or mkdtemp. Yes, if mktemp() were taken away from me, I could work around it. I'm just saying that in order to justify taking something like this away, it has to be both below some threshold of utility and above some threshold of dangerousness. In the canonical case of gets() in C, not only is fgets() almost a perfectly exact replacement for gets(), gets() is insanely dangerous. But the case of mktemp() doesn't seem to me to come close to this combination of redundancy and danger. You _do_ understand the security issue, yes? I sure looked like you did, until here. Well, it's always dangerous to say that you understand all the security issues of anything. In part that is why I wrote the code quoted above. I am open to the possibility that there is a security problem here that I haven't thought of. But so far the only problem anybody has with it is that you think there is something "fortuitous" about the way that it works. (As if that would be of any use in the situation above!) It looks like anxiety that some people might use mktemp() in a stupid way has caused an over-reaction. No, it is anxiety that mktemp's _normal_ use is inherently unsafe. So are you saying that the way I used mktemp() above is _abnormal_? In that you're not making a file. I mean "abnormal" in a statistical sense, and also in the "anticipated use case for mktemp's design". I'm not suggestioning you're wrong to use it like this. [ Here I have removed some perfectly reasonable text describing the race condition in question -- yes I really do understand that. ] This is neither weird nor even unlikely which is why kmtemp is strongly discouraged - naive (and standard) use is not safe. That you have contrived a use case where you can _carefully_ use mktemp in safety in no way makes mktemp recommendable. OK, so you _do_ seem to be saying that I have used mktemp() in a "contrived" and "non-standard" (and "non-naive"!) way. I'm genuinely surprised. I though I was just writing straightforward correct code and demonstrating that this was a useful utility that it was not hard to use safely. You seem to think what I did is something that ordinary programmers can not be expected to do. Your judgement is definitely different from mine! No, I meant only that (a) mktemp is normally used for regular files and (b) that mkdtemp()/mkfifo() present equivalent results without hand making a pick-a-name loop. Of course any programmer should be able to read the mktemp() spec and built from it. And ultimately this does all boil down to making judgements. It does make sense to remove things from libraries that are safety hazards (like gets() in C), I'm just trying to
Re: Make a unique filesystem path, without creating the file
On 02/22/2016 02:25 PM, Cameron Simpson wrote: On 22Feb2016 10:11, Ethan Furman wrote: On 02/14/2016 04:08 PM, Ben Finney wrote: I am unconcerned with whether there is a real filesystem entry of that name; the goal entails having no filesystem activity for this. I want a valid unique filesystem path, without touching the filesystem. This is impossible. If you don't touch the file system you have no way to know if the path is unique. I think Ben wants to avoid filesystem modification (let us ignore atime here). So one can read the filesystem to see what is current, but he does not want to actually make any new filesystem entry. Hmm -- well, he says "the goal entails having no filesystem activity for this", and seeing what already exists definitely requires file system activity . . . -- ~Ethan~ -- https://mail.python.org/mailman/listinfo/python-list
Re: Make a unique filesystem path, without creating the file
Ethan Furman writes: > On 02/14/2016 04:08 PM, Ben Finney wrote: > > > I am unconcerned with whether there is a real filesystem entry of that > > name; the goal entails having no filesystem activity for this. I want a > > valid unique filesystem path, without touching the filesystem. > > This is impossible. If you don't touch the file system you have no > way to know if the path is unique. That was unclear. Later in the same thread, I clarified that by “unique” I mean nothing about entries already on the filesystem. Instead it means “unpredictably different each time the function is called”. -- \ “It is difficult to get a man to understand something when his | `\ salary depends upon his not understanding it.” —Upton Sinclair, | _o__) 1935 | Ben Finney -- https://mail.python.org/mailman/listinfo/python-list
Re: Make a unique filesystem path, without creating the file
On 22Feb2016 10:11, Ethan Furman wrote: On 02/14/2016 04:08 PM, Ben Finney wrote: I am unconcerned with whether there is a real filesystem entry of that name; the goal entails having no filesystem activity for this. I want a valid unique filesystem path, without touching the filesystem. This is impossible. If you don't touch the file system you have no way to know if the path is unique. I think Ben wants to avoid filesystem modification (let us ignore atime here). So one can read the filesystem to see what is current, but he does not want to actually make any new filesystem entry. Cheers, Cameron Simpson -- https://mail.python.org/mailman/listinfo/python-list
Re: Make a unique filesystem path, without creating the file
Paul Rubin : >>> http://www.2uo.de/myths-about-urandom/ >> Did you post the link because you agreed with the Web pamphlet? > > I don't know what web pamphlet you mean, The only one linked above. Cryptography is tricky business, indeed. I know enough about it not to improvise too much. Infinitesimal weaknesses can make a difference between feasible and unfeasible attacks. > but the right thing to use now is getrandom(2). getrandom(2) is a good interface that distinguishes between the flag values 0=> /dev/urandom GRND_RANDOM => /dev/random GRND_RANDOM | GRND_NONBLOCK => /dev/random (O_NONBLOCK) However, although os.urandom() delegates to getrandom(), the documentation suggests it uses the flag value 0 (/dev/urandom). > The random/urandom interface was poorly designed and misleadingly > documented. It could be better I suppose, but I never found it particularly bad. The nice thing about it is that it is readily usable in shell scripts. Marko -- https://mail.python.org/mailman/listinfo/python-list
Re: Make a unique filesystem path, without creating the file
Marko Rauhamaa writes: >> http://www.2uo.de/myths-about-urandom/ > Did you post the link because you agreed with the Web pamphlet? I don't know what web pamphlet you mean, but the right thing to use now is getrandom(2). The random/urandom interface was poorly designed and misleadingly documented. -- https://mail.python.org/mailman/listinfo/python-list
Re: Make a unique filesystem path, without creating the file
Random832 : > On Mon, Feb 22, 2016, at 14:32, Marko Rauhamaa wrote: >> urandom() is not quite random and so should not be considered >> cryptographically airtight. >> >> Under Linux, /dev/random is the way to go when strong security is >> needed. Note that /dev/random is a scarce resource on ordinary >> systems. > > http://www.2uo.de/myths-about-urandom/ Did you post the link because you agreed with the Web pamphlet? Marko -- https://mail.python.org/mailman/listinfo/python-list
Re: Make a unique filesystem path, without creating the file
On Mon, Feb 22, 2016, at 14:32, Marko Rauhamaa wrote: > urandom() is not quite random and so should not be considered > cryptographically airtight. > > Under Linux, /dev/random is the way to go when strong security is > needed. Note that /dev/random is a scarce resource on ordinary systems. http://www.2uo.de/myths-about-urandom/ -- https://mail.python.org/mailman/listinfo/python-list
Re: Make a unique filesystem path, without creating the file
On Tue, Feb 23, 2016 at 6:22 AM, Jon Ribbens wrote: >> Maybe, if everyone's cooperating. I'm not sure how they fare in the >> face of malice though. > > Suppose you had code like this: > > filename = binascii.hexlify(os.urandom(16)).decode("ascii") > > Do we really think that is insecure or that there are any practical > attacks against it? It would be basically the same as saying that > urandom() is broken, surely? Sure, that would be safe. But UUIDs aren't necessarily based on "give me sixteen bytes from urandom". They can involve potentially-predictable information such as MAC addresses, current time of day, and so on, which gives them significantly less randomness. In that kind of usage, they're not intended to be cryptographically secure. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: Make a unique filesystem path, without creating the file
Jon Ribbens : > Suppose you had code like this: > > filename = binascii.hexlify(os.urandom(16)).decode("ascii") > > Do we really think that is insecure or that there are any practical > attacks against it? It would be basically the same as saying that > urandom() is broken, surely? urandom() is not quite random and so should not be considered cryptographically airtight. Under Linux, /dev/random is the way to go when strong security is needed. Note that /dev/random is a scarce resource on ordinary systems. Marko -- https://mail.python.org/mailman/listinfo/python-list
Re: Make a unique filesystem path, without creating the file
On 2016-02-22, Chris Angelico wrote: > On Tue, Feb 23, 2016 at 5:39 AM, Jon Ribbens > wrote: >> On 2016-02-22, Chris Angelico wrote: >>> On Tue, Feb 23, 2016 at 5:17 AM, Jon Ribbens >>> wrote: Weell, I have a lot of sympathy for that point, but on the other hand the whole concept of UUIDs ("import uuid") is predicated on the opposite assumption. >>> >>> Not quite opposite. Ethan is asserting that you cannot be *certain* >>> without actually checking the FS; the point of UUIDs is that you can >>> be fairly *confident* that there won't be a collision. There is a >>> nonzero probability of accidental collisions, and if an attacker is >>> deliberately trying to _force_ a collision, it's most definitely >>> possible. So both views are correct. >> >> I was under the impression that the point of UUIDs is that you can be >> *so* confident that there won't be a collision that for all practical >> purposes it's indistinguishable from being certain. > > Maybe, if everyone's cooperating. I'm not sure how they fare in the > face of malice though. Suppose you had code like this: filename = binascii.hexlify(os.urandom(16)).decode("ascii") Do we really think that is insecure or that there are any practical attacks against it? It would be basically the same as saying that urandom() is broken, surely? -- https://mail.python.org/mailman/listinfo/python-list
Re: Make a unique filesystem path, without creating the file
Chris Angelico writes: >> I was under the impression that the point of UUIDs is that you can be >> *so* confident that there won't be a collision that for all practical >> purposes it's indistinguishable from being certain. > Maybe, if everyone's cooperating. I'm not sure how they fare in the > face of malice though. There are different UUID algorithms, some of which have useful syntax but are easy to spoof. Uuid4 is random and implemented properly, should be hard to spoof. -- https://mail.python.org/mailman/listinfo/python-list
Re: Make a unique filesystem path, without creating the file
On Tue, Feb 23, 2016 at 5:39 AM, Jon Ribbens wrote: > On 2016-02-22, Chris Angelico wrote: >> On Tue, Feb 23, 2016 at 5:17 AM, Jon Ribbens >> wrote: >>> Weell, I have a lot of sympathy for that point, but on the other >>> hand the whole concept of UUIDs ("import uuid") is predicated on the >>> opposite assumption. >> >> Not quite opposite. Ethan is asserting that you cannot be *certain* >> without actually checking the FS; the point of UUIDs is that you can >> be fairly *confident* that there won't be a collision. There is a >> nonzero probability of accidental collisions, and if an attacker is >> deliberately trying to _force_ a collision, it's most definitely >> possible. So both views are correct. > > I was under the impression that the point of UUIDs is that you can be > *so* confident that there won't be a collision that for all practical > purposes it's indistinguishable from being certain. Maybe, if everyone's cooperating. I'm not sure how they fare in the face of malice though. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: Make a unique filesystem path, without creating the file
Jon Ribbens : > I was under the impression that the point of UUIDs is that you can be > *so* confident that there won't be a collision that for all practical > purposes it's indistinguishable from being certain. Yes, if you generate a random 128-bit number, it will be unique -- unless someone clones it. Cloning will be a practical issue when you clone virtual machines, for example. Marko -- https://mail.python.org/mailman/listinfo/python-list
Re: Make a unique filesystem path, without creating the file
On 2016-02-22, Chris Angelico wrote: > On Tue, Feb 23, 2016 at 5:17 AM, Jon Ribbens > wrote: >> Weell, I have a lot of sympathy for that point, but on the other >> hand the whole concept of UUIDs ("import uuid") is predicated on the >> opposite assumption. > > Not quite opposite. Ethan is asserting that you cannot be *certain* > without actually checking the FS; the point of UUIDs is that you can > be fairly *confident* that there won't be a collision. There is a > nonzero probability of accidental collisions, and if an attacker is > deliberately trying to _force_ a collision, it's most definitely > possible. So both views are correct. I was under the impression that the point of UUIDs is that you can be *so* confident that there won't be a collision that for all practical purposes it's indistinguishable from being certain. -- https://mail.python.org/mailman/listinfo/python-list
Re: Make a unique filesystem path, without creating the file
On 2016-02-22, Ethan Furman wrote: > On 02/14/2016 04:08 PM, Ben Finney wrote: >> I am unconcerned with whether there is a real filesystem entry of that >> name; the goal entails having no filesystem activity for this. I want a >> valid unique filesystem path, without touching the filesystem. > > This is impossible. If you don't touch the file system you have no way > to know if the path is unique. Weell, I have a lot of sympathy for that point, but on the other hand the whole concept of UUIDs ("import uuid") is predicated on the opposite assumption. -- https://mail.python.org/mailman/listinfo/python-list
Re: Make a unique filesystem path, without creating the file
On Tue, Feb 23, 2016 at 5:17 AM, Jon Ribbens wrote: > On 2016-02-22, Ethan Furman wrote: >> On 02/14/2016 04:08 PM, Ben Finney wrote: >>> I am unconcerned with whether there is a real filesystem entry of that >>> name; the goal entails having no filesystem activity for this. I want a >>> valid unique filesystem path, without touching the filesystem. >> >> This is impossible. If you don't touch the file system you have no way >> to know if the path is unique. > > Weell, I have a lot of sympathy for that point, but on the other > hand the whole concept of UUIDs ("import uuid") is predicated on the > opposite assumption. Not quite opposite. Ethan is asserting that you cannot be *certain* without actually checking the FS; the point of UUIDs is that you can be fairly *confident* that there won't be a collision. There is a nonzero probability of accidental collisions, and if an attacker is deliberately trying to _force_ a collision, it's most definitely possible. So both views are correct. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: Make a unique filesystem path, without creating the file
On 02/14/2016 04:08 PM, Ben Finney wrote: I am unconcerned with whether there is a real filesystem entry of that name; the goal entails having no filesystem activity for this. I want a valid unique filesystem path, without touching the filesystem. This is impossible. If you don't touch the file system you have no way to know if the path is unique. -- ~Ethan~ -- https://mail.python.org/mailman/listinfo/python-list
Re: Make a unique filesystem path, without creating the file
Cameron Simpson writes: > On 16Feb2016 19:24, Alan Bawden wrote: >>So in the FIFO case, I might write something like the following: >> >>def make_temp_fifo(mode=0o600): >>while True: >>path = tempfile.mktemp() >>try: >>os.mkfifo(path, mode=mode) >>except FileExistsError: >>pass >>else: >>return path >> >>So is there something wrong with the above code? Other than the fact >>that the documentation says something scary about mktemp()? > > Well, it has a few shortcomings. > > It relies on mkfifo reliably failing if the name exists. It shounds like > mkfifo is reliable this way, but I can imagine analogous use cases without > such a convenient core action, and your code only avoids mktemp's security > issue _because_ mkfifo has that fortuitous aspect. I don't understand your use of the word "fortuitous" here. mkfifo is defined to act that way according to POSIX. I wrote the code that way precisely because of that property. I sometimes write code knowing that adding two even numbers together results in an even answer. I suppose you might describe that as "fortuitous", but it's just things behaving as they are defined to behave! > Secondly, why is your example better than:: > > os.mkfifo(os.path.join(mkdtemp(), 'myfifo')) My way is not much better, but I think it is a little better because your way I have to worry about deleting both the file and the directory when I am done, and I have to get the permissions right on two filesystem objects. (If I can use a TemporaryDirectory() context manager, the cleaning up part does get easier.) And it also seems wasteful to me, given that the way mkdtemp() is implemented is to generate a possible name, try creating it, and loop if the mkdir() call fails. (POSIX makes the same guarantee for mkdir() as it does for mkfifo().) Why not just let me do an equivalent loop myself? > On that basis, this example doesn't present a use case what can't be > addressed by mkstemp or mkdtemp. Yes, if mktemp() were taken away from me, I could work around it. I'm just saying that in order to justify taking something like this away, it has to be both below some threshold of utility and above some threshold of dangerousness. In the canonical case of gets() in C, not only is fgets() almost a perfectly exact replacement for gets(), gets() is insanely dangerous. But the case of mktemp() doesn't seem to me to come close to this combination of redundancy and danger. > You _do_ understand the security issue, yes? I sure looked like you did, > until here. Well, it's always dangerous to say that you understand all the security issues of anything. In part that is why I wrote the code quoted above. I am open to the possibility that there is a security problem here that I haven't thought of. But so far the only problem anybody has with it is that you think there is something "fortuitous" about the way that it works. >>(As if that would be of any use in the >>situation above!) It looks like anxiety that some people might use >>mktemp() in a stupid way has caused an over-reaction. > > No, it is anxiety that mktemp's _normal_ use is inherently unsafe. So are you saying that the way I used mktemp() above is _abnormal_? > [ Here I have removed some perfectly reasonable text describing the > race condition in question -- yes I really do understand that. ] > > This is neither weird nor even unlikely which is why kmtemp is strongly > discouraged - naive (and standard) use is not safe. > > That you have contrived a use case where you can _carefully_ use mktemp in > safety in no way makes mktemp recommendable. OK, so you _do_ seem to be saying that I have used mktemp() in a "contrived" and "non-standard" (and "non-naive"!) way. I'm genuinely surprised. I though I was just writing straightforward correct code and demonstrating that this was a useful utility that it was not hard to use safely. You seem to think what I did is something that ordinary programmers can not be expected to do. Your judgement is definitely different from mine! And ultimately this does all boil down to making judgements. It does make sense to remove things from libraries that are safety hazards (like gets() in C), I'm just trying to argue that mktemp() isn't nearly dangerous enough to deserve more than a warning in its documentation. You don't agree. Oh well... Up until this point, you haven't said anything that I actually think is flat out wrong, we just disagree about what tools it is reasonable to take away from _all_ programmers just because _some_ programmers might use them to make a mess. > In fact your use case isn't safe, because _another_ task using mktemp > in conflict as a plain old temporary file may grab your fifo. But here in very last sentence I really must disagree. If the code I wrote above is "unsafe" because some _other_ process might be using mktemp() badly and stumble over
Re: Make a unique filesystem path, without creating the file
On 16Feb2016 19:24, Alan Bawden wrote: Ben Finney writes: Cameron Simpson writes: I've been watching this for a few days, and am struggling to understand your use case. Yes, you're not alone. This surprises me, which is why I'm persisting. Can you elaborate with a concrete example and its purpose which would work with a mktemp-ish official function? An example:: Let me present another example that might strike some as more straightforward. If I want to create a temporary file, I can call mkstemp(). If I want to create a temporary directory, I can call mkdtemp(). Suppose that instead of a file or a directory, I want a FIFO or a socket. A FIFO is created by passing a pathname to os.mkfifo(). A socket is created by passing a pathname to an AF_UNIX socket's bind() method. In both cases, the pathname must not name anything yet (not even a symbolic link), otherwise the call will fail. So in the FIFO case, I might write something like the following: def make_temp_fifo(mode=0o600): while True: path = tempfile.mktemp() try: os.mkfifo(path, mode=mode) except FileExistsError: pass else: return path mktemp() is convenient here, because I don't have to worry about whether I should be using "/tmp" or "/var/tmp" or "c:\temp", or whether the TMPDIR environment variable is set, or whether I have permission to create entries in those directories. It just gives me a pathname without making me think about the rest of that stuff. Yes, that is highly desirable. Yes, I have to defend against the possibility that somebody else creates something with the same name first, but as you can see, I did that, and it wasn't rocket science. So is there something wrong with the above code? Other than the fact that the documentation says something scary about mktemp()? Well, it has a few shortcomings. It relies on mkfifo reliably failing if the name exists. It shounds like mkfifo is reliable this way, but I can imagine analogous use cases without such a convenient core action, and your code only avoids mktemp's security issue _because_ mkfifo has that fortuitous aspect. It looks to me like mktemp() provides some real utility, packaged up in a way that is orthogonal to the type of file system entry I want to create, the permissions I want to give to that entry, and the mode I want use to open it. It looks like a useful, albeit low-level, primitive that it is perfectly reasonable for the tempfile module to supply. Secondly, why is your example better than:: os.mkfifo(os.path.join(mkdtemp(), 'myfifo')) On that basis, this example doesn't present a use case what can't be addressed by mkstemp or mkdtemp. By contrast, Ben's example does look like it needs something like mktemp. And yet the documentation condemns it as "deprecated", and tells me I should use mkstemp() instead. You _do_ understand the security issue, yes? I sure looked like you did, until here. (As if that would be of any use in the situation above!) It looks like anxiety that some people might use mktemp() in a stupid way has caused an over-reaction. No, it is anxiety that mktemp's _normal_ use is inherently unsafe. Let the documentation warn about the problem and point to prepackaged solutions in the common cases of making files and directories, but I see no good reason to deprecate this useful utility. I think it is like C's gets() function (albeit not as dangerous). It really shouldn't be used. One of the things about mktemp() is its raciness, which is the core of the security issue. People look at the term "security issue" and think "Ah, it can be attacked." But the flipside is that it is simply unreliable. Its normal use was to make an ordinary temp file. Consider the case where two instances of the same task are running at the same time, doing that. They can easily, by accident, end us using the same scratch file! This is by no means unlikely; any shell script running tasks in parallel can arrange it, any procmail script filing a message with a "copy" rule (which causes procmail simply to fork and proceed), etc. This is neither weird nor even unlikely which is why kmtemp is strongly discouraged - naive (and standard) use is not safe. That you have contrived a use case where you can _carefully_ use mktemp in safety in no way makes mktemp recommendable. In fact your use case isn't safe, because _another_ task using mktemp in conflict as a plain old temporary file may grab your fifo. Cheers, Cameron Simpson -- https://mail.python.org/mailman/listinfo/python-list
Re: Make a unique filesystem path, without creating the file
On 16 February 2016 at 19:40, Ben Finney wrote: > Oscar Benjamin writes: > >> If you're going to patch open to return a fake file when asked to open >> fake_file_path why do you care whether there is a real file of that >> name? > > I don't, and have been saying explicitly many times in this thread that > I do not care whether the file exists. Somehow that is still not clear? Sorry Ben I misunderstood. I think I can see the source of confusion which is in your first message: """ In some code (e.g. unit tests) I am calling ‘tempfile.mktemp’ to generate a unique path for a filesystem entry that I *do not want* to exist on the real filesystem. """ I read that as meaning that it was important that the file did not exist. But you say that you don't care if the file actually exists in the filesystem or not and just want a unique path. What do you mean by unique here? The intention of mktemp is that the path is unique so that there would not exist a file of that name and if you opened it for writing you wouldn't be interfering with any existing file. Do you just mean a function that returns a different value each time it's called? How about this: count = 0 def unique_path(): global count count += 1 return os.path.join(tempfile.gettempdir(), str(count)) -- Oscar -- https://mail.python.org/mailman/listinfo/python-list
Re: Make a unique filesystem path, without creating the file
Ben Finney writes: > Cameron Simpson writes: > >> I've been watching this for a few days, and am struggling to >> understand your use case. > > Yes, you're not alone. This surprises me, which is why I'm persisting. > >> Can you elaborate with a concrete example and its purpose which would >> work with a mktemp-ish official function? > > An example:: Let me present another example that might strike some as more straightforward. If I want to create a temporary file, I can call mkstemp(). If I want to create a temporary directory, I can call mkdtemp(). Suppose that instead of a file or a directory, I want a FIFO or a socket. A FIFO is created by passing a pathname to os.mkfifo(). A socket is created by passing a pathname to an AF_UNIX socket's bind() method. In both cases, the pathname must not name anything yet (not even a symbolic link), otherwise the call will fail. So in the FIFO case, I might write something like the following: def make_temp_fifo(mode=0o600): while True: path = tempfile.mktemp() try: os.mkfifo(path, mode=mode) except FileExistsError: pass else: return path mktemp() is convenient here, because I don't have to worry about whether I should be using "/tmp" or "/var/tmp" or "c:\temp", or whether the TMPDIR environment variable is set, or whether I have permission to create entries in those directories. It just gives me a pathname without making me think about the rest of that stuff. Yes, I have to defend against the possibility that somebody else creates something with the same name first, but as you can see, I did that, and it wasn't rocket science. So is there something wrong with the above code? Other than the fact that the documentation says something scary about mktemp()? It looks to me like mktemp() provides some real utility, packaged up in a way that is orthogonal to the type of file system entry I want to create, the permissions I want to give to that entry, and the mode I want use to open it. It looks like a useful, albeit low-level, primitive that it is perfectly reasonable for the tempfile module to supply. And yet the documentation condemns it as "deprecated", and tells me I should use mkstemp() instead. (As if that would be of any use in the situation above!) It looks like anxiety that some people might use mktemp() in a stupid way has caused an over-reaction. Let the documentation warn about the problem and point to prepackaged solutions in the common cases of making files and directories, but I see no good reason to deprecate this useful utility. -- Alan Bawden -- https://mail.python.org/mailman/listinfo/python-list
Re: Make a unique filesystem path, without creating the file
Steven D'Aprano writes: > On Tue, 16 Feb 2016 04:56 pm, Ben Finney wrote: > > > names = tempfile._get_candidate_names() > > I'm not sure that calling a private function of the tempfile module is > better than calling a deprecated function. Agreed, which is why I'm seeking a public API that is not deprecated. > So why not just pick a random bunch of characters? > > chars = list(string.ascii_letters) > random.shuffle(chars) > fake_file_path = ''.join(chars[:10]) This (an equivalent) is already implemented, internally to ‘tempfile’ and tested and maintained and more robust than me re-inventing the wheel. > Yes, but the system doesn't try to enforce the filesystem's rules, > does it? The test case I'm writing should not be prone to failure if the system happens to perform some arbitrary validation of filesystem paths. ‘tempfile’ already knows how to generate filesystem paths, I want to use that and not have to get it right myself. > and your system shouldn't care. If it does, this test case should not fail. > Since your test doesn't know what filesystem your code will be running > on, you can't make any assumptions about what paths are valid or not > valid. That implies that ‘tempfile._get_candidate_names’ would generate paths that would potentially be invalid. Is that what you intend to imply? > > Almost. I want the filesystem paths to be valid because the system > > under test expects them, it may perform its own validation, > > If the system tries to validate paths, it is broken. This is “you don't want what you say you want”, and seeing the justifications presented I don't agree. -- \ “I must say that I find television very educational. The minute | `\ somebody turns it on, I go to the library and read a book.” | _o__)—Groucho Marx | Ben Finney -- https://mail.python.org/mailman/listinfo/python-list
Re: Make a unique filesystem path, without creating the file
Oscar Benjamin writes: > If you're going to patch open to return a fake file when asked to open > fake_file_path why do you care whether there is a real file of that > name? I don't, and have been saying explicitly many times in this thread that I do not care whether the file exists. Somehow that is still not clear? -- \ “Nothing exists except atoms and empty space; everything else | `\is opinion.” —Democritus, c. 460 BCE – 370 BCE | _o__) | Ben Finney -- https://mail.python.org/mailman/listinfo/python-list
Re: Make a unique filesystem path, without creating the file
On Tue, 16 Feb 2016 04:56 pm, Ben Finney wrote: > An example:: > > import io > import tempfile > names = tempfile._get_candidate_names() I'm not sure that calling a private function of the tempfile module is better than calling a deprecated function. > def test_frobnicates_configured_spungfile(): > """ ‘foo’ should frobnicate the configured spungfile. """ > > fake_file_path = os.path.join(tempfile.gettempdir(), names.next()) At this point, you have a valid pathname, but no guarantee whether it refers to a real file on the file system or not. That's the whole problem with tempfile.makepath -- it can return a file name which is not in use, but by the time it returns to you, you cannot guarantee that it still doesn't exist. Now, since this is a test which doesn't actually open that file, it doesn't matter. There's no actual security vulnerability here. So your test doesn't actually require that the file is unique, or that it doesn't actually exist. (Which is good, because you can't guarantee that it doesn't exist.) So why not just pick a random bunch of characters? chars = list(string.ascii_letters) random.shuffle(chars) fake_file_path = ''.join(chars[:10]) > fake_file = io.BytesIO("Lorem ipsum, dolor sit > amet".encode("utf-8")) > > patch_builtins_open( > when_accessing_path=fake_file_path, > provide_file=fake_file) There's nothing apparent in this that requires that fake_file_path not actually exist, which is good since (as I've pointed out before) you cannot guarantee that it doesn't exist. One could just as easily, and just as correctly, write: patch_builtins_open( when_accessing_path='/foo/bar/baz', provide_file=fake_file) and regardless of whether /foo/bar/baz actually exists or not, you are guaranteed to get the fake file rather than the real file. So I question whether you actually need this tempfile.makepath function at all. *But* having questioned it, for the sake of the argument I'll assume you do need it, and continue accordingly. > system_under_test.config.spungfile_path = fake_file_path > system_under_test.foo() > assert_correctly_frobnicated(fake_file) > > So the test case creates a fake file, makes a valid filesystem path to > associate with it, then patches the ‘open’ function so that it will > return the fake file when that specific path is requested. > > Then the test case alters the system under test's configuration, giving > it the generated filesystem path for an important file. The test case > then calls the function about which the unit test is asserting > behaviour, ‘system_under_test.foo’. When that call returns, the test > case asserts some properties of the fake file to ensure the system under > test actually accessed that file. Personally, I think it would be simpler and easier to understand if, instead of patching open, you allowed the test to read and write real files: file_path = '/tmp/spam' system_under_test.config.spungfile_path = file_path system_under_test.foo() assert_correctly_frobnicated(file_path) os.unlink(file_path) In practice, I'd want to only unlike the file if the test passes. If it fails, I'd want to look at the file to see why it wasn't frobnicated. I think that a correctly-working filesystem is a perfectly reasonable prerequisite for the test, just like a working CPU, memory, power supply, operating system and Python interpreter. You don't have to guard against every imaginable failure ("fixme: test may return invalid results if the speed of light changes by more than 0.0001%"), and you might as well take advantage of real files for debugging. But that's my opinion, and if you have another, that's your personal choice. > With a supported standard library API for this – ‘tempfile.makepath’ for > example – the generation of the filesystem path would change from four > separate function calls, one of which is a private API:: > > names = tempfile._get_candidate_names() > fake_file_path = os.path.join(tempfile.gettempdir(), names.next()) > > to a simple public function call:: > > fake_file_path = tempfile.makepath() Nobody doubts that your use of tempfile.makepath is legitimate for your use-case. But it is *not* legitimate for the tempfile module, and it is a mistake that it was added in the first place, hence the deprecation. Assuming that your test suite needs this function, your test library, or test suite, should provide that function, not tempfile. I believe it is unreasonable to expect the tempfile module to keep a function which is a security risk in the context of "temp files" just because it is useful for some completely unrelated use-cases. After all, your use of this doesn't actually have anything to do with temporary files. It is a mocked *permanent* file, not a real temporary one. > This whole thread began because I expected s
Re: Make a unique filesystem path, without creating the file
On 16 Feb 2016 05:57, "Ben Finney" wrote: > > Cameron Simpson writes: > > > I've been watching this for a few days, and am struggling to > > understand your use case. > > Yes, you're not alone. This surprises me, which is why I'm persisting. > > > Can you elaborate with a concrete example and its purpose which would > > work with a mktemp-ish official function? > > An example:: > > import io > import tempfile > names = tempfile._get_candidate_names() > > def test_frobnicates_configured_spungfile(): > """ ‘foo’ should frobnicate the configured spungfile. """ > > fake_file_path = os.path.join(tempfile.gettempdir(), names.next()) > fake_file = io.BytesIO("Lorem ipsum, dolor sit amet".encode("utf-8")) > > patch_builtins_open( > when_accessing_path=fake_file_path, > provide_file=fake_file) > > system_under_test.config.spungfile_path = fake_file_path > system_under_test.foo() > assert_correctly_frobnicated(fake_file) If you're going to patch open to return a fake file when asked to open fake_file_path why do you care whether there is a real file of that name? -- Oscar -- https://mail.python.org/mailman/listinfo/python-list
Re: Make a unique filesystem path, without creating the file
Cameron Simpson writes: > I've been watching this for a few days, and am struggling to > understand your use case. Yes, you're not alone. This surprises me, which is why I'm persisting. > Can you elaborate with a concrete example and its purpose which would > work with a mktemp-ish official function? An example:: import io import tempfile names = tempfile._get_candidate_names() def test_frobnicates_configured_spungfile(): """ ‘foo’ should frobnicate the configured spungfile. """ fake_file_path = os.path.join(tempfile.gettempdir(), names.next()) fake_file = io.BytesIO("Lorem ipsum, dolor sit amet".encode("utf-8")) patch_builtins_open( when_accessing_path=fake_file_path, provide_file=fake_file) system_under_test.config.spungfile_path = fake_file_path system_under_test.foo() assert_correctly_frobnicated(fake_file) So the test case creates a fake file, makes a valid filesystem path to associate with it, then patches the ‘open’ function so that it will return the fake file when that specific path is requested. Then the test case alters the system under test's configuration, giving it the generated filesystem path for an important file. The test case then calls the function about which the unit test is asserting behaviour, ‘system_under_test.foo’. When that call returns, the test case asserts some properties of the fake file to ensure the system under test actually accessed that file. With a supported standard library API for this – ‘tempfile.makepath’ for example – the generation of the filesystem path would change from four separate function calls, one of which is a private API:: names = tempfile._get_candidate_names() fake_file_path = os.path.join(tempfile.gettempdir(), names.next()) to a simple public function call:: fake_file_path = tempfile.makepath() This whole thread began because I expected such an API would exist. > I don't see how it is useful to have a notion of a filepath at all > in this case, and therefore I don't see why you would want a > mktemp-like function available. Because the system under test expects to be dealing with a filesystem, including normal restrictions on filesystem paths. The filesystem path needs to be valid because the test case isn't making assertions about what the system does with invalid paths. A test case should be very narrow in what it asserts so that the failure's cause is as obvious as possible. The filesystem path needs to be unpredictable to make sure we're not using some hard-coded value; the test case asserts that the system under test will access whatever file is named in the configuration. The file object needs to be fake because the test case should not be prone to irrelevant failures when the real filesystem isn't behaving as expected; this test case makes assertions only about what ‘system_under_test.foo’ does internally, not what the filesystem does. The system library functionality should be providing this because it's *already implemented there* and well tested and maintained. It should be in a public non-deprecated API because merely generating filesystem paths is not a security risk. > But.. then why a filesystem path at all in that case? Because the system under test is expecting valid filesystem paths, and I have no good reason to violate that constraint. > Why use a filesystem as a reference at all? An actual running filesystem is irrelevant to this inquiry. I'm only wanting to use functionality, with the constraints I enumerated earlier (already implemented in the standard library), to generate filesystem paths. > The only modes I can imagine for such a thing (a generated but unused > filename) are: > > checking that the name is syntactly valid, for whatever constrains > you may have (but if you're calling an opaque mktemp-like function, is > this feasible or remediable?) Almost. I want the filesystem paths to be valid because the system under test expects them, it may perform its own validation, and I have no good reason to complicate the unit test by possibly supplying an invalid path when that's not relevant to the test case. > generating test paths without using a real filesystem as a reference, > but then you can't even use mktemp I hadn't realised the filesystem was accessed by ‘tempfile.mktemp’, and I apologise for the complication that entails. I would prefer to access some standard public documented non-deprecated function that internally uses ‘tempfile._get_candidate_names’ and returns a new path each time. > I think "the standard library clearly has this useful functionality > implemented, but simultaneously warns strongly against its use" pretty > much precludes this. I hope to get that addressed with https://bugs.python.org/issue26362>. -- \ “Timid men prefer the calm of despotism to the boisterous sea | `\of liberty.” —Thomas Jefferson |
Re: Make a unique filesystem path, without creating the file
"Mario R. Osorio" writes: > I would create a RAM disk > (http://www.cyberciti.biz/faq/howto-create-linux-ram-disk-filesystem/), > generate all the path/files I want with any, or my own algorithm, run > the tests, unmount it, destroy it, be happy ... Whats wrong with > that?? It is addressing the problem at a different level. I am not asking about writing a wrapper around the test suite, I am asking about an API to generate filesystem paths. Your solution is a fine response to a different question. -- \“Consider the daffodil. And while you're doing that, I'll be | `\ over here, looking through your stuff.” —Jack Handey | _o__) | Ben Finney -- https://mail.python.org/mailman/listinfo/python-list
Re: Make a unique filesystem path, without creating the file
I would create a RAM disk (http://www.cyberciti.biz/faq/howto-create-linux-ram-disk-filesystem/), generate all the path/files I want with any, or my own algorithm, run the tests, unmount it, destroy it, be happy ... Whats wrong with that?? AFAIK, RAM disks do not get logged, and even if they do, any "insecure" file created would also be gone. On Sunday, February 14, 2016 at 4:46:42 PM UTC-5, Ben Finney wrote: > Howdy all, > > How should a program generate a unique filesystem path and *not* create > the filesystem entry? > > The 'tempfile.mktemp' function is strongly deprecated, and rightly so > https://docs.python.org/3/library/tempfile.html#tempfile.mktemp> > because it leaves the program vulnerable to insecure file creation. > > In some code (e.g. unit tests) I am calling 'tempfile.mktemp' to > generate a unique path for a filesystem entry that I *do not want* to > exist on the real filesystem. In this case the filesystem security > concerns are irrelevant because there is no file. > > The deprecation of that function is a concern still, because I don't > want code that makes every conscientious reader need to decide whether > the code is a problem. Instead the code should avoid rightly-deprecated > APIs. > > It is also prone to that API function disappearing at some point in the > future, because it is explicitly and strongly deprecated. > > So I agree with the deprecation, but the library doesn't appear to > provide a replacement. > > What standard library function should I be using to generate > 'tempfile.mktemp'-like unique paths, and *not* ever create a real file > by that path? > > -- > \"If you have the facts on your side, pound the facts. If you | > `\ have the law on your side, pound the law. If you have neither | > _o__) on your side, pound the table." --anonymous | > Ben Finney -- https://mail.python.org/mailman/listinfo/python-list
Re: Make a unique filesystem path, without creating the file
On Sunday, February 14, 2016 at 10:55:11 PM UTC-6, Steven D'Aprano wrote: > If you want to guarantee that these faux pathnames can't > leak out of your test suite and touch the file system, > prepend an ASCII NUL to them. That will make it an illegal > path on all file systems that I'm aware of. Hmm, the unfounded fears in this thread are beginning to remind me of a famous Black Sabbath song. Finished with "py tempfile", 'cause it, couldn't help to, ease my mind. People think i'm insane, because, i want "faux paths", all the time. All day long i think of ways, but nothing seems to, satisfy. Think i'll loose my mind, if i don't, find a py-module to, pacify. CAN YOU HELP ME? MAKE "FAUX PATHS" TODY, OH YEAH... -- https://mail.python.org/mailman/listinfo/python-list
Re: Make a unique filesystem path, without creating the file
On Mon, 15 Feb 2016 15:28:27 +1100, Ben Finney wrote: > The behaviour is already implemented in the standard library. What I'm > looking for is a way to use it (not re-implement it) that is public API > and isn't scolded by the library documentation. So, basically you want (essentially) the exact behaviour of tempfile.mktemp(), except without any mention of the (genuine) risks that such a function presents? I suspect that you'll have to settle for either a) using that function and simply documenting the reasons why it isn't an issue in this particular case, or b) re-implementing it (so that you can choose to avoid mentioning the issue in its documentation). At the outside, you *might* have a third option: c) persuade the maintainers to tweak the documentation to further clarify that the risk arises from creating a file with the returned name, not from simply calling the function. But actually it's already fairly clear if you actually read it. If it's the bold-face "Warning:" and the red background that you don't like, I wouldn't expect those to go away either for mktemp() or for any other function with similar behaviour (i.e. something which someone *might* try to use to actually create temporary files). The simple fact that it might get used that way is enough to warrant a prominent warning. -- https://mail.python.org/mailman/listinfo/python-list
Re: Make a unique filesystem path, without creating the file
Roel Schroeven writes: > Use uuid.uuid1()? That has potential. A little counter-intuitive, for use in documentation about testing filesystem paths; but not frightening or dubious to the conscientious reader. I'll see whether that meets this use case, thank you. The bug report (to make a supported ‘tempfile’ API for generating filesystem paths only) remains, and fixing that would be the correct way to address this IMO. -- \ “I used to be a proofreader for a skywriting company.” —Steven | `\Wright | _o__) | Ben Finney -- https://mail.python.org/mailman/listinfo/python-list
Re: Make a unique filesystem path, without creating the file
Gregory Ewing wrote: > Ben Finney wrote: >> One valid filesystem path each time it's accessed. That is, behaviour >> equivalent to ‘tempfile.mktemp’. >> >> My question is because the standard library clearly has this useful >> functionality implemented, but simultaneously warns strongly against its >> use. > > But it *doesn't*, Yes, it does. > if your requirement is truly to not touch the filesystem at all, because > tempfile.mktemp() *reads* the file system to make sure the name it's > returning isn't in use. But there is a race condition occurring between the moment that the filesystem has been read and is being written to by another user. Hence the deprecation in favor of tempfile.mkstemp() which also *creates* the file instead, and the warning about the security hole if tempfile.mktemp() is used anyway. You can use tempfile.mktemp() only as long as it is irrelevant if a file with that name already exists, or exists later but was not created by you. -- PointedEars Twitter: @PointedEars2 Please do not cc me. / Bitte keine Kopien per E-Mail. -- https://mail.python.org/mailman/listinfo/python-list
Re: Make a unique filesystem path, without creating the file
On 15Feb2016 12:19, Ben Finney wrote: Dan Sommers writes: On Mon, 15 Feb 2016 11:08:52 +1100, Ben Finney wrote: > I am unconcerned with whether there is a real filesystem entry of > that name; the goal entails having no filesystem activity for this. > I want a valid unique filesystem path, without touching the > filesystem. That's an odd use case. It's very common to want filesystem paths divorced from accessing a filesystem entry. For example: test paths in a unit test. Filesystem access is orders of magnitude slower than accessing fake files in memory only, it is more complex and prone to irrelevant failures. So in such a test case filesystem access should be avoided as unnecessary. But.. then why a filesystem path at all in that case? Why use a filesystem as a reference at all? I've been watching this for a few days, and am struggling to understand your use case. The only modes I can imagine for such a thing (a generated but unused filename) are: checking that the name is syntactly valid, for whatever constrains you may have (but if you're calling an opaque mktemp-like function, is this feasible or remediable?) checking that the name generated does in fact not correspond to an existing file (which presumes that the target directory has no other users, which also implies that you don't need mktemp - a simple prefix+unused-ordinal will do) generating test paths using a real filesystem as a reference but not making a test file - I'm having trouble imagining how this can be useful generating test paths without using a real filesystem as a reference, but then you can't even use mktemp I think I can contrive your test case scenario using #3: filepath = mktemp(existing_dir_path) fp = InMemoryFileLikeClassWithBogusName(filepath) do I/O on fp ... but I don't see how it is useful to have a notion of a filepath at all in this case, and therefore I don't see why you would want a mktemp-like function available. Can you elaborate with a concrete example and its purpose which would work with a mktemp-ish official function? You say: One valid filesystem path each time it's accessed. That is, behaviour equivalent to ‘tempfile.mktemp’. My question is because the standard library clearly has this useful functionality implemented, but simultaneously warns strongly against its use. I'm looking for how to get at that functionality in a non-deprecated way, without re-implementing it myself. I think "the standard library clearly has this useful functionality implemented, but simultaneously warns strongly against its use" pretty much precludes this. I think you probably need to reimplement. However if your intent is never to use the path you can use something very simple (my personal habit is prefix+ordinal where that doesn't already exist - keep the last ordinal to arrange a distinct name next time). Cheers, Cameron Simpson -- https://mail.python.org/mailman/listinfo/python-list
Re: Make a unique filesystem path, without creating the file
Ben Finney schreef op 2016-02-14 22:46: How should a program generate a unique filesystem path and *not* create the filesystem entry? > ... What standard library function should I be using to generate ‘tempfile.mktemp’-like unique paths, and *not* ever create a real file by that path? Use uuid.uuid1()? -- The saddest aspect of life right now is that science gathers knowledge faster than society gathers wisdom. -- Isaac Asimov Roel Schroeven -- https://mail.python.org/mailman/listinfo/python-list
Re: Make a unique filesystem path, without creating the file
On 2016-02-15, Ben Finney wrote: > Dan Sommers writes: > >> On Mon, 15 Feb 2016 11:08:52 +1100, Ben Finney wrote: >> >> > I am unconcerned with whether there is a real filesystem entry of >> > that name; the goal entails having no filesystem activity for this. >> > I want a valid unique filesystem path, without touching the >> > filesystem. >> >> That's an odd use case. > > It's very common to want filesystem paths divorced from accessing a > filesystem entry. If the filesystem paths are not associated with a filesystem, what do you mean by "unique"? You want to make sure that path which doesn't exist in some filesystem is different from all other paths that don't exist in some filesystem? > For example: test paths in a unit test. Filesystem access is orders > of magnitude slower than accessing fake files in memory only, How is "fake files in memory" not a filesystem? -- Grant Edwards grant.b.edwardsYow! The Korean War must at have been fun. gmail.com -- https://mail.python.org/mailman/listinfo/python-list
Re: Make a unique filesystem path, without creating the file
On 2016-02-14, Ben Finney wrote: > Howdy all, > > How should a program generate a unique filesystem path and *not* create > the filesystem entry? Short answer: you can't because it's the filesystem entry operation that is atomic and guarantees uniqueness. > [..] > What standard library function should I be using to generate > ‘tempfile.mktemp’-like unique paths, and *not* ever create a real file > by that path? What's the point of creating a unique path if you don't want to create the file? -- Grant Edwards grant.b.edwardsYow! I'm rated PG-34!! at gmail.com -- https://mail.python.org/mailman/listinfo/python-list
Re: Make a unique filesystem path, without creating the file
Ben Finney wrote: The existing behaviour of ‘tempfile.mktemp’ – actually of its internal class ‘tempfile._RandomNameSequence’ – is to generate unpredictable, unique, valid filesystem paths that are different each time. But that's not documented behaviour, so even if mktemp() weren't marked as deprecated, you'd still be relying on undocumented and potentially changeable behaviour. What I'm looking for is a way to use it (not re-implement it) that is public API and isn't scolded by the library documentation. Then you're looking for something that doesn't exist, I'm sorry to say, and it's unlikely you'll persuade anyone to make it exist. If you want to leverage stdlib functionality for this, I'd suggest something along the lines of: def fakefilename(dir, ext): return os.path.join(dir, str(uuid.uuid4())) + ext -- Greg -- https://mail.python.org/mailman/listinfo/python-list
Re: Make a unique filesystem path, without creating the file
Ben Finney wrote: One valid filesystem path each time it's accessed. That is, behaviour equivalent to ‘tempfile.mktemp’. My question is because the standard library clearly has this useful functionality implemented, but simultaneously warns strongly against its use. But it *doesn't*, if your requirement is truly to not touch the filesystem at all, because tempfile.mktemp() *reads* the file system to make sure the name it's returning isn't in use. What's more, because you're *not* creating the file, mktemp() would be within its rights to return the same file name the second time you call it. If you want something that really doesn't go near the file system and/or is guaranteed to produce multiple different non-existing file names, you'll have to write it yourself. -- Greg -- https://mail.python.org/mailman/listinfo/python-list
Re: Make a unique filesystem path, without creating the file
Steven D'Aprano writes: > If you can absolutely guarantee that this string will never actually > be used on a real filesystem, then go right ahead and use it. I'm giving advice in examples in documentation. It's not enough to have some private usage that I know is good, I am looking for a standard API that when the reader looks it up will not be laden with big scary warnings. Currently I can write about the public API ‘tempfile.mktemp’ in documentation, but the conscientious reader will be correct to have concerns when the examples I give are sternly deprecated in the standard library documentation. Or I can write about the private API ‘tempfile._RandomNameSequence’ in the documentation, and the conscientious reader will be correct to have concerns about use of an undocumented private-use API. I'm looking for a way to give examples that use that standard library functionality, with an API that is both public and not discouraged. > > I'm looking for how to get at that functionality in a non-deprecated > > way, without re-implementing it myself. > > You probably can't, not if you want to future-proof your code against > the day when tempfile.mktemp is removed. That's disappointing. It is already implemented and well-tested, it is useful as is. Forking and duplicating it is poor practice if it can simply be used in a standard place. I have reported https://bugs.python.org/issue26362> for this request. -- \ “Nothing worth saying is inoffensive to everyone. Nothing worth | `\saying will fail to make you enemies. And nothing worth saying | _o__)will not produce a confrontation.” —Johann Hari, 2011 | Ben Finney -- https://mail.python.org/mailman/listinfo/python-list
Re: Make a unique filesystem path, without creating the file
On Monday 15 February 2016 12:19, Ben Finney wrote: > One valid filesystem path each time it's accessed. That is, behaviour > equivalent to ‘tempfile.mktemp’. > > My question is because the standard library clearly has this useful > functionality implemented, but simultaneously warns strongly against its > use. If you can absolutely guarantee that this string will never actually be used on a real filesystem, then go right ahead and use it. There's nothing wrong with (for instance) calling mktemp to generate *strings* that merely *look* like pathnames. If you want to guarantee that these faux pathnames can't leak out of your test suite and touch the file system, prepend an ASCII NUL to them. That will make it an illegal path on all file systems that I'm aware of. > I'm looking for how to get at that functionality in a non-deprecated > way, without re-implementing it myself. You probably can't, not if you want to future-proof your code against the day when tempfile.mktemp is removed. But you can simply fork that module, delete all the irrelevant bits, and make the mktemp function a private utility in your test suite. -- Steve -- https://mail.python.org/mailman/listinfo/python-list
Re: Make a unique filesystem path, without creating the file
Good evening/morning Ben, >> > I am unconcerned with whether there is a real filesystem entry of >> > that name; the goal entails having no filesystem activity for this. >> > I want a valid unique filesystem path, without touching the >> > filesystem. >> >> Your phrasing is ambiguous. > >The existing behaviour of ‘tempfile.mktemp’ – actually of its >internal class ‘tempfile._RandomNameSequence’ – is to generate >unpredictable, unique, valid filesystem paths that are different >each time. > >That's the behaviour I want, in a public API that exposes what >‘tempfile’ already has implemented, documented in a way that >doesn't create a scare about security. If your code is not actually touching the filesystem, then it will not be affected by the race condition identified in the tempfile.mktemp() warning anyway. So, I'm unsure of your worry. >> But if you explain in more detail why you want this filename, perhaps >> we can come up with some ideas that will help. > >The behaviour is already implemented in the standard library. What >I'm looking for is a way to use it (not re-implement it) that is >public API and isn't scolded by the library documentation. I might also suggest the (bound) method _create_tmp() on class mailbox.Maildir, which achieves roughly the same goals, but for a permanent file. Of course, that particular method also touches the filesystem. The Maildir naming approach is based on the assumptions* that time is monotonically increasing, that system nodes never share the same name and that you don't need more than 1 uniquely named file per directory per millisecond. If so, then you can use the 9 or 10 lines of that method. Good luck, -Martin * I was tempted to joke about these two guarantees, but I think that undermines my basic message. To wit, you can probably rely on this naming technique about as much as you can rely on your system clock. I'll assume that you aren't naming all of your nodes 'franklin.p.gundersnip'. -- Martin A. Brown http://linux-ip.net/ -- https://mail.python.org/mailman/listinfo/python-list
Re: Make a unique filesystem path, without creating the file
Steven D'Aprano writes: > On Monday 15 February 2016 11:08, Ben Finney wrote: > > > I am unconcerned with whether there is a real filesystem entry of > > that name; the goal entails having no filesystem activity for this. > > I want a valid unique filesystem path, without touching the > > filesystem. > > Your phrasing is ambiguous. The existing behaviour of ‘tempfile.mktemp’ – actually of its internal class ‘tempfile._RandomNameSequence’ – is to generate unpredictable, unique, valid filesystem paths that are different each time. That's the behaviour I want, in a public API that exposes what ‘tempfile’ already has implemented, documented in a way that doesn't create a scare about security. > But if you explain in more detail why you want this filename, perhaps > we can come up with some ideas that will help. The behaviour is already implemented in the standard library. What I'm looking for is a way to use it (not re-implement it) that is public API and isn't scolded by the library documentation. -- \ “Try adding “as long as you don't breach the terms of service – | `\ according to our sole judgement” to the end of any cloud | _o__) computing pitch.” —Simon Phipps, 2010-12-11 | Ben Finney -- https://mail.python.org/mailman/listinfo/python-list
Re: Make a unique filesystem path, without creating the file
On Monday 15 February 2016 11:08, Ben Finney wrote: > I am unconcerned with whether there is a real filesystem entry of that > name; the goal entails having no filesystem activity for this. I want a > valid unique filesystem path, without touching the filesystem. Your phrasing is ambiguous. If you are unconcerned whether or not a file of that name exists, then just pick a name and use that: unique_path = /tmp/foo is guaranteed to be valid on POSIX systems and unique, and it may or may not exist. If you actually do care that /tmp/foo *doesn't* exist, then you have a problem: whatever name you pick *now* may no longer "not exist" a millisecond later. In general there's no way to create a valid pathname which doesn't exist *now* and is guaranteed to continue to not exist unless you touch the file system. But if you explain in more detail why you want this filename, perhaps we can come up with some ideas that will help. -- Steve -- https://mail.python.org/mailman/listinfo/python-list
Re: Make a unique filesystem path, without creating the file
Dan Sommers writes: > On Mon, 15 Feb 2016 11:08:52 +1100, Ben Finney wrote: > > > I am unconcerned with whether there is a real filesystem entry of > > that name; the goal entails having no filesystem activity for this. > > I want a valid unique filesystem path, without touching the > > filesystem. > > That's an odd use case. It's very common to want filesystem paths divorced from accessing a filesystem entry. For example: test paths in a unit test. Filesystem access is orders of magnitude slower than accessing fake files in memory only, it is more complex and prone to irrelevant failures. So in such a test case filesystem access should be avoided as unnecessary. > If it's really just one valid filesystem path (your original post said > *paths*, plural), then how about __file__? or os.__file__? One valid filesystem path each time it's accessed. That is, behaviour equivalent to ‘tempfile.mktemp’. My question is because the standard library clearly has this useful functionality implemented, but simultaneously warns strongly against its use. I'm looking for how to get at that functionality in a non-deprecated way, without re-implementing it myself. -- \ “The most common way people give up their power is by thinking | `\ they don't have any.” —Alice Walker | _o__) | Ben Finney -- https://mail.python.org/mailman/listinfo/python-list
Re: Make a unique filesystem path, without creating the file
On Mon, 15 Feb 2016 11:08:52 +1100, Ben Finney wrote: > I am unconcerned with whether there is a real filesystem entry of that > name; the goal entails having no filesystem activity for this. I want > a valid unique filesystem path, without touching the filesystem. That's an odd use case. If it's really just one valid filesystem path (your original post said *paths*, plural), then how about __file__? or os.__file__? -- https://mail.python.org/mailman/listinfo/python-list
Re: Make a unique filesystem path, without creating the file
Matt Wheeler writes: > On 14 Feb 2016 21:46, "Ben Finney" wrote: > > What standard library function should I be using to generate > > ‘tempfile.mktemp’-like unique paths, and *not* ever create a real > > file by that path? > > Could you use tempfile.TemporaryDirectory and then just use a > consistent name within that directory. That fails because it touches the filesystem. I want to avoid using a real file or a real directory. > It's guaranteed not to exist I am unconcerned with whether there is a real filesystem entry of that name; the goal entails having no filesystem activity for this. I want a valid unique filesystem path, without touching the filesystem. -- \ “I believe our future depends powerfully on how well we | `\ understand this cosmos, in which we float like a mote of dust | _o__) in the morning sky.” —Carl Sagan, _Cosmos_, 1980 | Ben Finney -- https://mail.python.org/mailman/listinfo/python-list
Re: Make a unique filesystem path, without creating the file
On 14 Feb 2016 21:46, "Ben Finney" wrote: > What standard library function should I be using to generate > ‘tempfile.mktemp’-like unique paths, and *not* ever create a real file > by that path? Could you use tempfile.TemporaryDirectory and then just use a consistent name within that directory. It's guaranteed not to exist because the directory was only just created and only you can write to it? Has the added bonus of still being reasonably secure, to appease people like Mr PointedEars. (If you need multiple nonexistent paths in the same dir then perhaps use tempfile.NamedTemporaryFile with your newly created temp dir and an arbitrary suffix, and strip the suffix off to get the name you actually use.) -- Matt Wheeler http://funkyh.at -- https://mail.python.org/mailman/listinfo/python-list
Re: Make a unique filesystem path, without creating the file
Ben Finney wrote: > How should a program generate a unique filesystem path and *not* create > the filesystem entry? The Python documentation suggests that it should not. > The ‘tempfile.mktemp’ function is strongly deprecated, and rightly so > https://docs.python.org/3/library/tempfile.html#tempfile.mktemp> > because it leaves the program vulnerable to insecure file creation. > > In some code (e.g. unit tests) I am calling ‘tempfile.mktemp’ to > generate a unique path for a filesystem entry that I *do not want* to > exist on the real filesystem. In this case the filesystem security > concerns are irrelevant because there is no file. I do not think that you have properly understood the problems with tmpfile.mktemp(). > […] > It is also prone to that API function disappearing at some point in the > future, because it is explicitly and strongly deprecated. > > So I agree with the deprecation, but the library doesn't appear to > provide a replacement. | mktemp() usage can be replaced easily with NamedTemporaryFile(), passing | it the delete=False parameter: [example] > What standard library function should I be using to generate > ‘tempfile.mktemp’-like unique paths, and *not* ever create a real file > by that path? I do not think it is possible to avoid the creation of a real file using the PSL; in fact, that a file is created appears to be precisely what fixes the problems with tempfile.mktemp() because then it cannot happen that someone else creates a file with the same name at the same time: | tempfile.NamedTemporaryFile(mode='w+b', buffering=None, encoding=None, | newline=None, suffix=None, prefix=None, dir=None, delete=True) | | This function operates exactly as TemporaryFile() does, except that the | file is guaranteed to have a visible name in the file system (on Unix, the | directory entry is not unlinked). […] If delete is true (the default), the | file is deleted as soon as it is closed. […] It is of course possible to generate a filename that is not currently used, but I am not aware of a PSL feature that does this, and if there were such a feature there would be the same problems with it as with mktemp(). -- PointedEars Twitter: @PointedEars2 Please do not cc me. / Bitte keine Kopien per E-Mail. -- https://mail.python.org/mailman/listinfo/python-list
Make a unique filesystem path, without creating the file
Howdy all, How should a program generate a unique filesystem path and *not* create the filesystem entry? The ‘tempfile.mktemp’ function is strongly deprecated, and rightly so https://docs.python.org/3/library/tempfile.html#tempfile.mktemp> because it leaves the program vulnerable to insecure file creation. In some code (e.g. unit tests) I am calling ‘tempfile.mktemp’ to generate a unique path for a filesystem entry that I *do not want* to exist on the real filesystem. In this case the filesystem security concerns are irrelevant because there is no file. The deprecation of that function is a concern still, because I don't want code that makes every conscientious reader need to decide whether the code is a problem. Instead the code should avoid rightly-deprecated APIs. It is also prone to that API function disappearing at some point in the future, because it is explicitly and strongly deprecated. So I agree with the deprecation, but the library doesn't appear to provide a replacement. What standard library function should I be using to generate ‘tempfile.mktemp’-like unique paths, and *not* ever create a real file by that path? -- \“If you have the facts on your side, pound the facts. If you | `\ have the law on your side, pound the law. If you have neither | _o__) on your side, pound the table.” —anonymous | Ben Finney -- https://mail.python.org/mailman/listinfo/python-list