date:20181119

On Tue, Nov 20, 2018 at 10:51 AM Robert Girault  wrote:
> If you're just writing a toy software, even K&R PRNG works just fine.
> If you're writing a weather simulation, I suppose you need real
> random-like properties and still need your generator to be reproducible.
> If you're using random Quicksort, you do need unpredictability and
> reproducibility.  If you're writing a crypto application, then you need
> something way stronger.  We need all of them.  But mt19937 is now useful
> only in toy software.

I disagree. Yes, in a crypto-sensitive situation, you can't depend on
the Twister... but you shouldn't be relying on *any* PRNG for that.
There are plenty of situations where you need something unpredictable
but it doesn't have to be THAT safe. Your example of picking a random
pivot for quicksort is a perfect example. Let's suppose I am sorting
by that method... how are you going to get 624 consecutive outputs? If
you can provide a custom comparison function, you can DOS the search
just by making that inefficient. If you can't, how are you going to
reconstruct the randomness? Is this REALLY a viable attack vector?

It's different if, say, you're operating a virtual casino, and letting
people watch the roulette wheel spins. (Though even then,
reconstructing the twister's state from a series of 1-in-38 results
isn't going to be trivial.) But it's overly paranoid to say that every
single PRNG needs to be cryptographically secure.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: on the prng behind random.random()

Dennis Lee Bieber  writes:

> On Mon, 19 Nov 2018 19:05:44 -0200, Robert Girault  declaimed
> the following:
>
>>I mean the fact that with 624 samples from the generator, you can
>>determine the rest of the sequence completely.
>
>   Being able to predict the sequence after a large sampling does not mean
> that the /distribution of values/ is not (pseudo-) random.

The problem with determining its sequence is that it might defeat its
purpose.  If you use mt19937 to select a pivot in random Quicksort for
example (where you plan to spend n lg n time in sorting), we can
frustrate your plans and force it into n^2 every time, an effective DoS
attack on your software.

>   After all, pretty much all random number generators will produce the
> same sequence if given the same starting seed... You are, in effect,
> treating your 624 samples as a very large seed...

I think I disagree with your take here.  With mt19937, given ANY seed, I
can eventually predict all the sequence without having to query the
oracle any further.

If you're just writing a toy software, even K&R PRNG works just fine.
If you're writing a weather simulation, I suppose you need real
random-like properties and still need your generator to be reproducible.
If you're using random Quicksort, you do need unpredictability and
reproducibility.  If you're writing a crypto application, then you need
something way stronger.  We need all of them.  But mt19937 is now useful
only in toy software.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Reading 'scientific' csv using Pandas?

2018-11-19 Thread MRAB


On 2018-11-19 20:44, Martin Schöön wrote:

Too many files to go through them with an editor :-(


If only Python could read and write files... :-)
--
https://mail.python.org/mailman/listinfo/python-list

Re: Reading 'scientific' csv using Pandas?

2018-11-19 Thread MRAB


On 2018-11-19 21:32, Martin Schöön wrote:

Den 2018-11-19 skrev Martin Schöön :

Den 2018-11-19 skrev Peter Otten <__pete...@web.de>:


The engine="python" produces an exception over here:

"""
ValueError: The 'decimal' option is not supported with the 'python' engine
"""

Maybe you can try and omit that option?


Bingo!
No, I don't remember why I added that engine thing. It was two days ago!


If that doesn't work you can specify a converter:

pd.read_csv("file.csv", sep="\t", converters={0: lambda s: 

float(s.replace(",", "."))})
   col1  col2
0  1.10e+00 0
1  1.024000e-04 1
2  9.492000e-10 2

[3 rows x 2 columns]



I spoke too early. Upon closer inspection I get the first column with
decimal '.' and the rest with decimal ','. I have tried the converter
thing to no avail :-(

You passed {0: lambda s: float(s.replace(",", "."))} as the converters 
argument, which means that it applies only to column 0.

--
https://mail.python.org/mailman/listinfo/python-list

Re: on the prng behind random.random()

2018-11-19 Thread Ian Kelly

On Mon, Nov 19, 2018 at 2:12 PM Robert Girault  wrote:
>
> Chris Angelico  writes:
>
> > On Tue, Nov 20, 2018 at 7:31 AM Robert Girault  wrote:
> >> Nice.  So Python's random.random() does indeed use mt19937.  Since it's
> >> been broken for years, why isn't it replaced by something newer like
> >> ChaCha20?  Is it due to backward compatibility?  That would make sense.
> >
> > What exactly do you mean by "broken"?
>
> I mean the fact that with 624 samples from the generator, you can
> determine the rest of the sequence completely.
>
> Sorry about mentioning ChaCha20.  That was misleading.  I should've said
> something newer like mrtg32k3a or xorshift*.

If you wish to propose replacing it, that topic is probably best
brought up at python-dev.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Reading 'scientific' csv using Pandas?

Den 2018-11-19 skrev Martin Schöön :
> Den 2018-11-19 skrev Peter Otten <__pete...@web.de>:
>>
>> The engine="python" produces an exception over here:
>>
>> """
>> ValueError: The 'decimal' option is not supported with the 'python' engine
>> """
>>
>> Maybe you can try and omit that option?
>
> Bingo!
> No, I don't remember why I added that engine thing. It was two days ago!
>
>> If that doesn't work you can specify a converter:
>>
> pd.read_csv("file.csv", sep="\t", converters={0: lambda s: 
>> float(s.replace(",", "."))})
>>col1  col2
>> 0  1.10e+00 0
>> 1  1.024000e-04 1
>> 2  9.492000e-10 2
>>
>> [3 rows x 2 columns]
>
I spoke too early. Upon closer inspection I get the first column with
decimal '.' and the rest with decimal ','. I have tried the converter
thing to no avail :-(

/Martin
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: on the prng behind random.random()

Chris Angelico  writes:

> On Tue, Nov 20, 2018 at 7:31 AM Robert Girault  wrote:
>> Nice.  So Python's random.random() does indeed use mt19937.  Since it's
>> been broken for years, why isn't it replaced by something newer like
>> ChaCha20?  Is it due to backward compatibility?  That would make sense.
>
> What exactly do you mean by "broken"? 

I mean the fact that with 624 samples from the generator, you can
determine the rest of the sequence completely.

Sorry about mentioning ChaCha20.  That was misleading.  I should've said
something newer like mrtg32k3a or xorshift*.

> If you're generating random numbers for any sort of security purpose,
> you probably should look at this:
>
> https://docs.python.org/3/library/secrets.html
>
> (New in 3.6, though, hence the "probably". If you need to support 3.5
> or older - including 2.7 - then you can't use that.)

Thanks for the reference!  

I'm not particularly interested in security at the moment, but I would
like an expert's confirmation that some of these algorithms arent't
replaced due to backward compatibility.  We could easily replace them,
but I think we shouldn't: some people still depend on these algorithms
for their experiment.

Are there other reasons?
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Reading 'scientific' csv using Pandas?

Den 2018-11-19 skrev Peter Otten <__pete...@web.de>:
> Martin Schöön wrote:
>
>> My pandas is up to date.
>> 
>
> The engine="python" produces an exception over here:
>
> """
> ValueError: The 'decimal' option is not supported with the 'python' engine
> """
>
> Maybe you can try and omit that option?

Bingo!
No, I don't remember why I added that engine thing. It was two days ago!

> If that doesn't work you can specify a converter:
>
 pd.read_csv("file.csv", sep="\t", converters={0: lambda s: 
> float(s.replace(",", "."))})
>col1  col2
> 0  1.10e+00 0
> 1  1.024000e-04 1
> 2  9.492000e-10 2
>
> [3 rows x 2 columns]
>
I save that one for later. One never nows...

/Martin
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Reading 'scientific' csv using Pandas?

Too many files to go through them with an editor :-(

/Martin
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Reading 'scientific' csv using Pandas?

On Tue, Nov 20, 2018 at 7:46 AM Martin Schöön  wrote:
> Thanks, I just tried this. The line locale.setlocale... throws an
> error:
>
> "locale.Error: unsupported locale setting"
>
> Trying other ideas instead of 'de' results in more of the same.
> '' results in no errors.

Haven't been reading in detail, but maybe "de_DE" will work better,
assuming you have that locale installed.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Reading 'scientific' csv using Pandas?

Den 2018-11-18 skrev Stefan Ram :
> Martin =?UTF-8?Q?Sch=C3=B6=C3=B6n?=  writes:
>>to read from such files. This works so so. 'Common floats' (3,1415 etc)
>>works just fine but 'scientific' stuff (1,6023e23) does not work.
>
>   main.py
>
> import sys
> import pandas
> import locale
> print( sys.version )
> print( pandas.__version__ )
> with open( 'schoon20181118232102.csv', 'w' ) as file:
> print( 'col0\tcol1', file=file, flush=True )
> print( '1,1\t0', file=file, flush=True )
> print( '10,24e-05\t1', file=file, flush=True )
> print( '9,492e-10\t2', file=file, flush=True )
> EUData = pandas.read_csv\
> ( 'schoon20181118232102.csv', sep='\t', decimal=',', engine='python' )
> locale.setlocale( locale.LC_ALL, 'de' )
> print( 2 * locale.atof( EUData[ 'col0' ][ 1 ]))
>
>   transcript
>
> 3.7.0
> 0.23.4
> 0.0002048
>
Thanks, I just tried this. The line locale.setlocale... throws an
error:

"locale.Error: unsupported locale setting"

Trying other ideas instead of 'de' results in more of the same.
'' results in no errors.

The output I get is this:

3.4.2 (default, Oct  8 2014, 10:45:20) 
[GCC 4.9.1]
0.22.0
0.0002048

Scratching my head and speculating: I run this in a Virtualenv
I have created for Jupyter and pandas and whatever I feel I need
for this. Could locale be undefined or something that causes this?

/Martin
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: on the prng behind random.random()

On Tue, Nov 20, 2018 at 7:31 AM Robert Girault  wrote:
> Nice.  So Python's random.random() does indeed use mt19937.  Since it's
> been broken for years, why isn't it replaced by something newer like
> ChaCha20?  Is it due to backward compatibility?  That would make sense.

What exactly do you mean by "broken"? If you're generating random
numbers for any sort of security purpose, you probably should look at
this:

https://docs.python.org/3/library/secrets.html

(New in 3.6, though, hence the "probably". If you need to support 3.5
or older - including 2.7 - then you can't use that.)

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Question about the definition of the value of an object

2018-11-19 Thread Iwo Herka

> Attempting to define value here would be at best a massive
> distraction from the concepts the documentation is trying
> to get across.

> There is one very simple definition of "value" which is entirely
> accurate, but probably not helpful, and that is: An object's
> value is whatever it is equal to.

> Generally, Python objects have their values defined by an abstract
> concept that is being represented.

That confirms my intuition. Thank you for the responses.

Sincerely,
Iwo Herka
pon., 19 lis 2018 o 20:46 Chris Angelico  napisał(a):
>
> On Tue, Nov 20, 2018 at 3:08 AM Iwo Herka  wrote:
> >
> > Hello everyone,
> >
> > I've been looking for something in the documentation
> > (https://docs.python.org/3.8/reference/datamodel.html) recently
> > and I've noticed something weird. Documentation states that every
> > object has a value, but doesn’t provide any definition
> > whatsoever of what the value is. Now, I'm sure that every reasonably
> > fluent Python programmer has an intuitive
> > understanding of the term, nonetheless, I would expect the
> > documentation defines it somehow (not necessarily
> > in a formal fashion), especially considering that "the value of an
> > object" is used to explain other concepts, such as
> > mutability:
> >
> > > The value of some objects can change. Objects whose value can change are 
> > > said to be mutable; objects whose
> > value is unchangeable once they are created are called immutable.
> >
> > So, why is documentation silent on this? One reason I can think of is
> > to avoid answering inconvenient questions.
>
> Sorta kinda, yeah. There is one very simple definition of "value"
> which is entirely accurate, but probably not helpful, and that is: An
> object's value is whatever it is equal to. That is to say, you can ask
> two basic questions about an object:
>
> x is y # identity: are x and y the SAME object?
> x == y # equality: do x and y have the same value?
>
> Value and equality are intrinsically linked, but unfortunately that
> doesn't really explain what either one actually IS. As Rhodri says,
> you can ask a philosopher about that, and will be stuck for weeks :)
>
> The concept of "immutable" vs "mutable" object, therefore, is that
> some objects may compare equal now and unequal later. Since the same
> object is able to change in "value" over time, it may become (un)equal
> to something while still being the same object. Thus mutable objects
> can't be used as dict keys, as their values could change, and dict
> lookups have to match based on equality. (Imagine putting two keys
> into a dict while they have different values, and then mutating one of
> them to have the same value. Now try to look up that new value using a
> third object. The Logic Police will arrest you before you can say
> "hashability"!)
>
> Generally, Python objects have their values defined by an abstract
> concept that is being represented. For instance, the integer 32 and
> the float 32.0 have the same value, even though they're different
> types; they both represent the abstract number equal to
> two-to-the-fifth. But right here in that sentence, you can see how
> hard it is to actually *define* that value. At best, all you can
> really say is that the value is equal to the value of int("32") or
> some other way of getting an object with that value.
>
> Yep, it's hard. But the cool thing is, it usually doesn't matter - you
> can say "x has the value 32" without worrying about representations,
> data types, etc.
>
> ChrisA
> --
> https://mail.python.org/mailman/listinfo/python-list
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: on the prng behind random.random()

Peter Otten <__pete...@web.de> writes:

> Robert Girault wrote:
>
>> Looking at its source code, it seems the PRNG behind random.random() is
>> Mersenne Twister, but I'm not sure.  It also seems that random.random()
>> is using /dev/urandom.  Can someone help me to read that source code?
>> 
>> I'm talking about CPython, by the way.  I'm reading
>> 
>>   https://github.com/python/cpython/blob/master/Lib/random.py
>> 
>> The initial comment clearly says it's Mersenne Twister, but the only
>> random() function there seems to call _urandom(), which I suppose is an
>> interface to /dev/urandom.
>> 
>> What am I missing here?
>
> There's a class random.Random which is instantiated at the end of the file, 
> and random() is bound to the corresponding method:
>
> _inst = Random()
> ...
> random = _inst.random
>
> The Random class inherits from _random.Random [...]

Thanks.  I missed that.

> which is implemented in C and does most of the actual work. If you can
> read C:
>
> https://github.com/python/cpython/blob/master/Modules/_randommodule.c
>
> The most relevant part seems to be genrand_int32() which is wrapped by 
> random_random() that actually implenents the _random.Random.random() method.

Nice.  So Python's random.random() does indeed use mt19937.  Since it's
been broken for years, why isn't it replaced by something newer like
ChaCha20?  Is it due to backward compatibility?  That would make sense.

Do you know who broke mt19937 and when?  I'd love to read the reference.
Thank you!
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Question about the definition of the value of an object

On Tue, Nov 20, 2018 at 3:08 AM Iwo Herka  wrote:
>
> Hello everyone,
>
> I've been looking for something in the documentation
> (https://docs.python.org/3.8/reference/datamodel.html) recently
> and I've noticed something weird. Documentation states that every
> object has a value, but doesn’t provide any definition
> whatsoever of what the value is. Now, I'm sure that every reasonably
> fluent Python programmer has an intuitive
> understanding of the term, nonetheless, I would expect the
> documentation defines it somehow (not necessarily
> in a formal fashion), especially considering that "the value of an
> object" is used to explain other concepts, such as
> mutability:
>
> > The value of some objects can change. Objects whose value can change are 
> > said to be mutable; objects whose
> value is unchangeable once they are created are called immutable.
>
> So, why is documentation silent on this? One reason I can think of is
> to avoid answering inconvenient questions.

Sorta kinda, yeah. There is one very simple definition of "value"
which is entirely accurate, but probably not helpful, and that is: An
object's value is whatever it is equal to. That is to say, you can ask
two basic questions about an object:

x is y # identity: are x and y the SAME object?
x == y # equality: do x and y have the same value?

Value and equality are intrinsically linked, but unfortunately that
doesn't really explain what either one actually IS. As Rhodri says,
you can ask a philosopher about that, and will be stuck for weeks :)

The concept of "immutable" vs "mutable" object, therefore, is that
some objects may compare equal now and unequal later. Since the same
object is able to change in "value" over time, it may become (un)equal
to something while still being the same object. Thus mutable objects
can't be used as dict keys, as their values could change, and dict
lookups have to match based on equality. (Imagine putting two keys
into a dict while they have different values, and then mutating one of
them to have the same value. Now try to look up that new value using a
third object. The Logic Police will arrest you before you can say
"hashability"!)

Generally, Python objects have their values defined by an abstract
concept that is being represented. For instance, the integer 32 and
the float 32.0 have the same value, even though they're different
types; they both represent the abstract number equal to
two-to-the-fifth. But right here in that sentence, you can see how
hard it is to actually *define* that value. At best, all you can
really say is that the value is equal to the value of int("32") or
some other way of getting an object with that value.

Yep, it's hard. But the cool thing is, it usually doesn't matter - you
can say "x has the value 32" without worrying about representations,
data types, etc.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Extend NTFS with "version" of file and "version" of folder, also optionally GIT integration or something like it.

2018-11-19 Thread skybuck2000

Described also as:

(Versioning System Integration with Windows Explorer)

Anyway

Googling NTFS and GIT turned up this:

https://blogs.msdn.microsoft.com/devops/2017/02/03/announcing-gvfs-git-virtual-file-system/

The objective of this project seems to be a bit different. To handle very large
projects.

Which in itself is great. But for small projects like mine this is perhaps
somewhat overkill.

But the people working on this project do have some experience integrating GIT
with a file system and creating some virtual file system.

I highly recommend Microsoft to expand these kinds of projects massively or to
expand this project in a big way and to involve Windows Explorer programmers to
get on on this action and to expand windows explorer to also work with this
file system and versioning system and to perhaps provide some slight new
features for windows explorer to work in tandem with such a new file system.

I want something like this to be usuable for small projects too

Perhaps it's already usuable not sure... would be nice if this software could
be made available for windows 7, only microsoft sort out the troubles with
windows 10 updates ;)

I think this continous windows 10 updating approach might be a bit too much for
people to handle.

Perhaps it's better to have an older version and a new version, so people don't
be bother with new version f*ck ups.

I sure don't have to to mess with windows updates and troubles, except very
urgent security fixes.

Bye,
Skybuck.

--
https://mail.python.org/mailman/listinfo/python-list

Re: Question about the definition of the value of an object

2018-11-19 Thread Terry Reedy


On 11/19/2018 9:08 AM, Iwo Herka wrote:

Hello everyone,

I've been looking for something in the documentation
(https://docs.python.org/3.8/reference/datamodel.html) recently
and I've noticed something weird. Documentation states that every
object has a value, but doesn’t provide any definition
whatsoever of what the value is.


Python is a language for manipulating information stored in Python 
objects.  Abstractly, object values are the mostly implementation- and 
even language-independent information that we wish to manipulate.  Note 
that concrete types are somewhat implementation dependent and ids are 
implementation and session dependent.


Bools represent binary choices, not 'truth' per se.  We call the choices 
'True' and 'False' because that (with or without capitals) is the 
default binary choice in propositional logic.  For numbers, 0 versus not 
0 is often an important choice.  Ditto for 'empty' versus 'not empty' 
for collections.


--
Terry Jan Reedy


--
https://mail.python.org/mailman/listinfo/python-list

Re: on the prng behind random.random()

2018-11-19 Thread Peter Otten

Robert Girault wrote:

> Looking at its source code, it seems the PRNG behind random.random() is
> Mersenne Twister, but I'm not sure.  It also seems that random.random()
> is using /dev/urandom.  Can someone help me to read that source code?
> 
> I'm talking about CPython, by the way.  I'm reading
> 
>   https://github.com/python/cpython/blob/master/Lib/random.py
> 
> The initial comment clearly says it's Mersenne Twister, but the only
> random() function there seems to call _urandom(), which I suppose is an
> interface to /dev/urandom.
> 
> What am I missing here?

There's a class random.Random which is instantiated at the end of the file, 
and random() is bound to the corresponding method:

_inst = Random()
...
random = _inst.random

The Random class inherits from _random.Random which is implemented in C and 
does most of the actual work. If you can read C:

https://github.com/python/cpython/blob/master/Modules/_randommodule.c

The most relevant part seems to be genrand_int32() which is wrapped by 
random_random() that actually implenents the _random.Random.random() method.

-- 
https://mail.python.org/mailman/listinfo/python-list

on the prng behind random.random()