Re: Early std.crypto

2011-11-27 Thread bcs

On 11/26/2011 04:19 PM, Brad Anderson wrote:


How about putting a disclaimer on the module warning the code hasn't
been through a rigorous security audit and point them at well
established C libraries if they need that sort of assurance.


What does that gain over implementing the first itteration in terms of 
well established C libraries and then replacing that with native 
implementations as the code goes been through a rigorous security audit?


Or how about do both as API compatible implementations? That would work 
for people who need the proven security and people who can't afford 
external dependencies as well as allow them to be swapped out for each 
other with minimal effort once the native code is proven.


Re: Early std.crypto

2011-11-27 Thread Jude Young
On Sun 27 Nov 2011 10:27:58 AM CST, bcs wrote:
 On 11/26/2011 04:19 PM, Brad Anderson wrote:

 How about putting a disclaimer on the module warning the code hasn't
 been through a rigorous security audit and point them at well
 established C libraries if they need that sort of assurance.

 What does that gain over implementing the first itteration in terms of
 well established C libraries and then replacing that with native
 implementations as the code goes been through a rigorous security audit?

 Or how about do both as API compatible implementations? That would
 work for people who need the proven security and people who can't
 afford external dependencies as well as allow them to be swapped out
 for each other with minimal effort once the native code is proven.


I do like this idea.
swap implementations by simply swapping import and linking?
nice.


Re: Early std.crypto

2011-11-27 Thread Brad Anderson
On Sun, Nov 27, 2011 at 9:27 AM, bcs b...@example.com wrote:

 On 11/26/2011 04:19 PM, Brad Anderson wrote:


 How about putting a disclaimer on the module warning the code hasn't
 been through a rigorous security audit and point them at well
 established C libraries if they need that sort of assurance.


 What does that gain over implementing the first itteration in terms of
 well established C libraries and then replacing that with native
 implementations as the code goes been through a rigorous security audit?

 Or how about do both as API compatible implementations? That would work
 for people who need the proven security and people who can't afford
 external dependencies as well as allow them to be swapped out for each
 other with minimal effort once the native code is proven.


That's even better but isn't the issue over bundling incompatibly licensed
libraries with phobos?  Nothing is stopping someone from writing bindings
for these libraries as some random library on D Source or Github already.
 An agreed upon API would be very nice in any case.


Re: Early std.crypto

2011-11-27 Thread Piotr Szturmaj

Jude Young wrote:

On Sun 27 Nov 2011 10:27:58 AM CST, bcs wrote:

On 11/26/2011 04:19 PM, Brad Anderson wrote:


How about putting a disclaimer on the module warning the code hasn't
been through a rigorous security audit and point them at well
established C libraries if they need that sort of assurance.


What does that gain over implementing the first itteration in terms of
well established C libraries and then replacing that with native
implementations as the code goes been through a rigorous security audit?

Or how about do both as API compatible implementations? That would
work for people who need the proven security and people who can't
afford external dependencies as well as allow them to be swapped out
for each other with minimal effort once the native code is proven.



I do like this idea.
swap implementations by simply swapping import and linking?
nice.


This was my goal... to write native implementation along with OpenSSL 
wrapper and add 'useOpenSSL' version identifier. Would that satisfy 
everyone?


Re: Early std.crypto

2011-11-27 Thread bcs

On 11/27/2011 12:14 PM, Brad Anderson wrote:


That's even better but isn't the issue over bundling incompatibly
licensed libraries with phobos?  Nothing is stopping someone from
writing bindings for these libraries as some random library on D Source
or Github already.


If we can't find something with a licence allowing bundling then we just 
include the D language bits (including bindings) and note that along 
with where to get the lib.



An agreed upon API would be very nice in any case.


That is necessary (or at least very desirable) in any case as it would 
allow swapping one cipher for another just as easily as it would allow 
swapping one implementation for another.


Re: Early std.crypto

2011-11-27 Thread bcs

On 11/27/2011 12:15 PM, Piotr Szturmaj wrote:

Jude Young wrote:

On Sun 27 Nov 2011 10:27:58 AM CST, bcs wrote:

On 11/26/2011 04:19 PM, Brad Anderson wrote:


How about putting a disclaimer on the module warning the code hasn't
been through a rigorous security audit and point them at well
established C libraries if they need that sort of assurance.


What does that gain over implementing the first itteration in terms of
well established C libraries and then replacing that with native
implementations as the code goes been through a rigorous security audit?

Or how about do both as API compatible implementations? That would
work for people who need the proven security and people who can't
afford external dependencies as well as allow them to be swapped out
for each other with minimal effort once the native code is proven.



I do like this idea.
swap implementations by simply swapping import and linking?
nice.


This was my goal... to write native implementation along with OpenSSL
wrapper and add 'useOpenSSL' version identifier. Would that satisfy
everyone?


Yes, though I'd prefer to see them distinct and non-mutually exclusive. 
For one things, someone may well consider the native implementation of 
one primitive vetted before they consider another to be. Both results 
could be had by the suitable application of aliases.


Re: Early std.crypto

2011-11-26 Thread Brad Anderson
On Fri, Nov 25, 2011 at 10:31 PM, bcs b...@example.com wrote:

 On 11/22/2011 08:16 AM, Piotr Szturmaj wrote:

 bcs wrote:


  How many people in the D community have the experience and know-how to
 review the security of an implementation? If there are less than 2 or 3
 people who can do that, can we afford to include native kernels? We
 can't have just one and if we have only two and one leaves for some
 reason the code becomes un-maintainable for lack of a reviewer. *I*
 wouldn't be comfortable with less than about 4-5.


 I know Regan Heath who wrote some crypto code for Tango. Also, I suspect
 that D _will_ gain more (crypto) contributors, especially after joining
 GCC.


 Wrote some crypto code is a rather weak recommendation. Depending on how
 you interpret it, that would recommend *me*. A better recommendation would
 be Mr Y gets paid by security company X to do crypto analysis or Mrs Z
 has published several well review papers on vulnerabilities in this kind of
 code.


  Minimum number of contributors/reviewers requirement in open-source
 project is at least unfortunate in my opinion. Nevertheless, I respect
 your thoughts. But imagine what could happen if Walter waited for
 contributors instead of starting D project on his own?

 Please realize that we do not implement every possible crypto algorithm
 at once. We need to start with something like hashing, then add
 encryption and other cryptographic primitives.


 I have no problem with that comment. My concern revolves around the point
 that the implementation of cryptographic primitives has security
 implications. I'm worried that we don't have the resources to demonstrate
 that our implementation is at least as good as the currently available
 implementation. I'd rather Phobos not include a given primitive than
 contain one of unknown quality. What I'd like to see is that the crypto
 package quickly contain interfaces for all the primitives we can find
 pre-vetted Boost licensed implementations for. At that point I would have
 no issue with as methodical and meticulous effort to divest ourselves of
 external dependencies as we can get access to the expertises needed to vet
 our own implementations (to the same level as the code they are replacing).

 Yes, I'm intentionally being paranoid here but this is security and
 paranoia is part of the job description darn-it.


How about putting a disclaimer on the module warning the code hasn't been
through a rigorous security audit and point them at well established C
libraries if they need that sort of assurance.


Re: Early std.crypto

2011-11-25 Thread bcs

On 11/22/2011 08:16 AM, Piotr Szturmaj wrote:

bcs wrote:



How many people in the D community have the experience and know-how to
review the security of an implementation? If there are less than 2 or 3
people who can do that, can we afford to include native kernels? We
can't have just one and if we have only two and one leaves for some
reason the code becomes un-maintainable for lack of a reviewer. *I*
wouldn't be comfortable with less than about 4-5.


I know Regan Heath who wrote some crypto code for Tango. Also, I suspect
that D _will_ gain more (crypto) contributors, especially after joining
GCC.


Wrote some crypto code is a rather weak recommendation. Depending on 
how you interpret it, that would recommend *me*. A better recommendation 
would be Mr Y gets paid by security company X to do crypto analysis or 
Mrs Z has published several well review papers on vulnerabilities in 
this kind of code.



Minimum number of contributors/reviewers requirement in open-source
project is at least unfortunate in my opinion. Nevertheless, I respect
your thoughts. But imagine what could happen if Walter waited for
contributors instead of starting D project on his own?

Please realize that we do not implement every possible crypto algorithm
at once. We need to start with something like hashing, then add
encryption and other cryptographic primitives.


I have no problem with that comment. My concern revolves around the 
point that the implementation of cryptographic primitives has security 
implications. I'm worried that we don't have the resources to 
demonstrate that our implementation is at least as good as the currently 
available implementation. I'd rather Phobos not include a given 
primitive than contain one of unknown quality. What I'd like to see is 
that the crypto package quickly contain interfaces for all the 
primitives we can find pre-vetted Boost licensed implementations for. At 
that point I would have no issue with as methodical and meticulous 
effort to divest ourselves of external dependencies as we can get access 
to the expertises needed to vet our own implementations (to the same 
level as the code they are replacing).


Yes, I'm intentionally being paranoid here but this is security and 
paranoia is part of the job description darn-it.


Re: Early std.crypto

2011-11-22 Thread Piotr Szturmaj

bcs wrote:

On 11/20/2011 08:10 AM, Piotr Szturmaj wrote:

bcs wrote:

On 11/04/2011 04:27 AM, Piotr Szturmaj wrote:

bcs wrote:

Are you re-implementing the function kernels your self or are you
using
an existing implementation? Given what I've heard about things like
side-channel attacks using processing times to recover keys, I'd
rather
not see Phobos use anything written by less than the best expert
available.


Until now, I implemented some hash functions. There are no branching
instructions in their transform() routines, so theoretically processing
time is independent of the function input.


From my very incomplete memory I found the source I was looking for (I
googled for aes interperative dance) here
http://www.moserware.com/2009/09/stick-figure-guide-to-advanced.html
Look for Foot-Shooting Prevention Agreement in one of the images
~20-25% of the way down.

tl;dr; It mentions cache-based, timing, and other side channel
attacks. Unless you can explain to me what those are, in painful
detail, I don't think we should trust you to avoid them. Get a good
vetted C implementation and wrap it with a nice D API and call it a day.

 

Sorry for late answer.


Np, but I still have a number of concerns:

What is the advantage to implementing the kernels of any of these
functions in D? Will the code be faster? Smaller? More secure? More
maintainable? (On the other hand, the value of doing the API code in D
goes with no debate.)


The first advantage is that Phobos will be independent of any crypto 
libraries. The second one is that there will be no licensing issues. All 
crypto code will be under Boost license like the rest of Phobos.



How many people in the D community have the experience and know-how to
review the security of an implementation? If there are less than 2 or 3
people who can do that, can we afford to include native kernels? We
can't have just one and if we have only two and one leaves for some
reason the code becomes un-maintainable for lack of a reviewer. *I*
wouldn't be comfortable with less than about 4-5.


I know Regan Heath who wrote some crypto code for Tango. Also, I suspect 
that D _will_ gain more (crypto) contributors, especially after joining GCC.


Minimum number of contributors/reviewers requirement in open-source 
project is at least unfortunate in my opinion. Nevertheless, I respect 
your thoughts. But imagine what could happen if Walter waited for 
contributors instead of starting D project on his own?


Please realize that we do not implement every possible crypto algorithm 
at once. We need to start with something like hashing, then add 
encryption and other cryptographic primitives.


Re: Early std.crypto

2011-11-22 Thread Regan Heath

On Tue, 22 Nov 2011 16:16:21 -, Piotr Szturmaj bncr...@jadamspam.pl
wrote:

bcs wrote:

On 11/20/2011 08:10 AM, Piotr Szturmaj wrote:

bcs wrote:

On 11/04/2011 04:27 AM, Piotr Szturmaj wrote:

bcs wrote:

Are you re-implementing the function kernels your self or are you
using
an existing implementation? Given what I've heard about things like
side-channel attacks using processing times to recover keys, I'd
rather
not see Phobos use anything written by less than the best expert
available.


Until now, I implemented some hash functions. There are no branching
instructions in their transform() routines, so theoretically  
processing

time is independent of the function input.


From my very incomplete memory I found the source I was looking for (I
googled for aes interperative dance) here
http://www.moserware.com/2009/09/stick-figure-guide-to-advanced.html
Look for Foot-Shooting Prevention Agreement in one of the images
~20-25% of the way down.

tl;dr; It mentions cache-based, timing, and other side channel
attacks. Unless you can explain to me what those are, in painful
detail, I don't think we should trust you to avoid them. Get a good
vetted C implementation and wrap it with a nice D API and call it a  
day.

 

Sorry for late answer.


Np, but I still have a number of concerns:

What is the advantage to implementing the kernels of any of these
functions in D? Will the code be faster? Smaller? More secure? More
maintainable? (On the other hand, the value of doing the API code in D
goes with no debate.)


The first advantage is that Phobos will be independent of any crypto  
libraries. The second one is that there will be no licensing issues. All  
crypto code will be under Boost license like the rest of Phobos.


Ultimately I think it comes down to the question of whether we want/expect  
to have native D implementations of things, or whether certain  
projects/libraries will always be too large to maintain in D.  The answer  
to that, comes down to whether we think D will gain a sufficient user base  
to include enough people able to produce and maintain those libraries, or  
.. whether D will gain sufficient importance that existing developers in  
those libraries produce D bindings themselves.


Under the assumption that D will gain sufficient traction it makes sense  
to start implementing what we can in D, now.  At the same time, in order  
to gain that traction D needs bindings to existing libraries.  The latter  
is probably a slightly higher priority at present, but it's not a reason  
not to develop native D implementations at the same time.


If we're lucky having initially incomplete native D implementations will  
actually encourage people with those skills to contribute to D, as there  
is a certain sort of satisfaction in doing so.



How many people in the D community have the experience and know-how to
review the security of an implementation? If there are less than 2 or 3
people who can do that, can we afford to include native kernels? We
can't have just one and if we have only two and one leaves for some
reason the code becomes un-maintainable for lack of a reviewer. *I*
wouldn't be comfortable with less than about 4-5.


I know Regan Heath who wrote some crypto code for Tango. Also, I suspect  
that D _will_ gain more (crypto) contributors, especially after joining  
GCC.


I wrote those as a crypto novice and haven't had any cause to advance
since then.  There are some obvious things to watch out for, for example
the hashing routines should not make copies of the input data, or if they
do they should 'scrub' that memory clean afterwards.  But, that's the
limit of my knowledge, there are bound to be more advanced problems and
solutions that I'm simply not aware of.

Regan

--
Using Opera's revolutionary email client: http://www.opera.com/mail/


Re: Early std.crypto

2011-11-21 Thread bcs

On 11/20/2011 08:10 AM, Piotr Szturmaj wrote:

bcs wrote:

On 11/04/2011 04:27 AM, Piotr Szturmaj wrote:

bcs wrote:

Are you re-implementing the function kernels your self or are you using
an existing implementation? Given what I've heard about things like
side-channel attacks using processing times to recover keys, I'd rather
not see Phobos use anything written by less than the best expert
available.


Until now, I implemented some hash functions. There are no branching
instructions in their transform() routines, so theoretically processing
time is independent of the function input.


From my very incomplete memory I found the source I was looking for (I
googled for aes interperative dance) here
http://www.moserware.com/2009/09/stick-figure-guide-to-advanced.html
Look for Foot-Shooting Prevention Agreement in one of the images
~20-25% of the way down.

tl;dr; It mentions cache-based, timing, and other side channel
attacks. Unless you can explain to me what those are, in painful
detail, I don't think we should trust you to avoid them. Get a good
vetted C implementation and wrap it with a nice D API and call it a day.



Sorry for late answer.


Np, but I still have a number of concerns:

What is the advantage to implementing the kernels of any of these 
functions in D? Will the code be faster? Smaller? More secure? More 
maintainable? (On the other hand, the value of doing the API code in D 
goes with no debate.)


How many people in the D community have the experience and know-how to 
review the security of an implementation? If there are less than 2 or 3 
people who can do that, can we afford to include native kernels? We 
can't have just one and if we have only two and one leaves for some 
reason the code becomes un-maintainable for lack of a reviewer. *I* 
wouldn't be comfortable with less than about 4-5.




I didn't implement AES yet, but it can be implemented without lookup
tables (countermeasure to cache-timing attacks).

Branch-free code without secret-data dependent memory accesses is free
from cache-timing vurnelabilities on most architectures. The remaining
are architectures where instruction execution times (cycles) may depend
on data (f.i. multiply by 0 may take 1 cycle, and 2 cycles when
multiplying by other numbers). I don't know any concrete examples but I
realize such cycle-varying instructions _may_ exist. Of course such
instructions must be avoided in secure code.

The most hard to avoid side channel attack is Power Analysis. On almost
all CPUs instruction power drain depends on the operands. Again
multiplying by 0 consume less power than multiplying by other numbers
and that may be distinguished on the oscilloscope. There are some
countermeasures including operand blinding, but not all crypto
algorithms support that.

I only implemented some hashing functions: MD5, SHA1/224/256/384/512 and
Tiger1/2. MD5 and SHA have code that is branch-free and does not use any
data dependent lookups so their execution time should be constant,
making timing attacks harmless.

Unfortunately Tiger function does lookups based on the source bytes and
is vurnelable to cache-timing attacks. This may be considered insecure
and removed (openssl does not support it either) as making it secure may
be not worth it.

The only problem that remain is the power analysis which require
physical access to the computer. We need to ask ourselfs to what degree
we must secure our code. I'm going to argue that no single
implementation is secure to power analysis on all architectures. I think
we must stick to cache/branch timing level and forget about power
analysis as its scope is beyond D's specification. We simply can't avoid
most power analysis attacks because most of the countermeasures belong
to the hardware instead of the software level.




Re: Early std.crypto

2011-11-20 Thread Piotr Szturmaj

bcs wrote:

On 11/04/2011 04:27 AM, Piotr Szturmaj wrote:

bcs wrote:

Are you re-implementing the function kernels your self or are you using
an existing implementation? Given what I've heard about things like
side-channel attacks using processing times to recover keys, I'd rather
not see Phobos use anything written by less than the best expert
available.


Until now, I implemented some hash functions. There are no branching
instructions in their transform() routines, so theoretically processing
time is independent of the function input.


 From my very incomplete memory I found the source I was looking for (I
googled for aes interperative dance) here
http://www.moserware.com/2009/09/stick-figure-guide-to-advanced.html
Look for Foot-Shooting Prevention Agreement in one of the images
~20-25% of the way down.

tl;dr; It mentions cache-based, timing, and other side channel
attacks. Unless you can explain to me what those are, in painful
detail, I don't think we should trust you to avoid them. Get a good
vetted C implementation and wrap it with a nice D API and call it a day.


Sorry for late answer.

I didn't implement AES yet, but it can be implemented without lookup 
tables (countermeasure to cache-timing attacks).


Branch-free code without secret-data dependent memory accesses is free 
from cache-timing vurnelabilities on most architectures. The remaining 
are architectures where instruction execution times (cycles) may depend 
on data (f.i. multiply by 0 may take 1 cycle, and 2 cycles when 
multiplying by other numbers). I don't know any concrete examples but I 
realize such cycle-varying instructions _may_ exist. Of course such 
instructions must be avoided in secure code.


The most hard to avoid side channel attack is Power Analysis. On almost 
all CPUs instruction power drain depends on the operands. Again 
multiplying by 0 consume less power than multiplying by other numbers 
and that may be distinguished on the oscilloscope. There are some 
countermeasures including operand blinding, but not all crypto 
algorithms support that.


I only implemented some hashing functions: MD5, SHA1/224/256/384/512 and 
Tiger1/2. MD5 and SHA have code that is branch-free and does not use any 
data dependent lookups so their execution time should be constant, 
making timing attacks harmless.


Unfortunately Tiger function does lookups based on the source bytes and 
is vurnelable to cache-timing attacks. This may be considered insecure 
and removed (openssl does not support it either) as making it secure may 
be not worth it.


The only problem that remain is the power analysis which require 
physical access to the computer. We need to ask ourselfs to what degree 
we must secure our code. I'm going to argue that no single 
implementation is secure to power analysis on all architectures. I think 
we must stick to cache/branch timing level and forget about power 
analysis as its scope is beyond D's specification. We simply can't avoid 
most power analysis attacks because most of the countermeasures belong 
to the hardware instead of the software level.


Re: Early std.crypto

2011-11-04 Thread Piotr Szturmaj

bcs wrote:

Are you re-implementing the function kernels your self or are you using
an existing implementation? Given what I've heard about things like
side-channel attacks using processing times to recover keys, I'd rather
not see Phobos use anything written by less than the best expert available.


Until now, I implemented some hash functions. There are no branching 
instructions in their transform() routines, so theoretically processing 
time is independent of the function input.


Re: Early std.crypto

2011-11-04 Thread bcs

On 11/04/2011 04:27 AM, Piotr Szturmaj wrote:

bcs wrote:

Are you re-implementing the function kernels your self or are you using
an existing implementation? Given what I've heard about things like
side-channel attacks using processing times to recover keys, I'd rather
not see Phobos use anything written by less than the best expert
available.


Until now, I implemented some hash functions. There are no branching
instructions in their transform() routines, so theoretically processing
time is independent of the function input.


From my very incomplete memory I found the source I was looking for (I 
googled for aes interperative dance) here 
http://www.moserware.com/2009/09/stick-figure-guide-to-advanced.html 
Look for Foot-Shooting Prevention Agreement in one of the images 
~20-25% of the way down.


tl;dr; It mentions cache-based, timing, and other side channel 
attacks. Unless you can explain to me what those are, in painful 
detail, I don't think we should trust you to avoid them. Get a good 
vetted C implementation and wrap it with a nice D API and call it a day.


Re: Early std.crypto

2011-11-04 Thread Walter Bright

On 11/4/2011 7:52 PM, bcs wrote:

tl;dr; It mentions cache-based, timing, and other side channel attacks. Unless
you can explain to me what those are, in painful detail, I don't think we should
trust you to avoid them. Get a good vetted C implementation and wrap it with a
nice D API and call it a day.



You've got a good point. While I'd like to see native D implementations, crypto 
security is such a big issue we'd probably be better off with your suggestion.


Re: Early std.crypto

2011-10-28 Thread bcs

On 10/24/2011 05:10 PM, Piotr Szturmaj wrote:

https://github.com/pszturmaj/phobos/tree/master/std/crypto

This is some early work on std.crypto proposal. Currently only MD5, HMAC
and all SHA family functions (excluding SHA0 which is very old, broken
and no longer in use). I plan to add other crypto primitives later.

I know about one SHA1 pull request optimized for SSSE3. I think native
code must be there to support other non x86 CPUs and SIMD optimization
may be added at any time later.

Any opinions are welcome. Especially if such design is good or bad, and
what needs to be changed.

Thanks :)


Are you re-implementing the function kernels your self or are you using 
an existing implementation? Given what I've heard about things like 
side-channel attacks using processing times to recover keys, I'd rather 
not see Phobos use anything written by less than the best expert available.


Re: Early std.crypto

2011-10-26 Thread Steve Teale
 
 An easy test is that if the interface takes a T[] as input, consider a
 range instead. Ditto for output. If an interface takes a File as input,
 it's a red flag that something is wrong.

I have a question about ranges that occurred to me when I was composing a 
MySQL result set into a random access range.

To do that I have to provide the save capability defined by

R r1;
R r2 = r1.save;

Would it harm the use of ranges as elements of operation sequences if the 
type of the entity that got saved was not the same as the original range 
provider type. Something like:

R r1;
S r2 = r1.save;
r1.restore(r2);

For entities with a good deal of state, that implement a range, storing 
the whole state, and more significantly, restoring it, may be non-
trivial, but saving and restoring the state of the range may be quite a 
lightweight affair.

I'm also puzzled by the semantics of the random access range.

If I restrict myself to indexing into the range, then things are fine.

auto x = r[0];
x = r[10];
x = r[0];

But if I slip in a few popFront's, then presumably x = r[0] will give me 
a different result.

This just makes me slightly uneasy.

Steve


Re: Early std.crypto

2011-10-26 Thread Jonathan M Davis
On Wednesday, October 26, 2011 06:28:52 Steve Teale wrote:
  An easy test is that if the interface takes a T[] as input, consider a
  range instead. Ditto for output. If an interface takes a File as input,
  it's a red flag that something is wrong.
 
 I have a question about ranges that occurred to me when I was composing a
 MySQL result set into a random access range.
 
 To do that I have to provide the save capability defined by
 
 R r1;
 R r2 = r1.save;
 
 Would it harm the use of ranges as elements of operation sequences if the
 type of the entity that got saved was not the same as the original range
 provider type. Something like:
 
 R r1;
 S r2 = r1.save;
 r1.restore(r2);
 
 For entities with a good deal of state, that implement a range, storing
 the whole state, and more significantly, restoring it, may be non-
 trivial, but saving and restoring the state of the range may be quite a
 lightweight affair.

It's likely to cause issues if the type returned by save differs from the 
original type. From your description, it sounds like the range is over 
something, and it's that something which you don't want to copy. If that's the 
case, then the range just doesn't contain that data. Rather, it's a view into 
the data, so saving it just save that view.

The fact that arrays are currently the most used example of ranges definitely 
makes them more confusing IMHO, since people often think of ranges as 
containers, whereas ranges aren't. And really, dynamic arrays in D _aren't_ 
containers. No dynamic array actually owns its elements. The runtime owns it, 
and dynamic arrays are just ranges over that data.

It's easier to understand when you think about a container, such as vector 
type (e.g. std.container.Array) or a linked list. The container holds the 
data. The container is _not_ a range. A range is a view of that data. So, 
popping elements off of the range has no effect on the original container - the 
same goes with many range-based functions. As long as the elements in the 
range aren't mutated or rearranged, the elements in the container are 
unaltered. So, calling save on a range just gives you a copy of that view of 
the container, allowing you to alter the range and still have that original 
view of the container.

 I'm also puzzled by the semantics of the random access range.
 
 If I restrict myself to indexing into the range, then things are fine.
 
 auto x = r[0];
 x = r[10];
 x = r[0];
 
 But if I slip in a few popFront's, then presumably x = r[0] will give me
 a different result.
 
 This just makes me slightly uneasy.

r[0] _is_ r.front, so of course popFront would change what r[0] is. You're 
consuming the range as you pop elements off of its front. So, if you do 
_anything_ which alters a range (rather than its elements), at least some of 
(if not all of) the indices in a random access range change. The reason that 
this is making you uneasy is likely at least partially because of the fact 
that you're probably thinking about arrays and how they're used rather than 
thinking more abstractly about ranges. Arrays are really a bad example of 
ranges, in spite of the fact that they are unfortunately, our most common 
example of ranges at this point.

- Jonathan M Davis


Re: Early std.crypto

2011-10-26 Thread Dmitry Olshansky

On 26.10.2011 10:28, Steve Teale wrote:


An easy test is that if the interface takes a T[] as input, consider a
range instead. Ditto for output. If an interface takes a File as input,
it's a red flag that something is wrong.


I have a question about ranges that occurred to me when I was composing a
MySQL result set into a random access range.

To do that I have to provide the save capability defined by

R r1;
R r2 = r1.save;

Would it harm the use of ranges as elements of operation sequences if the
type of the entity that got saved was not the same as the original range
provider type. Something like:

R r1;
S r2 = r1.save;
r1.restore(r2);


Yay! Range with 'restore' strikes again.
Seriously, last time something about range was discussed I noticed this 
very same thing - full copy is costly, there are ranges that can restore 
their state using some special tiny savepoint object.
Though currently this kind of thing is not supported nor expected, plus 
there are cases where copy needs to be made.




For entities with a good deal of state, that implement a range, storing
the whole state, and more significantly, restoring it, may be non-
trivial, but saving and restoring the state of the range may be quite a
lightweight affair.

I'm also puzzled by the semantics of the random access range.

If I restrict myself to indexing into the range, then things are fine.

auto x = r[0];
x = r[10];
x = r[0];

But if I slip in a few popFront's, then presumably x = r[0] will give me
a different result.

This just makes me slightly uneasy.

Steve



--
Dmitry Olshansky


Re: Early std.crypto

2011-10-26 Thread Steve Teale
On Tue, 25 Oct 2011 23:45:58 -0700, Jonathan M Davis wrote:

 On Wednesday, October 26, 2011 06:28:52 Steve Teale wrote:
  An easy test is that if the interface takes a T[] as input, consider
  a range instead. Ditto for output. If an interface takes a File as
  input, it's a red flag that something is wrong.
 
 I have a question about ranges that occurred to me when I was composing
 a MySQL result set into a random access range.
 
 To do that I have to provide the save capability defined by
 
 R r1;
 R r2 = r1.save;
 
 Would it harm the use of ranges as elements of operation sequences if
 the type of the entity that got saved was not the same as the original
 range provider type. Something like:
 
 R r1;
 S r2 = r1.save;
 r1.restore(r2);
 
 For entities with a good deal of state, that implement a range, storing
 the whole state, and more significantly, restoring it, may be non-
 trivial, but saving and restoring the state of the range may be quite a
 lightweight affair.
 
 It's likely to cause issues if the type returned by save differs from
 the original type. From your description, it sounds like the range is
 over something, and it's that something which you don't want to copy. If
 that's the case, then the range just doesn't contain that data. Rather,
 it's a view into the data, so saving it just save that view.

Well, yes Jonathan, that's just what I'm getting at. I want to save just 
those things that are pertinent to the range/view, not the whole panoply 
of the result set representation or whatever other object is providing 
the data that the range is a view of.

I could do it by saving an instance of the object representing the result 
set containing only the relevant data, but that would be misleading for 
the user, because it would not match the object it was cloned from in any 
respect other than representing a range.

Should not the requirement for the thing provided by save be only that it 
should provide the same view as that provided by the source range at the 
time of the save, and the same range 'interface'.

Is there an accepted term for the way ranges are defined, i.e. as an 
entity that satisfies some template evaluating roughly to a bool?

Steve

Steve


Re: Early std.crypto

2011-10-26 Thread Jonathan M Davis
On Wednesday, October 26, 2011 03:17 Steve Teale wrote:
 On Tue, 25 Oct 2011 23:45:58 -0700, Jonathan M Davis wrote:
  On Wednesday, October 26, 2011 06:28:52 Steve Teale wrote:
   An easy test is that if the interface takes a T[] as input, consider
   a range instead. Ditto for output. If an interface takes a File as
   input, it's a red flag that something is wrong.
  
  I have a question about ranges that occurred to me when I was composing
  a MySQL result set into a random access range.
  
  To do that I have to provide the save capability defined by
  
  R r1;
  R r2 = r1.save;
  
  Would it harm the use of ranges as elements of operation sequences if
  the type of the entity that got saved was not the same as the original
  range provider type. Something like:
  
  R r1;
  S r2 = r1.save;
  r1.restore(r2);
  
  For entities with a good deal of state, that implement a range, storing
  the whole state, and more significantly, restoring it, may be non-
  trivial, but saving and restoring the state of the range may be quite a
  lightweight affair.
  
  It's likely to cause issues if the type returned by save differs from
  the original type. From your description, it sounds like the range is
  over something, and it's that something which you don't want to copy. If
  that's the case, then the range just doesn't contain that data. Rather,
  it's a view into the data, so saving it just save that view.
 
 Well, yes Jonathan, that's just what I'm getting at. I want to save just
 those things that are pertinent to the range/view, not the whole panoply
 of the result set representation or whatever other object is providing
 the data that the range is a view of.
 
 I could do it by saving an instance of the object representing the result
 set containing only the relevant data, but that would be misleading for
 the user, because it would not match the object it was cloned from in any
 respect other than representing a range.
 
 Should not the requirement for the thing provided by save be only that it
 should provide the same view as that provided by the source range at the
 time of the save, and the same range 'interface'.
 
 Is there an accepted term for the way ranges are defined, i.e. as an
 entity that satisfies some template evaluating roughly to a bool?

Ranges are defined per templates in std.range. isForwardRange, isInputRange, 
isRandomAccessRange, etc. And it looks like isForwardRange specifically checks 
that the return type of save is the same type as the range itself. I was 
thinking that it didn't necessarily require that but that you'd have definite 
issues having save return a different type, because it's generally assumed 
that it's the same type and code will reflect that. However, it looks like 
it's actually required.

What you should probably do is have the result set be an object holding the 
data but not be a range itself and then overload opSlice to give you a range 
over that data (as would be done with a container). Then the range isn't in a 
position where it's trying to own the data, and it's clear to the programmer 
where the data is stored.

- Jonathan M Davis


Re: Early std.crypto

2011-10-26 Thread Steve Teale
On Wed, 26 Oct 2011 13:01:14 -0400, Jonathan M Davis wrote:

 On Wednesday, October 26, 2011 03:17 Steve Teale wrote:
 On Tue, 25 Oct 2011 23:45:58 -0700, Jonathan M Davis wrote:
  On Wednesday, October 26, 2011 06:28:52 Steve Teale wrote:
   An easy test is that if the interface takes a T[] as input,
   consider a range instead. Ditto for output. If an interface takes
   a File as input, it's a red flag that something is wrong.
  
  I have a question about ranges that occurred to me when I was
  composing a MySQL result set into a random access range.
  
  To do that I have to provide the save capability defined by
  
  R r1;
  R r2 = r1.save;
  
  Would it harm the use of ranges as elements of operation sequences
  if the type of the entity that got saved was not the same as the
  original range provider type. Something like:
  
  R r1;
  S r2 = r1.save;
  r1.restore(r2);
  
  For entities with a good deal of state, that implement a range,
  storing the whole state, and more significantly, restoring it, may
  be non- trivial, but saving and restoring the state of the range may
  be quite a lightweight affair.
  
  It's likely to cause issues if the type returned by save differs from
  the original type. From your description, it sounds like the range is
  over something, and it's that something which you don't want to copy.
  If that's the case, then the range just doesn't contain that data.
  Rather, it's a view into the data, so saving it just save that view.
 
 Well, yes Jonathan, that's just what I'm getting at. I want to save
 just those things that are pertinent to the range/view, not the whole
 panoply of the result set representation or whatever other object is
 providing the data that the range is a view of.
 
 I could do it by saving an instance of the object representing the
 result set containing only the relevant data, but that would be
 misleading for the user, because it would not match the object it was
 cloned from in any respect other than representing a range.
 
 Should not the requirement for the thing provided by save be only that
 it should provide the same view as that provided by the source range at
 the time of the save, and the same range 'interface'.
 
 Is there an accepted term for the way ranges are defined, i.e. as an
 entity that satisfies some template evaluating roughly to a bool?
 
 Ranges are defined per templates in std.range. isForwardRange,
 isInputRange, isRandomAccessRange, etc. And it looks like isForwardRange
 specifically checks that the return type of save is the same type as the
 range itself. I was thinking that it didn't necessarily require that but
 that you'd have definite issues having save return a different type,
 because it's generally assumed that it's the same type and code will
 reflect that. However, it looks like it's actually required.

But is it just required by std.range, or is it required at some 
theoretical level by algorithms that can work with ranges as inputs and 
outputs?

 What you should probably do is have the result set be an object holding
 the data but not be a range itself and then overload opSlice to give you
 a range over that data (as would be done with a container). Then the
 range isn't in a position where it's trying to own the data, and it's
 clear to the programmer where the data is stored.
 

But in my mind, just from a quick reading of the language definition, 
slices are closely tied to arrays, which as you have already noted, are 
not the best example of a range.

Are ranges just a library artefact, or are they supported by the 
language. They appear to be recognized by foreach, so if they are 
recognized by D, then we should presumably have operator functions 
opInputRange, opForwardRange, and so on, whatever those terms might mean.

If not, then facilities like std.algorithm should have a warning notice 
like cigarettes that says this facility may not be usable with POD.

Steve




Re: Early std.crypto

2011-10-26 Thread Jonathan M Davis
On Wednesday, October 26, 2011 19:00:02 Steve Teale wrote:
 On Wed, 26 Oct 2011 13:01:14 -0400, Jonathan M Davis wrote:
  Ranges are defined per templates in std.range. isForwardRange,
  isInputRange, isRandomAccessRange, etc. And it looks like isForwardRange
  specifically checks that the return type of save is the same type as the
  range itself. I was thinking that it didn't necessarily require that but
  that you'd have definite issues having save return a different type,
  because it's generally assumed that it's the same type and code will
  reflect that. However, it looks like it's actually required.
 
 But is it just required by std.range, or is it required at some
 theoretical level by algorithms that can work with ranges as inputs and
 outputs?

It complicates things to allow for the range returned by save not be the same 
type as the original range. It's probably possible to alter isForwardRange to 
allow save to return a different type, but I don't know what all of the side 
effects of such a decision would be. I'm certain however that it would break a 
fair bit of code.

  What you should probably do is have the result set be an object holding
  the data but not be a range itself and then overload opSlice to give you
  a range over that data (as would be done with a container). Then the
  range isn't in a position where it's trying to own the data, and it's
  clear to the programmer where the data is stored.
 
 But in my mind, just from a quick reading of the language definition,
 slices are closely tied to arrays, which as you have already noted, are
 not the best example of a range.
 
 Are ranges just a library artefact, or are they supported by the
 language. They appear to be recognized by foreach, so if they are
 recognized by D, then we should presumably have operator functions
 opInputRange, opForwardRange, and so on, whatever those terms might mean.

foreach supports the API of an input range: front, empty, and popFront. 
There's no need for overloaded operators of any kind. If you want to overload 
an operator for foreach, then use opApply. foreach is the _only_ place in the 
language that specfically supports ranges.

Arrays are a poor example of ranges because they don't have an obviously 
associated container. A dynamic array is a range over a block of memory that 
the runtime maintains. With a full-blown container such as Array or 
RedBlackTree, it's the container that holds the data, and a range for one of 
them is a range over that container. People tend to think of arrays as being 
containers rather than just slices, so it becomes confusing to people.

It's much easier to properly understand ranges if you think of them as being 
associated with a container like an iterator would be in C++. They don't 
normally own the elements that they're iterating over. When they do, you get 
situations such as input ranges where you can't save them (an input stream 
being a prime example).

 If not, then facilities like std.algorithm should have a warning notice
 like cigarettes that says this facility may not be usable with POD.

I'm afraid that I don't understand why a warning would be necessary. std.range 
lists the functions that each type of range must implement and it has 
templates for testing whether a particular type implements those functions for 
a given sort of range. I suspect that you're misunderstanding something 
fundamental about ranges.

If you haven't read this article yet - 
http://www.informit.com/articles/printerfriendly.aspx?p=1407357 - I suggest 
that you do. It doesn't entirely match what std.range does (at least as far as 
naming goes) but it does explain the core concepts fairly well and as far as I 
know is the best explanation on ranges that currently exists.

- Jonathan M Davis


Re: Early std.crypto

2011-10-25 Thread Piotr Szturmaj

Martin Nowak wrote:

I have to say though that I like the current struct based interface
much better.

struct Hash
{
// enhanced by some compile time traits
enum hashLength = 16;
enum blockLength = 0;


The reason why hash and block length are runtime variables is that some 
hash functions are parametrized with variables of great amplitude, for 
example CubeHash may have any number of rounds, and any size of block 
and hash output.



// three interface functions
void start();
void update(const(ubyte)[] data);
void finish(ref ubyte[hashLength] digest);
}


There, it is:

reset();
put();
finish();

The put() function makes hash implementation an OutputRange.


You wouldn't need the save, restore functions.


They're not needed. They only serve as speed optimization when hashing 
many messages which have the same beginning block. This is used in HMAC, 
which is:


HMAC(func, key, message) = func(key ^ opad, func(key ^ ipad, message));

when func supports saving the IV, the first parts are precomputed, when 
not HMAC resorts to full hashing. This optimization is also mentioned in 
HMAC spec.



Some unnecessary allocations could go away.
Most important instances would have less mutable state.


Could you specify which ones, please?


You could probably parameterize a Merkle Damgård base with free
functions for the transformation.


What would be the difference from current class parametrization?


A dynamic interface can be obtaines by templated instances similar to
what std.range does.


Could you elaborate? I don't know exactly what do you mean. Function 
templates?


Thanks a lot!


Re: Early std.crypto

2011-10-25 Thread Martin Nowak
On Tue, 25 Oct 2011 09:43:48 +0200, Piotr Szturmaj bncr...@jadamspam.pl  
wrote:



Martin Nowak wrote:

I have to say though that I like the current struct based interface
much better.

struct Hash
{
// enhanced by some compile time traits
enum hashLength = 16;
enum blockLength = 0;


The reason why hash and block length are runtime variables is that some  
hash functions are parametrized with variables of great amplitude, for  
example CubeHash may have any number of rounds, and any size of block  
and hash output.



// three interface functions
void start();
void update(const(ubyte)[] data);
void finish(ref ubyte[hashLength] digest);
}


There, it is:

reset();
put();
finish();

Reset does two different things depending on the internal state. Not so  
good.



The put() function makes hash implementation an OutputRange.


You wouldn't need the save, restore functions.


They're not needed. They only serve as speed optimization when hashing  
many messages which have the same beginning block. This is used in HMAC,  
which is:


HMAC(func, key, message) = func(key ^ opad, func(key ^ ipad, message));

when func supports saving the IV, the first parts are precomputed, when  
not HMAC resorts to full hashing. This optimization is also mentioned in  
HMAC spec.



If hash contexts were value type you could simply do.
auto saved = hash_ctx;
Or alternatively one could add a 'save()' function to an isSaveableHash(H)
concept.


Some unnecessary allocations could go away.
Most important instances would have less mutable state.


Could you specify which ones, please?

Basically every 'new' in std.hash.crypto.base but especially the ones in  
hash(T) and hashToHex(T).



You could probably parameterize a Merkle Damgård base with free
functions for the transformation.


What would be the difference from current class parametrization?

Just wanted to point out a specific alternative if code reuse is of a  
concern.
If not using classes you need a way to inject the transformation which  
could

be done like.
alias MerkleDamgard!(uint, 5, 80, 16, 20, sha1Transform) SHA1;


A dynamic interface can be obtaines by templated instances similar to
what std.range does.


Could you elaborate? I don't know exactly what do you mean. Function  
templates?



http://www.digitalmars.com/d/2.0/phobos/std_range.html#InputRange
DynamicAllocatorTemplate at  
https://github.com/dsimcha/TempAlloc/blob/master/std/allocators/allocator.d


These are to support cases where you either want a stable ABI or
have a template firewall for scalability issues (e.g. could be sensible  
for the HMAC implementation although not really necessary).



Thanks a lot!


Re: Early std.crypto

2011-10-25 Thread Brad Roberts
On 10/24/2011 5:10 PM, Piotr Szturmaj wrote:
 https://github.com/pszturmaj/phobos/tree/master/std/crypto
 
 This is some early work on std.crypto proposal. Currently only MD5, HMAC and 
 all SHA family functions (excluding SHA0
 which is very old, broken and no longer in use). I plan to add other crypto 
 primitives later.
 
 I know about one SHA1 pull request optimized for SSSE3. I think native code 
 must be there to support other non x86 CPUs
 and SIMD optimization may be added at any time later.
 
 Any opinions are welcome. Especially if such design is good or bad, and what 
 needs to be changed.
 
 Thanks :)

A key element to a lot of crypto code is speed.  I really don't think we want 
to re-invent all the optimizations on all
the platforms.  To that end, I really suggest that we stick to wrapping 
existing implementations, like openssl.  While I
hate the openssl apis, I do respect the continual effort that various companies 
invest in optimizing the code.

My 2 cents,
Brad



Re: Early std.crypto

2011-10-25 Thread Piotr Szturmaj

Brad Roberts wrote:

On 10/24/2011 5:10 PM, Piotr Szturmaj wrote:

https://github.com/pszturmaj/phobos/tree/master/std/crypto

This is some early work on std.crypto proposal. Currently only MD5, HMAC and 
all SHA family functions (excluding SHA0
which is very old, broken and no longer in use). I plan to add other crypto 
primitives later.

I know about one SHA1 pull request optimized for SSSE3. I think native code 
must be there to support other non x86 CPUs
and SIMD optimization may be added at any time later.

Any opinions are welcome. Especially if such design is good or bad, and what 
needs to be changed.

Thanks :)


A key element to a lot of crypto code is speed.  I really don't think we want 
to re-invent all the optimizations on all
the platforms.  To that end, I really suggest that we stick to wrapping 
existing implementations, like openssl.  While I
hate the openssl apis, I do respect the continual effort that various companies 
invest in optimizing the code.


You are of course right about speed but there are some reasons for 
having our own code _if_ we want std.crypto:


1. Phobos independence
2. Non D friendly API of openssl
3. No need to link with openssl to compute a simple hash.
4. Licensing

We also have some options for speed improvement, while retaining our API

1. Wrap openssl _on demand_, this is transparent to the user, API 
doesn't change
2. Many (if not all) openssl asm code may be obtained using CRYPTOGAMS 
license (BSD). But reading http://www.openssl.org/~appro/cryptogams/ 
suggests that author might be willing to license it under Boost.

3. Adapt Crypto++ asm code which is public domain (x86/64 only)

The 1st one should be easy, and user would have choice between openssl 
wrapped within std.crypto or direct access to etc.c.openssl.


Re: Early std.crypto

2011-10-25 Thread Walter Bright

On 10/24/2011 5:10 PM, Piotr Szturmaj wrote:

https://github.com/pszturmaj/phobos/tree/master/std/crypto

This is some early work on std.crypto proposal. Currently only MD5, HMAC and all
SHA family functions (excluding SHA0 which is very old, broken and no longer in
use). I plan to add other crypto primitives later.

I know about one SHA1 pull request optimized for SSSE3. I think native code must
be there to support other non x86 CPUs and SIMD optimization may be added at any
time later.

Any opinions are welcome. Especially if such design is good or bad, and what
needs to be changed.



Thanks for championing this.

The input to the functions should be a range, not an array (although an array is 
a range).


In general, for Phobos, all arbitrary input data should be in the form of 
ranges, and all arbitrary output data should present itself as a range. This 
facilitates the idea of:


range = algorithm = range

So, for example, I want to encrypt and then zip a file and send the output to a 
socket:


file = encrypt = compress = socket

All the components here will just snap together. With the existing design of 
crypto, I'd have to read the file into an array, then pass the array to encrypt, 
etc.


Think of it like the filter concept in Unix that has been so successful.



Re: Early std.crypto

2011-10-25 Thread Piotr Szturmaj

Martin Nowak wrote:

On Tue, 25 Oct 2011 09:43:48 +0200, Piotr Szturmaj
bncr...@jadamspam.pl wrote:


Martin Nowak wrote:

I have to say though that I like the current struct based interface
much better.

struct Hash
{
// enhanced by some compile time traits
enum hashLength = 16;
enum blockLength = 0;


The reason why hash and block length are runtime variables is that
some hash functions are parametrized with variables of great
amplitude, for example CubeHash may have any number of rounds, and any
size of block and hash output.


// three interface functions
void start();
void update(const(ubyte)[] data);
void finish(ref ubyte[hashLength] digest);
}


There, it is:

reset();
put();
finish();


Reset does two different things depending on the internal state. Not so
good.


I think that's negligible, but it may be unbranched easily.


The put() function makes hash implementation an OutputRange.


You wouldn't need the save, restore functions.


They're not needed. They only serve as speed optimization when hashing
many messages which have the same beginning block. This is used in
HMAC, which is:

HMAC(func, key, message) = func(key ^ opad, func(key ^ ipad, message));

when func supports saving the IV, the first parts are precomputed,
when not HMAC resorts to full hashing. This optimization is also
mentioned in HMAC spec.


If hash contexts were value type you could simply do.
auto saved = hash_ctx;
Or alternatively one could add a 'save()' function to an isSaveableHash(H)
concept.


Yes, I thought about that, but current way is faster because only 
initialization vector is saved, and api name emphasises it. General 
save() on SHA512 would need to copy IV and 80 ulongs (internal state).



Some unnecessary allocations could go away.
Most important instances would have less mutable state.


Could you specify which ones, please?


Basically every 'new' in std.hash.crypto.base but especially the ones in
hash(T) and hashToHex(T).


Yes, this will be fixed.


You could probably parameterize a Merkle Damgård base with free
functions for the transformation.


What would be the difference from current class parametrization?


Just wanted to point out a specific alternative if code reuse is of a
concern.
If not using classes you need a way to inject the transformation which
could
be done like.
alias MerkleDamgard!(uint, 5, 80, 16, 20, sha1Transform) SHA1;


Either way this function must be written. Either as free function or as 
class method. I don't think someone would use transformation function 
directly.



A dynamic interface can be obtaines by templated instances similar to
what std.range does.


Could you elaborate? I don't know exactly what do you mean. Function
templates?


http://www.digitalmars.com/d/2.0/phobos/std_range.html#InputRange
DynamicAllocatorTemplate at
https://github.com/dsimcha/TempAlloc/blob/master/std/allocators/allocator.d

These are to support cases where you either want a stable ABI or
have a template firewall for scalability issues (e.g. could be sensible
for the HMAC implementation although not really necessary).



I will look into it. Thanks!


Re: Early std.crypto

2011-10-25 Thread Piotr Szturmaj

Walter Bright wrote:

On 10/24/2011 5:10 PM, Piotr Szturmaj wrote:

https://github.com/pszturmaj/phobos/tree/master/std/crypto

This is some early work on std.crypto proposal. Currently only MD5,
HMAC and all
SHA family functions (excluding SHA0 which is very old, broken and no
longer in
use). I plan to add other crypto primitives later.

I know about one SHA1 pull request optimized for SSSE3. I think native
code must
be there to support other non x86 CPUs and SIMD optimization may be
added at any
time later.

Any opinions are welcome. Especially if such design is good or bad,
and what
needs to be changed.



Thanks for championing this.

The input to the functions should be a range, not an array (although an
array is a range).

In general, for Phobos, all arbitrary input data should be in the form
of ranges, and all arbitrary output data should present itself as a
range. This facilitates the idea of:

range = algorithm = range

So, for example, I want to encrypt and then zip a file and send the
output to a socket:

file = encrypt = compress = socket

All the components here will just snap together. With the existing
design of crypto, I'd have to read the file into an array, then pass the
array to encrypt, etc.

Think of it like the filter concept in Unix that has been so successful.



I share your opinion. I was thinking about such filter concept for 
std.crypto.cipher (TBD), but I will also try to convert current hash 
function code to ranges.


Thanks for pointing that out.


Re: Early std.crypto

2011-10-25 Thread Walter Bright

On 10/25/2011 3:40 PM, Piotr Szturmaj wrote:

I share your opinion. I was thinking about such filter concept for
std.crypto.cipher (TBD), but I will also try to convert current hash function
code to ranges.

Thanks for pointing that out.


Andrei and I have pretty much failed at articulating this vision for Phobos. We 
need to get our act together.




Re: Early std.crypto

2011-10-25 Thread Jonathan M Davis
On Tuesday, October 25, 2011 15:55 Walter Bright wrote:
 On 10/25/2011 3:40 PM, Piotr Szturmaj wrote:
  I share your opinion. I was thinking about such filter concept for
  std.crypto.cipher (TBD), but I will also try to convert current hash
  function code to ranges.
  
  Thanks for pointing that out.
 
 Andrei and I have pretty much failed at articulating this vision for
 Phobos. We need to get our act together.

Ranges in general are not something that is comunicated very well. They're not 
in TDPL, and none of the online documentation discusses them in detail. You 
pretty much only learn them from reading Phobos' documentation and using 
Phobos or from discussing it with people who know about them already. And 
there's no overall plan or design for Phobos articulated _anywhere_ that I'm 
aware of, so between those two facts, there's really nothing to get such a 
vision across to anyone other than word of mouth (though I suppose that it's 
more word of keyboard in many cases).

- Jonathan M Davis


Re: Early std.crypto

2011-10-25 Thread Brad Anderson
On Tue, Oct 25, 2011 at 5:12 PM, Jonathan M Davis jmdavisp...@gmx.comwrote:

 On Tuesday, October 25, 2011 15:55 Walter Bright wrote:
  On 10/25/2011 3:40 PM, Piotr Szturmaj wrote:
   I share your opinion. I was thinking about such filter concept for
   std.crypto.cipher (TBD), but I will also try to convert current hash
   function code to ranges.
  
   Thanks for pointing that out.
 
  Andrei and I have pretty much failed at articulating this vision for
  Phobos. We need to get our act together.

 Ranges in general are not something that is comunicated very well. They're
 not
 in TDPL, and none of the online documentation discusses them in detail. You
 pretty much only learn them from reading Phobos' documentation and using
 Phobos or from discussing it with people who know about them already. And
 there's no overall plan or design for Phobos articulated _anywhere_ that
 I'm
 aware of, so between those two facts, there's really nothing to get such a
 vision across to anyone other than word of mouth (though I suppose that
 it's
 more word of keyboard in many cases).

 - Jonathan M Davis


I learned about them from Andrei's boostcon talk.  Not exactly the first
place you'd look for information about D but the talk was very informative.

Regards,
Brad Anderson


Re: Early std.crypto

2011-10-25 Thread Jonathan M Davis
On Tuesday, October 25, 2011 16:16 Brad Anderson wrote:
 On Tue, Oct 25, 2011 at 5:12 PM, Jonathan M Davis 
jmdavisp...@gmx.comwrote:
  On Tuesday, October 25, 2011 15:55 Walter Bright wrote:
   On 10/25/2011 3:40 PM, Piotr Szturmaj wrote:
I share your opinion. I was thinking about such filter concept for
std.crypto.cipher (TBD), but I will also try to convert current hash
function code to ranges.

Thanks for pointing that out.
   
   Andrei and I have pretty much failed at articulating this vision for
   Phobos. We need to get our act together.
  
  Ranges in general are not something that is comunicated very well.
  They're not
  in TDPL, and none of the online documentation discusses them in detail.
  You pretty much only learn them from reading Phobos' documentation and
  using Phobos or from discussing it with people who know about them
  already. And there's no overall plan or design for Phobos articulated
  _anywhere_ that I'm
  aware of, so between those two facts, there's really nothing to get such
  a vision across to anyone other than word of mouth (though I suppose
  that it's
  more word of keyboard in many cases).
  
  - Jonathan M Davis
 
 I learned about them from Andrei's boostcon talk. Not exactly the first
 place you'd look for information about D but the talk was very informative.

That and an article that Andrei wrote on ranges a while back which was not D-
specific are the only online sources on ranges that I'm aware of, and to my 
knowledge, neither of them are referenced on D's site. Ranges are a powerful 
concept, but they're not well known, in part because they're relatively new - 
particularly when it comes to being used in a major library, let alone the 
standard library for a language. So, they definitely need some explaining.

I was working on article on ranges a while back, but tabled it due to bugs 
related to std.container.Array rendering my examples unworkable. I should 
probably get back to working on that and see if those issues still exist or 
whether I can better work around them in the article. But the fact that 
dynamic arrays are actually a very poor example of ranges in some regards 
(given the fact that people tend to think of them as containers even though 
they really aren't in D) complicates things in a way that I'd really like to 
be able to explain some of the concepts using a real container rather than 
arrays.

- Jonathan M Davis


Re: Early std.crypto

2011-10-25 Thread Andrej Mitrovic
If you want a bit of history:

http://www.digitalmars.com/d/archives/digitalmars/D/announce/RFC_on_range_design_for_D2_12922.html#N12922
http://www.digitalmars.com/d/archives/digitalmars/D/announce/Revised_RFC_on_range_design_for_D2_13211.html

http://web.archive.org/web/20090112000313/http://ssli.ee.washington.edu/~aalexand/d/tmp/std_range.html

That last link (although outdated) is probably more descriptive than
what we have now.


Re: Early std.crypto

2011-10-25 Thread Piotr Szturmaj

Walter Bright wrote:

On 10/25/2011 3:40 PM, Piotr Szturmaj wrote:

I share your opinion. I was thinking about such filter concept for
std.crypto.cipher (TBD), but I will also try to convert current hash
function
code to ranges.

Thanks for pointing that out.


Andrei and I have pretty much failed at articulating this vision for
Phobos. We need to get our act together.


Personally, I learned ranges from Phobos docs/src and Andrei's article 
about them. In the beginning I was surprised that I could not find any 
range howto, yet on this NG some range designs were considered 
obvious. But I finally (somewhat) understood and used them in my 
postgres client implementation. And I like them.


I wish there was some article about ranges and their uses. Some tips and 
tricks thing for users and guidelines for range writers.


But let's get back to std.crypto. I planned to add input ranges to 
function put, because it already forms an output range. I thought that 
std.range.put() would serve the need if UFCS will be supported. But, as 
I guess, the best method is to support input ranges directly.


Re: Early std.crypto

2011-10-25 Thread Walter Bright

On 10/25/2011 4:12 PM, Jonathan M Davis wrote:

Ranges in general are not something that is comunicated very well. They're not
in TDPL, and none of the online documentation discusses them in detail. You
pretty much only learn them from reading Phobos' documentation and using
Phobos or from discussing it with people who know about them already. And
there's no overall plan or design for Phobos articulated _anywhere_ that I'm
aware of, so between those two facts, there's really nothing to get such a
vision across to anyone other than word of mouth (though I suppose that it's
more word of keyboard in many cases).


Yup. Guilty as charged.


Re: Early std.crypto

2011-10-25 Thread Walter Bright

On 10/25/2011 5:22 PM, Piotr Szturmaj wrote:

But let's get back to std.crypto. I planned to add input ranges to function put,
because it already forms an output range. I thought that std.range.put() would
serve the need if UFCS will be supported. But, as I guess, the best method is to
support input ranges directly.


An easy test is that if the interface takes a T[] as input, consider a range 
instead. Ditto for output. If an interface takes a File as input, it's a red 
flag that something is wrong.


Early std.crypto

2011-10-24 Thread Piotr Szturmaj

https://github.com/pszturmaj/phobos/tree/master/std/crypto

This is some early work on std.crypto proposal. Currently only MD5, HMAC 
and all SHA family functions (excluding SHA0 which is very old, broken 
and no longer in use). I plan to add other crypto primitives later.


I know about one SHA1 pull request optimized for SSSE3. I think native 
code must be there to support other non x86 CPUs and SIMD optimization 
may be added at any time later.


Any opinions are welcome. Especially if such design is good or bad, and 
what needs to be changed.


Thanks :)


Re: Early std.crypto

2011-10-24 Thread Martin Nowak
On Tue, 25 Oct 2011 02:10:49 +0200, Piotr Szturmaj bncr...@jadamspam.pl  
wrote:



https://github.com/pszturmaj/phobos/tree/master/std/crypto

This is some early work on std.crypto proposal. Currently only MD5, HMAC  
and all SHA family functions (excluding SHA0 which is very old, broken  
and no longer in use). I plan to add other crypto primitives later.


I know about one SHA1 pull request optimized for SSSE3. I think native  
code must be there to support other non x86 CPUs and SIMD optimization  
may be added at any time later.


Any opinions are welcome. Especially if such design is good or bad, and  
what needs to be changed.


Thanks :)


Great to push this a little.


I have to say though that I like the current struct based interface
much better.

struct Hash
{
// enhanced by some compile time traits
enum hashLength  = 16;
enum blockLength =  0;

// three interface functions
void start();
void update(const(ubyte)[] data);
void finish(ref ubyte[hashLength] digest);
}

You wouldn't need the save, restore functions.
Some unnecessary allocations could go away.
Most important instances would have less mutable state.

You could probably parameterize a Merkle Damgård base with free
functions for the transformation.

A dynamic interface can be obtaines by templated instances similar to what  
std.range does.


std.crypto

2010-11-10 Thread Steve Teale
Has anyone got something that would be a nucleus for this?



Re: std.crypto

2010-11-10 Thread Daniel Gibson

Steve Teale schrieb:

Has anyone got something that would be a nucleus for this?



I've used http://www.dsource.org/projects/dcrypt with D1 (just simple HMAC-SHA1 
signing, so I can't tell how well the other stuff works).