Re: [Bacula-users] Possibility of parallelising encryption?

2010-04-08 Thread Craig Ringer
On 8/04/2010 8:03 PM, Phil Stracchino wrote:
> On 04/08/10 02:16, Craig Ringer wrote:
>> Bacula should probably work with Intel's hardware crypto out of the box.
>> If it doesn't, most likely all that'd be required would be to call:
>>
>>  ENGINE_load_builtin_engines();
>>  ENGINE_register_all_complete();

> Sounds like a good idea to me.  Want to write up a patch?

I will, at that. I even have some Via C3 machines to test it on, though 
they're thin clients so I'll need to take one out of service for a bit 
and chuck a disk in it.

--
Craig Ringer


--
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Possibility of parallelising encryption?

2010-04-08 Thread Phil Stracchino
On 04/08/10 02:16, Craig Ringer wrote:
> Bacula should probably work with Intel's hardware crypto out of the box.
> If it doesn't, most likely all that'd be required would be to call:
> 
> ENGINE_load_builtin_engines();
> ENGINE_register_all_complete();
> 
> in init_crypto() , and:
> 
> ENGINE_cleanup();
> 
> when crypto is cleaned up. See "man 3 engine". In fact, as PadLock
> support comes pre-loaded and Intel crypto probably will too, it may not
> even be neccessary to call ENGINE_load_builtin_engines() at all, only
> ENGINE_register_all_complete(). Asking openssl to load all engines will
> let it use other less common hardware crypto systems like some of the
> add-on hw crypto PCI cards, though, and it's cheap enough not to even be
> detectable in something as long-lived as bacula-fd.
> 
> OpenSSL is smart enough to pick a hardware engine if one exists, and
> fall back to software if there's no suitable engine for the task at
> hand. IIRC that's all I had to do to patch PadLock support into OpenSSH
> when I needed it for my thin clients at work.
> 
> It's a trivial change that would enable Bacula to use any builtin
> hardware crypto engine supported by OpenSSL. Worth making, so that by
> the time the new Intel hardware hits Bacula supports it?
> 
> --
> Craig Ringer


Sounds like a good idea to me.  Want to write up a patch?


-- 
  Phil Stracchino, CDK#2 DoD#299792458 ICBM: 43.5607, -71.355
  ala...@caerllewys.net   ala...@metrocast.net   p...@co.ordinate.org
 Renaissance Man, Unix ronin, Perl hacker, Free Stater
 It's not the years, it's the mileage.

--
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Possibility of parallelising encryption?

2010-04-07 Thread Craig Ringer
Phil Stracchino wrote:

> FWIW, memory says we tried using GnuTLS once.  It turns out that for
> this purpose it is (or was) horribly broken.  In fact, at the time, I
> seem to recall the mere presence of Red Hat's GnuTLS lib broke Bacula
> altogether.

Yeah. I've found it to be pretty broken in other ways - I'm STILL having
to rebuild the subversion client against openssl to get working support
for client certificates due to a gnutls bug, and I rebuild netatalk
against openssl to get dhx password hashes that aren't supported in gnutls.

It doesn't look like gnutls is any better than openssl in terms of
parallel encryption support anyway. The whole affair seems to be a
licensing kerfuffle rather than anything else.

>> It'd be lovely to be able to use the IPP libraries in Bacula (and many
>> other things) for parallel crypto and many other parallel tasks, as
>> they're excellent even without special hardware. Unfortunately they're
>> rather GPL-incompatible and are only "free" for non-commercial use.
> 
> It would indeed be very nice to be able to use that kind of hardware
> crypto support without having to jump through licensing hoops.

You don't have to jump through any licensing hoops to use the hardware
crypto. Intel have submitted patches that've been accepted into OpenSSL
1.0 .  For Via's C3/C7 PadLock hardware crypto, support has been around
for ages and is certainly in OpenSSL 0.9.8.

Run "openssl engine" for a list of built-in engines in your version of
OpenSSL. Dynamically loaded engines for rarer hardware crypto devices
and/or those that need special configuration are availible in
/usr/lib/ssl/engines/  .

Those licensing hoops would have to be jumped to use Intel's  fast
parallel *software* crypto engine to do multi-core/multi-threaded
AES-CTR mode encryption.

Bacula should probably work with Intel's hardware crypto out of the box.
If it doesn't, most likely all that'd be required would be to call:

ENGINE_load_builtin_engines();
ENGINE_register_all_complete();

in init_crypto() , and:

ENGINE_cleanup();

when crypto is cleaned up. See "man 3 engine". In fact, as PadLock
support comes pre-loaded and Intel crypto probably will too, it may not
even be neccessary to call ENGINE_load_builtin_engines() at all, only
ENGINE_register_all_complete(). Asking openssl to load all engines will
let it use other less common hardware crypto systems like some of the
add-on hw crypto PCI cards, though, and it's cheap enough not to even be
detectable in something as long-lived as bacula-fd.

OpenSSL is smart enough to pick a hardware engine if one exists, and
fall back to software if there's no suitable engine for the task at
hand. IIRC that's all I had to do to patch PadLock support into OpenSSH
when I needed it for my thin clients at work.

It's a trivial change that would enable Bacula to use any builtin
hardware crypto engine supported by OpenSSL. Worth making, so that by
the time the new Intel hardware hits Bacula supports it?

--
Craig Ringer

--
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Possibility of parallelising encryption?

2010-04-07 Thread Richard Scobie
Craig Ringer wrote:

Snip interesting crypto discussion...

>> I'd use the hardware encryption (which presumably has no performance
>> impact), that is an option on this autochanger, except they want $2500
>> for it...
>
> Probably because it has a custom ASIC for the crypto algorithm in use to
> allow it to go fast enough.

Well, I suspect the ASIC is already onboard - the $2500 seems to get you 
a couple of USB keys to enable it (HP MSL series libraries).

> The trouble with this is that if your tape drive/changer dies, you
> generally need another one with the same hardware crypto to restore.
> This is a really, really ugly situation for disaster recovery.

As long as you obtain another MSL library and use one of your original 
USB keys, you're back in business, but I take your point.

Regards,

Richard

--
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Possibility of parallelising encryption?

2010-04-07 Thread Phil Stracchino
On 04/07/10 05:09, Craig Ringer wrote:
> Richard Scobie wrote:
>> Would it be possible to optimise this task by perhaps reading data in 
>> "chunks", which in turn can be encrypted by a core each, before being 
>> recombined and written out to tape?
> 
> Bacula uses OpenSSL for crypto support. It doesn't seem to support any
> other crypto libraries like NSS or GnuTLS.

FWIW, memory says we tried using GnuTLS once.  It turns out that for
this purpose it is (or was) horribly broken.  In fact, at the time, I
seem to recall the mere presence of Red Hat's GnuTLS lib broke Bacula
altogether.

> Some hardware, like the Via C7 series of CPUs, have built-in AES crypto
> hardware (PadLock) that on a single thread can do *insane* encryption
> rates. On the older C3 series CPUs I've had no problems saturating a
> 100MBit line with encrypted ssh data, despite the gutless 400MHz C3 CPU.
> 
> Intel has introduced similar instructions on their Xeon 5600 series:
>  
> http://software.intel.com/en-us/articles/boosting-openssl-aes-encryption-with-intel-ipp/
> 
> It'd be lovely to be able to use the IPP libraries in Bacula (and many
> other things) for parallel crypto and many other parallel tasks, as
> they're excellent even without special hardware. Unfortunately they're
> rather GPL-incompatible and are only "free" for non-commercial use.

It would indeed be very nice to be able to use that kind of hardware
crypto support without having to jump through licensing hoops.



-- 
  Phil Stracchino, CDK#2 DoD#299792458 ICBM: 43.5607, -71.355
  ala...@caerllewys.net   ala...@metrocast.net   p...@co.ordinate.org
 Renaissance Man, Unix ronin, Perl hacker, Free Stater
 It's not the years, it's the mileage.

--
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Possibility of parallelising encryption?

2010-04-07 Thread Craig Ringer
Richard Scobie wrote:
> I have a 2.8GHz Core i7 machine backing up uncompressable data spooled 
> onto an 8 drive RAID5, to LTO-4 tape.
> 
> Our requirements now dictate that data encryption must be used on the 
> tapes and having configured this, it seems that one core is saturated 
> encrypting the data and the result is that tape write speed is now about 
> 50% slower than when encryption is not used.
> 
> Would it be possible to optimise this task by perhaps reading data in 
> "chunks", which in turn can be encrypted by a core each, before being 
> recombined and written out to tape?

Sorry to reply-to-self, but after a bit more reading it seems that the
algorithm/cypher used by Bacula, aes-256-cbc, can't really be parallel
encrypted on multiple cores because encrypting one block affects the way
the next block is encrypted.

aes-ctr mode was developed to address that issue. It doesn't seem to be
hugely widely adopted yet, but is used in IPSec, appears in revisions to
the TLS standard, etc.

OpenSSL supports AES CTR mode at least for 128-bit, but only
single-threaded.

The implementation of multi-threaded aes-ctr encryption for OpenSSH I
referenced earlier was also presented to the OpenSSL folks:

  http://marc.info/?l=openssl-dev&m=120180007117054&w=2

but I can't find any response or follow-up to it.

So I guess the short answer is "parallel encryption is way harder to do
than it looks".

--
Craig Ringer

--
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Possibility of parallelising encryption?

2010-04-07 Thread Craig Ringer
Richard Scobie wrote:
> I have a 2.8GHz Core i7 machine backing up uncompressable data spooled 
> onto an 8 drive RAID5, to LTO-4 tape.
> 
> Our requirements now dictate that data encryption must be used on the 
> tapes and having configured this, it seems that one core is saturated 
> encrypting the data and the result is that tape write speed is now about 
> 50% slower than when encryption is not used.

The process saturating a core is the file daemon, right?


> Would it be possible to optimise this task by perhaps reading data in 
> "chunks", which in turn can be encrypted by a core each, before being 
> recombined and written out to tape?


Bacula uses OpenSSL for crypto support. It doesn't seem to support any
other crypto libraries like NSS or GnuTLS.

OpenSSL supports hardware crypto acceleration for some cyphers in a
largely transparent manner. This is one option. If Bacula doesn't "just
work" with hardware crypto I'd expect it to be a one-line patch to add
support, going by what I've had to do to enable it in other software.

Some hardware, like the Via C7 series of CPUs, have built-in AES crypto
hardware (PadLock) that on a single thread can do *insane* encryption
rates. On the older C3 series CPUs I've had no problems saturating a
100MBit line with encrypted ssh data, despite the gutless 400MHz C3 CPU.

Intel has introduced similar instructions on their Xeon 5600 series:
 
http://software.intel.com/en-us/articles/boosting-openssl-aes-encryption-with-intel-ipp/


I'd be lovely to be able to use the IPP libraries in Bacula (and many
other things) for parallel crypto and many other parallel tasks, as
they're excellent even without special hardware. Unfortunately they're
rather GPL-incompatible and are only "free" for non-commercial use.

( The Intel Thread Building Blocks library *is* open source under GPLv2,
though, and if Bacula wasn't already using pthreads directly would be
rather nice: http://www.threadingbuildingblocks.org/ )


Anyway, if you need a software-only option, it's necessary to:

1) get OpenSSL to use multiple cores for encryption internally;
2) get Bacula to use OpenSSL to encrypt blocks using worker
   threads using a suitable block cypher; or
3) Use another crypto library that automatically parallelizes.


None of these look easy by any stretch. (2) is probably most realistic,
but as OpenSSL does some internal locking and serialization it may not
be possible to encrypt on multiple threads even when using a simple
block cypher where one block doesn't depend on the next or previous. I
don't know much about OpenSSL and can't say more without a lot more
digging. For all I know it might be necessary to ask OpenSSL for the
session key, then use its low level crypto functions to encrypt blocks
rather than using the higher-level stream/session interface.

While not directly OpenSSL related, this might also be of interest:

  http://www.psc.edu/networking/projects/hpn-ssh/
  http://www.psc.edu/networking/projects/hpn-ssh/papers/a14-rapier.pdf

It doesn't touch on OpenSSL, but at least it's using a highly parallel
AES cypher...






> I'd use the hardware encryption (which presumably has no performance 
> impact), that is an option on this autochanger, except they want $2500 
> for it...

Probably because it has a custom ASIC for the crypto algorithm in use to
allow it to go fast enough.

The trouble with this is that if your tape drive/changer dies, you
generally need another one with the same hardware crypto to restore.
This is a really, really ugly situation for disaster recovery.

( Maybe things have improved since I switched away from tape, and now
the hardware crypto is just an accelerator and you can still load your
keys for driver-based crypto instead. I doubt it, though. )

--
Craig Ringer

--
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Possibility of parallelising encryption?

2010-03-28 Thread Peter Zenge
And of course it's even worse if you have compressible data.  Since it's 
uncompressible once encrypted, you can't let the tape drive handle it... So in 
our case, on a quad-core server, we see a single core saturated apparently 
doing both the compress and encrypt routines, while 3 cores idle.  Leads to 
much slower backup times than it would otherwise be capable of.  Even having 
one core compress and another encrypt would be more efficient in our case.

From: Richard Scobie [rich...@sauce.co.nz]
Sent: Sunday, March 28, 2010 11:26 AM
To: bacula-users
Subject: [Bacula-users] Possibility of parallelising encryption?

I have a 2.8GHz Core i7 machine backing up uncompressable data spooled
onto an 8 drive RAID5, to LTO-4 tape.

Our requirements now dictate that data encryption must be used on the
tapes and having configured this, it seems that one core is saturated
encrypting the data and the result is that tape write speed is now about
50% slower than when encryption is not used.

Would it be possible to optimise this task by perhaps reading data in
"chunks", which in turn can be encrypted by a core each, before being
recombined and written out to tape?

I'd use the hardware encryption (which presumably has no performance
impact), that is an option on this autochanger, except they want $2500
for it...

Regards,

Richard

--
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

--
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


[Bacula-users] Possibility of parallelising encryption?

2010-03-28 Thread Richard Scobie
I have a 2.8GHz Core i7 machine backing up uncompressable data spooled 
onto an 8 drive RAID5, to LTO-4 tape.

Our requirements now dictate that data encryption must be used on the 
tapes and having configured this, it seems that one core is saturated 
encrypting the data and the result is that tape write speed is now about 
50% slower than when encryption is not used.

Would it be possible to optimise this task by perhaps reading data in 
"chunks", which in turn can be encrypted by a core each, before being 
recombined and written out to tape?

I'd use the hardware encryption (which presumably has no performance 
impact), that is an option on this autochanger, except they want $2500 
for it...

Regards,

Richard

--
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users