Hi,

On Wed, May 2, 2012 at 8:59 AM, Thomas Mueller <muel...@adobe.com> wrote:
> I guess the bottleneck is calculating the content hash (SHA-1 message
> digest). That's expected. But maybe it's something else.

Another likely CPU consumer is full text indexing, especially if
you're dealing with a complex PDF or Office document.

The SHA-1 computation time should be a fraction of the IO time needed
to copy a binary, whereas full text extraction can at times take much
longer than the plain copy.

BR,

Jukka Zitting

Reply via email to