[Gluster-devel] Adding xxhash to gluster code base

2017-06-26 Thread Kotresh Hiremath Ravishankar
Hi,

We were looking for faster non-cryptographic hash to be used for the
gfid2path infra [1]
The initial testing was done with md5 128bit checksum which was a slow,
cryptographic hash
and using it makes software not complaint to FIPS [2]

On searching online a bit we found out xxhash [3] seems to be faster from
the results of
benchmark tests shared and lot of projects use it. So we have decided to us
xxHash
and added following files to gluster code base with the patch [4]

BSD 2-Clause License:
   contrib/xxhash/xxhash.c
   contrib/xxhash/xxhash.h

GPL v2 License:
   tests/utils/xxhsum.c

NOTE: We have ignored the code guideline check for these files as
maintaining it
further becomes difficult.

Please comment on the same if there are any issues around it.

[1] Issue: https://github.com/gluster/glusterfs/issues/139
[2] https://en.wikipedia.org/wiki/Federal_Information_Processing_Standards
[3] http://cyan4973.github.io/xxHash/
[4] https://review.gluster.org/#/c/17488/10



-- 
Thanks and Regards,
Kotresh H R and Aravinda VK
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Adding xxhash to gluster code base

2017-06-26 Thread Amar Tumballi
On Tue, Jun 27, 2017 at 12:25 PM, Kotresh Hiremath Ravishankar <
khire...@redhat.com> wrote:

> Hi,
>
> We were looking for faster non-cryptographic hash to be used for the
> gfid2path infra [1]
> The initial testing was done with md5 128bit checksum which was a slow,
> cryptographic hash
> and using it makes software not complaint to FIPS [2]
>
> On searching online a bit we found out xxhash [3] seems to be faster from
> the results of
> benchmark tests shared and lot of projects use it. So we have decided to
> us xxHash
> and added following files to gluster code base with the patch [4]
>
> BSD 2-Clause License:
>contrib/xxhash/xxhash.c
>contrib/xxhash/xxhash.h
>
> GPL v2 License:
>tests/utils/xxhsum.c
>
> NOTE: We have ignored the code guideline check for these files as
> maintaining it
> further becomes difficult.
>
> Please comment on the same if there are any issues around it.
>
> [1] Issue: https://github.com/gluster/glusterfs/issues/139
> [2] https://en.wikipedia.org/wiki/Federal_Information_Processing_Standards
> [3] http://cyan4973.github.io/xxHash/
> [4] https://review.gluster.org/#/c/17488/10
>
>
>
Just one comment at the moment. Please separate out the patches as

1. changes to get xxHash into the project
2. gfid2path feature (which can use xxHash code).

That way it will be very easy to review, and also to maintain in future.

-Amar



>
> --
> Thanks and Regards,
> Kotresh H R and Aravinda VK
>



-- 
Amar Tumballi (amarts)
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Adding xxhash to gluster code base

2017-06-27 Thread Kotresh Hiremath Ravishankar
Sure, I can do that.

On Tue, Jun 27, 2017 at 12:28 PM, Amar Tumballi  wrote:

>
>
> On Tue, Jun 27, 2017 at 12:25 PM, Kotresh Hiremath Ravishankar <
> khire...@redhat.com> wrote:
>
>> Hi,
>>
>> We were looking for faster non-cryptographic hash to be used for the
>> gfid2path infra [1]
>> The initial testing was done with md5 128bit checksum which was a slow,
>> cryptographic hash
>> and using it makes software not complaint to FIPS [2]
>>
>> On searching online a bit we found out xxhash [3] seems to be faster from
>> the results of
>> benchmark tests shared and lot of projects use it. So we have decided to
>> us xxHash
>> and added following files to gluster code base with the patch [4]
>>
>> BSD 2-Clause License:
>>contrib/xxhash/xxhash.c
>>contrib/xxhash/xxhash.h
>>
>> GPL v2 License:
>>tests/utils/xxhsum.c
>>
>> NOTE: We have ignored the code guideline check for these files as
>> maintaining it
>> further becomes difficult.
>>
>> Please comment on the same if there are any issues around it.
>>
>> [1] Issue: https://github.com/gluster/glusterfs/issues/139
>> [2] https://en.wikipedia.org/wiki/Federal_Information_Processing
>> _Standards
>> [3] http://cyan4973.github.io/xxHash/
>> [4] https://review.gluster.org/#/c/17488/10
>>
>>
>>
> Just one comment at the moment. Please separate out the patches as
>
> 1. changes to get xxHash into the project
> 2. gfid2path feature (which can use xxHash code).
>
> That way it will be very easy to review, and also to maintain in future.
>
> -Amar
>
>
>
>>
>> --
>> Thanks and Regards,
>> Kotresh H R and Aravinda VK
>>
>
>
>
> --
> Amar Tumballi (amarts)
>



-- 
Thanks and Regards,
Kotresh H R
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Adding xxhash to gluster code base

2017-06-27 Thread Niels de Vos
On Tue, Jun 27, 2017 at 12:25:11PM +0530, Kotresh Hiremath Ravishankar wrote:
> Hi,
> 
> We were looking for faster non-cryptographic hash to be used for the
> gfid2path infra [1]
> The initial testing was done with md5 128bit checksum which was a slow,
> cryptographic hash
> and using it makes software not complaint to FIPS [2]
> 
> On searching online a bit we found out xxhash [3] seems to be faster from
> the results of
> benchmark tests shared and lot of projects use it. So we have decided to us
> xxHash
> and added following files to gluster code base with the patch [4]
> 
> BSD 2-Clause License:
>contrib/xxhash/xxhash.c
>contrib/xxhash/xxhash.h
> 
> GPL v2 License:
>tests/utils/xxhsum.c
> 
> NOTE: We have ignored the code guideline check for these files as
> maintaining it
> further becomes difficult.
> 
> Please comment on the same if there are any issues around it.

How performance critical is the hashing for gfid2path? 

What is the plan to keep these files maintained? At minimal we need to
add these files to MAINTAINERS and the maintainers need to cherry-pick
updates and bugfixes from the original project. The few patches a year
makes this a recurring task that should not be forgoten. It would be
much better to use this as an external library that is provided by the
distributions. We already rely on OpenSSL, does this library not provide
an alternative 'FIPS approved' hashing that performs reasonably well?

Some distributions are very strict on bundling external projects, and we
need to inform the packagers about the additions so that they can handle
it correctly. Adding an external project to contrib/ should be mentioned
in the release notes at the very least.

Note that none of the symbols of any public functions in Gluster may
collide with functions in standard distribution libraries. This causes
for regular problems with gfapi applications. All exposed symbols that
get imported in contrib/ should have a gf_ prefix.

Thanks,
Niels


> 
> [1] Issue: https://github.com/gluster/glusterfs/issues/139
> [2] https://en.wikipedia.org/wiki/Federal_Information_Processing_Standards
> [3] http://cyan4973.github.io/xxHash/
> [4] https://review.gluster.org/#/c/17488/10
> 
> 
> 
> -- 
> Thanks and Regards,
> Kotresh H R and Aravinda VK


signature.asc
Description: PGP signature
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Adding xxhash to gluster code base

2017-06-27 Thread Aravinda


regards
Aravinda VK

On 06/27/2017 01:38 PM, Niels de Vos wrote:

On Tue, Jun 27, 2017 at 12:25:11PM +0530, Kotresh Hiremath Ravishankar wrote:

Hi,

We were looking for faster non-cryptographic hash to be used for the
gfid2path infra [1]
The initial testing was done with md5 128bit checksum which was a slow,
cryptographic hash
and using it makes software not complaint to FIPS [2]

On searching online a bit we found out xxhash [3] seems to be faster from
the results of
benchmark tests shared and lot of projects use it. So we have decided to us
xxHash
and added following files to gluster code base with the patch [4]

 BSD 2-Clause License:
contrib/xxhash/xxhash.c
contrib/xxhash/xxhash.h

 GPL v2 License:
tests/utils/xxhsum.c

NOTE: We have ignored the code guideline check for these files as
maintaining it
further becomes difficult.

Please comment on the same if there are any issues around it.

How performance critical is the hashing for gfid2path?
For this we need non-crypto hashing. This hash calculation will be in 
the I/O path(Affected FOPs are create, mknod, link, symlink, rename, 
unlink)


What is the plan to keep these files maintained? At minimal we need to
add these files to MAINTAINERS and the maintainers need to cherry-pick
updates and bugfixes from the original project. The few patches a year
makes this a recurring task that should not be forgoten. It would be
much better to use this as an external library that is provided by the
distributions. We already rely on OpenSSL, does this library not provide
an alternative 'FIPS approved' hashing that performs reasonably well?
I think, this library is not available in distribution packaging yet. 
Please suggest if you know any non-crypto hashing lib which is already 
available in distros.


Some distributions are very strict on bundling external projects, and we
need to inform the packagers about the additions so that they can handle
it correctly. Adding an external project to contrib/ should be mentioned
in the release notes at the very least.

Note that none of the symbols of any public functions in Gluster may
collide with functions in standard distribution libraries. This causes
for regular problems with gfapi applications. All exposed symbols that
get imported in contrib/ should have a gf_ prefix.

Thanks,
Niels



[1] Issue: https://github.com/gluster/glusterfs/issues/139
[2] https://en.wikipedia.org/wiki/Federal_Information_Processing_Standards
[3] http://cyan4973.github.io/xxHash/
[4] https://review.gluster.org/#/c/17488/10



--
Thanks and Regards,
Kotresh H R and Aravinda VK


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] Adding xxhash to gluster code base

2017-06-27 Thread Kaleb S. KEITHLEY


xxhash doesn't seem to change much. Last update to the non-test code was 
six months ago.


bundling giant (for some definition of giant) packages/projects would be 
bad. bundling two (three if you count the test) C files doesn't seem too 
bad when you consider that there are already three or four packages in 
fedora (perl, python, R-digest, ghc (gnu haskell) that have 
implementations of xxhash or murmur but didn't bother to package a C 
implementation and use it.


I'd be for packaging it in Fedora rather than bundling it in gluster. 
But then we get to "carry" it in rhgs as we do with userspace-rcu.




On 06/27/2017 04:08 AM, Niels de Vos wrote:

On Tue, Jun 27, 2017 at 12:25:11PM +0530, Kotresh Hiremath Ravishankar wrote:

Hi,

We were looking for faster non-cryptographic hash to be used for the
gfid2path infra [1]
The initial testing was done with md5 128bit checksum which was a slow,
cryptographic hash
and using it makes software not complaint to FIPS [2]

On searching online a bit we found out xxhash [3] seems to be faster from
the results of
benchmark tests shared and lot of projects use it. So we have decided to us
xxHash
and added following files to gluster code base with the patch [4]

 BSD 2-Clause License:
contrib/xxhash/xxhash.c
contrib/xxhash/xxhash.h

 GPL v2 License:
tests/utils/xxhsum.c

NOTE: We have ignored the code guideline check for these files as
maintaining it
further becomes difficult.

Please comment on the same if there are any issues around it.


How performance critical is the hashing for gfid2path?

What is the plan to keep these files maintained? At minimal we need to
add these files to MAINTAINERS and the maintainers need to cherry-pick
updates and bugfixes from the original project. The few patches a year
makes this a recurring task that should not be forgoten. It would be
much better to use this as an external library that is provided by the
distributions. We already rely on OpenSSL, does this library not provide
an alternative 'FIPS approved' hashing that performs reasonably well?

Some distributions are very strict on bundling external projects, and we
need to inform the packagers about the additions so that they can handle
it correctly. Adding an external project to contrib/ should be mentioned
in the release notes at the very least.

Note that none of the symbols of any public functions in Gluster may
collide with functions in standard distribution libraries. This causes
for regular problems with gfapi applications. All exposed symbols that
get imported in contrib/ should have a gf_ prefix.

Thanks,
Niels




[1] Issue: https://github.com/gluster/glusterfs/issues/139
[2] https://en.wikipedia.org/wiki/Federal_Information_Processing_Standards
[3] http://cyan4973.github.io/xxHash/
[4] https://review.gluster.org/#/c/17488/10



--
Thanks and Regards,
Kotresh H R and Aravinda VK


___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Adding xxhash to gluster code base

2017-06-27 Thread Niels de Vos
On Tue, Jun 27, 2017 at 08:09:25AM -0400, Kaleb S. KEITHLEY wrote:
> 
> xxhash doesn't seem to change much. Last update to the non-test code was six
> months ago.
> 
> bundling giant (for some definition of giant) packages/projects would be
> bad. bundling two (three if you count the test) C files doesn't seem too bad
> when you consider that there are already three or four packages in fedora
> (perl, python, R-digest, ghc (gnu haskell) that have implementations of
> xxhash or murmur but didn't bother to package a C implementation and use it.

I prefer to have as little maintenance components in the Gluster sources
as we can. The maintenance burdon is already very high. The number of
changes to xxhash seem limited, but we still need someone to track and
pay attention to them.

> I'd be for packaging it in Fedora rather than bundling it in gluster. But
> then we get to "carry" it in rhgs as we do with userspace-rcu.

We should descide what the most maintainable solution is. Having package
maintainers with the explicit task to keep xxhash updated and current is
apealing to me. Merging (even small) projects into the Gluster codebase
will add more maintenance need to the project members. Therefor I have a
strong preference to use xxhash (or an other library) that is provided
by distributions. The more common the library is, the better it will be
maintained without our (Gluster Community's) help.

Niels


> On 06/27/2017 04:08 AM, Niels de Vos wrote:
> > On Tue, Jun 27, 2017 at 12:25:11PM +0530, Kotresh Hiremath Ravishankar 
> > wrote:
> > > Hi,
> > > 
> > > We were looking for faster non-cryptographic hash to be used for the
> > > gfid2path infra [1]
> > > The initial testing was done with md5 128bit checksum which was a slow,
> > > cryptographic hash
> > > and using it makes software not complaint to FIPS [2]
> > > 
> > > On searching online a bit we found out xxhash [3] seems to be faster from
> > > the results of
> > > benchmark tests shared and lot of projects use it. So we have decided to 
> > > us
> > > xxHash
> > > and added following files to gluster code base with the patch [4]
> > > 
> > >  BSD 2-Clause License:
> > > contrib/xxhash/xxhash.c
> > > contrib/xxhash/xxhash.h
> > > 
> > >  GPL v2 License:
> > > tests/utils/xxhsum.c
> > > 
> > > NOTE: We have ignored the code guideline check for these files as
> > > maintaining it
> > > further becomes difficult.
> > > 
> > > Please comment on the same if there are any issues around it.
> > 
> > How performance critical is the hashing for gfid2path?
> > 
> > What is the plan to keep these files maintained? At minimal we need to
> > add these files to MAINTAINERS and the maintainers need to cherry-pick
> > updates and bugfixes from the original project. The few patches a year
> > makes this a recurring task that should not be forgoten. It would be
> > much better to use this as an external library that is provided by the
> > distributions. We already rely on OpenSSL, does this library not provide
> > an alternative 'FIPS approved' hashing that performs reasonably well?
> > 
> > Some distributions are very strict on bundling external projects, and we
> > need to inform the packagers about the additions so that they can handle
> > it correctly. Adding an external project to contrib/ should be mentioned
> > in the release notes at the very least.
> > 
> > Note that none of the symbols of any public functions in Gluster may
> > collide with functions in standard distribution libraries. This causes
> > for regular problems with gfapi applications. All exposed symbols that
> > get imported in contrib/ should have a gf_ prefix.
> > 
> > Thanks,
> > Niels
> > 
> > 
> > > 
> > > [1] Issue: https://github.com/gluster/glusterfs/issues/139
> > > [2] https://en.wikipedia.org/wiki/Federal_Information_Processing_Standards
> > > [3] http://cyan4973.github.io/xxHash/
> > > [4] https://review.gluster.org/#/c/17488/10
> > > 
> > > 
> > > 
> > > -- 
> > > Thanks and Regards,
> > > Kotresh H R and Aravinda VK
> 
___
Gluster-devel mailing list
Gluster-devel@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-devel


Re: [Gluster-devel] Adding xxhash to gluster code base

2017-06-28 Thread Amar Tumballi
On Tue, Jun 27, 2017 at 8:46 PM, Niels de Vos  wrote:

> On Tue, Jun 27, 2017 at 08:09:25AM -0400, Kaleb S. KEITHLEY wrote:
> >
> > xxhash doesn't seem to change much. Last update to the non-test code was
> six
> > months ago.
> >
> > bundling giant (for some definition of giant) packages/projects would be
> > bad. bundling two (three if you count the test) C files doesn't seem too
> bad
> > when you consider that there are already three or four packages in fedora
> > (perl, python, R-digest, ghc (gnu haskell) that have implementations of
> > xxhash or murmur but didn't bother to package a C implementation and use
> it.
>
> I prefer to have as little maintenance components in the Gluster sources
> as we can. The maintenance burdon is already very high. The number of
> changes to xxhash seem limited, but we still need someone to track and
> pay attention to them.
>

I agree that someone should maintain it, and we should add it to
MAINTAINERS file
(or some other place, where we are tracking the dependencies).

For now, Kotresh will be looking into keeping these changes up-to-date with
upstream xxhash project, along with me.


>
> > I'd be for packaging it in Fedora rather than bundling it in gluster. But
> > then we get to "carry" it in rhgs as we do with userspace-rcu.
>
> We should descide what the most maintainable solution is. Having package
> maintainers with the explicit task to keep xxhash updated and current is
> apealing to me. Merging (even small) projects into the Gluster codebase
> will add more maintenance need to the project members. Therefor I have a
> strong preference to use xxhash (or an other library) that is provided
> by distributions. The more common the library is, the better it will be
> maintained without our (Gluster Community's) help.
>
>
While this is desirable, we didn't see any library available for xxhash (
http://cyan4973.github.io/xxHash/) in our distro.

I would recommend taking these patches with TODO to use library in future
when its available, and continue to have xxhash in 'contrib/'. It is not
new for us to take code from different libraries and use it for our need
and maintain only that part (eg. libfuse). Lets treat this as similar setup.

Regards,
Amar





> Niels
>
>
> > On 06/27/2017 04:08 AM, Niels de Vos wrote:
> > > On Tue, Jun 27, 2017 at 12:25:11PM +0530, Kotresh Hiremath Ravishankar
> wrote:
> > > > Hi,
> > > >
> > > > We were looking for faster non-cryptographic hash to be used for the
> > > > gfid2path infra [1]
> > > > The initial testing was done with md5 128bit checksum which was a
> slow,
> > > > cryptographic hash
> > > > and using it makes software not complaint to FIPS [2]
> > > >
> > > > On searching online a bit we found out xxhash [3] seems to be faster
> from
> > > > the results of
> > > > benchmark tests shared and lot of projects use it. So we have
> decided to us
> > > > xxHash
> > > > and added following files to gluster code base with the patch [4]
> > > >
> > > >  BSD 2-Clause License:
> > > > contrib/xxhash/xxhash.c
> > > > contrib/xxhash/xxhash.h
> > > >
> > > >  GPL v2 License:
> > > > tests/utils/xxhsum.c
> > > >
> > > > NOTE: We have ignored the code guideline check for these files as
> > > > maintaining it
> > > > further becomes difficult.
> > > >
> > > > Please comment on the same if there are any issues around it.
> > >
> > > How performance critical is the hashing for gfid2path?
> > >
> > > What is the plan to keep these files maintained? At minimal we need to
> > > add these files to MAINTAINERS and the maintainers need to cherry-pick
> > > updates and bugfixes from the original project. The few patches a year
> > > makes this a recurring task that should not be forgoten. It would be
> > > much better to use this as an external library that is provided by the
> > > distributions. We already rely on OpenSSL, does this library not
> provide
> > > an alternative 'FIPS approved' hashing that performs reasonably well?
> > >
> > > Some distributions are very strict on bundling external projects, and
> we
> > > need to inform the packagers about the additions so that they can
> handle
> > > it correctly. Adding an external project to contrib/ should be
> mentioned
> > > in the release notes at the very least.
> > >
> > > Note that none of the symbols of any public functions in Gluster may
> > > collide with functions in standard distribution libraries. This causes
> > > for regular problems with gfapi applications. All exposed symbols that
> > > get imported in contrib/ should have a gf_ prefix.
> > >
> > > Thanks,
> > > Niels
> > >
> > >
> > > >
> > > > [1] Issue: https://github.com/gluster/glusterfs/issues/139
> > > > [2] https://en.wikipedia.org/wiki/Federal_Information_
> Processing_Standards
> > > > [3] http://cyan4973.github.io/xxHash/
> > > > [4] https://review.gluster.org/#/c/17488/10
> > > >
> > > >
> > > >
> > > > --
> > > > Thanks and Regards,
> > > > Kotresh H R and Aravinda VK
> >
>



-- 
A

Re: [Gluster-devel] Adding xxhash to gluster code base

2017-06-28 Thread Niels de Vos
On Wed, Jun 28, 2017 at 12:51:07PM +0530, Amar Tumballi wrote:
> On Tue, Jun 27, 2017 at 8:46 PM, Niels de Vos  wrote:
> 
> > On Tue, Jun 27, 2017 at 08:09:25AM -0400, Kaleb S. KEITHLEY wrote:
> > >
> > > xxhash doesn't seem to change much. Last update to the non-test code was
> > six
> > > months ago.
> > >
> > > bundling giant (for some definition of giant) packages/projects would be
> > > bad. bundling two (three if you count the test) C files doesn't seem too
> > bad
> > > when you consider that there are already three or four packages in fedora
> > > (perl, python, R-digest, ghc (gnu haskell) that have implementations of
> > > xxhash or murmur but didn't bother to package a C implementation and use
> > it.
> >
> > I prefer to have as little maintenance components in the Gluster sources
> > as we can. The maintenance burdon is already very high. The number of
> > changes to xxhash seem limited, but we still need someone to track and
> > pay attention to them.
> >
> 
> I agree that someone should maintain it, and we should add it to
> MAINTAINERS file
> (or some other place, where we are tracking the dependencies).
> 
> For now, Kotresh will be looking into keeping these changes up-to-date with
> upstream xxhash project, along with me.

Kotresh as maintainer/owner, and Aravinda as peer?

> > > I'd be for packaging it in Fedora rather than bundling it in gluster. But
> > > then we get to "carry" it in rhgs as we do with userspace-rcu.
> >
> > We should descide what the most maintainable solution is. Having package
> > maintainers with the explicit task to keep xxhash updated and current is
> > apealing to me. Merging (even small) projects into the Gluster codebase
> > will add more maintenance need to the project members. Therefor I have a
> > strong preference to use xxhash (or an other library) that is provided
> > by distributions. The more common the library is, the better it will be
> > maintained without our (Gluster Community's) help.
> >
> >
> While this is desirable, we didn't see any library available for xxhash (
> http://cyan4973.github.io/xxHash/) in our distro.
> 
> I would recommend taking these patches with TODO to use library in future
> when its available, and continue to have xxhash in 'contrib/'. It is not
> new for us to take code from different libraries and use it for our need
> and maintain only that part (eg. libfuse). Lets treat this as similar setup.

Yes, if there is no suitable alternative available in the majority of
distributions, this is the only sensible approach. Much of the code in
contrib/ is not maintained at all. We should prevent this from happening
with new code and assigning an owner/maintainer and peer(s) just like
for other components is a must.

Thanks,
Niels


> 
> Regards,
> Amar
> 
> 
> 
> 
> 
> > Niels
> >
> >
> > > On 06/27/2017 04:08 AM, Niels de Vos wrote:
> > > > On Tue, Jun 27, 2017 at 12:25:11PM +0530, Kotresh Hiremath Ravishankar
> > wrote:
> > > > > Hi,
> > > > >
> > > > > We were looking for faster non-cryptographic hash to be used for the
> > > > > gfid2path infra [1]
> > > > > The initial testing was done with md5 128bit checksum which was a
> > slow,
> > > > > cryptographic hash
> > > > > and using it makes software not complaint to FIPS [2]
> > > > >
> > > > > On searching online a bit we found out xxhash [3] seems to be faster
> > from
> > > > > the results of
> > > > > benchmark tests shared and lot of projects use it. So we have
> > decided to us
> > > > > xxHash
> > > > > and added following files to gluster code base with the patch [4]
> > > > >
> > > > >  BSD 2-Clause License:
> > > > > contrib/xxhash/xxhash.c
> > > > > contrib/xxhash/xxhash.h
> > > > >
> > > > >  GPL v2 License:
> > > > > tests/utils/xxhsum.c
> > > > >
> > > > > NOTE: We have ignored the code guideline check for these files as
> > > > > maintaining it
> > > > > further becomes difficult.
> > > > >
> > > > > Please comment on the same if there are any issues around it.
> > > >
> > > > How performance critical is the hashing for gfid2path?
> > > >
> > > > What is the plan to keep these files maintained? At minimal we need to
> > > > add these files to MAINTAINERS and the maintainers need to cherry-pick
> > > > updates and bugfixes from the original project. The few patches a year
> > > > makes this a recurring task that should not be forgoten. It would be
> > > > much better to use this as an external library that is provided by the
> > > > distributions. We already rely on OpenSSL, does this library not
> > provide
> > > > an alternative 'FIPS approved' hashing that performs reasonably well?
> > > >
> > > > Some distributions are very strict on bundling external projects, and
> > we
> > > > need to inform the packagers about the additions so that they can
> > handle
> > > > it correctly. Adding an external project to contrib/ should be
> > mentioned
> > > > in the release notes at the very least.
> > > >
> > > > Note that none of the symbol

Re: [Gluster-devel] Adding xxhash to gluster code base

2017-06-28 Thread Kotresh Hiremath Ravishankar
That sounds good to me. I will send it as a separate patch then. And I can
maintain it. No issues.

Thanks and Regards,
Kotresh H R


On Wed, Jun 28, 2017 at 1:02 PM, Niels de Vos  wrote:

> On Wed, Jun 28, 2017 at 12:51:07PM +0530, Amar Tumballi wrote:
> > On Tue, Jun 27, 2017 at 8:46 PM, Niels de Vos  wrote:
> >
> > > On Tue, Jun 27, 2017 at 08:09:25AM -0400, Kaleb S. KEITHLEY wrote:
> > > >
> > > > xxhash doesn't seem to change much. Last update to the non-test code
> was
> > > six
> > > > months ago.
> > > >
> > > > bundling giant (for some definition of giant) packages/projects
> would be
> > > > bad. bundling two (three if you count the test) C files doesn't seem
> too
> > > bad
> > > > when you consider that there are already three or four packages in
> fedora
> > > > (perl, python, R-digest, ghc (gnu haskell) that have implementations
> of
> > > > xxhash or murmur but didn't bother to package a C implementation and
> use
> > > it.
> > >
> > > I prefer to have as little maintenance components in the Gluster
> sources
> > > as we can. The maintenance burdon is already very high. The number of
> > > changes to xxhash seem limited, but we still need someone to track and
> > > pay attention to them.
> > >
> >
> > I agree that someone should maintain it, and we should add it to
> > MAINTAINERS file
> > (or some other place, where we are tracking the dependencies).
> >
> > For now, Kotresh will be looking into keeping these changes up-to-date
> with
> > upstream xxhash project, along with me.
>
> Kotresh as maintainer/owner, and Aravinda as peer?
>
> > > > I'd be for packaging it in Fedora rather than bundling it in
> gluster. But
> > > > then we get to "carry" it in rhgs as we do with userspace-rcu.
> > >
> > > We should descide what the most maintainable solution is. Having
> package
> > > maintainers with the explicit task to keep xxhash updated and current
> is
> > > apealing to me. Merging (even small) projects into the Gluster codebase
> > > will add more maintenance need to the project members. Therefor I have
> a
> > > strong preference to use xxhash (or an other library) that is provided
> > > by distributions. The more common the library is, the better it will be
> > > maintained without our (Gluster Community's) help.
> > >
> > >
> > While this is desirable, we didn't see any library available for xxhash (
> > http://cyan4973.github.io/xxHash/) in our distro.
> >
> > I would recommend taking these patches with TODO to use library in future
> > when its available, and continue to have xxhash in 'contrib/'. It is not
> > new for us to take code from different libraries and use it for our need
> > and maintain only that part (eg. libfuse). Lets treat this as similar
> setup.
>
> Yes, if there is no suitable alternative available in the majority of
> distributions, this is the only sensible approach. Much of the code in
> contrib/ is not maintained at all. We should prevent this from happening
> with new code and assigning an owner/maintainer and peer(s) just like
> for other components is a must.
>
> Thanks,
> Niels
>
>
> >
> > Regards,
> > Amar
> >
> >
> >
> >
> >
> > > Niels
> > >
> > >
> > > > On 06/27/2017 04:08 AM, Niels de Vos wrote:
> > > > > On Tue, Jun 27, 2017 at 12:25:11PM +0530, Kotresh Hiremath
> Ravishankar
> > > wrote:
> > > > > > Hi,
> > > > > >
> > > > > > We were looking for faster non-cryptographic hash to be used for
> the
> > > > > > gfid2path infra [1]
> > > > > > The initial testing was done with md5 128bit checksum which was a
> > > slow,
> > > > > > cryptographic hash
> > > > > > and using it makes software not complaint to FIPS [2]
> > > > > >
> > > > > > On searching online a bit we found out xxhash [3] seems to be
> faster
> > > from
> > > > > > the results of
> > > > > > benchmark tests shared and lot of projects use it. So we have
> > > decided to us
> > > > > > xxHash
> > > > > > and added following files to gluster code base with the patch [4]
> > > > > >
> > > > > >  BSD 2-Clause License:
> > > > > > contrib/xxhash/xxhash.c
> > > > > > contrib/xxhash/xxhash.h
> > > > > >
> > > > > >  GPL v2 License:
> > > > > > tests/utils/xxhsum.c
> > > > > >
> > > > > > NOTE: We have ignored the code guideline check for these files as
> > > > > > maintaining it
> > > > > > further becomes difficult.
> > > > > >
> > > > > > Please comment on the same if there are any issues around it.
> > > > >
> > > > > How performance critical is the hashing for gfid2path?
> > > > >
> > > > > What is the plan to keep these files maintained? At minimal we
> need to
> > > > > add these files to MAINTAINERS and the maintainers need to
> cherry-pick
> > > > > updates and bugfixes from the original project. The few patches a
> year
> > > > > makes this a recurring task that should not be forgoten. It would
> be
> > > > > much better to use this as an external library that is provided by
> the
> > > > > distributions. We already rely on OpenSSL, does this library not
>