[ 
https://issues.apache.org/jira/browse/JENA-1812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17010684#comment-17010684
 ] 

Andy Seaborne commented on JENA-1812:
-------------------------------------

Changing to another hash is a good idea.

The length of the hash is visible in N-Triples output (32 hex chars). 

I think keeping to a 128 bit length is better unless there is a need to change 
to a longer one.

Jena does not need cryptographic secure hashes for the blank node id 
allocation. It is a way to generate unique ids for the {{_:a}} and {{[]}} forms 
at scale (i.e. avoiding needing to keep a temporary map of label to 
parser-unique allocated label). Each parser run seeds the hash with a 122 bit 
random number.

A possible hash is murmur3_128, which is available in the Google Guava and in 
shaded form, Jena already has it as a dependency.

murmur3_128 is fast, not secure.

There may be other suitable hashes.

(There is another place MD5 is used in TDB1 and TDB2 but there it gets onto 
disk. It does not need to be secure in that usage either.)

> Migrate blank node hash algorithm from MD5 to SHA-256
> -----------------------------------------------------
>
>                 Key: JENA-1812
>                 URL: https://issues.apache.org/jira/browse/JENA-1812
>             Project: Apache Jena
>          Issue Type: Improvement
>          Components: ARQ
>            Reporter: Nicolas Seydoux
>            Assignee: Andy Seaborne
>            Priority: Trivial
>              Labels: easyfix
>             Fix For: Jena 3.14.0
>
>   Original Estimate: 5m
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> MD5 is a deprecated hashing algorithm, and even though it is not used on 
> sensitive data in the context of Jena, its usage is picked up by security 
> softwares as a security flaw. This may reduce the incentive to use Jena in 
> commercial products, and computing SHA-256 hashes is not prohibitively more 
> expensive than MD5.
>  
> Therefore, I suggest to migrate from using MD5 hashes to SHA-256.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to