I don’t know the reason, however would offer a hunch that perhaps it’s a nod to 
Douglas Adams (author of The Hitchhiker’s Guide to the Galaxy). 

https://news.mit.edu/2019/answer-life-universe-and-everything-sum-three-cubes-mathematics-0910

> On Sep 26, 2022, at 16:59, Sean Owen <sro...@gmail.com> wrote:
> 
> 
> OK, it came to my attention today that hash functions in spark, like 
> xxhash64, actually always seed with 42: 
> https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/hash.scala#L655
> 
> This is an issue if you want the hash of some value in Spark to match the 
> hash you compute with xxhash64 somewhere else, and, AFAICT most any other 
> impl will start with seed=0.
> 
> I'm guessing there wasn't a great reason for this, just seemed like 42 was a 
> nice default seed. And we can't change it now without maybe subtly changing 
> program behaviors. And, I am guessing it's messy to let the function now take 
> a seed argument, esp. in SQL.
> 
> So I'm left with, I guess we should doc that? I can do it if so.
> And just a cautionary tale I guess, for hash function users.

Reply via email to