Cassandra is both distributed and replicated. We have Replication Factor but no Distribution Factor!
Distribution Factor would define over how many nodes a CF should be distributed. Say you want to support millions of multi-tenant users in clusters with thousands of nodes, where you don't know the user's schema in advance, so you can't have users share CFs. In this case you wouldn't want to spread out each user's Column Families over thousands of nodes! You would want something like: RF=3, DF=10 i.e. distribute each CF over 10 nodes, within those nodes replicate 3 times. One implementation of DF would be to hash the CF name, and use the same strategies defined for RF to choose the N nodes in DF=N.