Now, is this scoring normal? I wonder if messages with 50 to 99% bayes should only get such low scores.
Only one of those messages has 99% bayes. The others all have 50%. A message with 50% bayes is by definition undecided between spam and ham.
I think the question you should be asking yourself is why your bayes training fails to categorize these spam messages as having a higher probability of spam than ham. (ie: having a bayes score over 50)
