I know I'm reviving an old thread but did the original poster ever find the cause of this issue and figure out what the fix was?
I am running a cluster of 18 Kafka .9 brokers and three of them are having behaving exactly this way once a week. Pretty scary because they are doing a full resign as ISR, dropping all of their partitions, and stopping. Then requiring a full rebuild when restarted. No smoking gun errors in the log. Just a clean shutdown triggered for no apparent reason.