Hi All, When my query is streaming I get the following error once in say 10 minutes. Lot of the solutions online seems to suggest just clear data directories under datanode and namenode and restart the HDFS cluster but I didn't see anything that explains the cause? If it happens so frequent what do I need to do? I use spark standalone 2.1.1 (I don't use any resource managers like YARN or Mesos at this time)
org.apache.spark.util.TaskCompletionListenerException: File /usr/local/hadoop/metrics/state/0/5/temp-6025335567362823423 could only be replicated to 0 nodes instead of minReplication (=1). There are 2 datanode(s) running and no node(s) are excluded in this operation. Thanks!