Hi, I dont know why I receive the message
WARN KMeans: The input data is not directly cached, which may hurt performance if its parent RDDs are also uncached. when I try to use Spark Kmeans df_Part = assembler.transform(df_Part) df_Part.cache()while (k<=max_cluster) and (wssse > seuilStop): kmeans = KMeans().setK(k) model = kmeans.fit(df_Part) wssse = model.computeCost(df_Part) k=k+1 It says that my input (Dataframe) is not cached !! I tried to print df_Part.is_cached and I recieved True which means that my dataframe is cached, So why spark still warning me about this ??? thank you in advance ᐧ