Hi, Following is my pyspark code, (attached input sample_fpgrowth.txt and python code along with this mail. Even after I have done cache, I am getting Warning: Input data is not cached.
*from pyspark.mllib.fpm import FPGrowthimport pysparkfrom pyspark.context import SparkContextfrom pyspark.sql.session import SparkSessionsc = SparkContext('local')data = sc.textFile("sample_fpgrowth.txt")transactions = data.map(lambda line: line.strip().split(' ')).cache()model = FPGrowth.train(transactions, minSupport=0.2, numPartitions=10)result = model.freqItemsets().collect()print(result)* Understood that it is a warning, but just wanted to know in detail -- Anu
r z h k p z y x w v u t s s x o n r x z y m t s q e z x z y r q t p
from pyspark.mllib.fpm import FPGrowth import pyspark from pyspark.context import SparkContext from pyspark.sql.session import SparkSession sc = SparkContext('local') data = sc.textFile("sample_fpgrowth.txt") transactions = data.map(lambda line: line.strip().split(' ')).cache() model = FPGrowth.train(transactions, minSupport=0.2, numPartitions=10) result = model.freqItemsets().collect() print(result)
--------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org