Fwd: [pyspark][MLlib] Getting WARN FPGrowth: Input data is not cached for cached data

Anu B Nair Thu, 21 Dec 2017 23:45:08 -0800

Hi,

Following is my pyspark code, (attached input sample_fpgrowth.txt and
python code along with this mail. Even after I have done cache, I am
getting Warning: Input data is not cached.


















*from pyspark.mllib.fpm import FPGrowthimport pysparkfrom pyspark.context
import SparkContextfrom pyspark.sql.session import SparkSessionsc =
SparkContext('local')data = sc.textFile("sample_fpgrowth.txt")transactions
= data.map(lambda line: line.strip().split(' ')).cache()model =
FPGrowth.train(transactions, minSupport=0.2, numPartitions=10)result =
model.freqItemsets().collect()print(result)*


Understood that it is a warning, but just wanted to know in detail

--

Anu

r z h k p
z y x w v u t s
s x o n r
x z y m t s q e
z
x z y r q t p

from pyspark.mllib.fpm import FPGrowth

import pyspark
from pyspark.context import SparkContext
from pyspark.sql.session import SparkSession
sc = SparkContext('local')


data = sc.textFile("sample_fpgrowth.txt")
transactions = data.map(lambda line: line.strip().split(' ')).cache()

model = FPGrowth.train(transactions, minSupport=0.2, numPartitions=10)

result = model.freqItemsets().collect()

print(result)

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Fwd: [pyspark][MLlib] Getting WARN FPGrowth: Input data is not cached for cached data

Reply via email to