Problem embedding GaussianMixtureModel in a closure

Tomasz Fruboes Thu, 31 Dec 2015 12:12:56 -0800

Dear All,

I'm trying to implement a procedure that iteratively updates a rddusing results from GaussianMixtureModel.predictSoft. In order to avoidproblems with local variable (the obtained GMM) beeing overwritten ineach pass of the loop I'm doing the following:


#######################################################
for i in xrange(10):
    gmm = GaussianMixture.train(rdd, 2)

    def getSafePredictor(unsafeGMM):
        return lambda x: \
            (unsafeGMM.predictSoft(x.features), unsafeGMM.gaussians.mu)

    safePredictor = getSafePredictor(gmm)
    predictionsRDD = (labelledpointrddselectedfeatsNansPatched
          .map(safePredictor)
    )
    print predictionsRDD.take(1)
    (... - rest of code - update rdd with results from predictionsRdd)
#######################################################

Unfortunately this ends with:

#######################################################

Exception: It appears that you are attempting to reference SparkContextfrom a broadcast variable, action, or transformation. SparkContext canonly be used on the driver, not in code that it run on workers. For moreinformation, see SPARK-5063.

#######################################################

Any idea why I'm getting this behaviour? My expectation would be, thatGMM should be a "simple" object without SparkContext in it. I'm usingspark 1.5.2


 Thanks,
   Tomasz


ps As a workaround I'm doing currently

########################
    def getSafeGMM(unsafeGMM):
        return lambda x: unsafeGMM.predictSoft(x)

    safeGMM = getSafeGMM(gmm)
    predictionsRDD = \
        safeGMM(labelledpointrddselectedfeatsNansPatched.map(rdd))
########################

which works fine. If it's possible I would like to avoid thisapproach, since it would require to perform another closure ongmm.gaussians later in my code



---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Problem embedding GaussianMixtureModel in a closure

Reply via email to