Make sure you use the correct python which has tensorframe installed. Use PYSPARK_PYTHON to configure the python
Spico Florin <spicoflo...@gmail.com>于2018年8月8日周三 下午9:59写道: > Hi! > > I would like to use tensorframes in my pyspark notebook. > > I have performed the following: > > 1. In the spark intepreter adde a new repository > http://dl.bintray.com/spark-packages/maven > 2. in the spark interpreter added the > dependency databricks:tensorframes:0.2.9-s_2.11 > 3. pip install tensorframes > > > In both 0.7.3 and 0.8.0: > 1. the following code resulted in error: "ImportError: No module named > tensorframes" > > %pyspark > import tensorframes as tfs > > 2. the following code succeeded > %spark > import org.tensorframes.{dsl => tf} > import org.tensorframes.dsl.Implicits._ > val df = spark.createDataFrame(Seq(1.0->1.1, 2.0->2.2)).toDF("a", "b") > > // As in Python, scoping is recommended to prevent name collisions. > val df2 = tf.withGraph { > val a = df.block("a") > // Unlike python, the scala syntax is more flexible: > val out = a + 3.0 named "out" > // The 'mapBlocks' method is added using implicits to dataframes. > df.mapBlocks(out).select("a", "out") > } > > // The transform is all lazy at this point, let's execute it with collect: > df2.collect() > > I ran the code above directly with spark interpreter with the default > configurations (master set up to local[*] - so not via spark-submit > command) . > > Also, I have installed spark home locally and ran the command > > $SPARK_HOME/bin/pyspark --packages databricks:tensorframes:0.2.9-s_2.11 > > and the code below worked as expcted > > import tensorframes as tfs > > Can you please help to solve this? > > Thanks, > > Florin > > > > > > > > >