The error that I'm getting is:
[ERROR]
/home/andy/sandbox/mahout/spark/src/main/scala/org/apache/mahout/sparkbindings/drm/CheckpointedDrmSpark.scala:169:
error: value saveAsSequenceFile is not a member of
org.apache.mahout.sparkbindings.DrmRdd[K]
[INFO] rdd.saveAsSequenceFile(path)
We don't call the SequenceFileRDDFunctions directly in
CheckpointedDrmSpark.dfsWrite(...), and all of the implicit evidence
should be available behind the scenes in the RDD object.
It seems like SequenceFileRDDFunctions is somehow out of scope of our
rdd (DrmRdd). I'm not sure if that's really helpful.
I wonder if it has anything to do with this:
object RDD {
// The following implicit functions were in SparkContext before 1.3 and
users had to
// `import SparkContext._` to enable them. Now we move them here to make
the compiler find
// them automatically. However, we still keep the old functions in
SparkContext for backward
// compatibility and forward to the following functions directly.
...
implicit def rddToSequenceFileRDDFunctions[K, V](rdd: RDD[(K, V)])
(implicit kt: ClassTag[K], vt: ClassTag[V],
keyWritableFactory: WritableFactory[K],
valueWritableFactory: WritableFactory[V])
: SequenceFileRDDFunctions[K, V] = {
implicit val keyConverter = keyWritableFactory.convert
implicit val valueConverter = valueWritableFactory.convert
new SequenceFileRDDFunctions(rdd,
keyWritableFactory.writableClass(kt),
valueWritableFactory.writableClass(vt))
}
from spark.RDD?
Also, looks like a bit of refactoring on REPL in spark 1.3 breaks our
shell. Haven't had a chance to look at it too closely.. it might an
easy fix. at least some of the methods in sparkILoop are made private,
I didn't get past that.
On 03/22/2015 01:17 PM, Pat Ferrel wrote:
Due to a bug in spark we have a nasty work around for Spark 1.2.1 so I’m trying
1.3.0.
Hoever they have redesigned the rdd.saveAsSequenceFile in
SequenceFileRDDFunctions. The class now expects K and V Writables to be
supplied in the constructor:
class SequenceFileRDDFunctions[K <% Writable: ClassTag, V <% Writable :
ClassTag](
self: RDD[(K, V)],
_keyWritableClass: Class[_ <: Writable], // <=========new
_valueWritableClass: Class[_ <: Writable]) // <========new
extends Logging
with Serializable {
as explained in the commit log:
[SPARK-4795][Core] Redesign the "primitive type => Writable" implicit APIs to make them be
activated automatically Try to redesign the "primitive type => Writable" implicit APIs to
make them be activated automatically and without breaking binary compatibility. However, this PR will
breaking the source compatibility if people use `xxxToXxxWritable` occasionally. See the unit test in
`graphx`. Author: zsxwing Closes #3642 from zsxwing/SPARK-4795 and squashes the following commits:
Since Andy, Gokhan, and Dmitriy have been messing with the Key type recently I
didn’t want to plow ahead with this before consulting. It appears that the
Writable classes need to be available to the constructor when the RDD is
written. This breaks all instances of rdd.saveAsSequenceFile in Mahout.
Where is the best place to fix this?