NoSuchMethodError CatalogTable.copy

2017-08-24 Thread Lionel Luffy
Hi, any one knows how to fix below error?

java.lang.NoSuchMethodError:
org.apache.spark.sql.catalyst.catalog.CatalogTable.copy(Lorg/apache/spark/sql/catalyst/TableIdentifier;Lorg/apache/spark/sql/catalyst/catalog/CatalogTableType;Lorg/apache/spark/sql/catalyst/catalog/CatalogStorageFormat;Lorg/apache/spark/sql/types/StructType;Lscala/Option;Lscala/collection/Seq;Lscala/Option;Ljava/lang/String;JJLscala/collection/immutable/Map;Lscala/Option;Lscala/Option;Lscala/Option;Lscala/Option;Lscala/collection/Seq;Z)Lorg/apache/spark/sql/catalyst/catalog/CatalogTable;

it occurred when execute below code...

catalogTable.copy(storage = newStorage)

the catalyst jar is spark-catalyst_2.11-2.1.0.cloudera1.jar

CatalogTable is a case class:

case class CatalogTable(

identifier: TableIdentifier,

tableType: CatalogTableType,

storage: CatalogStorageFormat,

schema: StructType,

provider: Option[String] = None,

partitionColumnNames: Seq[String] = Seq.empty,

bucketSpec: Option[BucketSpec] = None,

owner: String = "",

createTime: Long = System.currentTimeMillis,

lastAccessTime: Long = -1,

properties: Map[String, String] = Map.empty,

stats: Option[Statistics] = None,

viewOriginalText: Option[String] = None,

viewText: Option[String] = None,

comment: Option[String] = None,

unsupportedFeatures: Seq[String] = Seq.empty,

tracksPartitionsInCatalog: Boolean = false,

schemaPreservesCase: Boolean = true)


Re: A question about rdd transformation

2017-06-22 Thread Lionel Luffy
Now I found the root cause is a Wrapper class in AnyRef is not
Serializable, but even though I changed it to implements Serializable. the
'rows' still cannot get data... Any suggestion?

On Fri, Jun 23, 2017 at 10:56 AM, Lionel Luffy  wrote:

> Hi there,
> I'm trying to do below action while it always return 
> java.io.NotSerializableException
> in the shuffle task.
> I've checked that Array is serializable. how can I get the data of rdd in
> newRDD?
>
> step 1: val rdd: RDD[(AnyRef, Array[AnyRef]] {..}
>
> step2 :   rdd
>  .partitionBy(partitioner)
>  .map(_._2)
>
> step3:  pass rdd to newRDD as prev:
> newRDD[K, V] (
> xxx,
> xxx,
> xxx,
> prev: RDD[Array[AnyRef]] extends RDD[(K, V)] (prev) {
>
> override protected def getPartitions() {...}
>
> override def compute(split: Partition, context: TaskContext): Iterator[(K,
> V)] {...
>   val rows = firstParent[Array[AnyRef]].iterator(split, context)
>
>}
>
> }
>
>
> Thanks,
> LL
>


A question about rdd transformation

2017-06-22 Thread Lionel Luffy
Hi there,
I'm trying to do below action while it always return
java.io.NotSerializableException in the shuffle task.
I've checked that Array is serializable. how can I get the data of rdd in
newRDD?

step 1: val rdd: RDD[(AnyRef, Array[AnyRef]] {..}

step2 :   rdd
 .partitionBy(partitioner)
 .map(_._2)

step3:  pass rdd to newRDD as prev:
newRDD[K, V] (
xxx,
xxx,
xxx,
prev: RDD[Array[AnyRef]] extends RDD[(K, V)] (prev) {

override protected def getPartitions() {...}

override def compute(split: Partition, context: TaskContext): Iterator[(K,
V)] {...
  val rows = firstParent[Array[AnyRef]].iterator(split, context)

   }

}


Thanks,
LL