Re: type issue: found RDD[T] expected RDD[A]
Hi Amit, I think the type of the data contained in your RDD needs to be a known case class and not abstract for createSchemaRDD. This makes sense when you think it needs to know about the fields in the object to create the schema. I had the same issue when I used an abstract base class for a collection of types I had. Best regards, Patrick On 6 August 2014 07:58, Amit Kumar kumarami...@gmail.com wrote: Hi All, I am having some trouble trying to write generic code that uses sqlContext and RDDs. Can you suggest what might be wrong? class SparkTable[T : ClassTag](val sqlContext:SQLContext, val extractor: (String) = (T) ) { private[this] var location:Option[String] =None private[this] var name:Option[String]=None private[this] val sc = sqlContext.sparkContext ... def makeRDD(sqlQuery:String):SchemaRDD={ require(this.location!=None) require(this.name!=None) import sqlContext._ val rdd:RDD[String] = sc.textFile(this.location.get) val rddT:RDD[T] = rdd.map(extractor) val schemaRDD:SchemaRDD= createSchemaRDD(rddT) schemaRDD.registerAsTable(name.get) val all = sqlContext.sql(sqlQuery) all } } I use it as below: def extractor(line:String):POJO={ val splits= line.split(pattern).toList POJO(splits(0),splits(1),splits(2),splits(3)) } val pojoTable:SparkTable[POJO] = new SparkTable[POJO](sqlContext,extractor) val identityData:SchemaRDD= pojoTable.atLocation(hdfs://location/table) .withName(pojo) .makeRDD(SELECT * FROM pojo) I get compilation failure inferred type arguments [T] do not conform to method createSchemaRDD's type parameter bounds [A : Product] [error] val schemaRDD:SchemaRDD= createSchemaRDD(rddT) [error] ^ [error] SparkTable.scala:37: type mismatch; [error] found : org.apache.spark.rdd.RDD[T] [error] required: org.apache.spark.rdd.RDD[A] [error] val schemaRDD:SchemaRDD= createSchemaRDD(rddT) [error] ^ [error] two errors found I am probably missing something basic either in scala reflection/types or implicits? Any hints would be appreciated. Thanks Amit
Re: type issue: found RDD[T] expected RDD[A]
Hi, On Tue, Aug 19, 2014 at 7:01 PM, Patrick McGloin mcgloin.patr...@gmail.com wrote: I think the type of the data contained in your RDD needs to be a known case class and not abstract for createSchemaRDD. This makes sense when you think it needs to know about the fields in the object to create the schema. Exactly this. The actual message pointing to that is: inferred type arguments [T] do not conform to method createSchemaRDD's type parameter bounds [A : Product] All case classes are automatically subclasses of Product, but otherwise you will have to extend Product and add the required methods yourself. Tobias
Re: type issue: found RDD[T] expected RDD[A]
That might not be enough. Reflection is used to determine what the fields are, thus your class might actually need to have members corresponding to the fields in the table. I heard that a more generic method of inputting stuff is coming. On Tue, Aug 19, 2014 at 6:43 PM, Tobias Pfeiffer t...@preferred.jp wrote: Hi, On Tue, Aug 19, 2014 at 7:01 PM, Patrick McGloin mcgloin.patr...@gmail.com wrote: I think the type of the data contained in your RDD needs to be a known case class and not abstract for createSchemaRDD. This makes sense when you think it needs to know about the fields in the object to create the schema. Exactly this. The actual message pointing to that is: inferred type arguments [T] do not conform to method createSchemaRDD's type parameter bounds [A : Product] All case classes are automatically subclasses of Product, but otherwise you will have to extend Product and add the required methods yourself. Tobias - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: type issue: found RDD[T] expected RDD[A]
Hi Evan, Patrick and Tobias, So, It worked for what I needed it to do. I followed Yana's suggestion of using parameterized type of [T : Product:ClassTag:TypeTag] more concretely, I was trying to make the query process a bit more fluent -some pseudocode but with correct types val table:SparkTable[POJO] = new SparkTable[POJO](sqlContext,extractor:String=POJO) val data= table.atLocation(hdfs://) .withName(tableName) .makeRDD(SELECT * FROM tableName) class SparkTable[T : Product : ClassTag :TypeTag](val sqlContext:SQLContext, val extractor: (String) = (T) ) { private[this] var location:Option[String] =None private[this] var name:Option[String]=None private[this] val sc = sqlContext.sparkContext def withName(name:String):SparkTable[T]={..} def atLocation(path:String):SparkTable[T]={.. } def makeRDD(sqlQuery:String):SchemaRDD={ ... import sqlContext._ val rdd:RDD[String] = sc.textFile(this.location.get) val rddT:RDD[T] = rdd.map(extractor) val schemaRDD= createSchemaRDD(rddT) schemaRDD.registerAsTable(name.get) val all = sqlContext.sql(sqlQuery) all } } Best, Amit On Tue, Aug 19, 2014 at 9:13 PM, Evan Chan velvia.git...@gmail.com wrote: That might not be enough. Reflection is used to determine what the fields are, thus your class might actually need to have members corresponding to the fields in the table. I heard that a more generic method of inputting stuff is coming. On Tue, Aug 19, 2014 at 6:43 PM, Tobias Pfeiffer t...@preferred.jp wrote: Hi, On Tue, Aug 19, 2014 at 7:01 PM, Patrick McGloin mcgloin.patr...@gmail.com wrote: I think the type of the data contained in your RDD needs to be a known case class and not abstract for createSchemaRDD. This makes sense when you think it needs to know about the fields in the object to create the schema. Exactly this. The actual message pointing to that is: inferred type arguments [T] do not conform to method createSchemaRDD's type parameter bounds [A : Product] All case classes are automatically subclasses of Product, but otherwise you will have to extend Product and add the required methods yourself. Tobias
type issue: found RDD[T] expected RDD[A]
Hi All, I am having some trouble trying to write generic code that uses sqlContext and RDDs. Can you suggest what might be wrong? class SparkTable[T : ClassTag](val sqlContext:SQLContext, val extractor: (String) = (T) ) { private[this] var location:Option[String] =None private[this] var name:Option[String]=None private[this] val sc = sqlContext.sparkContext ... def makeRDD(sqlQuery:String):SchemaRDD={ require(this.location!=None) require(this.name!=None) import sqlContext._ val rdd:RDD[String] = sc.textFile(this.location.get) val rddT:RDD[T] = rdd.map(extractor) val schemaRDD:SchemaRDD= createSchemaRDD(rddT) schemaRDD.registerAsTable(name.get) val all = sqlContext.sql(sqlQuery) all } } I use it as below: def extractor(line:String):POJO={ val splits= line.split(pattern).toList POJO(splits(0),splits(1),splits(2),splits(3)) } val pojoTable:SparkTable[POJO] = new SparkTable[POJO](sqlContext,extractor) val identityData:SchemaRDD= pojoTable.atLocation(hdfs://location/table) .withName(pojo) .makeRDD(SELECT * FROM pojo) I get compilation failure inferred type arguments [T] do not conform to method createSchemaRDD's type parameter bounds [A : Product] [error] val schemaRDD:SchemaRDD= createSchemaRDD(rddT) [error] ^ [error] SparkTable.scala:37: type mismatch; [error] found : org.apache.spark.rdd.RDD[T] [error] required: org.apache.spark.rdd.RDD[A] [error] val schemaRDD:SchemaRDD= createSchemaRDD(rddT) [error] ^ [error] two errors found I am probably missing something basic either in scala reflection/types or implicits? Any hints would be appreciated. Thanks Amit