Will Berkeley has posted comments on this change. Change subject: KUDU-1533 Spark Kudu Rdd/Dataframe upsert ......................................................................
Patch Set 1: (6 comments) http://gerrit.cloudera.org:8080/#/c/3871/1/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/DefaultSource.scala File java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/DefaultSource.scala: Line 61: new KuduRelation(tableName, kuduMaster)(sqlContext) > Should the upsert flag get passed here as well? N/A. See below. Line 78: override def createRelation(sqlContext: SQLContext, mode: SaveMode, > How would you feel about removing the method and not implementing Creatable You're right we shouldn't implement CreatableRelationProvider if we can't do it right. Alternatively, couldn't createRelation take a dataframe, construct a Kudu schema from its schema, and then allow it to be persisted to Kudu? Working this way, we could support everything except Overwrite right now...buuuuut it's going to be really awkward to specify the key and the partition schema. Awkward enough I think we should remove creatableRelationProvider until we can implement it correctly as it is now (based on an existing table). All the operations will still be available on a dataframe via KuduContext. Line 94: case Overwrite => kuduRelation.insert(data, overwrite = true) > Throw Unsupported here as well for the reasons below. N/A. See above. Line 218: context.writeRows(data, tableName, overwrite, upsert) > Could you actually change this to throw a NotSupported exception when overw Done http://gerrit.cloudera.org:8080/#/c/3871/1/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduContext.scala File java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduContext.scala: Line 132: def writeRows(data: DataFrame, tableName: String, overwrite: Boolean, upsert: Boolean) { > Instead of writeRows taking two booleans, could you split this into insertR Done Line 151: def writeRows(rows: Iterator[Row], > Same here. There shouldn't be too much code duplication if you pull out th Done -- To view, visit http://gerrit.cloudera.org:8080/3871 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ib8e0d50fb74dc2ce5e757e8a56fc1e863f699822 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Will Berkeley <wdberke...@gmail.com> Gerrit-Reviewer: Chris George <chris.geo...@rms.com> Gerrit-Reviewer: Dan Burkert <d...@cloudera.com> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Will Berkeley <wdberke...@gmail.com> Gerrit-HasComments: Yes