Will Berkeley has posted comments on this change. Change subject: KUDU-1533 Spark Kudu Rdd/Dataframe upsert ......................................................................
Patch Set 2: (8 comments) http://gerrit.cloudera.org:8080/#/c/3871/2//COMMIT_MSG Commit Message: Line 13: for Kudu without support for table truncation. > My understanding is that it's more fundamental than that: it requires table Done Line 19: The change is backwards-incompatible. Syntax like > Could you add this or something like it to docs/release_notes.adoc? Maybe a silly question: should I add it as part of this patch or get it added to Todd's release notes patch (https://gerrit.cloudera.org/#/c/3802/)? http://gerrit.cloudera.org:8080/#/c/3871/2/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/DefaultSource.scala File java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/DefaultSource.scala: PS2, Line 172: insert > I liked your idea of defaulting to upsert, but allowing insert via a specia I agree people will want upsert by default, but it makes me a little nervous because the method name is insert (from the trait) and Kudu has otherwise been punctilious about the distinction between insert and upsert. This function is used for INSERT INTO, btw, so INSERT INTO will by default mean UPSERT INTO. http://gerrit.cloudera.org:8080/#/c/3871/2/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduContext.scala File java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduContext.scala: Line 132: writeRows(data, tableName, table => table.newInsert()) > this is a fantastic way to reduce the boilerplate. :D PS2, Line 179: writeRowPartition > how about 'writePartitionRows' Done Line 180: schema: StructType, > indent Done http://gerrit.cloudera.org:8080/#/c/3871/2/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/package.scala File java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/package.scala: Line 35 > I thought this would still work because of the RelationProvider impl? Or i It's tied to CreatableRelationProvider. I crossed my fingers and tried it without the impl and got a runtime exception. As I understand it, CreatableRelationProvider lets one do things like CTAS, while InsertableRelation lets one do INSERT INTO. Also, I noticed there's no test for INSERT INTO, so added that. http://gerrit.cloudera.org:8080/#/c/3871/2/java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/DefaultSourceTest.scala File java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/DefaultSourceTest.scala: Line 75: val tableName = "testcreatetable" > thank you! np. Actually, I'd also like to do a follow up patch just to clean up the code and comments a bit, after this one's done. -- To view, visit http://gerrit.cloudera.org:8080/3871 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ib8e0d50fb74dc2ce5e757e8a56fc1e863f699822 Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Will Berkeley <wdberke...@gmail.com> Gerrit-Reviewer: Chris George <chris.geo...@rms.com> Gerrit-Reviewer: Dan Burkert <d...@cloudera.com> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Will Berkeley <wdberke...@gmail.com> Gerrit-HasComments: Yes