Will Berkeley has posted comments on this change.

Change subject: KUDU-1533 Spark Kudu Rdd/Dataframe upsert
......................................................................


Patch Set 2:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/3871/2//COMMIT_MSG
Commit Message:

Line 13: for Kudu without support for table truncation.
> My understanding is that it's more fundamental than that: it requires table
Done


Line 19: The change is backwards-incompatible. Syntax like
> Could you add this or something like it to docs/release_notes.adoc?
Maybe a silly question: should I add it as part of this patch or get it added 
to Todd's release notes patch (https://gerrit.cloudera.org/#/c/3802/)?


http://gerrit.cloudera.org:8080/#/c/3871/2/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/DefaultSource.scala
File 
java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/DefaultSource.scala:

PS2, Line 172: insert
> I liked your idea of defaulting to upsert, but allowing insert via a specia
I agree people will want upsert by default, but it makes me a little nervous 
because the method name is insert (from the trait) and Kudu has otherwise been 
punctilious about the distinction between insert and upsert. This function is 
used for INSERT INTO, btw, so INSERT INTO will by default mean UPSERT INTO.


http://gerrit.cloudera.org:8080/#/c/3871/2/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduContext.scala
File 
java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduContext.scala:

Line 132:     writeRows(data, tableName, table => table.newInsert())
> this is a fantastic way to reduce the boilerplate.
:D


PS2, Line 179: writeRowPartition
> how about 'writePartitionRows'
Done


Line 180:                 schema: StructType,
> indent
Done


http://gerrit.cloudera.org:8080/#/c/3871/2/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/package.scala
File java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/package.scala:

Line 35
> I thought this would still work because of the RelationProvider impl?  Or i
It's tied to CreatableRelationProvider. I crossed my fingers and tried it 
without the impl and got a runtime exception. As I understand it, 
CreatableRelationProvider lets one do things like CTAS, while 
InsertableRelation lets one do INSERT INTO. Also, I noticed there's no test for 
INSERT INTO, so added that.


http://gerrit.cloudera.org:8080/#/c/3871/2/java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/DefaultSourceTest.scala
File 
java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/DefaultSourceTest.scala:

Line 75:     val tableName = "testcreatetable"
> thank you!
np. Actually, I'd also like to do a follow up patch just to clean up the code 
and comments a bit, after this one's done.


-- 
To view, visit http://gerrit.cloudera.org:8080/3871
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib8e0d50fb74dc2ce5e757e8a56fc1e863f699822
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wdberke...@gmail.com>
Gerrit-Reviewer: Chris George <chris.geo...@rms.com>
Gerrit-Reviewer: Dan Burkert <d...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Will Berkeley <wdberke...@gmail.com>
Gerrit-HasComments: Yes

Reply via email to