Dan Burkert has posted comments on this change.

Change subject: KUDU-1533 Spark Kudu Rdd/Dataframe upsert
......................................................................


Patch Set 1:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/3871/1/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/DefaultSource.scala
File 
java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/DefaultSource.scala:

Line 61:     new KuduRelation(tableName, kuduMaster)(sqlContext)
Should the upsert flag get passed here as well?


Line 78:   override def createRelation(sqlContext: SQLContext, mode: SaveMode,
How would you feel about removing the method and not implementing 
CreatableRelationProvider?  As far as I can tell we are not, and can not, 
implement it correctly, since the main point of it is to create a new table 
when it doesn't exist.


Line 94:       case Overwrite => kuduRelation.insert(data, overwrite = true)
Throw Unsupported here as well for the reasons below.


Line 218:     context.writeRows(data, tableName, overwrite, upsert)
Could you actually change this to throw a NotSupported exception when overwrite 
= true?  I think the Spark semantics are supposed to be that overwrite will 
truncate the dataset then do inserts, which we can't yet support.  Changing it 
to throw will allow us to backwards compatibly support it later when we have 
lightweight table truncation.


http://gerrit.cloudera.org:8080/#/c/3871/1/java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduContext.scala
File 
java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduContext.scala:

Line 132:   def writeRows(data: DataFrame, tableName: String, overwrite: 
Boolean, upsert: Boolean) {
Instead of writeRows taking two booleans, could you split this into insertRows, 
updateRows, and upsertRows?  I think that will be a lot more clear.  The 
boolean was already really confusing.


Line 151:   def writeRows(rows: Iterator[Row],
Same here.  There shouldn't be too much code duplication if you pull out the 
row building into a helper function.

Should this be private?


-- 
To view, visit http://gerrit.cloudera.org:8080/3871
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib8e0d50fb74dc2ce5e757e8a56fc1e863f699822
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wdberke...@gmail.com>
Gerrit-Reviewer: Chris George <chris.geo...@rms.com>
Gerrit-Reviewer: Dan Burkert <d...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-HasComments: Yes

Reply via email to