Jean-Daniel Cryans has posted comments on this change.

Change subject: kudu client tools for hadoop and spark 
import/export(csv,parquet,avro)
......................................................................


Patch Set 3:

(19 comments)

http://gerrit.cloudera.org:8080/#/c/7421/3/java/kudu-client-tools/src/main/java/org/apache/kudu/mapreduce/tools/ExportCsvMapper.java
File 
java/kudu-client-tools/src/main/java/org/apache/kudu/mapreduce/tools/ExportCsvMapper.java:

Line 69:    * converts RowResult to string.
I'd still advocate removing this javadoc section, it adds nothing and it's a 
private method.


http://gerrit.cloudera.org:8080/#/c/7421/3/java/kudu-client-tools/src/main/java/org/apache/kudu/mapreduce/tools/ImportParquet.java
File 
java/kudu-client-tools/src/main/java/org/apache/kudu/mapreduce/tools/ImportParquet.java:

Line 97:     //pre-flight checks of input parquet schema and table schema
nit: missing space after //m
Also end your sentence with a period.


Line 99:       if (!schema.containsField(sche.getName())) {
Why do you not also check the type?


Line 101:         System.exit(0);
Having System.exits in the code isn't good, ideally this case would be tested 
and if you exit then how can you catch the error?


Line 104:     //Kudu does not recommend using TIMESTAMP
Well Kudu doesn't support Parquet's TIMESTAMP, it's not about a recommendation. 
Also same nit as the comment above, and some comment regarding exit.


http://gerrit.cloudera.org:8080/#/c/7421/3/java/kudu-client-tools/src/test/java/org/apache/kudu/mapreduce/tools/ITExportCsv.java
File 
java/kudu-client-tools/src/test/java/org/apache/kudu/mapreduce/tools/ITExportCsv.java:

Line 17: package org.apache.kudu.mapreduce.tools;
nit: missing a blank line.


Line 68:     // Create a 2 lines input file
nit: end comments with a period.
Also I'm not following this comment. The next line creates a table with 4 
tablets, 3 of which have 3 rows. Where's the 2 lines input file coming from?


http://gerrit.cloudera.org:8080/#/c/7421/3/java/kudu-client-tools/src/test/java/org/apache/kudu/mapreduce/tools/ITImportParquet.java
File 
java/kudu-client-tools/src/test/java/org/apache/kudu/mapreduce/tools/ITImportParquet.java:

Line 17: package org.apache.kudu.mapreduce.tools;
nit: missing blank line.


Line 50: public class ITImportParquet extends BaseKuduTest {
I'd suggest having a separate test that specifically verifies the pre-flight 
checks that are running.


Line 107:     String[] args = new String[] { "-D" + 
CommandLineParser.MASTER_ADDRESSES_KEY + "=" + getMasterAddresses(),
nit: long line


Line 111:     Job job = 
ImportParquet.createSubmittableJob(parser.getConfiguration(), 
parser.getRemainingArgs());
nit: long line


Line 115:       client.newScannerBuilder(openTable(TABLE_NAME)).build()));
openTable isn't a cheap call, do it only once.


Line 116:     
assertEquals(4,getTableRows(openTable(TABLE_NAME)).get(0).getInt("key"));
Use scanTableToStrings and verify all the rows instead. Better for type 
conversion checking.


Line 130:     ParquetWriter<Group> writer = new ParquetWriter<Group>(data, new 
GroupWriteSupport(), UNCOMPRESSED, 1024, 1024, 512,
nit: long line


Line 133:       writer.write(f.newGroup().append("key", 1).append("column1_i", 
3).append("column2_d", 2.3).append("column3_s",
Those lines are all too long, also you could probably refactor this?


http://gerrit.cloudera.org:8080/#/c/7421/3/java/kudu-client/src/test/java/org/apache/kudu/client/BaseKuduTest.java
File java/kudu-client/src/test/java/org/apache/kudu/client/BaseKuduTest.java:

PS3, Line 204: rowStrings
What strings?


http://gerrit.cloudera.org:8080/#/c/7421/3/java/kudu-spark-tools/src/main/scala/org/apache/kudu/spark/tools/ImportExportFiles.scala
File 
java/kudu-spark-tools/src/main/scala/org/apache/kudu/spark/tools/ImportExportFiles.scala:

Line 119:             LOG.info(args.header+":"+args.delimiter+":"+args.path)
Forgot to remove?


http://gerrit.cloudera.org:8080/#/c/7421/3/java/kudu-spark-tools/src/test/scala/org/apache/kudu/spark/tools/TestImportExportFiles.scala
File 
java/kudu-spark-tools/src/test/scala/org/apache/kudu/spark/tools/TestImportExportFiles.scala:

Line 17: package org.apache.kudu.spark.tools
nit: add a blank line.


Line 66:     //val table = kuduClient.openTable(TABLE_NAME)
?


-- 
To view, visit http://gerrit.cloudera.org:8080/7421
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If462af948651f3869b444e82151c3559fde19142
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Sandish Kumar HN <sanysand...@gmail.com>
Gerrit-Reviewer: Jean-Daniel Cryans <jdcry...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Sandish Kumar HN <sanysand...@gmail.com>
Gerrit-HasComments: Yes

Reply via email to