----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/19257/#review37348 -----------------------------------------------------------
bin/readme.txt <https://reviews.apache.org/r/19257/#comment68876> Yes, it's (currently) always temporary. The HFiles that are written as part of the import process (but thrown away after they've been added to HBase) are written to a random directory under /tmp (in HDFS) by default, but depending on the setup of HDFS and the existence of directory permissions on it, it may be necessary to specify a different temp path to use (e.g. something under the user directory). Ideally this parameter wouldn't need to exist at all, but I think it is needed. James also mentioned a potential use case of writing HFiles for a table that doesn't exist -- I can see this being useful if you've got a separate MR cluster from your HBase cluster, and want to build up the HFiles quickly without putting a load on your HBase cluster. That's not a use case that's currently supported, but could become useful in the future. phoenix-core/src/it/java/org/apache/phoenix/mapreduce/CsvBulkLoadToolIT.java <https://reviews.apache.org/r/19257/#comment68869> This is a file being created on HDFS (not local FS), so couldn't platform-agnostic separators cause problems here? Closing it in a finally sounds like a good plan -- that being said, again this is on a transient HDFS cluster within the test, so there is no garbage that is going to be left behind after the test stops no matter what. phoenix-core/src/main/java/org/apache/phoenix/mapreduce/CsvBulkLoadTool.java <https://reviews.apache.org/r/19257/#comment68870> No, it's used by the unit tests, so it needs to be package private. It should be annotated with @VisibleForTesting though. phoenix-core/src/main/java/org/apache/phoenix/mapreduce/CsvBulkLoadTool.java <https://reviews.apache.org/r/19257/#comment68871> This is a path on HDFS, so java.io.tmpdir will point to something else. The explanation on the approach for the creating of this tempdir is in a comment above. phoenix-core/src/main/java/org/apache/phoenix/mapreduce/CsvBulkLoadTool.java <https://reviews.apache.org/r/19257/#comment68872> Again, used by unit tests, but should be annotated with @VisibleForTesting. phoenix-core/src/main/java/org/apache/phoenix/mapreduce/CsvBulkLoadTool.java <https://reviews.apache.org/r/19257/#comment68873> Yep, this one really should be private :-) phoenix-core/src/main/java/org/apache/phoenix/mapreduce/CsvBulkLoadTool.java <https://reviews.apache.org/r/19257/#comment68874> Yes, nice catch on that. phoenix-core/src/main/java/org/apache/phoenix/mapreduce/CsvToKeyValueMapper.java <https://reviews.apache.org/r/19257/#comment68875> Yes, good point -- I think I had a specific reason for doing it this way, but I don't remember what the exact reason was and unfortunately I didn't document it here. Should be double-checked. - Gabriel Reid On March 15, 2014, 9:16 p.m., Gabriel Reid wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/19257/ > ----------------------------------------------------------- > > (Updated March 15, 2014, 9:16 p.m.) > > > Review request for phoenix. > > > Repository: phoenix > > > Description > ------- > > Rewrite of the Phoenix MapReduce import, to follow more standard MR tool > development patterns and use standard Phoenix CSV handling functionality. > > > Diffs > ----- > > bin/csv-bulk-loader.py 385ef41 > bin/readme.txt fa23eeb > phoenix-assembly/pom.xml 9a69cab > phoenix-assembly/src/build/all.xml 9c1bc41 > phoenix-assembly/src/build/mapreduce.xml PRE-CREATION > > phoenix-core/src/it/java/org/apache/phoenix/mapreduce/CsvBulkLoadToolIT.java > PRE-CREATION > phoenix-core/src/main/java/org/apache/phoenix/map/reduce/CSVBulkLoader.java > 0fd74e1 > phoenix-core/src/main/java/org/apache/phoenix/map/reduce/MapReduceJob.java > 9dc8032 > > phoenix-core/src/main/java/org/apache/phoenix/map/reduce/util/ConfigReader.java > 1c94d6f > > phoenix-core/src/main/java/org/apache/phoenix/mapreduce/CsvBulkImportUtil.java > PRE-CREATION > > phoenix-core/src/main/java/org/apache/phoenix/mapreduce/CsvBulkLoadTool.java > PRE-CREATION > > phoenix-core/src/main/java/org/apache/phoenix/mapreduce/CsvToKeyValueMapper.java > PRE-CREATION > > phoenix-core/src/main/java/org/apache/phoenix/mapreduce/ImportPreUpsertKeyValueProcessor.java > PRE-CREATION > phoenix-core/src/main/java/org/apache/phoenix/schema/PDataType.java 34166d3 > phoenix-core/src/main/java/org/apache/phoenix/util/CSVCommonsLoader.java > 3e86477 > phoenix-core/src/main/java/org/apache/phoenix/util/ColumnInfo.java 37238c8 > > phoenix-core/src/test/java/org/apache/phoenix/mapreduce/CsvBulkImportUtilTest.java > PRE-CREATION > > phoenix-core/src/test/java/org/apache/phoenix/mapreduce/CsvBulkLoadToolTest.java > PRE-CREATION > > phoenix-core/src/test/java/org/apache/phoenix/mapreduce/CsvToKeyValueMapperTest.java > PRE-CREATION > phoenix-core/src/test/java/org/apache/phoenix/util/ColumnInfoTest.java > PRE-CREATION > > Diff: https://reviews.apache.org/r/19257/diff/ > > > Testing > ------- > > > Thanks, > > Gabriel Reid > >
