[ https://issues.apache.org/jira/browse/PHOENIX-129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13937539#comment-13937539 ]
Gabriel Reid commented on PHOENIX-129: -------------------------------------- Thanks for the review [~prkommireddi]. I've added some replies to some of your comments on RB (and managed to accidentally do some of them in a new review, which can be ignored). > Improve MapReduce-based import > ------------------------------ > > Key: PHOENIX-129 > URL: https://issues.apache.org/jira/browse/PHOENIX-129 > Project: Phoenix > Issue Type: Improvement > Reporter: Gabriel Reid > Assignee: Gabriel Reid > Attachments: PHOENIX-129-3.0.patch, PHOENIX-129-3.0_2.patch, > PHOENIX-129-master.patch, PHOENIX-129-master_2.patch > > > In implementing PHOENIX-66, it was noted that the current MapReduce-based > importer implementation has a number issues, including the following: > * CSV handling is largely replicated from the non-MR code, with no ability to > specify custom separators > * No automated tests, and code is written in a way that makes it difficult to > test > * Unusual custom config loading and handling instead of using > GenericOptionParser and ToolRunner and friends > The initial work towards PHOENIX-66 included refactoring the MR importer > enough to use common code, up until the development of automated testing > exposed the fact that the MR importer could use some major refactoring. > This ticket is a proposal to do a relatively major rework of the MR import, > fixing the above issues. The biggest improvements that will result from this > are a common codebase for handling CSV input, and the addition of automated > testing for the MR import. -- This message was sent by Atlassian JIRA (v6.2#6252)