Gabriel Reid created PHOENIX-129:
------------------------------------
Summary: Improve MapReduce-based import
Key: PHOENIX-129
URL: https://issues.apache.org/jira/browse/PHOENIX-129
Project: Phoenix
Issue Type: Improvement
Reporter: Gabriel Reid
In implementing PHOENIX-66, it was noted that the current MapReduce-based
importer implementation has a number issues, including the following:
* CSV handling is largely replicated from the non-MR code, with no ability to
specify custom separators
* No automated tests, and code is written in a way that makes it difficult to
test
* Unusual custom config loading and handling instead of using
GenericOptionParser and ToolRunner and friends
The initial work towards PHOENIX-66 included refactoring the MR importer enough
to use common code, up until the development of automated testing exposed the
fact that the MR importer could use some major refactoring.
This ticket is a proposal to do a relatively major rework of the MR import,
fixing the above issues. The biggest improvements that will result from this
are a common codebase for handling CSV input, and the addition of automated
testing for the MR import.
--
This message was sent by Atlassian JIRA
(v6.2#6252)