Hi users and devs, I'd like to add HCatalog support to crunch. CRUNCH-340 [1] is a first effort on it. It is simply a Source wrapper for HCatInputFormat, with which you can write Crunch pipelines reading from Hive tables.
The down side is that it has dependency on hcatalog and several hive modules transitively. There was some discussion in the JIRA ticket on whether or not to start a new submodule, instead of adding such dependencies to crunch-core (similar to crunch-hbase). I post the question here to get more feedback. I personally prefer adding a new submodule, as this piece of work can be completely decoupled from crunch-core. Feel free to tell me if you have any ideas. [1] https://issues.apache.org/jira/browse/CRUNCH-340
