Jan Van Besien created CRUNCH-536:
-------------------------------------
Summary: crunch jobs fail to use hbase api of secured hbase
Key: CRUNCH-536
URL: https://issues.apache.org/jira/browse/CRUNCH-536
Project: Crunch
Issue Type: Bug
Reporter: Jan Van Besien
When accessing a secured hbase from within a mapreduce job, it is required that
the hbase credentials were initialized on the job before it was submitted. This
can be done with TableMapReduceUtil.initCredentials(job).
In case the job is the consequence of using HBaseSourceTarget, crunch-hbase can
take care of it, see CRUNCH-535.
However, it is also possible to write DoFn's that use the HBase api directly,
without using hbase input/output format. As an example use case, consider a job
that bulk writes data to hbase by writing HFiles on HDFS which are later to be
loaded into HBase. Such a job doesn't read or write from/to hbase using an
input/output format directly, but it might still require access to other tables
in HBase, for example auxiliary tables with metadata specific to the
application.
We can of course not expect crunch-core to call initCredentials (which is HBase
specific) on all jobs, just in case, but it would be nice to be able to
register a callback on the MRPipeline which is applied to every job before it
is submitted, to cover this use case.
I will provide a patch which will help to explain what I am suggesting here.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)