[
https://issues.apache.org/jira/browse/PHOENIX-4490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16315744#comment-16315744
]
Karan Mehta commented on PHOENIX-4490:
--------------------------------------
[~aertoria] FYI. The bug that I was talking about.
> Phoenix Spark Module doesn't pass in user properties to create connection
> -------------------------------------------------------------------------
>
> Key: PHOENIX-4490
> URL: https://issues.apache.org/jira/browse/PHOENIX-4490
> Project: Phoenix
> Issue Type: Bug
> Reporter: Karan Mehta
>
> Phoenix Spark module doesn't work perfectly in a Kerberos environment. This
> is because whenever new {{PhoenixRDD}} are built, they are always built with
> new and default properties. The following piece of code in
> {{PhoenixRelation}} is an example. This is the class used by spark to create
> {{BaseRelation}} before executing a scan.
> {code}
> new PhoenixRDD(
> sqlContext.sparkContext,
> tableName,
> requiredColumns,
> Some(buildFilter(filters)),
> Some(zkUrl),
> new Configuration(),
> dateAsTimestamp
> ).toDataFrame(sqlContext).rdd
> {code}
> This would work fine in most cases if the spark code is being run on the same
> cluster as HBase, the config object will pickup properties from Class path
> xml files. However in an external environment we should use the user provided
> properties and merge them before creating any {{PhoenixRelation}} or
> {{PhoenixRDD}}. As per my understanding, we should ideally provide properties
> in {{DefaultSource#createRelation() method}}.
> An example of when this fails is, Spark tries to get the splits to optimize
> the MR performance for loading data in the table in
> {{PhoenixInputFormat#generateSplits()}} methods. Ideally, it should get all
> the config parameters from the {{JobContext}} being passed, but it is
> defaulted to {{new Configuration()}}, irrespective of what user passes in.
> Thus it fails to create a connection.
> [~jmahonin] [[email protected]]
> Any ideas or advice? Let me know if I am missing anything obvious here.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)