[ https://issues.apache.org/jira/browse/PHOENIX-4490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16315744#comment-16315744 ]
Karan Mehta commented on PHOENIX-4490: -------------------------------------- [~aertoria] FYI. The bug that I was talking about. > Phoenix Spark Module doesn't pass in user properties to create connection > ------------------------------------------------------------------------- > > Key: PHOENIX-4490 > URL: https://issues.apache.org/jira/browse/PHOENIX-4490 > Project: Phoenix > Issue Type: Bug > Reporter: Karan Mehta > > Phoenix Spark module doesn't work perfectly in a Kerberos environment. This > is because whenever new {{PhoenixRDD}} are built, they are always built with > new and default properties. The following piece of code in > {{PhoenixRelation}} is an example. This is the class used by spark to create > {{BaseRelation}} before executing a scan. > {code} > new PhoenixRDD( > sqlContext.sparkContext, > tableName, > requiredColumns, > Some(buildFilter(filters)), > Some(zkUrl), > new Configuration(), > dateAsTimestamp > ).toDataFrame(sqlContext).rdd > {code} > This would work fine in most cases if the spark code is being run on the same > cluster as HBase, the config object will pickup properties from Class path > xml files. However in an external environment we should use the user provided > properties and merge them before creating any {{PhoenixRelation}} or > {{PhoenixRDD}}. As per my understanding, we should ideally provide properties > in {{DefaultSource#createRelation() method}}. > An example of when this fails is, Spark tries to get the splits to optimize > the MR performance for loading data in the table in > {{PhoenixInputFormat#generateSplits()}} methods. Ideally, it should get all > the config parameters from the {{JobContext}} being passed, but it is > defaulted to {{new Configuration()}}, irrespective of what user passes in. > Thus it fails to create a connection. > [~jmahonin] [~maghamraviki...@gmail.com] > Any ideas or advice? Let me know if I am missing anything obvious here. -- This message was sent by Atlassian JIRA (v6.4.14#64029)