[ https://issues.apache.org/jira/browse/PIG-4585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14571795#comment-14571795 ]
Mohit Sabharwal commented on PIG-4585: -------------------------------------- FYI: [~kellyzly], [~kexianda], [~xuefuz] Most (27 out of 33) tests in TestHBaseStorage tests pass Remaining are failing due to UDFContext (thread local) not populated in Spark Executor threads. Fixing this in a separate patch. > Use newAPIHadoopRDD instead of newAPIHadoopFile > ----------------------------------------------- > > Key: PIG-4585 > URL: https://issues.apache.org/jira/browse/PIG-4585 > Project: Pig > Issue Type: Sub-task > Components: spark > Affects Versions: spark-branch > Reporter: Mohit Sabharwal > Assignee: Mohit Sabharwal > Fix For: spark-branch > > Attachments: PIG-4585.patch > > > LoadConverter currently uses SparkContext.newAPIHadoopFile which won't work > for non-filesystem based input sources, like HBase. > newAPIHadoopFile assumes a FileInputFormat and attempts to > [verify|https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/SparkContext.scala#L1065] > this in the constructor, which fails for HBaseTableInputFormat (which is not > a FileInputFormat) > {code} > NewFileInputFormat.setInputPaths(job, path) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)