Hi Josh, Thanks for the info. Views do work, and that helps with some of our use cases anyway.
I was hoping to use Spark as a means of running a long-running query without overly taxing everything, but since both the Spark/MapReduce integrations use the same underlying JDBC connections (as far as I can tell from the code), I've instead just bumped up the timeouts as suggested by a few other threads / forum posts. Thanks again for the help, *Craig Roberts* *Senior Developer* *FrogAsia Sdn Bhd (A YTL Company) *| Unit 9, Level 2, D6 at Sentul East | 801, Jalan Sentul, 51000 Kuala Lumpur | 01125618093 | Twitter <http://www.twitter.com/FrogAsia> | Facebook <http://www.facebook.com/FrogAsia> | Website <http://www.frogasia.com/> *This message (including any attachments) is for the use of the addressee only. It may contain private proprietary or legally privileged statements and information. No confidentiality or privilege is waived or lost by any mistransmission. If you are not the intended recipient, please immediately delete it and all copies of it from your system, destroy any hard copies of it and notify the sender. You must not, directly or indirectly, use, disclose, distribute, print, copy or rely on any part of the message if you are not the intended recipient. Any views expressed in this message (including any attachments) are those of the individual sender and not those of any member of the YTL Group, except where the message states otherwise and the sender is authorized to state them to be the views of any such entity.* On 24 February 2017 at 23:06, Josh Mahonin <jmaho...@gmail.com> wrote: > Hi Craig, > > I think this is an open issue in PHOENIX-2648 (https://issues.apache.org/ > jira/browse/PHOENIX-2648) > > There seems to be a workaround by using a 'VIEW' instead, as mentioned in > that ticket. > > Good luck, > > Josh > > On Thu, Feb 23, 2017 at 11:56 PM, Craig Roberts < > craig.robe...@frogasia.com> wrote: > >> Hi all, >> >> I've got a (very) basic Spark application in Python that selects some >> basic information from my Phoenix table. I can't quite figure out how (or >> even if I can) select dynamic columns through this, however. >> >> Here's what I have; >> >> from pyspark import SparkContext, SparkConf >> from pyspark.sql import SQLContext >> >> conf = SparkConf().setAppName("pysparkPhoenixLoad").setMaster("local") >> sc = SparkContext(conf=conf) >> sqlContext = SQLContext(sc) >> >> df = sqlContext.read.format("org.apache.phoenix.spark") \ >> .option("table", """MYTABLE("dyamic_column" VARCHAR)""") \ >> .option("zkUrl", "127.0.0.1:2181:/hbase-unsecure") \ >> .load() >> >> df.show() >> df.printSchema() >> >> >> I get a "org.apache.phoenix.schema.TableNotFoundException:" error for >> the above. >> >> If I try and load the data frame as a table and query that with SQL: >> >> sqlContext.registerDataFrameAsTable(df, "test") >> sqlContext.sql("""SELECT * FROM test("dynamic_column" VARCHAR)""") >> >> >> I get a bit of a strange exception: >> >> py4j.protocol.Py4JJavaError: An error occurred while calling o37.sql. >> : java.lang.RuntimeException: [1.19] failure: ``union'' expected but `(' >> found >> >> SELECT * FROM test("dynamic_column" VARCHAR) >> >> >> >> Does anybody have a pointer on whether this is supported and how I might >> be able to query a dynamic column? I haven't found much information on the >> wider Internet about Spark + Phoenix integration for this kind of >> thing...Simple selects are working. Final note: I have (rather stupidly) >> lower-cased my column names in Phoenix, so I need to quote them when I >> execute a query (I'll be changing this as soon as possible). >> >> Any assistance would be appreciated :) >> *-- Craig* >> > >