Thanks for the patience, sorry maybe I sent incomplete information. We are loading the following jars and still getting: *executor 1): java.lang.NoClassDefFoundError: Could not initialize class org.apache.phoenix.query.QueryServicesOptions*
http://central.maven.org/maven2/org/apache/hbase/hbase-client/2.1.0/hbase-client-2.1.0.jar http://central.maven.org/maven2/org/apache/hbase/hbase-common/2.1.0/hbase-common-2.1.0.jar http://central.maven.org/maven2/org/apache/hbase/hbase-hadoop-compat/2.1.0/hbase-hadoop-compat-2.1.0.jar http://central.maven.org/maven2/org/apache/hbase/hbase-mapreduce/2.1.0/hbase-mapreduce-2.1.0.jar http://central.maven.org/maven2/org/apache/hbase/thirdparty/hbase-shaded-miscellaneous/2.1.0/hbase-shaded-miscellaneous-2.1.0.jar http://central.maven.org/maven2/org/apache/hbase/hbase-protocol/2.1.0/hbase-protocol-2.1.0.jar http://central.maven.org/maven2/org/apache/hbase/hbase-protocol-shaded/2.1.0/hbase-protocol-shaded-2.1.0.jar http://central.maven.org/maven2/org/apache/hbase/thirdparty/hbase-shaded-protobuf/2.1.0/hbase-shaded-protobuf-2.1.0.jar http://central.maven.org/maven2/org/apache/hbase/thirdparty/hbase-shaded-netty/2.1.0/hbase-shaded-netty-2.1.0.jar http://central.maven.org/maven2/org/apache/hbase/hbase-server/2.1.0/hbase-server-2.1.0.jar http://central.maven.org/maven2/org/apache/hbase/hbase-hadoop2-compat/2.1.0/hbase-hadoop2-compat-2.1.0.jar http://central.maven.org/maven2/org/apache/hbase/hbase-metrics/2.1.0/hbase-metrics-2.1.0.jar http://central.maven.org/maven2/org/apache/hbase/hbase-metrics-api/2.1.0/hbase-metrics-api-2.1.0.jar http://central.maven.org/maven2/org/apache/hbase/hbase-zookeeper/2.1.0/hbase-zookeeper-2.1.0.jar http://central.maven.org/maven2/org/apache/phoenix/phoenix-spark/5.0.0-HBase-2.0/phoenix-spark-5.0.0-HBase-2.0.jar http://central.maven.org/maven2/org/apache/phoenix/phoenix-core/5.0.0-HBase-2.0/phoenix-core-5.0.0-HBase-2.0.jar http://central.maven.org/maven2/org/apache/phoenix/phoenix-queryserver/5.0.0-HBase-2.0/phoenix-queryserver-5.0.0-HBase-2.0.jar http://central.maven.org/maven2/org/apache/phoenix/phoenix-queryserver-client/5.0.0-HBase-2.0/phoenix-queryserver-client-5.0.0-HBase-2.0.jar http://central.maven.org/maven2/org/apache/twill/twill-zookeeper/0.13.0/twill-zookeeper-0.13.0.jar http://central.maven.org/maven2/org/apache/twill/twill-discovery-core/0.13.0/twill-discovery-core-0.13.0.jar Not sure which one I could be missing?? On Fri, Sep 14, 2018 at 7:34 PM Josh Elser <els...@apache.org> wrote: > Uh, you're definitely not using the right JARs :) > > You'll want the phoenix-client.jar for the Phoenix JDBC driver and the > phoenix-spark.jar for the Phoenix RDD. > > On 9/14/18 1:08 PM, Saif Addin wrote: > > Hi, I am attempting to make connection with Spark but no success so far. > > > > For writing into Phoenix, I am trying this: > > > > tdd.toDF("ID", "COL1", "COL2", > > "COL3").write.format("org.apache.phoenix.spark").option("zkUrl", > > "zookeper-host-url:2181").option("table", > > htablename).mode("overwrite").save() > > > > But getting: > > *java.sql.SQLException: ERROR 103 (08004): Unable to establish > connection.* > > * > > * > > For reading, on the other hand, attempting this: > > > > val hbConf = HBaseConfiguration.create() > > val hbaseSitePath = "/etc/hbase/conf/hbase-site.xml" > > hbConf.addResource(new Path(hbaseSitePath)) > > > > spark.sqlContext.phoenixTableAsDataFrame("VISTA_409X68", Array("ID"), > > conf = hbConf) > > > > Gets me > > *java.lang.NoClassDefFoundError: Could not initialize class > > org.apache.phoenix.query.QueryServicesOptions* > > * > > * > > I have added phoenix-queryserver-5.0.0-HBase-2.0.jar and > > phoenix-queryserver-client-5.0.0-HBase-2.0.jar > > Any thoughts? I have an hbase-site.xml file with more configuration but > > not sure how to get it to be read in the saving instance. > > Thanks > > > > On Thu, Sep 13, 2018 at 11:38 AM Josh Elser <els...@apache.org > > <mailto:els...@apache.org>> wrote: > > > > Pretty sure we ran tests with Spark 2.3 with Phoenix 5.0. Not sure if > > Spark has already moved beyond that. > > > > On 9/12/18 11:00 PM, Saif Addin wrote: > > > Thanks, we'll try Spark Connector then. Thought it didn't support > > newest > > > Spark Versions > > > > > > On Wed, Sep 12, 2018 at 11:03 PM Jaanai Zhang > > <cloud.pos...@gmail.com <mailto:cloud.pos...@gmail.com> > > > <mailto:cloud.pos...@gmail.com <mailto:cloud.pos...@gmail.com>>> > > wrote: > > > > > > It seems columns data missing mapping information of the > > schema. if > > > you want to use this way to write HBase table, you can > create an > > > HBase table and uses Phoenix mapping it. > > > > > > ---------------------------------------- > > > Jaanai Zhang > > > Best regards! > > > > > > > > > > > > Thomas D'Silva <tdsi...@salesforce.com > > <mailto:tdsi...@salesforce.com> > > > <mailto:tdsi...@salesforce.com > > <mailto:tdsi...@salesforce.com>>> 于2018年9月13日周四 上午6:03写道: > > > > > > Is there a reason you didn't use the spark-connector to > > > serialize your data? > > > > > > On Wed, Sep 12, 2018 at 2:28 PM, Saif Addin > > <saif1...@gmail.com <mailto:saif1...@gmail.com> > > > <mailto:saif1...@gmail.com <mailto:saif1...@gmail.com>>> > > wrote: > > > > > > Thank you Josh! That was helpful. Indeed, there was a > > salt > > > bucket on the table, and the key-column now shows > > correctly. > > > > > > However, the problem still persists in that the rest > > of the > > > columns show as completely empty on Phoenix (appear > > > correctly on Hbase). We'll be looking into this but > > if you > > > have any further advice, appreciated. > > > > > > Saif > > > > > > On Wed, Sep 12, 2018 at 5:50 PM Josh Elser > > > <els...@apache.org <mailto:els...@apache.org> > > <mailto:els...@apache.org <mailto:els...@apache.org>>> wrote: > > > > > > Reminder: Using Phoenix internals forces you to > > > understand exactly how > > > the version of Phoenix that you're using > serializes > > > data. Is there a > > > reason you're not using SQL to interact with > Phoenix? > > > > > > Sounds to me that Phoenix is expecting more data > > at the > > > head of your > > > rowkey. Maybe a salt bucket that you've defined > > on the > > > table but not > > > created? > > > > > > On 9/12/18 4:32 PM, Saif Addin wrote: > > > > Hi all, > > > > > > > > We're trying to write tables with all string > > columns > > > from spark. > > > > We are not using the Spark Connector, instead > > we are > > > directly writing > > > > byte arrays from RDDs. > > > > > > > > The process works fine, and Hbase receives the > > data > > > correctly, and > > > > content is consistent. > > > > > > > > However reading the table from Phoenix, we > > notice the > > > first character of > > > > strings are missing. This sounds like it's a > byte > > > encoding issue, but > > > > we're at loss. We're using PVarchar to > > generate bytes. > > > > > > > > Here's the snippet of code creating the RDD: > > > > > > > > val tdd = pdd.flatMap(x => { > > > > val rowKey = PVarchar.INSTANCE.toBytes(x._1) > > > > for(i <- 0 until cols.length) yield { > > > > other stuff for other columns ... > > > > ... > > > > (rowKey, (column1, column2, column3)) > > > > } > > > > }) > > > > > > > > ... > > > > > > > > We then create the following output to be > written > > > down in Hbase > > > > > > > > val output = tdd.map(x => { > > > > val rowKeyByte: Array[Byte] = x._1 > > > > val immutableRowKey = new > > > ImmutableBytesWritable(rowKeyByte) > > > > > > > > val kv = new KeyValue(rowKeyByte, > > > > PVarchar.INSTANCE.toBytes(column1), > > > > PVarchar.INSTANCE.toBytes(column2), > > > > PVarchar.INSTANCE.toBytes(column3) > > > > ) > > > > (immutableRowKey, kv) > > > > }) > > > > > > > > By the way, we are using *KryoSerializer* in > > order to > > > be able to > > > > serialize all classes necessary for Hbase > > (KeyValue, > > > BytesWritable, etc). > > > > > > > > The key of this table is the one missing data > when > > > queried from Phoenix. > > > > So we guess something is wrong with the byte > ser. > > > > > > > > Any ideas? Appreciated! > > > > Saif > > > > > > > > >