Hi All,

I've finished my GSoC project but I have a problem. I've implemented a
Spark backend for Gora and I've written a word count test class for it.

Here is my particular test method:

https://github.com/kamaci/gora/blob/master/gora-hbase/src/test/java/org/apache/gora/hbase/mapreduce/TestHBaseStoreWordCount.java#L65


When I run my test there is no need to startup an Hbase cluster because
Spark will connect to my dummy cluster. However when I run my test method
it throws an error. Here is a part from stack trace:

2015-08-27 01:03:29,602 WARN  [Executor task launch
worker-0-SendThread(localhost:2181)] zookeeper.ClientCnxn
(ClientCnxn.java:run(1089)) - Session 0x0 for server null, unexpected
error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739)
at
org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068)

2015-08-27 01:03:29,704 WARN  [Executor task launch worker-0]
zookeeper.RecoverableZooKeeper
(RecoverableZooKeeper.java:retryOrThrow(276)) - Possibly transient
ZooKeeper, quorum=localhost:2181,
exception=org.apache.zookeeper.KeeperException$ConnectionLossException:
KeeperErrorCode = ConnectionLoss for /hbase/meta-region-server

2015-08-27 01:03:29,704 ERROR [Executor task launch worker-0]
zookeeper.RecoverableZooKeeper
(RecoverableZooKeeper.java:retryOrThrow(278)) - ZooKeeper exists failed
after 4 attempts

2015-08-27 01:03:29,704 WARN  [Executor task launch worker-0]
zookeeper.ZKUtil (ZKUtil.java:watchAndCheckExists(434)) -
catalogtracker-on-hconnection-0x18cd6fd6, quorum=localhost:2181,
baseZNode=/hbase Unable to set watcher on znode /hbase/meta-region-server
org.apache.zookeeper.KeeperException$ConnectionLossException:
KeeperErrorCode = ConnectionLoss for /hbase/meta-region-server
at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1041)
at
org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:222)
at
org.apache.hadoop.hbase.zookeeper.ZKUtil.watchAndCheckExists(ZKUtil.java:425)
at
org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.start(ZooKeeperNodeTracker.java:77)
at
org.apache.hadoop.hbase.catalog.CatalogTracker.start(CatalogTracker.java:199)
at
org.apache.hadoop.hbase.client.HBaseAdmin.startCatalogTracker(HBaseAdmin.java:261)
at
org.apache.hadoop.hbase.client.HBaseAdmin.getCatalogTracker(HBaseAdmin.java:234)
at
org.apache.hadoop.hbase.client.HBaseAdmin.tableExists(HBaseAdmin.java:305)
at
org.apache.hadoop.hbase.client.HBaseAdmin.tableExists(HBaseAdmin.java:321)
at org.apache.gora.hbase.store.HBaseStore.schemaExists(HBaseStore.java:197)
at org.apache.gora.hbase.store.HBaseStore.createSchema(HBaseStore.java:170)
at org.apache.gora.hbase.store.HBaseStore.initialize(HBaseStore.java:147)
at
org.apache.gora.store.impl.DataStoreBase.readFields(DataStoreBase.java:213)
at org.apache.gora.query.impl.QueryBase.readFields(QueryBase.java:215)
at
org.apache.gora.query.impl.PartitionQueryImpl.readFields(PartitionQueryImpl.java:151)
at
org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:71)
at
org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:42)
at org.apache.gora.util.IOUtils.deserialize(IOUtils.java:228)
at org.apache.gora.util.IOUtils.deserialize(IOUtils.java:248)
at
org.apache.gora.mapreduce.GoraInputSplit.readFields(GoraInputSplit.java:76)
at org.apache.hadoop.io.ObjectWritable.readObject(ObjectWritable.java:285)
at org.apache.hadoop.io.ObjectWritable.readFields(ObjectWritable.java:77)
at
org.apache.spark.SerializableWritable$$anonfun$readObject$1.apply$mcV$sp(SerializableWritable.scala:43)
at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1138)
at
org.apache.spark.SerializableWritable.readObject(SerializableWritable.scala:39)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
at
org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:68)
at
org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:94)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:185)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)

When I check the logs I see that cluster and Spark is started up correctly:

2015-08-27 01:02:29,067 INFO  [main] hdfs.MiniDFSCluster
(MiniDFSCluster.java:waitActive(2055)) - Cluster is active

2015-08-27 01:02:29,181 INFO  [main] zookeeper.MiniZooKeeperCluster
(MiniZooKeeperCluster.java:startup(200)) - Started MiniZK Cluster and
connect 1 ZK server on client port: 63668

2015-08-27 01:02:45,233 INFO  [main] util.Utils (Logging.scala:logInfo(59))
- Successfully started service 'sparkDriver' on port 60494.

I realized that when I start up an Hbase from command line, my test method
for Spark connects to it! So, it doesn't connect to dummy cluster but it
tries to connect to default one.

Any ideas about solving that connection problem?

PS 1: I've ignored the test at my Github repository.
PS 2: I don't think that there is a problem Spark side.
PS 3: I'll upload full stack trace to
https://issues.apache.org/jira/browse/GORA-386

Kind Regards,
Furkan KAMACI

Reply via email to