I'm using the python thrift bindings to hbase (generated as per http://yannramin.com/2008/07/19/using-facebook-thrift-with-python-and-hbase/
, and following the example usage in hbase-0.2.0/src/examples/thrift/
DemoClient.py)
(everything below with hadoop 0.17.1 and hbase 0.2.0)
It doesn't seem to cope with errors very well. For example, when doing
a batch update on a row, if I address a non-existent column, the
following appears in the logs, and python hangs indefinitely, no
exceptions are passed back.
2008-08-12 14:37:32,646 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 7 on 51153, call batchUpdate([EMAIL PROTECTED], row => 0000-0000,
{column => type2:, value => '...'}) from 127.0.0.1:52183: error:
java.io.IOException: java.lang.NullPointerException
java.io.IOException: java.lang.NullPointerException
at
org
.apache
.hadoop
.hbase
.regionserver.HRegionServer.validateValuesLength(HRegionServer.java:
1173)
at
org
.apache
.hadoop
.hbase.regionserver.HRegionServer.batchUpdate(HRegionServer.java:1148)
at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source)
at
sun
.reflect
.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:
25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.hbase.ipc.HbaseRPC
$Server.call(HbaseRPC.java:473)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:896)
Similarly, if I create an empty Mutation with no column name, and
issue it as an update, then the following Java exception is generated:
java.lang.NullPointerException
at org.apache.hadoop.io.Text.validateUTF8(Text.java:437)
at org.apache.hadoop.hbase.thrift.ThriftServer
$HBaseHandler.getText(ThriftServer.java:154)
at org.apache.hadoop.hbase.thrift.ThriftServer
$HBaseHandler.mutateRowTs(ThriftServer.java:423)
at org.apache.hadoop.hbase.thrift.ThriftServer
$HBaseHandler.mutateRow(ThriftServer.java:394)
at org.apache.hadoop.hbase.thrift.generated.Hbase$Processor
$mutateRow.process(Hbase.java:1620)
at org.apache.hadoop.hbase.thrift.generated.Hbase
$Processor.process(Hbase.java:1348)
at com.facebook.thrift.server.TThreadPoolServer
$WorkerProcess.run(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor
$Worker.runTask(ThreadPoolExecutor.java:885)
at java.util.concurrent.ThreadPoolExecutor
$Worker.run(ThreadPoolExecutor.java:907)
at java.lang.Thread.run(Thread.java:619)
This one at least is propagated back to Python:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "hbase/Hbase.py", line 435, in mutateRow
self.recv_mutateRow()
File "hbase/Hbase.py", line 448, in recv_mutateRow
(fname, mtype, rseqid) = self._iprot.readMessageBegin()
File "thrift/protocol/TBinaryProtocol.py", line 113, in
readMessageBegin
sz = self.readI32()
File "thrift/protocol/TBinaryProtocol.py", line 190, in readI32
buff = self.trans.readAll(4)
File "thrift/transport/TTransport.py", line 45, in readAll
chunk = self.read(sz-have)
File "thrift/transport/TTransport.py", line 142, in read
self.__rbuf = StringIO(self.__trans.read(max(sz,
self.DEFAULT_BUFFER)))
File "thrift/transport/TSocket.py", line 81, in read
raise TTransportException('TSocket read 0 bytes')
thrift.transport.TTransport.TTransportException: None
Not the most helpful error message - but worse, because Hbase has died
& been relaunched, all further communication over the existing
transport will fail, until I issue another "transport.open()" from
Python.
Is this really the expected behaviour in the face of errors, or is
this a deficiency in the python thrift bindings?
Toby