I'm using the python thrift bindings to hbase (generated as per http://yannramin.com/2008/07/19/using-facebook-thrift-with-python-and-hbase/ , and following the example usage in hbase-0.2.0/src/examples/thrift/ DemoClient.py)

(everything below with hadoop 0.17.1 and hbase 0.2.0)

It doesn't seem to cope with errors very well. For example, when doing a batch update on a row, if I address a non-existent column, the following appears in the logs, and python hangs indefinitely, no exceptions are passed back.

2008-08-12 14:37:32,646 INFO org.apache.hadoop.ipc.Server: IPC Server handler 7 on 51153, call batchUpdate([EMAIL PROTECTED], row => 0000-0000, {column => type2:, value => '...'}) from 127.0.0.1:52183: error: java.io.IOException: java.lang.NullPointerException
java.io.IOException: java.lang.NullPointerException
at org .apache .hadoop .hbase .regionserver.HRegionServer.validateValuesLength(HRegionServer.java: 1173) at org .apache .hadoop .hbase.regionserver.HRegionServer.batchUpdate(HRegionServer.java:1148)
        at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source)
at sun .reflect .DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java: 25)
        at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.hbase.ipc.HbaseRPC $Server.call(HbaseRPC.java:473)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:896)


Similarly, if I create an empty Mutation with no column name, and issue it as an update, then the following Java exception is generated:

java.lang.NullPointerException
        at org.apache.hadoop.io.Text.validateUTF8(Text.java:437)
at org.apache.hadoop.hbase.thrift.ThriftServer $HBaseHandler.getText(ThriftServer.java:154) at org.apache.hadoop.hbase.thrift.ThriftServer $HBaseHandler.mutateRowTs(ThriftServer.java:423) at org.apache.hadoop.hbase.thrift.ThriftServer $HBaseHandler.mutateRow(ThriftServer.java:394) at org.apache.hadoop.hbase.thrift.generated.Hbase$Processor $mutateRow.process(Hbase.java:1620) at org.apache.hadoop.hbase.thrift.generated.Hbase $Processor.process(Hbase.java:1348) at com.facebook.thrift.server.TThreadPoolServer $WorkerProcess.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor $Worker.runTask(ThreadPoolExecutor.java:885) at java.util.concurrent.ThreadPoolExecutor $Worker.run(ThreadPoolExecutor.java:907)
        at java.lang.Thread.run(Thread.java:619)

This one at least is propagated back to Python:
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "hbase/Hbase.py", line 435, in mutateRow
    self.recv_mutateRow()
  File "hbase/Hbase.py", line 448, in recv_mutateRow
    (fname, mtype, rseqid) = self._iprot.readMessageBegin()
File "thrift/protocol/TBinaryProtocol.py", line 113, in readMessageBegin
    sz = self.readI32()
  File "thrift/protocol/TBinaryProtocol.py", line 190, in readI32
    buff = self.trans.readAll(4)
  File "thrift/transport/TTransport.py", line 45, in readAll
    chunk = self.read(sz-have)
  File "thrift/transport/TTransport.py", line 142, in read
self.__rbuf = StringIO(self.__trans.read(max(sz, self.DEFAULT_BUFFER)))
  File "thrift/transport/TSocket.py", line 81, in read
    raise TTransportException('TSocket read 0 bytes')
thrift.transport.TTransport.TTransportException: None

Not the most helpful error message - but worse, because Hbase has died & been relaunched, all further communication over the existing transport will fail, until I issue another "transport.open()" from Python.

Is this really the expected behaviour in the face of errors, or is this a deficiency in the python thrift bindings?

Toby

Reply via email to