[hypertable-dev] Re: python bindings to hypertable

Masha Sat, 13 Feb 2010 15:15:34 -0800

Hi

Of course, your feedback will be appreciated!


What I failed to do, and would like to ask the other developers: how
to compile all the six .so info one big .so. ?

Just adding -static into linker flags (extra_link_args =["-static"] in
setup.py) did not help.
Linking failed with the error:

/usr/bin/ld: /usr/lib/gcc/x86_64-linux-gnu/4.4.1/crtbeginT.o:
relocation R_X86_64_32 against `__DTOR_END__' can not be used when
making a shared object; recompile with -fPIC
/usr/lib/gcc/x86_64-linux-gnu/4.4.1/crtbeginT.o: could not read
symbols: Bad value

That would simplify the deployment of mapreduce workers.
Actually, six libraries is not enough: "apt-get install boost log4cpp
libdb4 ..." must be executed on a worker machine in advance (I do the
hypertable compilation against the versions from standard ubuntu
repositories)

With one big .so we can avoid version mismatches in case if the worker
machine, for example, already have too new of too old boost.
To be safe of the version hell, much more than 6 libraries must be deployed :(


2010/2/13, Mateusz Berezecki <[email protected]>:
> Hi Masha
>
> This is awesome news. I'll check it out and prepare patches if you
> don't mind.
>
> Thanks for a great job!
>
> And yes, thrift does not feel solid at all!;-)
>
> Mateusz
>
> On Feb 13, 2010, at 0:14, Masha <[email protected]> wrote:
>
>> Hello
>>
>> I have fixed the Python bindings to reflect the modern Hypertable and
>> boost versions.
>>
>> Using the python bindings, 'select *' over a large dataset is about 20
>> times faster than using the Thrift (I tested in on a single Linux-x64
>> server, Thrift client eat CPU a lot).
>>
>>
>> Also, The API is slightly improved:
>> 1. TableScanner can act as an iterable object emitting Cell
>>
>> # how it was
>>
>> scanner = table.create_scanner(scan_spec)
>> cell = ht.Cell()
>> while scanner.next(cell):
>>  print "%s:%s %s" % (cell.row_key, cell.column_family, cell.value())
>>
>> # how it is
>>
>> for cell in table.create_scanner(scan_spec):
>>  print "%s:%s %s" % (cell.row_key, cell.column_family, cell.value)
>>
>> # or even simpler
>>
>> for cell in client.hql("select * from table"):
>>  print "%s:%s %s" % (cell.row_key, cell.column_family, cell.value)
>>
>> #--------------------------
>>
>> 2. client.hql("select ...") returns TableScanner
>>   client.hql("show tables") returns python list, both of them are
>> iterables
>>
>> 3. cell.value now is a getter, the parenthesis are not required.
>>
>> 4. Parameter of Client constructor is a path to 'hypertable.cfg', not
>> the path to the installation directory.
>>   Hypertable libraries deep inside use path to the executable as a
>> starting point to find 'hypertable.cfg'.
>>   It fails in case if the executable is '/usr/bin/python'.
>>
>>   As it is intended to be used on a client, it must work without full
>> Hypertable installation, and must work with more than one hypertable
>> server.
>>
>>   Required files are to copy from the full installation: 'ht.so'
>> 'libHyperComm.so' 'libHyperCommon.so' 'libHyperTools.so'
>> 'libHyperspace.so' 'libHypertable.so'
>>   And, of course, 'hypertable.cfg'
>>
>> It is my first experience with boost:python and I'm not sure if it is
>> correct to wrap pointers (TablePtr, TableMutatorPtr) instead of the
>> the objects.
>> So I suppose there could be some memory leaks, I have not investigated
>> it yet.
>> (I tried to wrap the objects  - Table, TableMutator, TableScanner -
>> but then I do not know how to return either TableScanner or list from
>> client.hql(), with the pointers it is easy, so I get back to use
>> them).
>>
>> Compiling of the python bindings does not depend on hypertable
>> compilation process and can be done independently later.
>> Just run 'python setup.py build'.
>> But note that hypertable libraries must be compiled with -
>> DBUILD_SHARED_LIBS=ON (precompiled binaries from hypertable.org do
>> not).
>>
>> I put the code here for a while (sorry, I do not know how to use
>> git):
>> http://code.google.com/p/python-hypertable/source/browse/trunk/
>

-- 
You received this message because you are subscribed to the Google Groups 
"Hypertable Development" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/hypertable-dev?hl=en.

[hypertable-dev] Re: python bindings to hypertable

Reply via email to