Re: Migration from memcachedb to riak

2013-07-10 Thread Andrew Thompson
On Wed, Jul 10, 2013 at 08:19:23AM -0700, Howard Chu wrote:
> If you only need a pure key/value store, you should consider
> memcacheDB using LMDB as its backing store. It's far faster than
> memcacheDB using BerkeleyDB.
>   http://symas.com/mdb/memcache/
> 
> I doubt LevelDB accessed through any interpreted language will be
> anywhere near its performance either, though I haven't tested. (Is
> there a LevelDB backend for modular memcache yet?)

Except that the comparison was not memcache vs leveldb, it was memcache
vs Riak. Yes, Riak does impose some overhead, but on the other hand, I
don't see anything about scaling LMDB to more nodes as your dataset grows,
which was one of the goals listed in the original email.

Also, if you think Erlang is an intepreted language, you probably need
to look again.

Andrew

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Migration from memcachedb to riak

2013-07-10 Thread Howard Chu

On 10 July 2013 10:49, Edgar Veiga mailto:edgarmve...@gmail.com>> wrote:

 Hello all!

 I have a couple of questions that I would like to address all of
 you guys, in order to start this migration the best as possible.

 Context:
 - I'm responsible for the migration of a pure key/value store that
 for now is being stored on memcacheDB.
 - We're serializing php objects and storing them.
 - The total size occupied it's ~2TB.

 - The idea it's to migrate this data to a riak cluster with
 elevelDB backend (starting with 6 nodes, 256 partitions. This
 thing is scaling very fast).
 - We only need to access the information by key. *We won't need
 neither map/reduces, searches or secondary indexes*. It's a pure
 key/value store!

 My questions are:
 - Do you have any riak fine tunning tip regarding this use case
 (due to the fact that we will only use the key/value capabilities
 of riak)?


If you only need a pure key/value store, you should consider memcacheDB using 
LMDB as its backing store. It's far faster than memcacheDB using BerkeleyDB.

http://symas.com/mdb/memcache/

I doubt LevelDB accessed through any interpreted language will be anywhere 
near its performance either, though I haven't tested. (Is there a LevelDB 
backend for modular memcache yet?)


Also if you're serializing language objects, you should consider using LMDB as 
an embedded data store. With the FIXEDMAP option you can copy objects to the 
store and then execute the objects directly from the store on future 
retrievals, no deserialization required.



 - It's expected that those 2TB would be reduced due to the levelDB
 compression. Do you think we should compress our objects to on the
 client?


--
  -- Howard Chu
  CTO, Symas Corp.   http://www.symas.com
  Director, Highland Sun http://highlandsun.com/hyc/
  Chief Architect, OpenLDAP  http://www.openldap.org/project/

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Migration from memcachedb to riak

2013-07-10 Thread damien krotkine
Hi,

Indeed you're using very big keys. If you can't change the keys, then yes
you'll have to use leveldb. However I wonder why you need keys that long :)


On 10 July 2013 13:04, Edgar Veiga  wrote:

> Hi Damien,
>
> Well let's dive into this a little bit.
>
> I told you guys that bitcask was not an option due to a bad past
> experiencie with couchbase (sorry, in the previous post I wrote couchdb),
> that uses the same architecture as bitcask, keys in memory and values in
> disk.
>
> We started the migration to couchbase, and were already using 3 physical
> nodes and only had imported 5% of the real data! That was one of the main
> reasons for choosing a solution like riak + leveldb, to delete the keys fit
> in memory bottleneck..
>
> Now, here's a tipical key (It's large :D, x=letter and 0=numbers ):
>
> xxx___0x_00_000_000xx
>
> We are using the php serialize native function!
>
> Best regards
>
>
>
> On 10 July 2013 11:43, damien krotkine  wrote:
>
>>
>>
>>
>> On 10 July 2013 11:03, Edgar Veiga  wrote:
>>
>>> Hi Guido.
>>>
>>> Thanks for your answer!
>>>
>>> Bitcask it's not an option due to the amount of ram needed.. We would
>>> need a lot more of physical nodes so more money spent...
>>>
>>
>> Why is it not an option?
>>
>> If you use Bitcask, then each node needs to store its keys in memory.
>> It's usually not a lot. In a precedent email I asked you the average lenght
>> of *keys*, but you gave us the average length of *values* :)
>>
>> We have 1 billion keys and fits on a 5 nodes Ring. ( check out
>> http://docs.basho.com/riak/1.2.0/references/appendices/Bitcask-Capacity-Planning/).
>>  Our bucket names are 1 letter, our keys are 10 chars long.
>>
>> What does a typical key look like ? Also, what are you using to serialize
>> your php objects? Maybe you could paste a typical value somewhere as well
>>
>> Damien
>>
>>
>>>
>>> Instead we're using less machines with SSD disks to improve elevelDB
>>> performance.
>>>
>>> Best regards
>>>
>>>
>>>
>>> On 10 July 2013 09:58, Guido Medina  wrote:
>>>
  Well, I rushed my answer before, if you want performance, you probably
 want Bitcask, if you want compression then LevelDB, the following links
 should help you decide better:

 http://docs.basho.com/riak/1.2.0/tutorials/choosing-a-backend/Bitcask/
 http://docs.basho.com/riak/1.2.0/tutorials/choosing-a-backend/LevelDB/

 Or multi, use one as default and then the other for specific buckets:

 http://docs.basho.com/riak/1.2.0/tutorials/choosing-a-backend/Multi/

 HTH,

 Guido.



 On 10/07/13 09:53, Guido Medina wrote:

 Then you are better off with Bitcask, that will be the fastest in your
 case (no 2i, no searches, no M/R)

 HTH,

 Guido.

 On 10/07/13 09:49, Edgar Veiga wrote:

 Hello all!

  I have a couple of questions that I would like to address all of you
 guys, in order to start this migration the best as possible.

  Context:
 - I'm responsible for the migration of a pure key/value store that for
 now is being stored on memcacheDB.
 - We're serializing php objects and storing them.
 - The total size occupied it's ~2TB.

  - The idea it's to migrate this data to a riak cluster with elevelDB
 backend (starting with 6 nodes, 256 partitions. This thing is scaling very
 fast).
 - We only need to access the information by key. *We won't need
 neither map/reduces, searches or secondary indexes*. It's a pure
 key/value store!

  My questions are:
 - Do you have any riak fine tunning tip regarding this use case (due to
 the fact that we will only use the key/value capabilities of riak)?
  - It's expected that those 2TB would be reduced due to the levelDB
 compression. Do you think we should compress our objects to on the client?

  Best regards,
 Edgar Veiga


 ___
 riak-users mailing 
 listriak-users@lists.basho.comhttp://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com




 ___
 riak-users mailing list
 riak-users@lists.basho.com
 http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


>>>
>>> ___
>>> riak-users mailing list
>>> riak-users@lists.basho.com
>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>>
>>>
>>
>
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Migration from memcachedb to riak

2013-07-10 Thread Edgar Veiga
Hi Damien,

Well let's dive into this a little bit.

I told you guys that bitcask was not an option due to a bad past
experiencie with couchbase (sorry, in the previous post I wrote couchdb),
that uses the same architecture as bitcask, keys in memory and values in
disk.

We started the migration to couchbase, and were already using 3 physical
nodes and only had imported 5% of the real data! That was one of the main
reasons for choosing a solution like riak + leveldb, to delete the keys fit
in memory bottleneck..

Now, here's a tipical key (It's large :D, x=letter and 0=numbers ):

xxx___0x_00_000_000xx

We are using the php serialize native function!

Best regards



On 10 July 2013 11:43, damien krotkine  wrote:

>
>
>
> On 10 July 2013 11:03, Edgar Veiga  wrote:
>
>> Hi Guido.
>>
>> Thanks for your answer!
>>
>> Bitcask it's not an option due to the amount of ram needed.. We would
>> need a lot more of physical nodes so more money spent...
>>
>
> Why is it not an option?
>
> If you use Bitcask, then each node needs to store its keys in memory. It's
> usually not a lot. In a precedent email I asked you the average lenght of
> *keys*, but you gave us the average length of *values* :)
>
> We have 1 billion keys and fits on a 5 nodes Ring. ( check out
> http://docs.basho.com/riak/1.2.0/references/appendices/Bitcask-Capacity-Planning/).
>  Our bucket names are 1 letter, our keys are 10 chars long.
>
> What does a typical key look like ? Also, what are you using to serialize
> your php objects? Maybe you could paste a typical value somewhere as well
>
> Damien
>
>
>>
>> Instead we're using less machines with SSD disks to improve elevelDB
>> performance.
>>
>> Best regards
>>
>>
>>
>> On 10 July 2013 09:58, Guido Medina  wrote:
>>
>>>  Well, I rushed my answer before, if you want performance, you probably
>>> want Bitcask, if you want compression then LevelDB, the following links
>>> should help you decide better:
>>>
>>> http://docs.basho.com/riak/1.2.0/tutorials/choosing-a-backend/Bitcask/
>>> http://docs.basho.com/riak/1.2.0/tutorials/choosing-a-backend/LevelDB/
>>>
>>> Or multi, use one as default and then the other for specific buckets:
>>>
>>> http://docs.basho.com/riak/1.2.0/tutorials/choosing-a-backend/Multi/
>>>
>>> HTH,
>>>
>>> Guido.
>>>
>>>
>>>
>>> On 10/07/13 09:53, Guido Medina wrote:
>>>
>>> Then you are better off with Bitcask, that will be the fastest in your
>>> case (no 2i, no searches, no M/R)
>>>
>>> HTH,
>>>
>>> Guido.
>>>
>>> On 10/07/13 09:49, Edgar Veiga wrote:
>>>
>>> Hello all!
>>>
>>>  I have a couple of questions that I would like to address all of you
>>> guys, in order to start this migration the best as possible.
>>>
>>>  Context:
>>> - I'm responsible for the migration of a pure key/value store that for
>>> now is being stored on memcacheDB.
>>> - We're serializing php objects and storing them.
>>> - The total size occupied it's ~2TB.
>>>
>>>  - The idea it's to migrate this data to a riak cluster with elevelDB
>>> backend (starting with 6 nodes, 256 partitions. This thing is scaling very
>>> fast).
>>> - We only need to access the information by key. *We won't need neither
>>> map/reduces, searches or secondary indexes*. It's a pure key/value
>>> store!
>>>
>>>  My questions are:
>>> - Do you have any riak fine tunning tip regarding this use case (due to
>>> the fact that we will only use the key/value capabilities of riak)?
>>>  - It's expected that those 2TB would be reduced due to the levelDB
>>> compression. Do you think we should compress our objects to on the client?
>>>
>>>  Best regards,
>>> Edgar Veiga
>>>
>>>
>>> ___
>>> riak-users mailing 
>>> listriak-users@lists.basho.comhttp://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>>
>>>
>>>
>>>
>>> ___
>>> riak-users mailing list
>>> riak-users@lists.basho.com
>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>>
>>>
>>
>> ___
>> riak-users mailing list
>> riak-users@lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>
>>
>
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Migration from memcachedb to riak

2013-07-10 Thread Guido Medina
For the sake of using the right capacity planner use the latest GA Riak 
version link which is 1.3.2, and probably comeback after 1.4 is fully is 
released which should happen really soon, also check release notes 
between 1.3.2 and 1.4, might give you ideas/good news.


http://docs.basho.com/riak/1.3.2/references/appendices/Bitcask-Capacity-Planning/

This link should change soon:
http://docs.basho.com/riak/1.4.0rc1/references/appendices/Bitcask-Capacity-Planning/

Guido.

On 10/07/13 11:43, damien krotkine wrote:




On 10 July 2013 11:03, Edgar Veiga > wrote:


Hi Guido.

Thanks for your answer!

Bitcask it's not an option due to the amount of ram needed.. We
would need a lot more of physical nodes so more money spent...


Why is it not an option?

If you use Bitcask, then each node needs to store its keys in memory. 
It's usually not a lot. In a precedent email I asked you the average 
lenght of *keys*, but you gave us the average length of *values* :)


We have 1 billion keys and fits on a 5 nodes Ring. ( check out 
http://docs.basho.com/riak/1.2.0/references/appendices/Bitcask-Capacity-Planning/ 
). Our bucket names are 1 letter, our keys are 10 chars long.


What does a typical key look like ? Also, what are you using to 
serialize your php objects? Maybe you could paste a typical value 
somewhere as well


Damien


Instead we're using less machines with SSD disks to improve
elevelDB performance.

Best regards



On 10 July 2013 09:58, Guido Medina mailto:guido.med...@temetra.com>> wrote:

Well, I rushed my answer before, if you want performance, you
probably want Bitcask, if you want compression then LevelDB,
the following links should help you decide better:

http://docs.basho.com/riak/1.2.0/tutorials/choosing-a-backend/Bitcask/
http://docs.basho.com/riak/1.2.0/tutorials/choosing-a-backend/LevelDB/

Or multi, use one as default and then the other for specific
buckets:

http://docs.basho.com/riak/1.2.0/tutorials/choosing-a-backend/Multi/

HTH,

Guido.



On 10/07/13 09:53, Guido Medina wrote:

Then you are better off with Bitcask, that will be the
fastest in your case (no 2i, no searches, no M/R)

HTH,

Guido.

On 10/07/13 09:49, Edgar Veiga wrote:

Hello all!

I have a couple of questions that I would like to address
all of you guys, in order to start this migration the best
as possible.

Context:
- I'm responsible for the migration of a pure key/value
store that for now is being stored on memcacheDB.
- We're serializing php objects and storing them.
- The total size occupied it's ~2TB.

- The idea it's to migrate this data to a riak cluster with
elevelDB backend (starting with 6 nodes, 256 partitions.
This thing is scaling very fast).
- We only need to access the information by key. *We won't
need neither map/reduces, searches or secondary indexes*.
It's a pure key/value store!

My questions are:
- Do you have any riak fine tunning tip regarding this use
case (due to the fact that we will only use the key/value
capabilities of riak)?
- It's expected that those 2TB would be reduced due to the
levelDB compression. Do you think we should compress our
objects to on the client?

Best regards,
Edgar Veiga


___
riak-users mailing list
riak-users@lists.basho.com  
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com





___
riak-users mailing list
riak-users@lists.basho.com 
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com



___
riak-users mailing list
riak-users@lists.basho.com 
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com




___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Migration from memcachedb to riak

2013-07-10 Thread damien krotkine
On 10 July 2013 11:03, Edgar Veiga  wrote:

> Hi Guido.
>
> Thanks for your answer!
>
> Bitcask it's not an option due to the amount of ram needed.. We would need
> a lot more of physical nodes so more money spent...
>

Why is it not an option?

If you use Bitcask, then each node needs to store its keys in memory. It's
usually not a lot. In a precedent email I asked you the average lenght of
*keys*, but you gave us the average length of *values* :)

We have 1 billion keys and fits on a 5 nodes Ring. ( check out
http://docs.basho.com/riak/1.2.0/references/appendices/Bitcask-Capacity-Planning/).
Our bucket names are 1 letter, our keys are 10 chars long.

What does a typical key look like ? Also, what are you using to serialize
your php objects? Maybe you could paste a typical value somewhere as well

Damien


>
> Instead we're using less machines with SSD disks to improve elevelDB
> performance.
>
> Best regards
>
>
>
> On 10 July 2013 09:58, Guido Medina  wrote:
>
>>  Well, I rushed my answer before, if you want performance, you probably
>> want Bitcask, if you want compression then LevelDB, the following links
>> should help you decide better:
>>
>> http://docs.basho.com/riak/1.2.0/tutorials/choosing-a-backend/Bitcask/
>> http://docs.basho.com/riak/1.2.0/tutorials/choosing-a-backend/LevelDB/
>>
>> Or multi, use one as default and then the other for specific buckets:
>>
>> http://docs.basho.com/riak/1.2.0/tutorials/choosing-a-backend/Multi/
>>
>> HTH,
>>
>> Guido.
>>
>>
>>
>> On 10/07/13 09:53, Guido Medina wrote:
>>
>> Then you are better off with Bitcask, that will be the fastest in your
>> case (no 2i, no searches, no M/R)
>>
>> HTH,
>>
>> Guido.
>>
>> On 10/07/13 09:49, Edgar Veiga wrote:
>>
>> Hello all!
>>
>>  I have a couple of questions that I would like to address all of you
>> guys, in order to start this migration the best as possible.
>>
>>  Context:
>> - I'm responsible for the migration of a pure key/value store that for
>> now is being stored on memcacheDB.
>> - We're serializing php objects and storing them.
>> - The total size occupied it's ~2TB.
>>
>>  - The idea it's to migrate this data to a riak cluster with elevelDB
>> backend (starting with 6 nodes, 256 partitions. This thing is scaling very
>> fast).
>> - We only need to access the information by key. *We won't need neither
>> map/reduces, searches or secondary indexes*. It's a pure key/value store!
>>
>>  My questions are:
>> - Do you have any riak fine tunning tip regarding this use case (due to
>> the fact that we will only use the key/value capabilities of riak)?
>>  - It's expected that those 2TB would be reduced due to the levelDB
>> compression. Do you think we should compress our objects to on the client?
>>
>>  Best regards,
>> Edgar Veiga
>>
>>
>> ___
>> riak-users mailing 
>> listriak-users@lists.basho.comhttp://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>
>>
>>
>>
>> ___
>> riak-users mailing list
>> riak-users@lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>
>>
>
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Migration from memcachedb to riak

2013-07-10 Thread Edgar Veiga
Guido, we'r not using Java and that won't be an option.

The technology stack is php and/or node.js

Thanks anyway :)

Best regards


On 10 July 2013 10:35, Edgar Veiga  wrote:

> Hi Damien,
>
> We have ~11 keys and we are using ~2TB of disk space.
> (The average object length will be ~2000 bytes).
>
> This is a lot to fit in memory (We have bad past experiencies with
> couchDB...).
>
> Thanks for the rest of the tips!
>
>
> On 10 July 2013 10:13, damien krotkine  wrote:
>
>>
>> ( first post here, hi everybody... )
>>
>> If you don't need MR, 2i, etc, then BitCask will be faster. You just need
>> to make sure all your keys fit in memory, which should not be a problem.
>> How many keys do you have and what's their average length ?
>>
>> About the values,you can save a lot of space by choosing an appropriate
>> serialization. We use Sereal[1] to serialize our data, and it's small
>> enough that we don't need to compress it further (it can automatically use
>> snappy to compress further). There is a php client [2]
>>
>> If you use leveldb, it can compress using snappy, but I've been a bit
>> disappointed by snappy, because it didn't work well with our data. If you
>> serialize your php object as verbose string (I don't know what's the usual
>> way to serialize php objects), then you should probably benchmark different
>> compressions algorithms on the application side.
>>
>>
>> [1]: https://github.com/Sereal/Sereal/wiki/Sereal-Comparison-Graphs
>> [2]: https://github.com/tobyink/php-sereal/tree/master/PHP
>>
>> On 10 July 2013 10:49, Edgar Veiga  wrote:
>>
>>>  Hello all!
>>>
>>> I have a couple of questions that I would like to address all of you
>>> guys, in order to start this migration the best as possible.
>>>
>>> Context:
>>> - I'm responsible for the migration of a pure key/value store that for
>>> now is being stored on memcacheDB.
>>> - We're serializing php objects and storing them.
>>> - The total size occupied it's ~2TB.
>>>
>>> - The idea it's to migrate this data to a riak cluster with elevelDB
>>> backend (starting with 6 nodes, 256 partitions. This thing is scaling very
>>> fast).
>>> - We only need to access the information by key. *We won't need neither
>>> map/reduces, searches or secondary indexes*. It's a pure key/value
>>> store!
>>>
>>> My questions are:
>>> - Do you have any riak fine tunning tip regarding this use case (due to
>>> the fact that we will only use the key/value capabilities of riak)?
>>> - It's expected that those 2TB would be reduced due to the levelDB
>>> compression. Do you think we should compress our objects to on the client?
>>>
>>> Best regards,
>>> Edgar Veiga
>>>
>>> ___
>>> riak-users mailing list
>>> riak-users@lists.basho.com
>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>>
>>>
>>
>
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Migration from memcachedb to riak

2013-07-10 Thread Edgar Veiga
Hi Damien,

We have ~11 keys and we are using ~2TB of disk space.
(The average object length will be ~2000 bytes).

This is a lot to fit in memory (We have bad past experiencies with
couchDB...).

Thanks for the rest of the tips!


On 10 July 2013 10:13, damien krotkine  wrote:

>
> ( first post here, hi everybody... )
>
> If you don't need MR, 2i, etc, then BitCask will be faster. You just need
> to make sure all your keys fit in memory, which should not be a problem.
> How many keys do you have and what's their average length ?
>
> About the values,you can save a lot of space by choosing an appropriate
> serialization. We use Sereal[1] to serialize our data, and it's small
> enough that we don't need to compress it further (it can automatically use
> snappy to compress further). There is a php client [2]
>
> If you use leveldb, it can compress using snappy, but I've been a bit
> disappointed by snappy, because it didn't work well with our data. If you
> serialize your php object as verbose string (I don't know what's the usual
> way to serialize php objects), then you should probably benchmark different
> compressions algorithms on the application side.
>
>
> [1]: https://github.com/Sereal/Sereal/wiki/Sereal-Comparison-Graphs
> [2]: https://github.com/tobyink/php-sereal/tree/master/PHP
>
> On 10 July 2013 10:49, Edgar Veiga  wrote:
>
>>  Hello all!
>>
>> I have a couple of questions that I would like to address all of you
>> guys, in order to start this migration the best as possible.
>>
>> Context:
>> - I'm responsible for the migration of a pure key/value store that for
>> now is being stored on memcacheDB.
>> - We're serializing php objects and storing them.
>> - The total size occupied it's ~2TB.
>>
>> - The idea it's to migrate this data to a riak cluster with elevelDB
>> backend (starting with 6 nodes, 256 partitions. This thing is scaling very
>> fast).
>> - We only need to access the information by key. *We won't need neither
>> map/reduces, searches or secondary indexes*. It's a pure key/value store!
>>
>> My questions are:
>> - Do you have any riak fine tunning tip regarding this use case (due to
>> the fact that we will only use the key/value capabilities of riak)?
>> - It's expected that those 2TB would be reduced due to the levelDB
>> compression. Do you think we should compress our objects to on the client?
>>
>> Best regards,
>> Edgar Veiga
>>
>> ___
>> riak-users mailing list
>> riak-users@lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>
>>
>
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Migration from memcachedb to riak

2013-07-10 Thread Guido Medina
If you are using Java you could store Riak keys as binaries using 
Jackson smile format, supposedly it will compress faster and better than 
default Java serialization, we use it for very large keys (say a key 
with a large collection of entries), the drawback is that you won't be 
able to easily read that key with other clients; say, write in Java and 
read from JavaScript for example.


Application compression usually comes at the cost of performance and CPU 
usage, you surely want to compress without compromising the CPU by a lot.

*
**Reference:* http://www.cowtowncoder.com/blog/blog.html

HTH,

Guido.


On 10/07/13 10:13, damien krotkine wrote:


( first post here, hi everybody... )

If you don't need MR, 2i, etc, then BitCask will be faster. You just 
need to make sure all your keys fit in memory, which should not be a 
problem. How many keys do you have and what's their average length ?


About the values,you can save a lot of space by choosing an 
appropriate serialization. We use Sereal[1] to serialize our data, and 
it's small enough that we don't need to compress it further (it can 
automatically use snappy to compress further). There is a php client [2]


If you use leveldb, it can compress using snappy, but I've been a bit 
disappointed by snappy, because it didn't work well with our data. If 
you serialize your php object as verbose string (I don't know what's 
the usual way to serialize php objects), then you should probably 
benchmark different compressions algorithms on the application side.



[1]: https://github.com/Sereal/Sereal/wiki/Sereal-Comparison-Graphs
[2]: https://github.com/tobyink/php-sereal/tree/master/PHP

On 10 July 2013 10:49, Edgar Veiga > wrote:


Hello all!

I have a couple of questions that I would like to address all of
you guys, in order to start this migration the best as possible.

Context:
- I'm responsible for the migration of a pure key/value store that
for now is being stored on memcacheDB.
- We're serializing php objects and storing them.
- The total size occupied it's ~2TB.

- The idea it's to migrate this data to a riak cluster with
elevelDB backend (starting with 6 nodes, 256 partitions. This
thing is scaling very fast).
- We only need to access the information by key. *We won't need
neither map/reduces, searches or secondary indexes*. It's a pure
key/value store!

My questions are:
- Do you have any riak fine tunning tip regarding this use case
(due to the fact that we will only use the key/value capabilities
of riak)?
- It's expected that those 2TB would be reduced due to the levelDB
compression. Do you think we should compress our objects to on the
client?

Best regards,
Edgar Veiga

___
riak-users mailing list
riak-users@lists.basho.com 
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com




___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Migration from memcachedb to riak

2013-07-10 Thread damien krotkine
( first post here, hi everybody... )

If you don't need MR, 2i, etc, then BitCask will be faster. You just need
to make sure all your keys fit in memory, which should not be a problem.
How many keys do you have and what's their average length ?

About the values,you can save a lot of space by choosing an appropriate
serialization. We use Sereal[1] to serialize our data, and it's small
enough that we don't need to compress it further (it can automatically use
snappy to compress further). There is a php client [2]

If you use leveldb, it can compress using snappy, but I've been a bit
disappointed by snappy, because it didn't work well with our data. If you
serialize your php object as verbose string (I don't know what's the usual
way to serialize php objects), then you should probably benchmark different
compressions algorithms on the application side.


[1]: https://github.com/Sereal/Sereal/wiki/Sereal-Comparison-Graphs
[2]: https://github.com/tobyink/php-sereal/tree/master/PHP

On 10 July 2013 10:49, Edgar Veiga  wrote:

> Hello all!
>
> I have a couple of questions that I would like to address all of you guys,
> in order to start this migration the best as possible.
>
> Context:
> - I'm responsible for the migration of a pure key/value store that for now
> is being stored on memcacheDB.
> - We're serializing php objects and storing them.
> - The total size occupied it's ~2TB.
>
> - The idea it's to migrate this data to a riak cluster with elevelDB
> backend (starting with 6 nodes, 256 partitions. This thing is scaling very
> fast).
> - We only need to access the information by key. *We won't need neither
> map/reduces, searches or secondary indexes*. It's a pure key/value store!
>
> My questions are:
> - Do you have any riak fine tunning tip regarding this use case (due to
> the fact that we will only use the key/value capabilities of riak)?
> - It's expected that those 2TB would be reduced due to the levelDB
> compression. Do you think we should compress our objects to on the client?
>
> Best regards,
> Edgar Veiga
>
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Migration from memcachedb to riak

2013-07-10 Thread Guido Medina

Hi Edgar,

You don't need to compress your objects, LevelDB will do that for you, 
and if you are using Protocol Buffers it will compress the network 
traffic for you too without compromising performance or any CPU bound 
process. There isn't anything special about LevelDB config, I would 
suggest you to try the Riak defaults which will work for 95%+ of the 
cases, start from there and see how that works for you.


HTH,

Guido.

On 10/07/13 10:03, Edgar Veiga wrote:

Hi Guido.

Thanks for your answer!

Bitcask it's not an option due to the amount of ram needed.. We would 
need a lot more of physical nodes so more money spent...


Instead we're using less machines with SSD disks to improve elevelDB 
performance.


Best regards



On 10 July 2013 09:58, Guido Medina > wrote:


Well, I rushed my answer before, if you want performance, you
probably want Bitcask, if you want compression then LevelDB, the
following links should help you decide better:

http://docs.basho.com/riak/1.2.0/tutorials/choosing-a-backend/Bitcask/
http://docs.basho.com/riak/1.2.0/tutorials/choosing-a-backend/LevelDB/

Or multi, use one as default and then the other for specific buckets:

http://docs.basho.com/riak/1.2.0/tutorials/choosing-a-backend/Multi/

HTH,

Guido.



On 10/07/13 09:53, Guido Medina wrote:

Then you are better off with Bitcask, that will be the fastest in
your case (no 2i, no searches, no M/R)

HTH,

Guido.

On 10/07/13 09:49, Edgar Veiga wrote:

Hello all!

I have a couple of questions that I would like to address all of
you guys, in order to start this migration the best as possible.

Context:
- I'm responsible for the migration of a pure key/value store
that for now is being stored on memcacheDB.
- We're serializing php objects and storing them.
- The total size occupied it's ~2TB.

- The idea it's to migrate this data to a riak cluster with
elevelDB backend (starting with 6 nodes, 256 partitions. This
thing is scaling very fast).
- We only need to access the information by key. *We won't need
neither map/reduces, searches or secondary indexes*. It's a pure
key/value store!

My questions are:
- Do you have any riak fine tunning tip regarding this use case
(due to the fact that we will only use the key/value
capabilities of riak)?
- It's expected that those 2TB would be reduced due to the
levelDB compression. Do you think we should compress our objects
to on the client?

Best regards,
Edgar Veiga


___
riak-users mailing list
riak-users@lists.basho.com  
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com





___
riak-users mailing list
riak-users@lists.basho.com 
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com




___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Migration from memcachedb to riak

2013-07-10 Thread Edgar Veiga
Hi Guido.

Thanks for your answer!

Bitcask it's not an option due to the amount of ram needed.. We would need
a lot more of physical nodes so more money spent...

Instead we're using less machines with SSD disks to improve elevelDB
performance.

Best regards



On 10 July 2013 09:58, Guido Medina  wrote:

>  Well, I rushed my answer before, if you want performance, you probably
> want Bitcask, if you want compression then LevelDB, the following links
> should help you decide better:
>
> http://docs.basho.com/riak/1.2.0/tutorials/choosing-a-backend/Bitcask/
> http://docs.basho.com/riak/1.2.0/tutorials/choosing-a-backend/LevelDB/
>
> Or multi, use one as default and then the other for specific buckets:
>
> http://docs.basho.com/riak/1.2.0/tutorials/choosing-a-backend/Multi/
>
> HTH,
>
> Guido.
>
>
>
> On 10/07/13 09:53, Guido Medina wrote:
>
> Then you are better off with Bitcask, that will be the fastest in your
> case (no 2i, no searches, no M/R)
>
> HTH,
>
> Guido.
>
> On 10/07/13 09:49, Edgar Veiga wrote:
>
> Hello all!
>
>  I have a couple of questions that I would like to address all of you
> guys, in order to start this migration the best as possible.
>
>  Context:
> - I'm responsible for the migration of a pure key/value store that for now
> is being stored on memcacheDB.
> - We're serializing php objects and storing them.
> - The total size occupied it's ~2TB.
>
>  - The idea it's to migrate this data to a riak cluster with elevelDB
> backend (starting with 6 nodes, 256 partitions. This thing is scaling very
> fast).
> - We only need to access the information by key. *We won't need neither
> map/reduces, searches or secondary indexes*. It's a pure key/value store!
>
>  My questions are:
> - Do you have any riak fine tunning tip regarding this use case (due to
> the fact that we will only use the key/value capabilities of riak)?
>  - It's expected that those 2TB would be reduced due to the levelDB
> compression. Do you think we should compress our objects to on the client?
>
>  Best regards,
> Edgar Veiga
>
>
> ___
> riak-users mailing 
> listriak-users@lists.basho.comhttp://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>
>
>
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Migration from memcachedb to riak

2013-07-10 Thread Guido Medina
Well, I rushed my answer before, if you want performance, you probably 
want Bitcask, if you want compression then LevelDB, the following links 
should help you decide better:


http://docs.basho.com/riak/1.2.0/tutorials/choosing-a-backend/Bitcask/
http://docs.basho.com/riak/1.2.0/tutorials/choosing-a-backend/LevelDB/

Or multi, use one as default and then the other for specific buckets:

http://docs.basho.com/riak/1.2.0/tutorials/choosing-a-backend/Multi/

HTH,

Guido.


On 10/07/13 09:53, Guido Medina wrote:
Then you are better off with Bitcask, that will be the fastest in your 
case (no 2i, no searches, no M/R)


HTH,

Guido.

On 10/07/13 09:49, Edgar Veiga wrote:

Hello all!

I have a couple of questions that I would like to address all of you 
guys, in order to start this migration the best as possible.


Context:
- I'm responsible for the migration of a pure key/value store that 
for now is being stored on memcacheDB.

- We're serializing php objects and storing them.
- The total size occupied it's ~2TB.

- The idea it's to migrate this data to a riak cluster with elevelDB 
backend (starting with 6 nodes, 256 partitions. This thing is scaling 
very fast).
- We only need to access the information by key. *We won't need 
neither map/reduces, searches or secondary indexes*. It's a pure 
key/value store!


My questions are:
- Do you have any riak fine tunning tip regarding this use case (due 
to the fact that we will only use the key/value capabilities of riak)?
- It's expected that those 2TB would be reduced due to the levelDB 
compression. Do you think we should compress our objects to on the 
client?


Best regards,
Edgar Veiga


___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com




___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Migration from memcachedb to riak

2013-07-10 Thread Guido Medina
Then you are better off with Bitcask, that will be the fastest in your 
case (no 2i, no searches, no M/R)


HTH,

Guido.

On 10/07/13 09:49, Edgar Veiga wrote:

Hello all!

I have a couple of questions that I would like to address all of you 
guys, in order to start this migration the best as possible.


Context:
- I'm responsible for the migration of a pure key/value store that for 
now is being stored on memcacheDB.

- We're serializing php objects and storing them.
- The total size occupied it's ~2TB.

- The idea it's to migrate this data to a riak cluster with elevelDB 
backend (starting with 6 nodes, 256 partitions. This thing is scaling 
very fast).
- We only need to access the information by key. *We won't need 
neither map/reduces, searches or secondary indexes*. It's a pure 
key/value store!


My questions are:
- Do you have any riak fine tunning tip regarding this use case (due 
to the fact that we will only use the key/value capabilities of riak)?
- It's expected that those 2TB would be reduced due to the levelDB 
compression. Do you think we should compress our objects to on the client?


Best regards,
Edgar Veiga


___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com