Also, which I forgot on my reply, make sure your Riak client is connected to each node and not only to a single node (cluster config doesn't work that well, so try haproxy and make sure you are using protocol buffers)

/HA proxy sample config:/ https://gist.github.com/gburd/1507077

And a single PB config like this one which will connect HA proxy load balancer assuming it is running on localhost and it is connected to each node:
/
final PBClientConfig clientConfig=new PBClientConfig.Builder().withHost("127.0.0.1").withPort(8087).withPoolSize(N).build();/

Guido.

On 13/02/13 10:29, Guido Medina wrote:
Are you transferring using a single thread? If so, I would recommend you to use a ThreaPoolExecutor and schedule each write as you, control the failures (if any) using either an AtomicInteger or a concurrent/synchronized list where you can track the keys that failed.

No matter how much you do, a single threaded transfer won't help you at all. We have done transfers many times and depending on the size of the DB table, we use single thread or thread pool service. Try 8 threads and see the difference, assuming you have N connections in your Riak client where N>max thread pool size.

You might want to remove pw=1 when using multi-threading so Riak doesn't fallback behind too much (elevel db catch up? whatever that's called), pw=1 will add more risk than the benefit you gain.

Hope that helps,

Guido.

On 13/02/13 09:44, Bogdan Flueras wrote:
Ok, so I've done something like this:
Bucket bucket = client.createBucket("foo"); // lastWriteWins(true) doesn't work for Protobuf

when I insert I have:
bucket.store(someKey, someValue).withoutFetch().pw(1).execute();

It looks like it's 20% faster than before. Is there something I could further tweak ?

ing. Bogdan Flueras



On Wed, Feb 13, 2013 at 10:19 AM, Bogdan Flueras <[email protected] <mailto:[email protected]>> wrote:

    Each thread has it's own bucket instance (pointing to the same
    location) and I don't re-fetch the bucket per insert.
    Thank you very much!

    ing. Bogdan Flueras



    On Wed, Feb 13, 2013 at 10:14 AM, Russell Brown
    <[email protected] <mailto:[email protected]>> wrote:


        On 13 Feb 2013, at 08:07, Bogdan Flueras
        <[email protected] <mailto:[email protected]>>
        wrote:

        > How to set the bucket to last write? Is it in the builder?

        Something like:

            Bucket b =
        client.createBucket("my_bucket").lastWriteWins(true);

        Also, after you've created the bucket, do you use it from all
        threads? You don't re-fetch the bucket per-insert operation,
        do you?

        But  the "withoutFecth()" option is probably going to be the
        biggest performance increase, and safe if you are only doing
        inserts.

        Cheers

        Russell

        > I'll have a look..
        > Yes, I use more threads and the bucket is configured to
        spread the load across all nodes.
        >
        > Thanks, I'll have a deeper look into the API and let you
        know about my results.
        >
        > ing. Bogdan Flueras
        >
        >
        >
        > On Wed, Feb 13, 2013 at 10:02 AM, Russell Brown
        <[email protected] <mailto:[email protected]>> wrote:
        > Hi,
        >
        > On 13 Feb 2013, at 07:37, Bogdan Flueras
        <[email protected] <mailto:[email protected]>>
        wrote:
        >
        > > Hello all,
        > > I've got a 5 node cluster with Riak 1.2.1, all machines
        are multicore,
        > > with min 4GB RAM.
        > >
        > > I want to insert something like 50 million records in
        Riak with the java client (Protobuf used) with default
        settings.  I've tried also with HTTP protocol and set w = 1
        but got some problems.
        > >
        > > However the process is very slow: it doesn't write more
        than 6GB/ hour or aprox. 280 KB/second.
        > > To have all my data filled in, it would take aprox 2 days !!
        > >
        > > What can I do to have the data filled into Riak ASAP?
        > > How should I configure the cluster ? (vm.args/
        app.config) I don't care so much about consistency at this point.
        >
        > If you are certain to be only inserting new data setting
        your bucket(s) to last write wins will speed things up. Also,
        are you using multiple threads for the Java client insert?
        Spreading the load across all five nodes? Are you using the
        "withoutFetch()" option on the java client?
        >
        > Cheers
        >
        > Russell
        >
        > >
        > > Thank you,
        > > ing. Bogdan Flueras
        > >
        > > _______________________________________________
        > > riak-users mailing list
        > > [email protected]
        <mailto:[email protected]>
        > >
        http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
        >
        >





_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to