Re: Slow performance using linkwalk, help wanted

Alexander Sicular Mon, 08 Nov 2010 05:27:10 -0800

Try distributing your reads in a round robin against all your nodes.


Best, Alexander


@siculars on twitter
http://siculars.posterous.com

Sent from my iPhone

On Nov 8, 2010, at 6:36, Jan Buchholdt <[email protected]> wrote:

We are evaluating Riak for a project, but having a hard time makingit fast enough for our need.
Our model is very simple and looks like this:

---------------------                         * ---------------------
|       Person      | ------------------------> |   Document        |
---------------------                           ---------------------

We have a set of persons and each person can have many documents.

Our typical queries are:
Get an overview of all the persons documents. This query returns theperson along with a subset of data from all the persons documents.
Get document by id.
Our requirements are that these quires should be performed under inunder 100millis when we have 10 requests per second or less load.
The size of the data:
A document is approximately 1 kb
No data for a persons except the personidentifier
Around 6 million persons.
Each person has from from 0 to a couple of thousand documents.
All in all we have 120 mio documents.
Most persons don't have more than 1 to 10 documents, but then wehave some few "heavy" persons having 500 to 1000 documents.
Riak setup:
4 Nodes.
Hardware configuration for each node:
HP ProLiant DL360 G7
18 gb ram
SAS discs
Intel(R) Xeon(R) CPU E5620 @ 2.40GHz Proc 1
Solaris 10 update 9

We use the default bitcask storage engine
We replicate data to 3 machines when it is written.
Reads are read from just one machine
We tried implementing our datamodel using Riak links as describedbelow:
Persons are stored in a person bucket using their person identifieras key /person/{personid}
Documents are saved in another bucket /document/{documented}
At each person we store links to the persons documents.
We are having problems with the query fetching all the documents fora person. Reading all the documents for a person is done using alink walk. The linkwalk start reading all the document keys usingthe personid. It then fetches all documents.For persons with 1 - 5 documents the response times are often over100 mills. And for the "heavy" persons with many documents responsetimes are several seconds. But we are very new to Riak and areprobably using a wrong approach.
Below are our thoughts (having almost no experience with Riak):
The chosen datamodel is good for writes. Writing a new documentresults in 3 operations against Riak. Writing the document using itsid as key. Reading the Person to get all the persons document links.Append the new document's key to the persons links and write backthe person.
Reading, using linkwalk, is slow because it is expensive to fetchmany documents even though the linkwalk can read their keys rightaway by reading the links for the person. Even though we have 4nodes and linkwalks are parallelized many documents need to beretrieved from one node. Having to fetch for example 100 documentson one node (one disc) is expensive. We do not know how data isstored but are afraid Riak is doing a lot of disk seeks.
We are considering another more denormalized approach where we writeall the documents for a person in one "blob". But then we are afraidour writes become slow, because when adding a new document the blobmust be read, the new document inserted and the blob written back.
We could really need some input. Is our assumptions wrong? (we havenot yet dug into the problems). Is there a good datamodel for ourrequirements? etc?.We haven't looked at Riak search at all. Maybe it could solve someof our problems.
--
--
Jan Buchholdt
Software Pilot
Trifork A/S
Cell +45 50761121

_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Re: Slow performance using linkwalk, help wanted

Reply via email to