On 28/09/14 12:48, snacktime wrote:
I'm looking for some feedback on the design I'm using for a basic
key/value storage using postgres.
Just some quick background. This design is for large scale games that
can get up to 10K writes per second or more. The storage will be
behind a distributed memory cache that is built on top of Akka, and
has a write behind caching mechanism to cut down on the number of
writes when you have many updates in a short time period of the same
key, which is common for a lot of multiplayer type games.
I have been using Couchbase, but this is an open source project, and
Couchbase is basically a commercial product for all intents and
purposes, which is problematic. I will still support Couchbase, but I
don't want it have to tell people if you really want to scale,
couchbase is the only option.
The schema is that a key is a string, and the value is a string or
binary. I am actually storing protocol buffer messages, but the
library gives me the ability to serialize to native protobuf or to
json. Json is useful at times especially for debugging.
This is my current schema:
CREATE TABLE entities
(
id character varying(128) NOT NULL,
value bytea,
datatype smallint,
CONSTRAINT entities_pkey PRIMARY KEY (id)
);
CREATE OR REPLACE RULE entities_merge AS
ON INSERT TO entities
WHERE (EXISTS ( SELECT 1
FROM entities entities_1
WHERE entities_1.id::text = new.id::text)) DO INSTEAD
UPDATE entities SET value = new.value, datatype = new.datatype
WHERE entities.id::text = new.id::text;
Additional functionality I want is to do basic fuzzy searches by key.
Currently I'm using a left anchored LIKE query. This works well
because keys are left prefixed with a scope, a delimiter, and then the
actual key for the data. These fuzzxy searches would never be used in
game logic, they would be admin only queries for doing things like
obtaining a list of players. So they should be infrequent.
The scope of the query ability will not expand in the future. I
support multiple backends for the key/value storage so I'm working
with the lowest common denominator. Plus I have a different approach
for data that you need to do complex queries on (regular tables and an
ORM).
Chris
Note:
I suspect that what I suggest below will probably NOT improve
performance, and may not necessarily be appropriate for your use case.
However, they may facilitate a wider range of queries, and might be
easier to understand.
Note the comment about using 'PRIMARY KEY' in
http://www.postgresql.org/docs/9.2/static/sql-createtable.html
[...]
The primary key constraint specifies that a column or columns of a
table can contain only unique (non-duplicate), nonnull values.
Technically, PRIMARY KEY is merely a combination of UNIQUE and NOT
NULL, but identifying a set of columns as primary key also provides
metadata about the design of the schema, as a primary key implies
that other tables can rely on this set of columns as a unique
identifier for rows.
[...]
My first thought was to simplify the table create, though I think the
length check on the id is best done in the software updating the databased:
CREATE TABLE entities
(
id text PRIMARY KEY,
value bytea,
datatype smallint,
CONSTRAINT id_too_long CHECK (length(id) <= 128)
);
Then I noticed that your id is actually a compound key, and probably
would be better modelled as:
CREATE TABLE entities
(
scope text,
key text,
value bytea,
datatype smallint,
CONSTRAINT entities_pkey PRIMARY KEY (scope, key)
);
I suspect that making 'datatype' an 'int' would improve performance, but
only by a negligible amount!
Cheers,
Gavin