On 28/09/14 12:48, snacktime wrote:
I'm looking for some feedback on the design I'm using for a basic key/value storage using postgres.

Just some quick background. This design is for large scale games that can get up to 10K writes per second or more. The storage will be behind a distributed memory cache that is built on top of Akka, and has a write behind caching mechanism to cut down on the number of writes when you have many updates in a short time period of the same key, which is common for a lot of multiplayer type games.

I have been using Couchbase, but this is an open source project, and Couchbase is basically a commercial product for all intents and purposes, which is problematic. I will still support Couchbase, but I don't want it have to tell people if you really want to scale, couchbase is the only option.

The schema is that a key is a string, and the value is a string or binary. I am actually storing protocol buffer messages, but the library gives me the ability to serialize to native protobuf or to json. Json is useful at times especially for debugging.

This is my current schema:

  id character varying(128) NOT NULL,
  value bytea,
  datatype smallint,
  CONSTRAINT entities_pkey PRIMARY KEY (id)

    ON INSERT TO entities
           FROM entities entities_1
WHERE entities_1.id::text = new.id::text)) DO INSTEAD UPDATE entities SET value = new.value, datatype = new.datatype
  WHERE entities.id::text = new.id::text;

Additional functionality I want is to do basic fuzzy searches by key. Currently I'm using a left anchored LIKE query. This works well because keys are left prefixed with a scope, a delimiter, and then the actual key for the data. These fuzzxy searches would never be used in game logic, they would be admin only queries for doing things like obtaining a list of players. So they should be infrequent.

The scope of the query ability will not expand in the future. I support multiple backends for the key/value storage so I'm working with the lowest common denominator. Plus I have a different approach for data that you need to do complex queries on (regular tables and an ORM).

I suspect that what I suggest below will probably NOT improve performance, and may not necessarily be appropriate for your use case. However, they may facilitate a wider range of queries, and might be easier to understand.

Note the comment about using 'PRIMARY KEY' in http://www.postgresql.org/docs/9.2/static/sql-createtable.html

   The primary key constraint specifies that a column or columns of a
   table can contain only unique (non-duplicate), nonnull values.
   Technically, PRIMARY KEY is merely a combination of UNIQUE and NOT
   NULL, but identifying a set of columns as primary key also provides
   metadata about the design of the schema, as a primary key implies
   that other tables can rely on this set of columns as a unique
   identifier for rows.

My first thought was to simplify the table create, though I think the length check on the id is best done in the software updating the databased:

   CREATE TABLE entities
        id         text PRIMARY KEY,
        value      bytea,
        datatype   smallint,
        CONSTRAINT id_too_long CHECK (length(id) <= 128)

Then I noticed that your id is actually a compound key, and probably would be better modelled as:

   CREATE TABLE entities
        scope      text,
        key        text,
        value      bytea,
        datatype   smallint,
        CONSTRAINT entities_pkey PRIMARY KEY (scope, key)

I suspect that making 'datatype' an 'int' would improve performance, but only by a negligible amount!


Reply via email to