On Wed, Aug 10, 2011 at 16:30, Jeremiah Peschka
<[email protected]> wrote:
> Ciprian's response this morning has some interesting information in it - I
> have suspicions about where the PostgreSQL and MySQL performance problems.
> Unfortunately, my suspicion also highlights why some people don't like using
> RDBMSes: when you get to higher performance scenarios you frequently need
> specialized knowledge to tune the RDBMS to run well.
Indeed you need highly qualified DBA's to tune the RDBMS. And in
my case I've failed to successfully do so :). (I've asked here and
there for feedback, but nothing I have done "revived" the insert
speed.)
> Within an RDBMS, you'll get good performance by splitting out your writes
> into a partitioned table where you either group up writes by some constant
> value (modulus of sensor location, for example) or else you group writes by
> month. You can then spread your writes out across multiple storage devices,
> gaining some of the benefits of a system like Riak, and still getting the
> benefit of an RDBMSes sequential scans.
> ---
> Jeremiah Peschka - Founder, Brent Ozar PLF, LLC
> Microsoft SQL Server MVP
But then the solution gets quite complicated and you'll end up
constructing a specialized "data-store" on-top of a cluster of RDBMS,
when you could have just started from the beginning on-top of some
lightweight storage engine.
Ciprian.
> On Aug 9, 2011, at 6:38 PM, Paul O wrote:
>
>> Jeremiah, with all the feedback indicating Riak to not be such a straight
>> fit for the problem even accepting batching compromises, I'll have to
>> strongly reconsider RDBMs solutions, though they come with their own
>> tradeoffs, as discussed here today. How about storing data using innostore
>> instead of bitcask, would this be a "best of both worlds" situation?
>>
>> Thanks for these and the previous valuable comments, again,
>>
>> Paul
>>
>> On Tue, Aug 9, 2011 at 11:14 AM, Jeremiah Peschka
>> <[email protected]> wrote:
>> Excellent points, Alex.
>>
>> When you compare Riak's storage overhead to something like an RDBMS where
>> you have maybe 24 bytes for row overhead (as is the case for PostgreSQL),
>> you'll find that there's a tremendous economy of space elsewhere.
>>
>> Riak is going to excel in applications where reads and writes will be truly
>> random. Yes, there are Map Reduce features, but randomly organized data is
>> still randomly organized data.
>>
>> If you look at RDBMSes, horizontally partitioning your system through RDBMS
>> features (SQL Server's partitioned tables, PostgreSQL's partitioned views,
>> for example), gives you the ability to take advantage of many known
>> quantities in that world - range queries can take advantage of sequential
>> scan speeds across rotational disks.
>>
>> You can even avoid the overhead of a traditional RDBMS altogether and use
>> the InnoDB APIs or something like HandlerSocket to write data very quickly.
>>
>> ---
>> Jeremiah Peschka - Founder, Brent Ozar PLF, LLC
>> Microsoft SQL Server MVP
>>
>> On Aug 9, 2011, at 7:43 AM, Alexander Sicular wrote:
>>
>> > A couple of thoughts:
>> >
>> > -disk io
>> > -total keys versus memory
>> > -data on disk overhead
>> >
>> > As Jeremiah noted, disk io is crucial. Thankfully, Riak's distributed mesh
>> > gives you access to a number of spindles limited only by your budget. I
>> > think that is a critical bonus of a distributed system like Riak that is
>> > often not fully appreciated. Here Riak is a win for you.
>> > Bitcask needs all keys to fit in memory. We are talking something like:
>> >
>> > (key length + overhead) * number of keys * replicas < cluster max
>> > available ram.
>> >
>> > There is a tool on the wiki which should help figure this out. What that
>> > basically means for you is that you will have to batch your data by some
>> > sensor/time granularity metric. Let's say every minute. At 10hz that is a
>> > 600x reduction in total keys. Of course, this doesn't come for free. Your
>> > application middleware will have to accommodate. That means you could lose
>> > up to whatever your time granularity batch is. Ie. You could lose a minute
>> > of sensor data should your application fail. Here Riak is neutral to
>> > negative.
>> > Riak data structure is not friendly towards small values. Sensor data
>> > generally spit out integers or other small data tuples. If you search the
>> > list archives you will find a magnificent data overhead writeup. IIRC, it
>> > was something on the order of 450b. What that basically tells you is that
>> > you can't use bitcask for small values if disk space is a concern, as I
>> > imagine it to be in this case. Also, sensor data is generally write only,
>> > ie. never deleted or modified, so compaction should not be a concern when
>> > using bitcask. Here Riak is a strong negative.
>> > Data retrieval issues aside (which between Riak Search/secondary
>> > indexes/third party indexes should not be a major concern), I am of the
>> > opinion that Riak is not a good fit for high frequency sensor data
>> > applications.
>> >
>> > Cheers,
>> > Alexander
>> >
>> > Sent from my rotary phone.
>> >
>> > On Aug 8, 2011 9:40 PM, "Paul O" <[email protected]> wrote:
>> > > Quite a few interesting points, thanks!
>> > >
>> > > On Mon, Aug 8, 2011 at 5:53 PM, Jeremiah Peschka
>> > > <[email protected]
>> > >> wrote:
>> > >
>> > >> Responses inline
>> > >>
>> > >> On Aug 8, 2011, at 1:25 PM, Paul O wrote:
>> > >>
>> > >> Will any existing data be imported? If this is totally greenfield, then
>> > >> you're free to do whatever zany things you want!
>> > >
>> > >
>> > > Almost totally greenfield, yes. Some data will need to be imported but
>> > > it's
>> > > already in the format described.
>> > >
>> > > Ah, so you need IOPS throughput, not storage capacity. On the hardware
>> > > side
>> > >> make sure your storage subsystem can keep up - don't cheap out on disks
>> > >> just
>> > >> because you have a lot of nodes. A single rotational HDD can only handle
>> > >> about 180 IOPS on average. There's a lot you can do on the storage
>> > >> backend
>> > >> to make sure you're able to keep up there.
>> > >>
>> > >
>> > > Indeed, storage capacity is also an issue but IOPS would be important,
>> > > too.
>> > > I assume that sending batches to Riak (opaque blobs) would help a lot
>> > > with
>> > > the quantity of writes, but it's still a very important point.
>> > >
>> > > You may want to look into ways to force Riak to clean up the bitcask
>> > > files.
>> > >> I don't entirely remember how it's going to handle cleaning up deleted
>> > >> records, but you might run into some tricky situations where compactions
>> > >> aren't occurring.
>> > >>
>> > >
>> > > Hm, any references regarding that? It would be a major snag in the whole
>> > > schema Riak doesn't properly reclaim space for deleted records.
>> > >
>> > > Riak is pretty constant time for Bitcask. The tricky part with the
>> > > amount of
>> > >> data you're describing is that Bitcask requires (I think) that all keys
>> > >> fit
>> > >> into memory. As your data volume increases, you'll need to do a
>> > >> combination
>> > >> of scaling up and scaling out. Scale up RAM in the nodes and then add
>> > >> additional nodes to handle load. RAM will help with data volume, more
>> > >> nodes
>> > >> will help with write throughput.
>> > >>
>> > >
>> > > Indeed, for high frequency sources that would create lots of bundles even
>> > > the MaxN to 1 reduction for key names might still generate loads of keys.
>> > > Any idea how much RAM Riak requires per record, or a reference that would
>> > > point me to it?
>> > >
>> > > Since you're searching on time series, mostly, you could build time
>> > > indexes
>> > >> in your RDBMS. The nice thing is that querying temporal data is well
>> > >> documented in the relational world, especially in the data warehousing
>> > >> world. In your case, I'd create a dates table and have a foreign key
>> > >> relating to my RDBMS index table to make it easy to search for dates.
>> > >> Querying your time table will be fast which reduces the need for scans
>> > >> in
>> > >> your index table.
>> > >>
>> > >> EXAMPLE:
>> > >>
>> > >> CREATE TABLE timeseries (
>> > >> time_key INT,
>> > >> date TIMESTAMP,
>> > >> datestring VARCHAR(30),
>> > >> year SMALLINT,
>> > >> month TINYINT,
>> > >> day TINYINT,
>> > >> day_of_week TINYINT
>> > >> -- etc
>> > >> );
>> > >>
>> > >> CREATE TABLE riak_index (
>> > >> id INT NOT NULL,
>> > >> time_key INT NOT NULL REFERENCES timeseries(time_key),
>> > >> riak_key VARCHAR(100) NOT NULL
>> > >> );
>> > >>
>> > >>
>> > >> SELECT ri.riak_key
>> > >> FROM timeseries ts
>> > >> JOIN riak_index ri ON ts.time_key = ri.time_key
>> > >> WHERE ts.date BETWEEN '20090702' AND '20100702';
>> > >>
>> > >
>> > > My plan was to have the riak_index contain something like: (id,
>> > > start_time,
>> > > end_time, source_id, record_count.)
>> > >
>> > > Without going too much into RDBMS fun, this pattern can get your RDBMS
>> > >> running pretty quickly and then you can combine that with Riak's
>> > >> performance
>> > >> and have a really good idea of how quick any query will be.
>> > >
>> > >
>> > > That's roughly the plan, thanks again for your contributions to the
>> > > discussion!
>> > >
>> > > Paul
>> > _______________________________________________
>> > riak-users mailing list
>> > [email protected]
>> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>
>>
>> _______________________________________________
>> riak-users mailing list
>> [email protected]
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>
>> _______________________________________________
>> riak-users mailing list
>> [email protected]
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>
> _______________________________________________
> riak-users mailing list
> [email protected]
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com