Sorry, that was not meant to be sent to the list...

On Tue, Apr 28, 2009 at 3:27 PM, tim robertson
<timrobertson...@gmail.com> wrote:
> Hi,
>
> [Ankur]: How can make sure this happens?
> -- show processlist is how we spot it... literally it takes hours on
> our set up so easy to find.
>
> So we ended up with 2 DBs
> - DB1 we insert to, prepare and do batch processing
> - DB2 serving the read only web app
>
> Periodically we dump the DB1, point the read-only web app at DB1, load
> the dump into DB2, and then DB2 becomes the write DB... and an endless
> cycle.  We can afford data latency in our system though.
>
> Why did we end up with this?  Because of locking on writes that kill
> reads as you say... basically you can't insert when a read is
> happening on myisam as it locks the whole table.  InnoDB has row level
> locking to get around this but in our experience (at the time we had
> 130million records) it just didn't work either.  We spent €10,000 for
> the supposed "european expert" on mysql from their professional
> services and were unfortunately very disappointed.  Seems such large
> tables are just problematic with mysql.  We are now very much looking
> into Lucene technologies for search and Hadoop for reporting and
> datamining type operations. SOLR does a lot of what our DB does for
> us.
>
> So with myisam... here is what we learnt:
>
> Only very latest mysql versions (beta still I think) support more than
> 4G memory for indexes (you really really need the index in memory, and
> where possible the FK for joins in the index too).  Mysql has
> differing join strategies between innoDB and myisam, so be aware.
> An undocumented feature of myisam is you can create memory buffers for
> single indexes:
> In the my.cnf:
>     taxon_concept_cache.key_buffer_size=3990M    -- for some reason
> you have to drop a little under 4G
>
> then in the DB run:
>    cache index taxon_concept in taxon_concept_cache;
>    load index into cache taxon_concept;
>
> This allows for making sure an index gets into memory for sure.
>
> And here is our main cnf changes:
>
> table_cache = 512
> sort_buffer_size = 128M
> read_buffer_size = 64M
> read_rnd_buffer_size = 300M
> thread_cache_size = 8
> query_cache_type = 0
> query_cache_size = 0
> thread_concurrency = 8
> # this is used for repair by sorting and is faster than keycache by far
> myisam_max_sort_file_size = 200G
> myisam_max_extra_sort_file_size = 200G
> # this is used for repair by sorting and is limited to 4G
> myisam_sort_buffer_size = 3990M
> # maximum value for this is 4g for now but see section on specific
> keys for large indexes
> key_buffer_size=3990M
> tmp_table_size=512M
> max_heap_table_size=512M
>
> -- for repair by sorting, here are the required values
> myisam_max_sort_file_size = 200G
> myisam_max_extra_sort_file_size = 200G
> myisam_sort_buffer_size = 3990M
>
>
> I guess just archive this mail off for if you ever do try myisam for
> something... these params are the result of a lot of trial and error
> and days of waiting!
>
> Our DB has about 44G of index which we can keep in memory but are
> hitting limits - hence on the hadoop train.
>
> Cheers
>
> Tim
>
>
>
>
>
> On Tue, Apr 28, 2009 at 2:49 PM, Wang Zhong <wangzhong....@gmail.com> wrote:
>> Hi,
>>
>> That's really cool. It seems that Hadoop could work with SQL DBs like
>> Mysql with bearable time. I thought when inserting data to Mysql, the
>> expense of communication was always a big problem. You got a method to
>> reduce the expense.
>>
>> Using Distribution Databases like HBase is another good choice. It is
>> based on Hadoop and HDFS, so there is no problem with communication
>> expense.
>>
>>
>>
>> On Tue, Apr 28, 2009 at 6:46 PM, Ankur Goel <ankur.g...@corp.aol.com> wrote:
>>>
>>> hello hadoop users,
>>> Recently I had a chance to lead a team building a log-processing system 
>>> that uses Hadoop and MySQL. The system's goal was to process the incoming 
>>> information as quickly as possible (real time or near real time), and make 
>>> it available for querying in MySQL. I thought it would be good to share the 
>>> experience and the challenges with the community. Couldn't think of a 
>>> better place than these mailing lists as I am not much of a blogger :-)
>>>
>>> The information flow in the system looks something like
>>>
>>> [Apache-Servers] -> [Hadoop] -> [MySQL-shards] -> [Query-Tools]
>>>
>>> Transferring from Apache-Servers to Hadoop was quite easy as we just had to 
>>> organize the data in timely buckets (directories). Once that was running 
>>> smooth we had to make sure that map-reduce jobs are fired at regular 
>>> intervals and they pick up the right data. The jobs would then 
>>> process/aggregate the date and dump the info into MySQL shards from the 
>>> reducers [we have our own DB partioning set up]. This is where we hit major 
>>> bottlenecks [any surprises? :-)]
>>>
>>> The table engine used was InnoDB as there was a need for fast replication 
>>> and writes but only moderate reads (should eventually support high read 
>>> rates). The data would take up quite a while to load completely far away 
>>> from being near-real time. And so our optimization journey begin.
>>>
>>> 1. We tried to optimize/tune InnoDB parameters like increasing the buffer 
>>> pool size to 75 % of available RAM. This helped but only till the time DBs 
>>> were lightly loaded i.e. innoDB had sufficient buffer pool to host the data 
>>> and indexes.
>>>
>>> 2. We also realized that InnoDB has considerable locking overhead because 
>>> of which write concurrency is really bad when you have a large number of 
>>> concurrent threads doing writes. The default thread concurrency for us was 
>>> set to no_of_cpu * 2 = 8 which is what the official documentation advises 
>>> as the optimal limit. So we limited the number of reduce tasks and 
>>> consequently the number of concurrent writes and boy the performance 
>>> improved 4x. We were almost there :-)
>>>
>>> 3. Next thing we tried is the standard DB optimzation techniques like 
>>> de-normalizing the schema and dropping constraints. This gave only a minor 
>>> performance improvement, nothing earth shattering. Note that we were 
>>> already caching connections in reducers to each MySQL shard and 
>>> partionining logic was embedded into reducers.
>>>
>>> 4. Falling still short of our performance objectives, we finally we decided 
>>> to get rid of JDBC writes from reducers and work on an alternative that 
>>> uses MySQLs LOAD utility.
>>> - The processing would partition the data into MySQL shard specific files 
>>> resident in HDFS.
>>> - A script would then spawn processes via ssh on different physical 
>>> machines to download this data.
>>> - Each spawned process just downloads the data for the shard it should 
>>> upload to.
>>> - All the processes then start uploading data in parallel into their 
>>> respective MySQL shards using LOAD DATA infile.
>>>
>>> This proved to be the fastest approach, even in the wake of increasing data 
>>> loads. The enitre processing/loading would complete in less than 6 min. The 
>>> system has been holding up quite well so far, even though we've had to 
>>> limit the number of days for which we keep the data or else the MySQLs get 
>>> overwhelmed.
>>>
>>> Hope this is helpful to people.
>>>
>>> Regards
>>> -Ankur
>>>
>>
>>
>>
>> --
>> Wang Zhong
>>
>

Reply via email to