Hi Fabio,

Congratulations. I'm impressed of your in-depth tuning of robinhood, 
mysql and your system.

Robinhood should dump stats in its logs at regular interval.
It would help identifying bottlenecks if you could send an extract of 
it, like this:


2014/05/21 21:08:34 [13362/4] STATS | ======== FS scan statistics =========
2014/05/21 21:08:34 [13362/4] STATS | scan is running:
2014/05/21 21:08:34 [13362/4] STATS |      started at : 2014/05/21 
09:08:26 (12.0h ago)
2014/05/21 21:08:34 [13362/4] STATS |      last action: 2014/05/21 
21:08:33 (01s ago)
2014/05/21 21:08:34 [13362/4] STATS |      progress   : 6349237 entries 
scanned (0 errors)
2014/05/21 21:08:34 [13362/4] STATS |      avg. speed : 103.84 
ms/entry/thread -> 154.09 entries/sec
2014/05/21 21:08:34 [13362/4] STATS |      inst. speed: 41.30 
ms/entry/thread -> 387.39 entries/sec
2014/05/21 21:08:34 [13362/4] STATS | ==== EntryProcessor Pipeline Stats ===
2014/05/21 21:08:34 [13362/4] STATS | Idle threads: 15
2014/05/21 21:08:34 [13362/4] STATS | Id constraints count: 10000 (hash 
min=0/max=7/avg=1.3)
2014/05/21 21:08:34 [13362/4] STATS | Stage | Wait | Curr | Done |     
Total | ms/op |
2014/05/21 21:08:34 [13362/4] STATS |  0: GET_FID |    0 |    0 |    0 
|         0 |  0.00 |
2014/05/21 21:08:34 [13362/4] STATS |  1: GET_INFO_DB |    0 |    0 |    
0 |   6603345 |  0.25 |
2014/05/21 21:08:34 [13362/4] STATS |  2: GET_INFO_FS |    0 |    0 |    
0 |   6603345 |  0.03 |
2014/05/21 21:08:34 [13362/4] STATS |  3: REPORTING |    0 |    0 |    0 
|   6603345 |  0.00 |
2014/05/21 21:08:34 [13362/4] STATS |  4: PRE_APPLY |    0 |    0 |    0 
|   6603345 |  0.00 |
2014/05/21 21:08:34 [13362/4] STATS |  5: DB_APPLY | 9999 |    1 |    0 
|   6593345 |  6.54 | 98.35% batched (avg batch size: 85.1)
2014/05/21 21:08:34 [13362/4] STATS |  6: RM_OLD_ENTRIES |    0 |    0 
|    0 |         0 |  0.00 |

I noticed that you disabled DB batching to use multithreaded DB 
operations. Did you get better results this way ?

Your hardware looks appropriate. The stats dump will give use more 
information.

Regards


On 06/06/14 09:29, Verzelloni Fabio wrote:
> Hello folks,
>     I'm doing some test on robinhood in our environment, some details 
> regarding the hardware in use:
>
> ## Robinhood server ##
>
> IBM M3550 X4
> 128Gb RAM
> 256G HD SSD for mysql
> 2* Intel Xeon Processor E5-2650v2 8C 2.6GHz 20MB Cache 1866MHz
> Lustre 2.5.1
> Mysql 5.5
>
> ## Luster Version in production ##
> lustre 2.1
>
> ## Robinhood.conf ##
>
> FS_Scan {
>          nb_threads_scan = 32;
>          nb_prealloc_tasks=10000;
> }
>
> EntryProcessor {
>          nb_threads = 32;
>          STAGE_GET_FID_threads_max = 16;
>          STAGE_GET_INFO_DB_threads_max = 4;
>          STAGE_GET_INFO_FS_threads_max = 4;
>          STAGE_REPORTING_threads_max = 1;
>          STAGE_DB_APPLY_threads_max = 16;
>          STAGE_CHGLOG_CLR_threads_max = 1;
>          STAGE_RM_OLD_ENTRIES_threads_max = 1;
>          max_pending_operations = 1000;
>          max_batch_size=1;
> }
>
> ## My.cnf ##
>
> [mysqld]
> large-pages
> datadir=/var/lib/mysql
> socket=/var/lib/mysql/mysql.sock
> user=mysql
> # Disabling symbolic-links is recommended to prevent assorted security risks
> symbolic-links=0
> innodb_flush_log_at_trx_commit = 0
> # possibly the most important setting
> max_connections= 512
> innodb_buffer_pool_size= 60G
> # ~50% of memory
> innodb_max_dirty_pages_pct= 15
> innodb_thread_concurrency= 32
> innodb_log_file_size= 100M
> innodb_log_buffer_size= 50M
> innodb_data_file_path= ibdata1:1G:autoextend
> # kernel must be configured for support
> table-open-cache= 2000
> sort-buffer-size= 32M
> read-buffer-size= 16M
> read-rnd-buffer-size= 4M
> thread-cache-size= 128
> query-cache-size= 40M
> query-cache-limit= 1M
> tmp-table-size= 16M
>
> [mysqld_safe]
> log-error=/var/log/mysqld.log
> pid-file=/var/run/mysqld/mysqld.pid
>
> ## vm.nr_hugepages ##
>
> vm.nr_hugepages = 50000
> vm.nr_hugepages_mempolicy = 50000
> vm.hugetlb_shm_group = 27
> vm.hugepages_treat_as_movable = 0
> vm.nr_overcommit_hugepages = 0
>
> ## sysctl ##
>
> kernel.shmmax = 118111600640
> kernel.shmall = 118111600640
>
> ## limits.conf ##
>
> mysql hard memlock unlimited
> mysql soft memlock unlimited
> ---
>
> I'm trying to find the best configuration to reach the best "entries/sec", 
> and with this configuration the best number I can get is 2600 ~ entries/sec. 
> Do you think that based on the HW in use is it possible to improve the speed 
> of the scan?
> What's the best practice to better configure the server to perform the best 
> speed of scan?
>
> While I'm running the initial scan I see a lot of the following messages:
>
> ...
> 2014/06/06 08:34:20 [10535/15] ListMgr | Retryable DB error in ListMgr_Insert 
> l.218. Restarting transaction in 1 sec...
> 2014/06/06 08:34:21 [10535/15] ListMgr | DB deadlock detected
> ...
>
> I was hoping to reach 4000 / 5000 entries/sec do you think with the HW I have 
> available I can manage to reach that result? Suggestions or questions are 
> welcome.
>
> Regards
> Fabio
>
>
> --
> - Fabio Verzelloni - CSCS - Swiss National Supercomputing Centre
> via Trevano 131 - 6900 Lugano, Switzerland
> Tel: +41 (0)91 610 82 04
>
>
> ------------------------------------------------------------------------------
> Learn Graph Databases - Download FREE O'Reilly Book
> "Graph Databases" is the definitive new guide to graph databases and their
> applications. Written by three acclaimed leaders in the field,
> this first edition is now available. Download your free book today!
> http://p.sf.net/sfu/NeoTech
> _______________________________________________
> robinhood-support mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/robinhood-support


------------------------------------------------------------------------------
HPCC Systems Open Source Big Data Platform from LexisNexis Risk Solutions
Find What Matters Most in Your Big Data with HPCC Systems
Open Source. Fast. Scalable. Simple. Ideal for Dirty Data.
Leverages Graph Analysis for Fast Processing & Easy Data Exploration
http://p.sf.net/sfu/hpccsystems
_______________________________________________
robinhood-support mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/robinhood-support

Reply via email to