That's good if you get the performance you expect with MyISAM.
There is no problem using MyISAM, except its transaction management is 
weaker,
so you have a higher risk of corrupting the DB or creating 
inconsistencies in case of DB host crash.

As you can see, the DB is not the limiting factor in you case (the 
pipeline is empty).
For the next scans, you can try descreasing the number of scan threads 
(you currently use 8):
In some case, I noticed a FS client saturation with so many threads,
so you might get better performance with just 2 or 4 scan threads.

Regarding InnoDB, I'm still surprised about the so low performance.
If one day you give it another try, I would be interested in getting the 
pipeline statistics,
and the output of http://mysqltuner.pl after some scan time, to analyze 
DB stats.

Regards,
Thomas

On 01/13/14 23:12, Brock Palen wrote:
> Oh also current thread stats from the log, interesting nothing is waiting:
>
> 2014/01/13 17:05:53 robinhood@flux-xfer1[22595/2]: STATS | ======== FS scan 
> statistics =========
> 2014/01/13 17:05:53 robinhood@flux-xfer1[22595/2]: STATS | current scan 
> interval = 7.0d
> 2014/01/13 17:05:53 robinhood@flux-xfer1[22595/2]: STATS | scan is running:
> 2014/01/13 17:05:53 robinhood@flux-xfer1[22595/2]: STATS |      started at : 
> 2014/01/13 16:33:01 (32.9min ago)
> 2014/01/13 17:05:53 robinhood@flux-xfer1[22595/2]: STATS |      last action: 
> 2014/01/13 17:05:52 (01s ago)
> 2014/01/13 17:05:53 robinhood@flux-xfer1[22595/2]: STATS |      progress   : 
> 5214490 entries scanned (2 errors)
> 2014/01/13 17:05:53 robinhood@flux-xfer1[22595/2]: STATS |      avg. speed : 
> 3.02 ms/entry/thread -> 2647.49 entries/sec
> 2014/01/13 17:05:53 robinhood@flux-xfer1[22595/2]: STATS |      inst. speed: 
> 4.68 ms/entry/thread -> 1710.82 entries/sec
> 2014/01/13 17:05:53 robinhood@flux-xfer1[22595/2]: STATS | ==== 
> EntryProcessor Pipeline Stats ===
> 2014/01/13 17:05:53 robinhood@flux-xfer1[22595/2]: STATS | Threads waiting: 8
> 2014/01/13 17:05:53 robinhood@flux-xfer1[22595/2]: STATS | Id constraints 
> count: 0 (hash min=0/max=0/avg=0.0)
> 2014/01/13 17:05:53 robinhood@flux-xfer1[22595/2]: STATS |  0: STAGE_GET_FID  
>       | Wait:     0 | Curr:   0 | Done:   0 | Total:      0 | ms/op: 0.00
> 2014/01/13 17:05:53 robinhood@flux-xfer1[22595/2]: STATS |  1: 
> STAGE_GET_INFO_DB    | Wait:     0 | Curr:   0 | Done:   0 | Total: 5219385 | 
> ms/op: 0.59
> 2014/01/13 17:05:53 robinhood@flux-xfer1[22595/2]: STATS |  2: 
> STAGE_GET_INFO_FS    | Wait:     0 | Curr:   0 | Done:   0 | Total: 5219385 | 
> ms/op: 0.01
> 2014/01/13 17:05:53 robinhood@flux-xfer1[22595/2]: STATS |  3: 
> STAGE_REPORTING      | Wait:     0 | Curr:   0 | Done:   0 | Total: 5219385 | 
> ms/op: 0.00
> 2014/01/13 17:05:53 robinhood@flux-xfer1[22595/2]: STATS |  4: 
> STAGE_PRE_APPLY      | Wait:     0 | Curr:   0 | Done:   0 | Total: 5219385 | 
> ms/op: 0.00
> 2014/01/13 17:05:53 robinhood@flux-xfer1[22595/2]: STATS |  5: STAGE_DB_APPLY 
>       | Wait:     0 | Curr:   0 | Done:   0 | Total: 5219385, batches: 95027 
> (avg size: 52) | ms/op: 0.33
> 2014/01/13 17:05:53 robinhood@flux-xfer1[22595/2]: STATS |  6: 
> STAGE_RM_OLD_ENTRIES | Wait:     0 | Curr:   0 | Done:   0 | Total:      0 | 
> ms/op: 0.00
> 2014/01/13 17:05:53 robinhood@flux-xfer1[22595/2]: STATS | DB ops: 
> get=5172474/ins=39076/upd=5180309/rm=0
>
> Brock Palen
> www.umich.edu/~brockp
> CAEN Advanced Computing
> XSEDE Campus Champion
> [email protected]
> (734)936-1985
>
>
>
> On Jan 10, 2014, at 4:27 AM, LEIBOVICI Thomas <[email protected]> wrote:
>
>> Sorry Brock, I noticed an issue with the lastest code you got, that result 
>> in setting a wrong config default for entry processor.
>> It only impacts you if you have no EntryProcessor block at all in your 
>> config file.
>> A simple workaround is to define an empty block.
>>
>> Or if you want, the fix is available in git (commit 
>> 33cfa01416d7d55a275a35a34fab2beed66f7dc8).
>>
>> I believe that 2.5 may be slower that 2.3, as it introduce new information 
>> in the DB.
>> However I'm still surprised it is 2-3x slower.
>>
>> I understood you tried both MyISAM and InnoDB with rbh 2.5.
>> What performance difference do you get between the 2 engines?
>>
>> Regards,
>> Thomas
>>
>>
>> On 01/09/14 22:45, Brock Palen wrote:
>>> Thomas,
>>>
>>> I tried the new build, while it was slightly faster (200/sec rather than 
>>> 50/sec) and the errors from the current 2.5.0 beta were resolved, I enabled 
>>> innodb=disabled,
>>>
>>> And I am getting still lower performance than 2.3 by about 2-3x  but it is 
>>> still 2000-3000/second  and this will work for us.
>>>
>>> Brock Palen
>>> www.umich.edu/~brockp
>>> CAEN Advanced Computing
>>> XSEDE Campus Champion
>>> [email protected]
>>> (734)936-1985
>>>
>>>
>>>
>>> On Jan 9, 2014, at 10:31 AM, LEIBOVICI Thomas <[email protected]> 
>>> wrote:
>>>
>>>> Hi Brock,
>>>>
>>>> On 01/09/14 15:49, Brock Palen wrote:
>>>>> Adam,
>>>>>
>>>>> I tried adding innodb=disabled;
>>>>>
>>>>> To ListManager but 2.5.0 beta barfs on it:
>>>>>
>>>>> Config Check | WARNING: unknown parameter 'innodb' in block 'ListManager' 
>>>>> line 74
>>>> This parameter must be specified in the MySQL subblock (not directly in 
>>>> ListManager), as is it MySQL related:
>>>>
>>>> ListManager {
>>>> ...
>>>>     MySQL  {
>>>>         ...
>>>>         innodb = disabled ;
>>>>     }
>>>> }
>>>>
>>>>> I sadly found out that due to a mistake on my part 2.5.0 was built 
>>>>> without lustre support so no stripe info was being collected,
>>>>>
>>>>> When I rebuilt performance was only 50entires/ms  :-(     Scans never 
>>>>> finish, you can't run qurries.
>>>>>
>>>>> My solution for now is to roll all the way back to 2.3.x  The last 
>>>>> version of robinhood that worked at full speed for us.  We will look at 
>>>>> thi sall again when we have lustre + changelogs.   It is sad because we 
>>>>> really wanted the post scan script hook, we are working on a system where 
>>>>> we use squop  to pull all the data from robinhood into hive/pig and then 
>>>>> run a bunch of stats on our system over time.   Problem is robinhood 
>>>>> isn't fast enough on the hardware I have available (7200RPM sata).
>>>> I suggest you give a try to the latest version of code from the git 
>>>> repository.
>>>> If you run it, drop any previous tuning in the EntryProcessor block. You 
>>>> can also re-enable acct_* parameters.
>>>>
>>>> With the current code, we get a nice scan speed (6.7M entries per hour 
>>>> ~1800/sec) and no longer state deadlock issues.
>>>> It reached a changelog processing speed of 36k/sec without any sign of 
>>>> saturation.
>>>> The DB is also on a local spinning disk and we use the innodb engine.
>>>>
>>>> Here are the tunings in /etc/my.cnf.
>>>> We recently found that "innodb_log_file_size" has a major impact (see URL 
>>>> below).
>>>>
>>>> key_buffer_size=512M
>>>> thread_cache_size=64
>>>> query_cache_size=512M
>>>> query_cache_limit=512M
>>>> sort_buffer_size=512M
>>>> read_rnd_buffer_size=1M
>>>> table_cache=8K
>>>> tmp_table_size=1G
>>>> max_heap_table_size=1G
>>>>
>>>> innodb_file_per_table
>>>> innodb_buffer_pool_size=128G# up to 80% of physical memory
>>>> innodb_max_dirty_pages_pct=20
>>>>
>>>> # see the following tutorial to tune innodb_log_file_size:
>>>> #http://www.mysqlperformanceblog.com/2008/11/21/how-to-calculate-a-good-innodb-log-file-size
>>>> innodb_log_file_size=900M
>>>>
>>>> Regards
>>>> Thomas
>>>>
>>>>


------------------------------------------------------------------------------
CenturyLink Cloud: The Leader in Enterprise Cloud Services.
Learn Why More Businesses Are Choosing CenturyLink Cloud For
Critical Workloads, Development Environments & Everything In Between.
Get a Quote or Start a Free Trial Today. 
http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
_______________________________________________
robinhood-support mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/robinhood-support

Reply via email to