Hi Aurélien,
         You are right, DB cost the most of time. If each node have a database 
and when the scanning process finished, I use another program to take care all 
of them, such as, report the state of file system. Is this possible to speed up 
the process?
         I have implement the part that can assign jobs(  robinhood can chose 
weather add the a directory to its job stack and  scan it by itself or use the 
Torque's API to send that directory as a job and scan it in other nodes ) to 
each nodes automatically using the technology of Torque.
 
 
PS: Torque is an open source program and widely used. If  there is anything 
makes you confused, Let me know,  I believe I have made a lot of syntax errors.
 
Thank you all of you!
Best regards,
Lucas

At 2014-01-14 16:34:32,"DEGREMONT Aurelien" <[email protected]> wrote: 
>Hi, > >I think this could be useful if you have a very fast MySQL backend (or  
>maybe you plan to work on another kind of backend too ?;)). >As Adam said, the 
connection between RBH processes and MySQL server  >could become a bottleneck 
too. > >You can check what is your current bottleneck looking at the RBH  
>pipeline log. >Look at what is taking the longest time in the pipeline and 
where most  >of entries are Waiting to be processed. >Most of the time, it is 
the DB backend. > > >Aurélien > >Le 14/01/2014 08:30, Adam Brenner a écrit : >> 
Lucas, >> >> I think the overhead of putting it on to a different node will not 
>> show any performance gain. This includes the network traffic to your >> 
MySQL server rather then using a socket on the localhost. >> >> If you are 
trying to improve performance, have you tried using >> multiple cores with RBH? 
Some tuning that Thomas showed are here: >> 
http://sourceforge.net/apps/trac/robinhood/wiki/TunePipeline >> >> You can also 
optimize MySQL with mysqltunner.pl >> >>>        Node1: robinhood --scan --once 
-f myconf -L stdout -F /A/B >>>        Node1: robinhood --scan --once -f myconf 
-L stdout -F /A/C >>>        Node1: robinhood --scan --once -f myconf -L stdout 
-F /A/D >> If you just want to scan parts of your filesystem, have a unique >> 
myconf file that includes an Ignore block >> >> FS_Scan { >>      # FS Specific 
>>      Ignore >>      { >>          not tree == "/A/B" >>      } >> } >> >> Do 
this for each of your folders on your filesystem. >> >> >> Does that help? >> 
-Adam >> >> -- >> Adam Brenner >> Computer Science, Undergraduate Student >> 
Donald Bren School of Information and Computer Sciences >> >> Research 
Computing Support >> Office of Information Technology >> 
http://www.oit.uci.edu/rcs/ >> >> University of California, Irvine >> 
www.ics.uci.edu/~aebrenne/ >> [email protected] >> >> >> On Mon, Jan 13, 2014 at 
11:09 PM, Lucas <[email protected]> wrote: >>> Hello, >>>         I have a 
thought in my mind that running robinhood in multiple nodes >>> to scan the 
file system. first step is to split the file system into many >>> 
sub-filesystem(I have find a way to implement it ), then run robinhood in >>> 
many other nodes and update the ENTRIES table of MySQL database. >>>         I 
tested the -F parameter, here we surpose the structure of the >>> database is: 
>>>         /A >>>         |-- B >>>         |-- C >>>         |-- D >>>        
And there are three nodes using robinhood to scan the the file system: >>>      
  Node1: robinhood --scan --once -f myconf -L stdout -F /A/B >>>        Node1: 
robinhood --scan --once -f myconf -L stdout -F /A/C >>>        Node1: robinhood 
--scan --once -f myconf -L stdout -F /A/D >>> Apparently it is not works, but 
what if modified the source codes of >>> list_mgr and only update the ENTRIES 
table ? >>> Is that a good idea? >>> >>> Best regards, >>> Lucas >>> >>> >>> 
>>> 
------------------------------------------------------------------------------ 
>>> CenturyLink Cloud: The Leader in Enterprise Cloud Services. >>> Learn Why 
More Businesses Are Choosing CenturyLink Cloud For >>> Critical Workloads, 
Development Environments & Everything In Between. >>> Get a Quote or Start a 
Free Trial Today. >>> 
http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk 
>>> _______________________________________________ >>> robinhood-devel mailing 
list >>> [email protected] >>> 
https://lists.sourceforge.net/lists/listinfo/robinhood-devel >>> >> 
------------------------------------------------------------------------------ 
>> CenturyLink Cloud: The Leader in Enterprise Cloud Services. >> Learn Why 
More Businesses Are Choosing CenturyLink Cloud For >> Critical Workloads, 
Development Environments & Everything In Between. >> Get a Quote or Start a 
Free Trial Today. >> 
http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk >> 
_______________________________________________ >> robinhood-support mailing 
list >> [email protected] >> 
https://lists.sourceforge.net/lists/listinfo/robinhood-support >
------------------------------------------------------------------------------
CenturyLink Cloud: The Leader in Enterprise Cloud Services.
Learn Why More Businesses Are Choosing CenturyLink Cloud For
Critical Workloads, Development Environments & Everything In Between.
Get a Quote or Start a Free Trial Today. 
http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
_______________________________________________
robinhood-support mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/robinhood-support

Reply via email to