Hi Aurélien,
You are right, DB cost the most of time. If each node have a database
and when the scanning process finished, I use another program to take care all
of them, such as, report the state of file system. Is this possible to speed up
the process?
I have implement the part that can assign jobs( robinhood can chose
weather add the a directory to its job stack and scan it by itself or use the
Torque's API to send that directory as a job and scan it in other nodes ) to
each nodes automatically using the technology of Torque.
PS: Torque is an open source program and widely used. If there is anything
makes you confused, Let me know, I believe I have made a lot of syntax errors.
Thank you all of you!
Best regards,
Lucas
At 2014-01-14 16:34:32,"DEGREMONT Aurelien" <[email protected]> wrote:
>Hi, > >I think this could be useful if you have a very fast MySQL backend (or
>maybe you plan to work on another kind of backend too ?;)). >As Adam said, the
connection between RBH processes and MySQL server >could become a bottleneck
too. > >You can check what is your current bottleneck looking at the RBH
>pipeline log. >Look at what is taking the longest time in the pipeline and
where most >of entries are Waiting to be processed. >Most of the time, it is
the DB backend. > > >Aurélien > >Le 14/01/2014 08:30, Adam Brenner a écrit : >>
Lucas, >> >> I think the overhead of putting it on to a different node will not
>> show any performance gain. This includes the network traffic to your >>
MySQL server rather then using a socket on the localhost. >> >> If you are
trying to improve performance, have you tried using >> multiple cores with RBH?
Some tuning that Thomas showed are here: >>
http://sourceforge.net/apps/trac/robinhood/wiki/TunePipeline >> >> You can also
optimize MySQL with mysqltunner.pl >> >>> Node1: robinhood --scan --once
-f myconf -L stdout -F /A/B >>> Node1: robinhood --scan --once -f myconf
-L stdout -F /A/C >>> Node1: robinhood --scan --once -f myconf -L stdout
-F /A/D >> If you just want to scan parts of your filesystem, have a unique >>
myconf file that includes an Ignore block >> >> FS_Scan { >> # FS Specific
>> Ignore >> { >> not tree == "/A/B" >> } >> } >> >> Do
this for each of your folders on your filesystem. >> >> >> Does that help? >>
-Adam >> >> -- >> Adam Brenner >> Computer Science, Undergraduate Student >>
Donald Bren School of Information and Computer Sciences >> >> Research
Computing Support >> Office of Information Technology >>
http://www.oit.uci.edu/rcs/ >> >> University of California, Irvine >>
www.ics.uci.edu/~aebrenne/ >> [email protected] >> >> >> On Mon, Jan 13, 2014 at
11:09 PM, Lucas <[email protected]> wrote: >>> Hello, >>> I have a
thought in my mind that running robinhood in multiple nodes >>> to scan the
file system. first step is to split the file system into many >>>
sub-filesystem(I have find a way to implement it ), then run robinhood in >>>
many other nodes and update the ENTRIES table of MySQL database. >>> I
tested the -F parameter, here we surpose the structure of the >>> database is:
>>> /A >>> |-- B >>> |-- C >>> |-- D >>>
And there are three nodes using robinhood to scan the the file system: >>>
Node1: robinhood --scan --once -f myconf -L stdout -F /A/B >>> Node1:
robinhood --scan --once -f myconf -L stdout -F /A/C >>> Node1: robinhood
--scan --once -f myconf -L stdout -F /A/D >>> Apparently it is not works, but
what if modified the source codes of >>> list_mgr and only update the ENTRIES
table ? >>> Is that a good idea? >>> >>> Best regards, >>> Lucas >>> >>> >>>
>>>
------------------------------------------------------------------------------
>>> CenturyLink Cloud: The Leader in Enterprise Cloud Services. >>> Learn Why
More Businesses Are Choosing CenturyLink Cloud For >>> Critical Workloads,
Development Environments & Everything In Between. >>> Get a Quote or Start a
Free Trial Today. >>>
http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
>>> _______________________________________________ >>> robinhood-devel mailing
list >>> [email protected] >>>
https://lists.sourceforge.net/lists/listinfo/robinhood-devel >>> >>
------------------------------------------------------------------------------
>> CenturyLink Cloud: The Leader in Enterprise Cloud Services. >> Learn Why
More Businesses Are Choosing CenturyLink Cloud For >> Critical Workloads,
Development Environments & Everything In Between. >> Get a Quote or Start a
Free Trial Today. >>
http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk >>
_______________________________________________ >> robinhood-support mailing
list >> [email protected] >>
https://lists.sourceforge.net/lists/listinfo/robinhood-support >------------------------------------------------------------------------------
CenturyLink Cloud: The Leader in Enterprise Cloud Services.
Learn Why More Businesses Are Choosing CenturyLink Cloud For
Critical Workloads, Development Environments & Everything In Between.
Get a Quote or Start a Free Trial Today.
http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
_______________________________________________
robinhood-support mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/robinhood-support