On 09/05/13 05:20 PM, Dev Priya wrote:
Hi,
I have been working on a persistent log storage solution for Autotest
and want to discuss my thoughts with you, seek your advice and
investigate if prior work/solution exist on this front. As of now
Autotest in its default config stores the logs locally on the results
server. We don't get redundancy as well as very large storage capacity
in this configuration. To tackle this issue I am thinking of
implementing a variant of ResultsArchiver that archives the log files
and stores them on HDFS.
That's a great idea. I have some comments to make below.
The proposed changes are like this -
1. Config file will dictate whether to use local storage or HDFS.
2. All HDFS related configs will be in the global config file.
3. ResultsArchiver's HDFS implementation can either use python libraries
or wrap command line tools to push a file on HDFS. I am even planning to
explore HttpFS for Hadoop.
4. For reading the files, currently Apache file handler handles the file
rendering. We can use HttpFS for accessing the files directly from HDFS
and this will need some alteration to the file urls. I think this can be
achieved by some rewrite rules.
5. Another solution which will be better performance-wise but harder to
implement is to cache the files locally and then deliver them through
Apache file handler as we are doing now. The details of this
implementation are yet to be sorted out, again your feedback will be
valuable here.
Has this storage problem's solution been attempted in the past
Not that I'm aware. There's the drone architecture in autotest that
allows to spread the load of autoserv processes across several machines,
that we call drones, but no special treatment is given to log files.
or do we
have any existing solution inside Autotest already that I might have
missed? If not, then does my proposed plan look good and will it be
something we would like to see in Autotest?
Yes, I definitely want to see it in Autotest, as we sometimes have
trouble with our internal test grid logs. One thing that I was thinking
is that GlusterFS might be an interesting option here. It even has a
drop in compatibility library to make GlusterFS to replace HDFS, so
that's something interesting to explore.
If you feel like it, we could put the design open as a github issue or
something, so it can be tracked, and people could help with tasks.
Cheers,
Lucas
_______________________________________________
Autotest-kernel mailing list
[email protected]
https://www.redhat.com/mailman/listinfo/autotest-kernel