Title: Message Title
|
|
|
|
Hi George: Looks like a great optimization for virus checking in submission, but I'm not sure it really relates to curation tasks. Tasks have to operate in any lifecycle context, not just submission (e.g. virus checking 6 months later, when content is in archive) - and in these cases there is no java.io.File access to content (just streams), so your optimized clamdscan-based method can't be called. (There are many other cases: SWORD deposit, workflow, etc). There are also OS-based limitations (haven't checked, but I think this optimization will break on Windows). What I'd suggest is disentangling this work from the ClamScan task, and create a simple 'ClamAVFileScanner' class that has the logic to invoke clamdscan. Then create a configuration property for its use ('webui.submit.scanfiles'? or some such), and favor it in the Submission Code, in the sense that if that property is set, you simply instantiate and call ClamAvFileScanner.scanFile(), etc *instead* of the task. If its *not* set, then the code should work as before (i.e. use the task if configured). This approach cleanly separates the concerns, and allows (favors) use of the optimization if desired. As to your point (1), I'm not sure I follow the reasoning: the ClamScan task understandably makes the most conservative assumption possible (scan all files), but this has nothing to do with curation tasks as such - you could easily extend the task to filter bitstreams (only run on ORIGINAL bundle, e.g). The point is that such filtering requires knowledge of how you store content, and what it is, which may not be generally applicable. Thanks for the contribution! Richard R
|
|
|
|
|
|
|
The current virus checking during ingest is problematic for two reasons. 1) using the curation task to check for viruses means that all bitstreams are virus checked each time a bitstream is uploaded, see ClamScan.perform 2) using TCP to communicate with the clamav daemon is painfully slow for large bitstreams This change allows the ingest to be con...
|
|
|
|
------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60135031&iu=/4140/ostg.clktrk
_______________________________________________
Dspace-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspace-devel