Re: Virus scan during file submission

2012-06-20 Thread Samuele Kaplun
Hi Theodoros,

In data mercoledì, 20 giugno 2012 10.40:23, Theodoros Theodoropoulos ha 
scritto:
> I was recently asked to think of a way to implement virus scan procedure 
> during the file submission step. Coming to think of it, it's not a bad 
> idea... Of course, it does not apply to PDF and GIF files, but it could 
> come useful for ZIP/RAR/EXE/MSOffice files...

Actually I believe PDFs might as well contain malicious code :-) (on the other 
hand I am not sure AVs can already spot this).

> With that in mind, I have the following comments/questions:
> - There are some free antivirus packages for Linux. AVG 
> (http://free.avg.com/us-en/download.prd-alf) is one of them, but i have 
> never tried it. Do you have any better suggestions?

In the past I have investigated ClamAV which is an opensource antivirus that 
have all sorts of CLI, APIs bindings. There is also a Python package :-)

http://xael.org/norman/python/pyclamd/

> - The chosen antivirus program should have a CLI that should take the 
> file(s) in question and reply with a code that determines whether the 
> file is infected, suspicious, clean etc
> - Some websubmit function (probably Create_Upload_Files_Interface.py ??) 
> must be modified to check the files-to-be-uploaded and reject the, or 
> warn the user accordingly

Yep, maybe, given the fact that there are several ways of uploading files in 
WebSubmit, it might be better to simply add a new WebSubmit function that can 
be added to any workflow.

> - Maybe, in addition to this, there could be a scheduled task that would 
> periodically search the /opt/invenio/var/data/files/ (say once per day), 
> and run a bibdocfile --delete for the definitely infected files 
> (probably also sending a warning email to the admin and/or original 
> submitter), and just a warning for the suspicious ones.

Nice.

> What do you think? Do you have a similar procedure at CERN? If not, do 
> you now of any Invenio installation that incorporates it? 

Indeed at CERN all our submission are open only to trusted authors uploading 
mainly PDF created with LaTeX, so in principle virus free. For other Invenio 
installation I let other maintainers answer this :-)

> If not, would you be interested in implementing it? I think I could 
contribute (with ideas, tests and maybe some very basic code) :)

Indeed, it would be a nice feature, especially for installations handling 
archiving binaries and executables.

I'll ticketize it.

Cheers!
Sam

-- 
Samuele Kaplun
Invenio Developer ** 



Virus scan during file submission

2012-06-20 Thread Theodoros Theodoropoulos

Dear Invenio devs,

I was recently asked to think of a way to implement virus scan procedure 
during the file submission step. Coming to think of it, it's not a bad 
idea... Of course, it does not apply to PDF and GIF files, but it could 
come useful for ZIP/RAR/EXE/MSOffice files...


With that in mind, I have the following comments/questions:
- There are some free antivirus packages for Linux. AVG 
(http://free.avg.com/us-en/download.prd-alf) is one of them, but i have 
never tried it. Do you have any better suggestions?
- The chosen antivirus program should have a CLI that should take the 
file(s) in question and reply with a code that determines whether the 
file is infected, suspicious, clean etc
- Some websubmit function (probably Create_Upload_Files_Interface.py ??) 
must be modified to check the files-to-be-uploaded and reject the, or 
warn the user accordingly
- Maybe, in addition to this, there could be a scheduled task that would 
periodically search the /opt/invenio/var/data/files/ (say once per day), 
and run a bibdocfile --delete for the definitely infected files 
(probably also sending a warning email to the admin and/or original 
submitter), and just a warning for the suspicious ones.


What do you think? Do you have a similar procedure at CERN? If not, do 
you now of any Invenio installation that incorporates it? If not, would 
you be interested in implementing it? I think I could contribute (with 
ideas, tests and maybe some very basic code) :)


Best regards,
Theodoros Theodoropoulos