Re: [clamav-users] Question About MaxFileSize / news of upcoming Large Archive Scanner tool

2023-11-13 Thread Paul Kosinski via clamav-users
Large archive files may be the most obvious case, especially if things like 
disk images and installation images are included, but make sure that large 
multimedia files are also handled.

In today's Internet environment, there are probably far, far more large video 
files floating around than traditional archives. And in some sense multimedia 
"container" files (like MP4, MOV, AVI etc.) are archives of their media streams 
(like H.264/5, AAC, etc.) -- but these archives are, of course, interleaved for 
real-time playback.

I might add that there have been recent reports of malformed (perhaps 
malicious) multimedia files causing crashes or unwanted code execution in 
software such as FFMPEG.


On Mon, 13 Nov 2023 20:32:38 +
"Micah Snyder \(micasnyd\) via clamav-users"  
wrote:

> In case anyone else is looking into this, I wanted to share some news.
> 
> We have been getting some help to create a tool to recursively unpack (or 
> mount) and scan large archives (greater than 2000MB).
> 
> This effort has progressed to the point where we've started code review and 
> writing documentation. I'm not entirely sure how we will package it for 
> people to use.  I'll share more when we go to open source it. I wanted to 
> share the news now in case anyone else was going to work on it and so they're 
> not as frustrated when it turns out we've done the same.
> 
> I don't have a specific release date in mind.  It likely won't be until early 
> next year.  While we've started code review and testing, the developer that 
> has built the tool for us is now working on adding the allmatch-mode feature 
> support.
> 
> Best regards,
> Micah
> 
> 
> Micah Snyder
> ClamAV Development
> Talos
> Cisco Systems, Inc.
> 
> 
> From: Andrew C Aitchison 
> Sent: Thursday, June 8, 2023 6:25 PM
> To: Micah Snyder (micasnyd) 
> Cc: ClamAV users ML 
> Subject: Re: [clamav-users] Question About MaxFileSize
> 
> On Thu, 8 Jun 2023, Micah Snyder (micasnyd) wrote:
> 
> > I agree with you.  I suspect the majority of cases today is when
> > people have a large archive of files to scan.
> >
> > I think best case scenario for people with a need to scan files
> > larger than the present internal 2GB limit is that archives larger
> > than 2GB are decompressed and then the files inside are scanned, but
> > without actually scanning the very large outer archive.
> >
> > The way to do this as things work today is to script something
> > around clamscan or clamdscan that if the file is too large, handle
> > some assorted file types:
> >
> >  1.  if file is a tar.gz, un-tar.gz it and then scan the files within.
> >  2.  if file is a zip, un-zip it and then scan the files within.
> >  3.  etc.
> >
> > I think everyone would like if clamav could do this automatically
> > for select archive types. And I think the advantage would be that we
> > would perhaps keep the extracted files in memory, or else at least
> > delete the temp files as we go without extracting all of it to disk
> > before starting to scan.
> >
> > However, it would be far easier to make a shell script or a python
> > script that wraps clamscan/clamdscan and uses native tools like
> > "tar", "unzip", etc.  
> 
> Good idea.
> 
> Simply untarring or unzipping into a pipe does not separate the packed files.
> However at least tar does have an option which allow us to write a one-liner:
> (tar xf ~/viruses.tar --to-command='clamdscan -v - || echo "  found in 
> $TAR_REALNAME\n\n---"' ) |& egrep -i found
> stream: Eicar-Signature FOUND
>found in viruses/EICAR.COM.TAR
> stream: Eicar-Signature FOUND
>found in viruses/eicar.com.txt
> stream: Eicar-Signature FOUND
>found in viruses/URLEICAR.COM.TAR
> stream: Eicar-Signature FOUND
>found in viruses/4DOSBOX/EICAR.COM
> stream: Eicar-Signature FOUND
>found in viruses/EICAR.COM
> 
> The echo is needed to show the name of the file inside the archive.
> 
> This appears not to write the unpacked files to disk.
> 
> --
> Andrew C. Aitchison  Kendal, UK
> and...@aitchison.me.uk
___

Manage your clamav-users mailing list subscription / unsubscribe:
https://lists.clamav.net/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/Cisco-Talos/clamav-documentation

https://docs.clamav.net/#mailing-lists-and-chat


Re: [clamav-users] Question About MaxFileSize / news of upcoming Large Archive Scanner tool

2023-11-13 Thread Micah Snyder (micasnyd) via clamav-users
In case anyone else is looking into this, I wanted to share some news.

We have been getting some help to create a tool to recursively unpack (or 
mount) and scan large archives (greater than 2000MB).

This effort has progressed to the point where we've started code review and 
writing documentation. I'm not entirely sure how we will package it for people 
to use.  I'll share more when we go to open source it. I wanted to share the 
news now in case anyone else was going to work on it and so they're not as 
frustrated when it turns out we've done the same.

I don't have a specific release date in mind.  It likely won't be until early 
next year.  While we've started code review and testing, the developer that has 
built the tool for us is now working on adding the allmatch-mode feature 
support.

Best regards,
Micah


Micah Snyder
ClamAV Development
Talos
Cisco Systems, Inc.


From: Andrew C Aitchison 
Sent: Thursday, June 8, 2023 6:25 PM
To: Micah Snyder (micasnyd) 
Cc: ClamAV users ML 
Subject: Re: [clamav-users] Question About MaxFileSize

On Thu, 8 Jun 2023, Micah Snyder (micasnyd) wrote:

> I agree with you.  I suspect the majority of cases today is when
> people have a large archive of files to scan.
>
> I think best case scenario for people with a need to scan files
> larger than the present internal 2GB limit is that archives larger
> than 2GB are decompressed and then the files inside are scanned, but
> without actually scanning the very large outer archive.
>
> The way to do this as things work today is to script something
> around clamscan or clamdscan that if the file is too large, handle
> some assorted file types:
>
>  1.  if file is a tar.gz, un-tar.gz it and then scan the files within.
>  2.  if file is a zip, un-zip it and then scan the files within.
>  3.  etc.
>
> I think everyone would like if clamav could do this automatically
> for select archive types. And I think the advantage would be that we
> would perhaps keep the extracted files in memory, or else at least
> delete the temp files as we go without extracting all of it to disk
> before starting to scan.
>
> However, it would be far easier to make a shell script or a python
> script that wraps clamscan/clamdscan and uses native tools like
> "tar", "unzip", etc.

Good idea.

Simply untarring or unzipping into a pipe does not separate the packed files.
However at least tar does have an option which allow us to write a one-liner:
(tar xf ~/viruses.tar --to-command='clamdscan -v - || echo "  found in 
$TAR_REALNAME\n\n---"' ) |& egrep -i found
stream: Eicar-Signature FOUND
   found in viruses/EICAR.COM.TAR
stream: Eicar-Signature FOUND
   found in viruses/eicar.com.txt
stream: Eicar-Signature FOUND
   found in viruses/URLEICAR.COM.TAR
stream: Eicar-Signature FOUND
   found in viruses/4DOSBOX/EICAR.COM
stream: Eicar-Signature FOUND
   found in viruses/EICAR.COM

The echo is needed to show the name of the file inside the archive.

This appears not to write the unpacked files to disk.

--
Andrew C. Aitchison  Kendal, UK
and...@aitchison.me.uk
___

Manage your clamav-users mailing list subscription / unsubscribe:
https://lists.clamav.net/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/Cisco-Talos/clamav-documentation

https://docs.clamav.net/#mailing-lists-and-chat