On 12/02/2015 12:06 PM, Raymond Bakker wrote:
> Hello,
> 
> ==Summary== 
> We are experiencing different DLP behavior for complex RegEx between two 
> installations. 
> 
> 
> ==System==
> Version:  ciphermail-virtual-appliance-2.10.0-3.
>   1. Ubuntu pre-made virtual appliance (on my laptop)
>   2. Red Hat & CentOS gateway package (on a test server)
> 
> 
> ==Configuration==
> DLP: several triggers with "Must Encrypt"
> Settings: Encrypt Mode "No Encryption"
> Settings: DLP Patterns added
> 
> 
> ==Example==
> We want to search a message for [any text][four numbers][any text]
> So we try this RegEx: *.\d{4}.*
> 
> This works perfectly on the Ubuntu VA, but it encrypts EVERY message on 
> CentOS.
> Everything is back to normal when we disable the complex RegEx on CentOS.
> 
> We also tried to search for a little more simple like: [0-9][0-9][0-9][0-9]
> Ubuntu version is fine, CentOS version encrypts every message.
> 
> 
> ==DLP Trigger Comparison ==
> Ubuntu version:
>   - Single words work as expected
>   - Mail header works as expected
>   - Complex *.\d{4}.* works as expected
> 
> CentOS version:
>   - Single words work as expected
>   - Mail header works as expected
>   - Complex *.\d{4}.* works DIFFERENT
> 
> 
> Does anyone have experience with this situation?
> 
> Is our installation perhaps incorrect?

It's quite likely that a message contains 4 digits. Could it be that the
mail sent via the CentOS gateway is sent with some other mail app than
the mail sent via the virtual appliance?

The DLP text extractor also extracts header values. So for example a
date header will also be extracted. Since almost all mails contain a
date header, almost any mail will contain 4 digits.

If you have the "raw" MIME content,  you can see what text the DLP
engine see during scanning by uploading the MIME message to the "extract
text" tool (Admin -> other -> extract text). The "extract text" tool
will return the normalized text.

> So we try this RegEx: *.\d{4}.*

If you want to trigger on 4 digits, you should use \d{4} , i.e., skip
the .* part. The .* is not needed, it will make scanning slower. The reg
exp is not required to match the complete text, i.e. .* is kind of
implicitly added to any reg ex.

Kind regards,

CipherMail support

> Cheers,
> 
> Raymond Bakker | Integration Consultant
> 
> T    +31 (0)10 288 1600
> M   +31 (0)6 2222 5515
> E    raymond.bak...@vanadgroup.com
> 
> VANAD Enovation
> Rivium Westlaan 1
> 2909 LD Capelle aan den IJssel
> The Netherlands
> 
> Website | Facebook | LinkedIn | Twitter
> 
> This e-mail is personal. For our disclaimer, please visit 
> www.vanadgroup.com/disclaimer
> 
> _______________________________________________
> Users mailing list
> Users@lists.djigzo.com
> https://lists.djigzo.com/lists/listinfo/users

-- 
CipherMail email encryption

Email encryption with support for S/MIME, OpenPGP, PDF encryption and
secure webmail pull.

https://www.ciphermail.com

Twitter: http://twitter.com/CipherMail

_______________________________________________
Users mailing list
Users@lists.djigzo.com
https://lists.djigzo.com/lists/listinfo/users

Reply via email to