Don't use the image processing in the OCR plugin - only use the PDF 
processing.
This issue seems to be related to a thread semaphore problem - I'll fix 
this, if I'll find the time.
I plan to remove the image processing from this plugin until end of this 
year.

>Why does the documentation say
>ASSP_OCRocrmaxprocesses should be less than the number of cpu cores?

OCR (ImageMagick + tesseract) will use 100% of one core per concurrent 
processed image until it has finished. 
I expect a maximum of 0.05% better spam detection using the OCR-image 
processing - for an up to 95% higher CPU usage, this is toooo less.

It is much more efficient to use the 'ASSP_AFCDetectSpamAttachRe' in the 
ASSP_AFC plugin.

Thomas



Von:    "Dirk Kulmsee" <d.kulm...@netgroup.de>
An:     <assp-test@lists.sourceforge.net>
Datum:  02.09.2014 16:27
Betreff:        [Assp-test] ASSP_OCR & stuck workers question



Hi,

I am currently running ASSP 2.4.4 (14241) on Debian Linux with Perl 5.20.
The ASSP_OCR module is 2.18.

I had all worker processes stuck in ASSP_OCR one by one: 

 

2014-09-02 10:59:21 [Main_Thread] Info: Loop in Worker_1 was not active 
for
461 seconds

2014-09-02 10:59:21 [Main_Thread] Info: Worker_1 : last sigoff in 
ASSP_OCR,
/opt/assp/Plugins/ASSP_OCR.pm, 282, main::sigoffTry, 1, , ,  at 14-2-8
10:51:40 1409647900.23592 - 282

2014-09-02 10:59:21 [Main_Thread] Info: Worker_1 : last sigon in main, sub
main::URIBLok, 15, main::URIBLok_Run, 1, , ,  at 14-2-8 10:51:40
1409647900.2248 - 272

2014-09-02 10:59:21 [Main_Thread] Info: Worker_1 : last action was : call
Plugin ASSP_OCR with

2014-09-02 10:59:21 [Main_Thread] Warning: try to terminate
inactive/stucking Worker_1

2014-09-02 11:19:26 [Main_Thread] Info: Loop in Worker_2 was not active 
for
466 seconds

2014-09-02 11:19:26 [Main_Thread] Info: Worker_2 : last sigoff in 
ASSP_OCR,
/opt/assp/Plugins/ASSP_OCR.pm, 282, main::sigoffTry, 1, , ,  at 14-2-8
11:11:40 1409649100.27879 - 282

2014-09-02 11:19:26 [Main_Thread] Info: Worker_2 : last sigon in main, sub
main::URIBLok, 15, main::URIBLok_Run, 1, , ,  at 14-2-8 11:11:40
1409649100.26713 - 241

2014-09-02 11:19:26 [Main_Thread] Info: Worker_2 : last action was : call
Plugin ASSP_OCR with

2014-09-02 11:19:26 [Main_Thread] Warning: try to terminate
inactive/stucking Worker_2

2014-09-02 11:36:11 [Main_Thread] Info: Loop in Worker_3 was not active 
for
271 seconds

2014-09-02 11:36:11 [Main_Thread] Info: Worker_3 : last sigoff in 
ASSP_OCR,
/opt/assp/Plugins/ASSP_OCR.pm, 282, main::sigoffTry, 1, , ,  at 14-2-8
11:31:40 1409650300.57724 - 282

2014-09-02 11:36:11 [Main_Thread] Info: Worker_3 : last sigon in main, sub
main::URIBLok, 15, main::URIBLok_Run, 1, , ,  at 14-2-8 11:31:40
1409650300.56076 - 241

2014-09-02 11:36:11 [Main_Thread] Info: Worker_3 : last action was : call
Plugin ASSP_OCR with

2014-09-02 11:36:11 [Main_Thread] Warning: try to terminate
inactive/stucking Worker_3

2014-09-02 13:49:57 [Main_Thread] Info: Loop in Worker_4 was not active 
for
196 seconds

2014-09-02 13:49:57 [Main_Thread] Info: Worker_4 : last sigoff in 
ASSP_OCR,
/opt/assp/Plugins/ASSP_OCR.pm, 282, main::sigoffTry, 1, , ,  at 14-2-8
13:46:41 1409658401.38248 - 282

2014-09-02 13:49:57 [Main_Thread] Info: Worker_4 : last sigon in main, sub
main::URIBLok, 15, main::URIBLok_Run, 1, , ,  at 14-2-8 13:46:41
1409658401.36525 - 241

2014-09-02 13:49:57 [Main_Thread] Info: Worker_4 : last action was : call
Plugin ASSP_OCR with

2014-09-02 13:49:57 [Main_Thread] Warning: try to terminate
inactive/stucking Worker_4

 

Later I found a live example for this. A simple email status report
containing four little PNG icons stuck the worker process, leaving log 
lines
like these:

 

2014-09-02 13:59:26 m1-59166-11063 [Worker_1] [Plugin] 88.198.3.4 [OIP:
81.209.171.97] <server2...@someone.de> to: al...@mydomain.de ASSP_OCR: 
(att)
file text1.ecelp9600 found in mime part 1

2014-09-02 13:59:26 m1-59166-11063 [Worker_1] [Plugin] 88.198.3.4 [OIP:
81.209.171.97] <server2...@someone.de> to: al...@mydomain.de ASSP_OCR: 
(att)
file logo.png found in mime part 2

2014-09-02 13:59:26 m1-59166-11063 [Worker_1] [Plugin] 88.198.3.4 [OIP:
81.209.171.97] <server2...@someone.de> to: al...@mydomain.de ASSP_OCR:
processing (attatched) file logo.png

2014-09-02 13:59:26 m1-59166-11063 [Worker_1] [Plugin] 88.198.3.4 [OIP:
81.209.171.97] <server2...@someone.de> to: al...@mydomain.de ASSP_OCR: 
(att)
file warning.png found in mime part 3

2014-09-02 13:59:26 m1-59166-11063 [Worker_1] [Plugin] 88.198.3.4 [OIP:
81.209.171.97] <server2...@someone.de> to: al...@mydomain.de ASSP_OCR:
processing (attatched) file warning.png

2014-09-02 13:59:26 m1-59166-11063 [Worker_1] [Plugin] 88.198.3.4 [OIP:
81.209.171.97] <server2...@someone.de> to: al...@mydomain.de ASSP_OCR: 
(att)
file success.png found in mime part 4

2014-09-02 13:59:26 m1-59166-11063 [Worker_1] [Plugin] 88.198.3.4 [OIP:
81.209.171.97] <server2...@someone.de> to: al...@mydomain.de ASSP_OCR:
processing (attatched) file success.png

2014-09-02 13:59:26 m1-59166-11063 [Worker_1] [Plugin] 88.198.3.4 [OIP:
81.209.171.97] <server2...@someone.de> to: al...@mydomain.de ASSP_OCR: 
(att)
file error.png found in mime part 5

2014-09-02 13:59:26 m1-59166-11063 [Worker_1] [Plugin] 88.198.3.4 [OIP:
81.209.171.97] <server2...@someone.de> to: al...@mydomain.de ASSP_OCR:
processing (attatched) file error.png

 

I looked into the config for ASSP_OCR and found ASSP_OCRocrmaxprocesses 
set
to its default value of three.

Here comes the funny part:

When ASSP_OCRocrmaxprocesses is set to 3, the worker gets stuck as soon as
it hits png #4 

When ASSP_OCRocrmaxprocesses is set to 1, the worker gets stuck as soon as
it hits png #2

When ASSP_OCRocrmaxprocesses is set to 10, this email gets through and I
have no stuck worker processes since (at least for the last two hours :) 
).

 

Can anyone confirm this? Could it be, that ASSP_OCR goes mad when it finds
"ASSP_OCRocrmaxprocesses +1 " images? Why does the documentation say
ASSP_OCRocrmaxprocesses should be less than the number of cpu cores?

 

Best regards

Dirk

 

------------------------------------------------------------------------------
Slashdot TV. 
Video for Nerds.  Stuff that matters.
http://tv.slashdot.org/
_______________________________________________
Assp-test mailing list
Assp-test@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/assp-test





DISCLAIMER:
*******************************************************
This email and any files transmitted with it may be confidential, legally 
privileged and protected in law and are intended solely for the use of the 

individual to whom it is addressed.
This email was multiple times scanned for viruses. There should be no 
known virus in this email!
*******************************************************

------------------------------------------------------------------------------
Slashdot TV.  
Video for Nerds.  Stuff that matters.
http://tv.slashdot.org/
_______________________________________________
Assp-test mailing list
Assp-test@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/assp-test

Reply via email to