date:20150220

[jira] [Created] (TIKA-1553) Let's add an evil parser to be used in testing parser drivers

2015-02-20 Thread Tim Allison (JIRA)

Tim Allison created TIKA-1553:
-

 Summary: Let's add an evil parser to be used in testing parser 
drivers
 Key: TIKA-1553
 URL: https://issues.apache.org/jira/browse/TIKA-1553
 Project: Tika
  Issue Type: Test
Reporter: Tim Allison
Assignee: Tim Allison
Priority: Minor


As part of TIKA-1302 and as part of making Tika more robust generally, it would 
be useful to have an evil parser that will throw exceptions/errors and hang for 
lengths of time.  

This will allow us to test timeouts and handling of exceptions and errors in 
tika-server and in tika-batch.  

We could also use this for tests with ForkParser.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (TIKA-1553) Let's add an evil parser to be used in testing parser drivers

2015-02-20 Thread Tim Allison (JIRA)


 [ 
https://issues.apache.org/jira/browse/TIKA-1553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Allison resolved TIKA-1553.
---
Resolution: Fixed

r1661129

 Let's add an evil parser to be used in testing parser drivers
 -

 Key: TIKA-1553
 URL: https://issues.apache.org/jira/browse/TIKA-1553
 Project: Tika
  Issue Type: Test
Reporter: Tim Allison
Assignee: Tim Allison
Priority: Minor

 As part of TIKA-1302 and as part of making Tika more robust generally, it 
 would be useful to have an evil parser that will throw exceptions/errors and 
 hang for lengths of time.  
 This will allow us to test timeouts and handling of exceptions and errors in 
 tika-server and in tika-batch.  
 We could also use this for tests with ForkParser.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (TIKA-1553) Let's add an evil parser to be used in testing parser drivers

2015-02-20 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/TIKA-1553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14328991#comment-14328991
 ] 

Hudson commented on TIKA-1553:
--

SUCCESS: Integrated in tika-trunk-jdk1.7 #499 (See 
[https://builds.apache.org/job/tika-trunk-jdk1.7/499/])
TIKA-1553: add an EvilParser for testing purposes (tallison: 
http://svn.apache.org/viewvc/tika/trunk/?view=revrev=1661129)
* /tika/trunk/CHANGES.txt
* /tika/trunk/tika-parsers/src/test/java/org/apache/tika/TikaTest.java
* /tika/trunk/tika-parsers/src/test/java/org/apache/tika/parser/evil
* 
/tika/trunk/tika-parsers/src/test/java/org/apache/tika/parser/evil/EvilParser.java
* 
/tika/trunk/tika-parsers/src/test/java/org/apache/tika/parser/evil/EvilParserTest.java
* /tika/trunk/tika-parsers/src/test/resources/META-INF
* /tika/trunk/tika-parsers/src/test/resources/META-INF/services
* 
/tika/trunk/tika-parsers/src/test/resources/META-INF/services/org.apache.tika.parser.Parser
* /tika/trunk/tika-parsers/src/test/resources/org
* /tika/trunk/tika-parsers/src/test/resources/org/apache
* /tika/trunk/tika-parsers/src/test/resources/org/apache/tika
* /tika/trunk/tika-parsers/src/test/resources/org/apache/tika/mime
* 
/tika/trunk/tika-parsers/src/test/resources/org/apache/tika/mime/custom-mimetypes.xml
* /tika/trunk/tika-parsers/src/test/resources/test-documents/evil
* /tika/trunk/tika-parsers/src/test/resources/test-documents/evil/fake_oom.evil
* 
/tika/trunk/tika-parsers/src/test/resources/test-documents/evil/heavy_hang.evil
* 
/tika/trunk/tika-parsers/src/test/resources/test-documents/evil/nothing_bad.evil
* 
/tika/trunk/tika-parsers/src/test/resources/test-documents/evil/null_pointer.evil
* 
/tika/trunk/tika-parsers/src/test/resources/test-documents/evil/null_pointer_no_msg.evil
* /tika/trunk/tika-parsers/src/test/resources/test-documents/evil/real_oom.evil
* /tika/trunk/tika-parsers/src/test/resources/test-documents/evil/sleep.evil


 Let's add an evil parser to be used in testing parser drivers
 -

 Key: TIKA-1553
 URL: https://issues.apache.org/jira/browse/TIKA-1553
 Project: Tika
  Issue Type: Test
Reporter: Tim Allison
Assignee: Tim Allison
Priority: Minor

 As part of TIKA-1302 and as part of making Tika more robust generally, it 
 would be useful to have an evil parser that will throw exceptions/errors and 
 hang for lengths of time.  
 This will allow us to test timeouts and handling of exceptions and errors in 
 tika-server and in tika-batch.  
 We could also use this for tests with ForkParser.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (TIKA-1557) Create TesseractOCR Option to Never Run

2015-02-20 Thread Uwe Schindler (JIRA)


[ 
https://issues.apache.org/jira/browse/TIKA-1557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329523#comment-14329523
 ] 

Uwe Schindler commented on TIKA-1557:
-

I would not make this a special option only for tesseract. As said on 
TIKA-1555, it would be better to have a general way to blacklist some parsers 
through TikaConfig.

Currently you have to maintain the whole list of parsers (or parse META-INF 
yourself) and pass the full list to TikaConfig / AutodetectParser / 
CompositeParser. I would like to have an option in TIKA config to blacklist 
parsers. Ideally this should work alos for subclasses, so one could disable all 
ForkParser subclasses by adding ForkParser to blacklist.

 Create TesseractOCR Option to Never Run
 ---

 Key: TIKA-1557
 URL: https://issues.apache.org/jira/browse/TIKA-1557
 Project: Tika
  Issue Type: New Feature
  Components: parser
Reporter: Tyler Palsulich
Assignee: Tyler Palsulich
Priority: Minor
 Fix For: 1.8


 As brought up in TIKA-1555, TesseractOCRParser should have an option to never 
 be run. So, we can add an {{enabled}} option to the Config.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (TIKA-1557) Create TesseractOCR Option to Never Run

2015-02-20 Thread Luis Filipe Nassif (JIRA)


[ 
https://issues.apache.org/jira/browse/TIKA-1557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329547#comment-14329547
 ] 

Luis Filipe Nassif commented on TIKA-1557:
--

I think the same problem that happens with TesseractOCRParser can occur with 
any ExternalParser, like StringsParser or ffmpeg. Maybe it will be better to 
add this option to ExternalParser?

 Create TesseractOCR Option to Never Run
 ---

 Key: TIKA-1557
 URL: https://issues.apache.org/jira/browse/TIKA-1557
 Project: Tika
  Issue Type: New Feature
  Components: parser
Reporter: Tyler Palsulich
Assignee: Tyler Palsulich
Priority: Minor
 Fix For: 1.8


 As brought up in TIKA-1555, TesseractOCRParser should have an option to never 
 be run. So, we can add an {{enabled}} option to the Config.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (TIKA-1557) Create TesseractOCR Option to Never Run

2015-02-20 Thread David Pilato (JIRA)


[ 
https://issues.apache.org/jira/browse/TIKA-1557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329509#comment-14329509
 ] 

David Pilato commented on TIKA-1557:


Thanks! I'd not qualify it as a bug though. :)

 Create TesseractOCR Option to Never Run
 ---

 Key: TIKA-1557
 URL: https://issues.apache.org/jira/browse/TIKA-1557
 Project: Tika
  Issue Type: Bug
  Components: parser
Reporter: Tyler Palsulich
Assignee: Tyler Palsulich
Priority: Minor
 Fix For: 1.8


 As brought up in TIKA-1555, TesseractOCRParser should have an option to never 
 be run. So, we can add an {{enabled}} option to the Config.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (TIKA-1557) Create TesseractOCR Option to Never Run

2015-02-20 Thread Uwe Schindler (JIRA)

[
https://issues.apache.org/jira/browse/TIKA-1557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329523#comment-14329523
]

Uwe Schindler edited comment on TIKA-1557 at 2/20/15 9:05 PM:
--

I would not make this a special option only for tesseract. As said on
TIKA-1555, it would be better to have a general way to blacklist some parsers
through TikaConfig.

Currently you have to maintain the whole list of parsers (or parse META-INF
yourself) and pass the full list to TikaConfig / AutodetectParser /
CompositeParser. I would like to have an option in TIKA config to blacklist
parsers. Ideally this should also work for subclasses, so one could disable all
ExternalParser subclasses by adding ExternalParser to blacklist.

was (Author: thetaphi):
I would not make this a special option only for tesseract. As said on
TIKA-1555, it would be better to have a general way to blacklist some parsers
through TikaConfig.

Currently you have to maintain the whole list of parsers (or parse META-INF
yourself) and pass the full list to TikaConfig / AutodetectParser /
CompositeParser. I would like to have an option in TIKA config to blacklist
parsers. Ideally this should also work for subclasses, so one could disable all
ForkParser subclasses by adding ForkParser to blacklist.

Create TesseractOCR Option to Never Run
---

Key: TIKA-1557
URL: https://issues.apache.org/jira/browse/TIKA-1557
Project: Tika
Issue Type: New Feature
Components: parser
Reporter: Tyler Palsulich
Assignee: Tyler Palsulich
Priority: Minor
Fix For: 1.8

As brought up in TIKA-1555, TesseractOCRParser should have an option to never
be run. So, we can add an {{enabled}} option to the Config.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

45 matches

Mail list logo