Jan Burkhardt created TIKA-2447: ----------------------------------- Summary: PSDParser creates unnecessary large byte array and discards it Key: TIKA-2447 URL: https://issues.apache.org/jira/browse/TIKA-2447 Project: Tika Issue Type: Bug Components: parser Affects Versions: 1.16, 1.15 Environment: openjdk version "1.8.0_131" few memory (currently using 256M xmx) Reporter: Jan Burkhardt Priority: Critical
PSD (Adobe Photoshop) are split into ResourceBlock's which contain different data, but only Caption Blocks are currently extracted into the description. Parsing a file with very big blocks, i.e. for image data, a byte array of the size of the block is allocated: https://github.com/justsocialapps/tika/blob/master/tika-parsers/src/main/java/org/apache/tika/parser/image/PSDParser.java#L191 even if it is discarded after that: https://github.com/justsocialapps/tika/blob/master/tika-parsers/src/main/java/org/apache/tika/parser/image/PSDParser.java#L117 and following lines I will prepare a pull request to fix that. -- This message was sent by Atlassian JIRA (v6.4.14#64029)