[ https://issues.apache.org/jira/browse/TIKA-529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Radek updated TIKA-529: ----------------------- Attachment: (was: isLamAlef.diff) > IBM420 charset detection's isLamAlef is allocation-happy > -------------------------------------------------------- > > Key: TIKA-529 > URL: https://issues.apache.org/jira/browse/TIKA-529 > Project: Tika > Issue Type: Bug > Components: parser > Affects Versions: 0.8 > Reporter: Radek > Assignee: Ken Krugler > Priority: Minor > Attachments: isLamAlef.diff > > > Two IBM420 charset detectors (rtl and ltr) run isLamAlef() for each byte of > detection buffer. > The code is allocating and filling a bytes array every time it runs, which > makes it responsible for approximately 70% of all object allocations in my > current test case (many text files). > Since array is identical every time, and the entire thing can be achieved > without any array, this is wasteful. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.