[ https://issues.apache.org/jira/browse/NUTCH-2687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16746160#comment-16746160 ]
Hudson commented on NUTCH-2687: ------------------------------- SUCCESS: Integrated in Jenkins build Nutch-trunk #3599 (See [https://builds.apache.org/job/Nutch-trunk/3599/]) NUTCH-2687 Regex for reading title from Content-Disposition is wrong (markus: [https://github.com/apache/nutch/commit/9cc076f33746c34acfdeef8b3007bb5b0dec736d]) * (edit) src/plugin/index-more/src/java/org/apache/nutch/indexer/more/MoreIndexingFilter.java > Regex for reading title from Content-Disposition is wrong > --------------------------------------------------------- > > Key: NUTCH-2687 > URL: https://issues.apache.org/jira/browse/NUTCH-2687 > Project: Nutch > Issue Type: Bug > Reporter: Markus Jelsma > Assignee: Markus Jelsma > Priority: Major > Fix For: 1.16 > > Attachments: NUTCH-2687.patch > > > Given URL: > https://www.amuse-project.org/file/download/default/E6D0537647AF1204656076943F4729B0/Koopstra2016_5fOntologically%20classifying%20ERP%20feature,%20the%20NEXT%20method_5fFinal.pdf > And regex: \\bfilename=['\"](.+)['\"] > We get the following title: > Koopstra2016_Ontologically classifying ERP feature, the NEXT > method_Final.pdf"; filename*=utf-8' > Changed regex to: \\bfilename=['\"]([^\"]+) fixes it -- This message was sent by Atlassian JIRA (v7.6.3#76005)