[jira] [Comment Edited] (SOLR-14768) Error 500 on PDF extraction: java.lang.NoClassDefFoundError: org/eclipse/jetty/server/MultiParts
[ https://issues.apache.org/jira/browse/SOLR-14768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17192711#comment-17192711 ] Markus Kalkbrenner edited comment on SOLR-14768 at 9/9/20, 7:54 AM: I just want to mention that I created this issue because it has been caught by automated tests ;) [https://github.com/solariumphp/solarium/actions] and [https://github.com/mkalkbrenner/search_api_solr/actions] Beside PDF extraction we test things like the Collections API, Streaming Expressions, Facets, JSON Facets, ... Therefore we run Solr docker containers. Maybe we can add a test that includes a build of a docker container using the latest Solr dev version? BTW both test suites allways use the latest versions of Solr 7 and 8. At the moment we had to break this rule and force Solr 8.5 instead of the latest version to get the tests to pass again to not interrupt the development of these frameworks. was (Author: mkalkbrenner): I just want to mention that I created this issue because it has been caught by automated tests ;) [https://github.com/solariumphp/solarium/actions] and [https://github.com/mkalkbrenner/search_api_solr/actions] Beside PDF extraction we test things like the Collections API, Streaming Expressions, Facets, JSON Facets, ... Therefore we run Solr docker containers. Maybe we can add a test that includes a build of a docker container using the latest Solr dev version? > Error 500 on PDF extraction: java.lang.NoClassDefFoundError: > org/eclipse/jetty/server/MultiParts > > > Key: SOLR-14768 > URL: https://issues.apache.org/jira/browse/SOLR-14768 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: contrib - Solr Cell (Tika extraction) >Affects Versions: 8.6, 8.6.1 >Reporter: Markus Kalkbrenner >Assignee: David Smiley >Priority: Major > Attachments: Solr v8.6.x fails with multipart MIME in commands.eml, > Solr v8.6.x fails with multipart MIME in commands.eml, Solr v8.6.x fails with > multipart MIME in commands.eml > > Time Spent: 10m > Remaining Estimate: 0h > > See [https://www.mail-archive.com/solr-user@lucene.apache.org/msg152182.html] > The integration tests of the solarium PHP library and the integration tests > of the Search API Solr Drupal module both fail on PDF extraction if executed > on Solr 8.6. > They still work on Solr 8.5.1 an earlier versions. > {quote}2020-08-20 12:30:35.279 INFO (qtp855700733-19) [ x:5f3e6ce2810ef] > o.a.s.u.p.LogUpdateProcessorFactory [5f3e6ce2810ef] webapp=/solr > path=/update/extract > params=\{json.nl=flat=0=false=testpdf.pdf=extract-test=true=false=attr_=json}{add=[extract-test > (1675547519474466816)],commit=} 0 957 > solr8_1 | 2020-08-20 12:30:35.280 WARN (qtp855700733-19) [ ] > o.e.j.s.HttpChannel /solr/5f3e6ce2810ef/update/extract => > java.lang.NoClassDefFoundError: org/eclipse/jetty/server/MultiParts > solr8_1 | at > org.apache.solr.servlet.SolrRequestParsers.cleanupMultipartFiles(SolrRequestParsers.java:624) > solr8_1 | java.lang.NoClassDefFoundError: org/eclipse/jetty/server/MultiParts > solr8_1 | at > org.apache.solr.servlet.SolrRequestParsers.cleanupMultipartFiles(SolrRequestParsers.java:624) > ~[?:?] > solr8_1 | at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:443) > ~[?:?] > solr8_1 | at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:345) > ~[?:?] > solr8_1 | at > org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1596) > ~[jetty-servlet-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:545) > ~[jetty-servlet-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143) > ~[jetty-server-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:590) > ~[jetty-security-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127) > ~[jetty-server-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235) > ~[jetty-server-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1610) > ~[jetty-server-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233) > ~[jetty-server-9.4.27.v20200227.jar:9.4.27.v20200227]
[jira] [Comment Edited] (SOLR-14768) Error 500 on PDF extraction: java.lang.NoClassDefFoundError: org/eclipse/jetty/server/MultiParts
[ https://issues.apache.org/jira/browse/SOLR-14768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17192450#comment-17192450 ] Ishan Chattopadhyaya edited comment on SOLR-14768 at 9/8/20, 8:59 PM: -- bq. That's good news, but looking from a distance it is not yet convincing. Please have a closer look. {quote}I have volunteered my Linux based search service (PHP based) for this, found on [https://netlab1.net/] as the Solr/Lucene Search Service. That takes less than an hour to build up completely. As its command line crawler does the work of interest here it easily blends into the needed final assembly and test operation. {quote} If you rely on Solr for this functionality and also believe your tool would take an hour to build up, why don't you set it up for your own automated testing? Alternately, feel free to open a separate JIRA/submit a patch there to help us do so. bq. What would help us is using external agents ... Please feel free to chime in in this dev list discussion. https://lists.apache.org/thread.html/r27d742357f9e1342cce1c9aaa215a10d94f64b43b2e681f19fa8d150%40%3Cdev.lucene.apache.org%3E was (Author: ichattopadhyaya): bq. That's good news, but looking from a distance it is not yet convincing. Please have a closer look. {quote}I have volunteered my Linux based search service (PHP based) for this, found on [https://netlab1.net/] as the Solr/Lucene Search Service. That takes less than an hour to build up completely. As its command line crawler does the work of interest here it easily blends into the needed final assembly and test operation. {quote} If you rely on Solr for this functionality and also believe your tool would take an hour to build up, why don't you set it up for your own automated testing? Alternately, feel free to submit a patch to help us do so. > Error 500 on PDF extraction: java.lang.NoClassDefFoundError: > org/eclipse/jetty/server/MultiParts > > > Key: SOLR-14768 > URL: https://issues.apache.org/jira/browse/SOLR-14768 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: contrib - Solr Cell (Tika extraction) >Affects Versions: 8.6, 8.6.1 >Reporter: Markus Kalkbrenner >Assignee: David Smiley >Priority: Major > Attachments: Solr v8.6.x fails with multipart MIME in commands.eml, > Solr v8.6.x fails with multipart MIME in commands.eml, Solr v8.6.x fails with > multipart MIME in commands.eml > > Time Spent: 10m > Remaining Estimate: 0h > > See [https://www.mail-archive.com/solr-user@lucene.apache.org/msg152182.html] > The integration tests of the solarium PHP library and the integration tests > of the Search API Solr Drupal module both fail on PDF extraction if executed > on Solr 8.6. > They still work on Solr 8.5.1 an earlier versions. > {quote}2020-08-20 12:30:35.279 INFO (qtp855700733-19) [ x:5f3e6ce2810ef] > o.a.s.u.p.LogUpdateProcessorFactory [5f3e6ce2810ef] webapp=/solr > path=/update/extract > params=\{json.nl=flat=0=false=testpdf.pdf=extract-test=true=false=attr_=json}{add=[extract-test > (1675547519474466816)],commit=} 0 957 > solr8_1 | 2020-08-20 12:30:35.280 WARN (qtp855700733-19) [ ] > o.e.j.s.HttpChannel /solr/5f3e6ce2810ef/update/extract => > java.lang.NoClassDefFoundError: org/eclipse/jetty/server/MultiParts > solr8_1 | at > org.apache.solr.servlet.SolrRequestParsers.cleanupMultipartFiles(SolrRequestParsers.java:624) > solr8_1 | java.lang.NoClassDefFoundError: org/eclipse/jetty/server/MultiParts > solr8_1 | at > org.apache.solr.servlet.SolrRequestParsers.cleanupMultipartFiles(SolrRequestParsers.java:624) > ~[?:?] > solr8_1 | at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:443) > ~[?:?] > solr8_1 | at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:345) > ~[?:?] > solr8_1 | at > org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1596) > ~[jetty-servlet-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:545) > ~[jetty-servlet-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143) > ~[jetty-server-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:590) > ~[jetty-security-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127) > ~[jetty-server-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at >
[jira] [Comment Edited] (SOLR-14768) Error 500 on PDF extraction: java.lang.NoClassDefFoundError: org/eclipse/jetty/server/MultiParts
[ https://issues.apache.org/jira/browse/SOLR-14768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17185070#comment-17185070 ] Joe Doupnik edited comment on SOLR-14768 at 8/26/20, 10:09 AM: --- After some review of the source code of Solr 8.6.1, file SolrRequestParsers.java, and then asking Google about MultiParts, one finds that the function cleanupMultipartFiles (which is the locus of the present error) is coupled to functions (MultiPartInputStream etc) which are depreciated and even parts actively blocked. Also such a cleanup process itself is vulnerable to side effects ([https://github.com/eclipse/jetty.project/issues/4383). |https://github.com/eclipse/jetty.project/issues/4383).**]and ([https://github.com/eclipse/jetty.project/issues/4350|https://github.com/eclipse/jetty]) . These are from Nov 2019. A comment in the first reference is useful to note: "A simple fix was added in 9.4.26 to avoid the NPE ([#4479|https://github.com/eclipse/jetty.project/pull/4479]). A more substantial fix using synchronization has been merged to jetty-10.0.x ([#4498|https://github.com/eclipse/jetty.project/pull/4498]) which also ensures the parts are always cleaned up properly when parsing the multipart form asynchronously." The second reference, 4350, is has topic title of "Depreciated MultiPartInputStreamParser still used in jetty-server (MultiPartsUtilParser) but OSGi ExportPackage suppressed." Thus the present functions need to be replaced, and perhaps they are with jetty 10. In any case, the material is broken in Solr 8.6.x, and any fixes need adequate testing. Thanks, Joe D. was (Author: jdoupnik): After some review of the source code of Solr 8.6.1, file SolrRequestParsers.java, and then asking Google about MultiParts, one finds that the function cleanupMultipartFiles (which is the locus of the present error) is coupled to functions (MultiPartInputStream etc) which are depreciated and even parts actively blocked. Also such a cleanup process itself is vulnerable to side effects ([https://github.com/eclipse/jetty.project/issues/4383).**] A comment in that reference is useful to note: "A simple fix was added in 9.4.26 to avoid the NPE ([#4479|https://github.com/eclipse/jetty.project/pull/4479]). A more substantial fix using synchronization has been merged to jetty-10.0.x ([#4498|https://github.com/eclipse/jetty.project/pull/4498]) which also ensures the parts are always cleaned up properly when parsing the multipart form asynchronously." Thus the present functions need to be replaced, and perhaps they are with jetty 10. In any case, the material is broken in Solr 8.6.x, and any fixes need adequate testing. Thanks, Joe D. > Error 500 on PDF extraction: java.lang.NoClassDefFoundError: > org/eclipse/jetty/server/MultiParts > > > Key: SOLR-14768 > URL: https://issues.apache.org/jira/browse/SOLR-14768 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: contrib - Solr Cell (Tika extraction) >Affects Versions: 8.6, 8.6.1 >Reporter: Markus Kalkbrenner >Priority: Major > Attachments: Solr v8.6.x fails with multipart MIME in commands.eml > > > See [https://www.mail-archive.com/solr-user@lucene.apache.org/msg152182.html] > The integration tests of the solarium PHP library and the integration tests > of the Search API Solr Drupal module both fail on PDF extraction if executed > on Solr 8.6. > They still work on Solr 8.5.1 an earlier versions. > {quote}2020-08-20 12:30:35.279 INFO (qtp855700733-19) [ x:5f3e6ce2810ef] > o.a.s.u.p.LogUpdateProcessorFactory [5f3e6ce2810ef] webapp=/solr > path=/update/extract > params=\{json.nl=flat=0=false=testpdf.pdf=extract-test=true=false=attr_=json}{add=[extract-test > (1675547519474466816)],commit=} 0 957 > solr8_1 | 2020-08-20 12:30:35.280 WARN (qtp855700733-19) [ ] > o.e.j.s.HttpChannel /solr/5f3e6ce2810ef/update/extract => > java.lang.NoClassDefFoundError: org/eclipse/jetty/server/MultiParts > solr8_1 | at > org.apache.solr.servlet.SolrRequestParsers.cleanupMultipartFiles(SolrRequestParsers.java:624) > solr8_1 | java.lang.NoClassDefFoundError: org/eclipse/jetty/server/MultiParts > solr8_1 | at > org.apache.solr.servlet.SolrRequestParsers.cleanupMultipartFiles(SolrRequestParsers.java:624) > ~[?:?] > solr8_1 | at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:443) > ~[?:?] > solr8_1 | at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:345) > ~[?:?] > solr8_1 | at > org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1596) > ~[jetty-servlet-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at
[jira] [Comment Edited] (SOLR-14768) Error 500 on PDF extraction: java.lang.NoClassDefFoundError: org/eclipse/jetty/server/MultiParts
[ https://issues.apache.org/jira/browse/SOLR-14768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17183515#comment-17183515 ] Joe Doupnik edited comment on SOLR-14768 at 8/25/20, 9:00 AM: -- My observations were described with pictures added, thus to see them please refer to the user's list, 23 August. This include how to reproduce code snippets. I have added my report (from solr-users) as an email attachment to the main item. Joe D. was (Author: jdoupnik): My observations were described with pictures added, thus to see them please refer to the user's list, 23 August. This include how to reproduce code snippets. Joe D. > Error 500 on PDF extraction: java.lang.NoClassDefFoundError: > org/eclipse/jetty/server/MultiParts > > > Key: SOLR-14768 > URL: https://issues.apache.org/jira/browse/SOLR-14768 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: contrib - Solr Cell (Tika extraction) >Affects Versions: 8.6, 8.6.1 >Reporter: Markus Kalkbrenner >Priority: Major > Attachments: Solr v8.6.x fails with multipart MIME in commands.eml > > > See [https://www.mail-archive.com/solr-user@lucene.apache.org/msg152182.html] > The integration tests of the solarium PHP library and the integration tests > of the Search API Solr Drupal module both fail on PDF extraction if executed > on Solr 8.6. > They still work on Solr 8.5.1 an earlier versions. > {quote}2020-08-20 12:30:35.279 INFO (qtp855700733-19) [ x:5f3e6ce2810ef] > o.a.s.u.p.LogUpdateProcessorFactory [5f3e6ce2810ef] webapp=/solr > path=/update/extract > params=\{json.nl=flat=0=false=testpdf.pdf=extract-test=true=false=attr_=json}{add=[extract-test > (1675547519474466816)],commit=} 0 957 > solr8_1 | 2020-08-20 12:30:35.280 WARN (qtp855700733-19) [ ] > o.e.j.s.HttpChannel /solr/5f3e6ce2810ef/update/extract => > java.lang.NoClassDefFoundError: org/eclipse/jetty/server/MultiParts > solr8_1 | at > org.apache.solr.servlet.SolrRequestParsers.cleanupMultipartFiles(SolrRequestParsers.java:624) > solr8_1 | java.lang.NoClassDefFoundError: org/eclipse/jetty/server/MultiParts > solr8_1 | at > org.apache.solr.servlet.SolrRequestParsers.cleanupMultipartFiles(SolrRequestParsers.java:624) > ~[?:?] > solr8_1 | at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:443) > ~[?:?] > solr8_1 | at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:345) > ~[?:?] > solr8_1 | at > org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1596) > ~[jetty-servlet-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:545) > ~[jetty-servlet-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143) > ~[jetty-server-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:590) > ~[jetty-security-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127) > ~[jetty-server-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235) > ~[jetty-server-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1610) > ~[jetty-server-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233) > ~[jetty-server-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1300) > ~[jetty-server-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:188) > ~[jetty-server-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:485) > ~[jetty-servlet-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1580) > ~[jetty-server-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:186) > ~[jetty-server-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at > org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1215) > ~[jetty-server-9.4.27.v20200227.jar:9.4.27.v20200227] > solr8_1 | at >