[jira] [Commented] (SOLR-14013) javabin performance regressions
[ https://issues.apache.org/jira/browse/SOLR-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17086434#comment-17086434 ] ASF subversion and git services commented on SOLR-14013: Commit c2cd10b923cf2dca0030f2b1c304038bd8267b4e in lucene-solr's branch refs/heads/branch_7_7 from Noble Paul [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=c2cd10b ] SOLR-14259: Back port javabin performance regression fixes from SOLR-14013 > javabin performance regressions > --- > > Key: SOLR-14013 > URL: https://issues.apache.org/jira/browse/SOLR-14013 > Project: Solr > Issue Type: Bug >Affects Versions: 7.7 >Reporter: Yonik Seeley >Assignee: Yonik Seeley >Priority: Blocker > Fix For: 8.4 > > Attachments: SOLR-14013.patch, SOLR-14013.patch, TestQuerySpeed.java, > test.json > > > As noted by [~rrockenbaugh] in SOLR-13963, javabin also recently became > orders of magnitude slower in certain cases since v7.7. The cases identified > so far include large numbers of values in a field. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14013) javabin performance regressions
[ https://issues.apache.org/jira/browse/SOLR-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17086432#comment-17086432 ] ASF subversion and git services commented on SOLR-14013: Commit 6b1263a035cf1ff01c868dac5b32b2421aa74f1f in lucene-solr's branch refs/heads/branch_7_7 from Noble Paul [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=6b1263a ] SOLR-14259: Back port javabin performance regression fixes from SOLR-14013 > javabin performance regressions > --- > > Key: SOLR-14013 > URL: https://issues.apache.org/jira/browse/SOLR-14013 > Project: Solr > Issue Type: Bug >Affects Versions: 7.7 >Reporter: Yonik Seeley >Assignee: Yonik Seeley >Priority: Blocker > Fix For: 8.4 > > Attachments: SOLR-14013.patch, SOLR-14013.patch, TestQuerySpeed.java, > test.json > > > As noted by [~rrockenbaugh] in SOLR-13963, javabin also recently became > orders of magnitude slower in certain cases since v7.7. The cases identified > so far include large numbers of values in a field. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14013) javabin performance regressions
[ https://issues.apache.org/jira/browse/SOLR-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17083861#comment-17083861 ] ASF subversion and git services commented on SOLR-14013: Commit 5d3dfbd0ce8a2ad990635e71144615f1c4815d22 in lucene-solr's branch refs/heads/branch_7_7 from Noble Paul [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=5d3dfbd ] SOLR-14013: trying to port to SOlr 7.7 (#1254) > javabin performance regressions > --- > > Key: SOLR-14013 > URL: https://issues.apache.org/jira/browse/SOLR-14013 > Project: Solr > Issue Type: Bug >Affects Versions: 7.7 >Reporter: Yonik Seeley >Assignee: Yonik Seeley >Priority: Blocker > Fix For: 8.4 > > Attachments: SOLR-14013.patch, SOLR-14013.patch, TestQuerySpeed.java, > test.json > > > As noted by [~rrockenbaugh] in SOLR-13963, javabin also recently became > orders of magnitude slower in certain cases since v7.7. The cases identified > so far include large numbers of values in a field. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14013) javabin performance regressions
[ https://issues.apache.org/jira/browse/SOLR-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17035589#comment-17035589 ] Noble Paul commented on SOLR-14013: --- I've opened SOLR-14259 > javabin performance regressions > --- > > Key: SOLR-14013 > URL: https://issues.apache.org/jira/browse/SOLR-14013 > Project: Solr > Issue Type: Bug >Affects Versions: 7.7 >Reporter: Yonik Seeley >Assignee: Yonik Seeley >Priority: Blocker > Fix For: 8.4 > > Attachments: SOLR-14013.patch, SOLR-14013.patch, TestQuerySpeed.java, > test.json > > > As noted by [~rrockenbaugh] in SOLR-13963, javabin also recently became > orders of magnitude slower in certain cases since v7.7. The cases identified > so far include large numbers of values in a field. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14013) javabin performance regressions
[ https://issues.apache.org/jira/browse/SOLR-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17035499#comment-17035499 ] Houston Putman commented on SOLR-14013: --- [~noble.paul], at the very least I think we should backport this to 7_7. If we want to leave the latest release of 7 in a state with a significant regression/bug in it, then we are basically asking people to either: * Know that 7.6 is the last stable release of solr for people wanting to use multiValued fields in a sharded collection * Upgrade to Solr 8.4 In my opinion, neither of those are good options. Because users are always going to go with the most up to date version of Solr that works for their index, and upgrading to new major versions is a very tough process for a lot of people. This isn't a bug that existed throughout the entirety of Solr 7, it was introduced in the last minor release. A lot of people are very comfortable with Solr 7, and trust it. People also trust that the last minor/patch version of something is going to be the most stable version. We should make sure that the latest release of our second to last major version (7) is stable and maintains that trust that users have in it and Solr in general. It is very little work to backport this, and also probably not a whole lot of work to do another patch or minor release (7.8 or 7.7.3). And with that work we will be providing a significantly better user experience for our community. > javabin performance regressions > --- > > Key: SOLR-14013 > URL: https://issues.apache.org/jira/browse/SOLR-14013 > Project: Solr > Issue Type: Bug >Affects Versions: 7.7 >Reporter: Yonik Seeley >Assignee: Yonik Seeley >Priority: Blocker > Fix For: 8.4 > > Attachments: SOLR-14013.patch, SOLR-14013.patch, TestQuerySpeed.java, > test.json > > > As noted by [~rrockenbaugh] in SOLR-13963, javabin also recently became > orders of magnitude slower in certain cases since v7.7. The cases identified > so far include large numbers of values in a field. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14013) javabin performance regressions
[ https://issues.apache.org/jira/browse/SOLR-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17026170#comment-17026170 ] Karl Stoney commented on SOLR-14013: Please could this be backported to 7_7? We build that branch from source anyway so I'd really appreciate it! > javabin performance regressions > --- > > Key: SOLR-14013 > URL: https://issues.apache.org/jira/browse/SOLR-14013 > Project: Solr > Issue Type: Bug >Affects Versions: 7.7 >Reporter: Yonik Seeley >Assignee: Yonik Seeley >Priority: Blocker > Fix For: 8.4 > > Attachments: SOLR-14013.patch, SOLR-14013.patch, TestQuerySpeed.java, > test.json > > > As noted by [~rrockenbaugh] in SOLR-13963, javabin also recently became > orders of magnitude slower in certain cases since v7.7. The cases identified > so far include large numbers of values in a field. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14013) javabin performance regressions
[ https://issues.apache.org/jira/browse/SOLR-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17026046#comment-17026046 ] Houston Putman commented on SOLR-14013: --- I think backporting would be a good idea, even if a release isn't planned yet. > javabin performance regressions > --- > > Key: SOLR-14013 > URL: https://issues.apache.org/jira/browse/SOLR-14013 > Project: Solr > Issue Type: Bug >Affects Versions: 7.7 >Reporter: Yonik Seeley >Assignee: Yonik Seeley >Priority: Blocker > Fix For: 8.4 > > Attachments: SOLR-14013.patch, SOLR-14013.patch, TestQuerySpeed.java, > test.json > > > As noted by [~rrockenbaugh] in SOLR-13963, javabin also recently became > orders of magnitude slower in certain cases since v7.7. The cases identified > so far include large numbers of values in a field. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14013) javabin performance regressions
[ https://issues.apache.org/jira/browse/SOLR-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17026032#comment-17026032 ] Noble Paul commented on SOLR-14013: --- I can port it to 7.x , but, no release is planned > javabin performance regressions > --- > > Key: SOLR-14013 > URL: https://issues.apache.org/jira/browse/SOLR-14013 > Project: Solr > Issue Type: Bug >Affects Versions: 7.7 >Reporter: Yonik Seeley >Assignee: Yonik Seeley >Priority: Blocker > Fix For: 8.4 > > Attachments: SOLR-14013.patch, SOLR-14013.patch, TestQuerySpeed.java, > test.json > > > As noted by [~rrockenbaugh] in SOLR-13963, javabin also recently became > orders of magnitude slower in certain cases since v7.7. The cases identified > so far include large numbers of values in a field. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14013) javabin performance regressions
[ https://issues.apache.org/jira/browse/SOLR-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025919#comment-17025919 ] Florent Sithi commented on SOLR-14013: -- Do you plan to fix it in 7.X series also ? or do we have to migrate to 8.4.0 > javabin performance regressions > --- > > Key: SOLR-14013 > URL: https://issues.apache.org/jira/browse/SOLR-14013 > Project: Solr > Issue Type: Bug >Affects Versions: 7.7 >Reporter: Yonik Seeley >Assignee: Yonik Seeley >Priority: Blocker > Fix For: 8.4 > > Attachments: SOLR-14013.patch, SOLR-14013.patch, TestQuerySpeed.java, > test.json > > > As noted by [~rrockenbaugh] in SOLR-13963, javabin also recently became > orders of magnitude slower in certain cases since v7.7. The cases identified > so far include large numbers of values in a field. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14013) javabin performance regressions
[ https://issues.apache.org/jira/browse/SOLR-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16995402#comment-16995402 ] ASF subversion and git services commented on SOLR-14013: Commit 422de99acf7cfab004e1e976c1ab47870dc6cfba in lucene-solr's branch refs/heads/branch_8_4 from Noble Paul [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=422de99 ] SOLR-14013: javabin performance regressions > javabin performance regressions > --- > > Key: SOLR-14013 > URL: https://issues.apache.org/jira/browse/SOLR-14013 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 7.7 >Reporter: Yonik Seeley >Assignee: Yonik Seeley >Priority: Blocker > Fix For: 8.4 > > Attachments: SOLR-14013.patch, SOLR-14013.patch, TestQuerySpeed.java, > test.json > > > As noted by [~rrockenbaugh] in SOLR-13963, javabin also recently became > orders of magnitude slower in certain cases since v7.7. The cases identified > so far include large numbers of values in a field. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14013) javabin performance regressions
[ https://issues.apache.org/jira/browse/SOLR-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16995400#comment-16995400 ] ASF subversion and git services commented on SOLR-14013: Commit 9717540b8ecb6f5e142aaef8e27464690684a0f9 in lucene-solr's branch refs/heads/branch_8x from Noble Paul [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=9717540 ] SOLR-14013: javabin performance regressions > javabin performance regressions > --- > > Key: SOLR-14013 > URL: https://issues.apache.org/jira/browse/SOLR-14013 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 7.7 >Reporter: Yonik Seeley >Assignee: Yonik Seeley >Priority: Blocker > Fix For: 8.4 > > Attachments: SOLR-14013.patch, SOLR-14013.patch, TestQuerySpeed.java, > test.json > > > As noted by [~rrockenbaugh] in SOLR-13963, javabin also recently became > orders of magnitude slower in certain cases since v7.7. The cases identified > so far include large numbers of values in a field. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14013) javabin performance regressions
[ https://issues.apache.org/jira/browse/SOLR-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16994888#comment-16994888 ] Houston Putman commented on SOLR-14013: --- [~noble.paul], my speed tests weren't completely scientific, but I tried to make the scenarios as similar between the setups as possible. I think the main takeaways were that the queries were significantly faster (30 seconds -> .1 seconds). The smaller differences between the ingest speeds were less of a concern to me. I can redo the tests and try to make them more scientific & accurate if these numbers give you pause. I've reviewed the patch and run the test on the updated master, and everything looks good to me. > javabin performance regressions > --- > > Key: SOLR-14013 > URL: https://issues.apache.org/jira/browse/SOLR-14013 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 7.7 >Reporter: Yonik Seeley >Assignee: Yonik Seeley >Priority: Blocker > Fix For: 8.4 > > Attachments: SOLR-14013.patch, SOLR-14013.patch, TestQuerySpeed.java, > test.json > > > As noted by [~rrockenbaugh] in SOLR-13963, javabin also recently became > orders of magnitude slower in certain cases since v7.7. The cases identified > so far include large numbers of values in a field. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14013) javabin performance regressions
[ https://issues.apache.org/jira/browse/SOLR-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16994651#comment-16994651 ] ASF subversion and git services commented on SOLR-14013: Commit 4d5df0e20ac3f2ac0a050241b3e124667ea1f812 in lucene-solr's branch refs/heads/gradle-master from Noble Paul [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=4d5df0e ] SOLR-14013: FIX: javabin performance regressions > javabin performance regressions > --- > > Key: SOLR-14013 > URL: https://issues.apache.org/jira/browse/SOLR-14013 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 7.7 >Reporter: Yonik Seeley >Assignee: Yonik Seeley >Priority: Blocker > Fix For: 8.4 > > Attachments: SOLR-14013.patch, SOLR-14013.patch, TestQuerySpeed.java, > test.json > > > As noted by [~rrockenbaugh] in SOLR-13963, javabin also recently became > orders of magnitude slower in certain cases since v7.7. The cases identified > so far include large numbers of values in a field. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14013) javabin performance regressions
[ https://issues.apache.org/jira/browse/SOLR-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16994650#comment-16994650 ] ASF subversion and git services commented on SOLR-14013: Commit b35f1debe33e69dcfb94d295324ca7fa85a6b5d7 in lucene-solr's branch refs/heads/gradle-master from Noble Paul [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=b35f1de ] SOLR-14013: javabin performance regressions > javabin performance regressions > --- > > Key: SOLR-14013 > URL: https://issues.apache.org/jira/browse/SOLR-14013 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 7.7 >Reporter: Yonik Seeley >Assignee: Yonik Seeley >Priority: Blocker > Fix For: 8.4 > > Attachments: SOLR-14013.patch, SOLR-14013.patch, TestQuerySpeed.java, > test.json > > > As noted by [~rrockenbaugh] in SOLR-13963, javabin also recently became > orders of magnitude slower in certain cases since v7.7. The cases identified > so far include large numbers of values in a field. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14013) javabin performance regressions
[ https://issues.apache.org/jira/browse/SOLR-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16994621#comment-16994621 ] Noble Paul commented on SOLR-14013: --- [~jpountz] It's not yet pushed to branch_8x. I'll do it once it is reviewed. This has to go in {{8.4}} > javabin performance regressions > --- > > Key: SOLR-14013 > URL: https://issues.apache.org/jira/browse/SOLR-14013 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 7.7 >Reporter: Yonik Seeley >Assignee: Yonik Seeley >Priority: Blocker > Fix For: 8.4 > > Attachments: SOLR-14013.patch, SOLR-14013.patch, TestQuerySpeed.java, > test.json > > > As noted by [~rrockenbaugh] in SOLR-13963, javabin also recently became > orders of magnitude slower in certain cases since v7.7. The cases identified > so far include large numbers of values in a field. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14013) javabin performance regressions
[ https://issues.apache.org/jira/browse/SOLR-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16994614#comment-16994614 ] ASF subversion and git services commented on SOLR-14013: Commit 4d5df0e20ac3f2ac0a050241b3e124667ea1f812 in lucene-solr's branch refs/heads/master from Noble Paul [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=4d5df0e ] SOLR-14013: FIX: javabin performance regressions > javabin performance regressions > --- > > Key: SOLR-14013 > URL: https://issues.apache.org/jira/browse/SOLR-14013 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 7.7 >Reporter: Yonik Seeley >Assignee: Yonik Seeley >Priority: Blocker > Fix For: 8.4 > > Attachments: SOLR-14013.patch, SOLR-14013.patch, TestQuerySpeed.java, > test.json > > > As noted by [~rrockenbaugh] in SOLR-13963, javabin also recently became > orders of magnitude slower in certain cases since v7.7. The cases identified > so far include large numbers of values in a field. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14013) javabin performance regressions
[ https://issues.apache.org/jira/browse/SOLR-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16994610#comment-16994610 ] Noble Paul commented on SOLR-14013: --- I accidentally pushed the fix to master instead of a branch an raise a PR [~ysee...@gmail.com] [~houston] please review The changes are # Perf optimizations are eliminated from HttpShardHandler & JavabinLoader # The bug is fixed > javabin performance regressions > --- > > Key: SOLR-14013 > URL: https://issues.apache.org/jira/browse/SOLR-14013 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 7.7 >Reporter: Yonik Seeley >Assignee: Yonik Seeley >Priority: Blocker > Fix For: 8.4 > > Attachments: SOLR-14013.patch, SOLR-14013.patch, TestQuerySpeed.java, > test.json > > > As noted by [~rrockenbaugh] in SOLR-13963, javabin also recently became > orders of magnitude slower in certain cases since v7.7. The cases identified > so far include large numbers of values in a field. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14013) javabin performance regressions
[ https://issues.apache.org/jira/browse/SOLR-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16994603#comment-16994603 ] ASF subversion and git services commented on SOLR-14013: Commit b35f1debe33e69dcfb94d295324ca7fa85a6b5d7 in lucene-solr's branch refs/heads/master from Noble Paul [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=b35f1de ] SOLR-14013: javabin performance regressions > javabin performance regressions > --- > > Key: SOLR-14013 > URL: https://issues.apache.org/jira/browse/SOLR-14013 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 7.7 >Reporter: Yonik Seeley >Assignee: Yonik Seeley >Priority: Blocker > Fix For: 8.4 > > Attachments: SOLR-14013.patch, SOLR-14013.patch, TestQuerySpeed.java, > test.json > > > As noted by [~rrockenbaugh] in SOLR-13963, javabin also recently became > orders of magnitude slower in certain cases since v7.7. The cases identified > so far include large numbers of values in a field. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14013) javabin performance regressions
[ https://issues.apache.org/jira/browse/SOLR-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16994367#comment-16994367 ] Noble Paul commented on SOLR-14013: --- I shall merge a fix soon. This fix is important > javabin performance regressions > --- > > Key: SOLR-14013 > URL: https://issues.apache.org/jira/browse/SOLR-14013 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 7.7 >Reporter: Yonik Seeley >Assignee: Yonik Seeley >Priority: Blocker > Fix For: 8.4 > > Attachments: SOLR-14013.patch, SOLR-14013.patch, TestQuerySpeed.java, > test.json > > > As noted by [~rrockenbaugh] in SOLR-13963, javabin also recently became > orders of magnitude slower in certain cases since v7.7. The cases identified > so far include large numbers of values in a field. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14013) javabin performance regressions
[ https://issues.apache.org/jira/browse/SOLR-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16994350#comment-16994350 ] Adrien Grand commented on SOLR-14013: - I'm deferring to you as to whether this patch is safe to get in so close to the release, but if you think it's better to get it in than not, then to me the question is about how long you think we need to get it merged. If it's a matter of one or two additional days it's fine. If it's weeks, I'll have a preference for targeting it for the next release. > javabin performance regressions > --- > > Key: SOLR-14013 > URL: https://issues.apache.org/jira/browse/SOLR-14013 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 7.7 >Reporter: Yonik Seeley >Assignee: Yonik Seeley >Priority: Blocker > Fix For: 8.4 > > Attachments: SOLR-14013.patch, SOLR-14013.patch, TestQuerySpeed.java, > test.json > > > As noted by [~rrockenbaugh] in SOLR-13963, javabin also recently became > orders of magnitude slower in certain cases since v7.7. The cases identified > so far include large numbers of values in a field. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14013) javabin performance regressions
[ https://issues.apache.org/jira/browse/SOLR-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16994233#comment-16994233 ] Ishan Chattopadhyaya commented on SOLR-14013: - bq. Probably too close of a call to get it into 8.4 This is a blocker for 8.4. If [~jpountz] feels we shouldn't wait for this one, then we can have a 8.4.1 with this. > javabin performance regressions > --- > > Key: SOLR-14013 > URL: https://issues.apache.org/jira/browse/SOLR-14013 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 7.7 >Reporter: Yonik Seeley >Assignee: Yonik Seeley >Priority: Blocker > Fix For: 8.4 > > Attachments: SOLR-14013.patch, SOLR-14013.patch, TestQuerySpeed.java, > test.json > > > As noted by [~rrockenbaugh] in SOLR-13963, javabin also recently became > orders of magnitude slower in certain cases since v7.7. The cases identified > so far include large numbers of values in a field. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14013) javabin performance regressions
[ https://issues.apache.org/jira/browse/SOLR-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16993963#comment-16993963 ] Noble Paul commented on SOLR-14013: --- Thanks [~houston] I didn't understand the difference between. The perf (with the patch ) should be same on both 8.x and master, correct? {code:java} patch (8.x) Ingest - 1.2 seconds Sharded Query - 0.4 seconds Non-Distrib Javabin Query - 0.17 seconds Non-Distrib JSON Query - 0.13 seconds patch (master) Ingest - .87 seconds Sharded Query - .3 seconds Non-Distrib Javabin Query - 0.06 seconds Non-Distrib JSON Query - 0.08 seconds {code} > javabin performance regressions > --- > > Key: SOLR-14013 > URL: https://issues.apache.org/jira/browse/SOLR-14013 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 7.7 >Reporter: Yonik Seeley >Assignee: Yonik Seeley >Priority: Blocker > Fix For: 8.4 > > Attachments: SOLR-14013.patch, SOLR-14013.patch, TestQuerySpeed.java, > test.json > > > As noted by [~rrockenbaugh] in SOLR-13963, javabin also recently became > orders of magnitude slower in certain cases since v7.7. The cases identified > so far include large numbers of values in a field. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14013) javabin performance regressions
[ https://issues.apache.org/jira/browse/SOLR-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16993844#comment-16993844 ] Houston Putman commented on SOLR-14013: --- I've created a patch that adds in [~noble.paul] 's test and fix (without the large file), and reverts the conversion for indexing and deserializing responses. I think I incorporated all of the places that you mentioned [~ysee...@gmail.com]. The only issue in the tests was in the langid contrib module, which was reverting a later-made fix. Probably too close of a call to get it into 8.4. What do y'all think of backporting this to 7.7.x, since it is such a serious regression? The solr/lucene tests pass on 8x and master. I've used [~rrockenbaugh]'s testing method mentioned above, for all branches that I could think would be relevant. The results are below: *patch (8.x)* Ingest - 1.2 seconds Sharded Query - 0.4 seconds Non-Distrib Javabin Query - 0.17 seconds Non-Distrib JSON Query - 0.13 seconds *patch (master)* Ingest - .87 seconds Sharded Query - .3 seconds Non-Distrib Javabin Query - 0.06 seconds Non-Distrib JSON Query - 0.08 seconds *7.6* Ingest - 1.6 seconds Sharded Query - 0.3 seconds Non-Distrib Javabin Query - 0.12 seconds Non-Distrib JSON Query - 0.15 seconds *7.x* Ingest - 1.3 seconds Sharded Query - 36 seconds Non-Distrib Javabin Query - 30 seconds Non-Distrib JSON Query - 0.3 seconds *8.x* Ingest - 1.18 seconds Sharded Query - 21 seconds Non-Distrib Javabin Query - 20 seconds Non-Distrib JSON Query - 0.07 seconds *master* Ingest - 2.6 seconds Sharded Query - 35 seconds Non-Distrib Javabin Query - 35 seconds Non-Distrib JSON Query - .16 seconds > javabin performance regressions > --- > > Key: SOLR-14013 > URL: https://issues.apache.org/jira/browse/SOLR-14013 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 7.7 >Reporter: Yonik Seeley >Assignee: Yonik Seeley >Priority: Blocker > Fix For: 8.4 > > Attachments: SOLR-14013.patch, TestQuerySpeed.java, test.json > > > As noted by [~rrockenbaugh] in SOLR-13963, javabin also recently became > orders of magnitude slower in certain cases since v7.7. The cases identified > so far include large numbers of values in a field. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14013) javabin performance regressions
[ https://issues.apache.org/jira/browse/SOLR-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16991041#comment-16991041 ] Yonik Seeley commented on SOLR-14013: - Please don't commit that huge JSON file... a doc matching that can be created with a few lines of java in the test. I'm not sure the test belongs as a unit test anyway as it's more of a performance benchmark, but I don't care much either way as long as it's quick to run. In general, what I think should be done is: - the auto-convert changes should be removed (in SolrDocument, SolrInputField, MaskCharSeqSolrDocument) - if there are parts of the code base that can't handle CharSequence, then disable reading Strings as CharSequence and look at if those other pieces of code can be fixed to handle CharSequence. > javabin performance regressions > --- > > Key: SOLR-14013 > URL: https://issues.apache.org/jira/browse/SOLR-14013 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 7.7 >Reporter: Yonik Seeley >Assignee: Yonik Seeley >Priority: Major > Attachments: SOLR-14013.patch, TestQuerySpeed.java, test.json > > > As noted by [~rrockenbaugh] in SOLR-13963, javabin also recently became > orders of magnitude slower in certain cases since v7.7. The cases identified > so far include large numbers of values in a field. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14013) javabin performance regressions
[ https://issues.apache.org/jira/browse/SOLR-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16991031#comment-16991031 ] Noble Paul commented on SOLR-14013: --- I have submitted a bug fix for the perf degradation There are 3 places where the optimizations are done # Writing out responses # Indexing # deserializing responses during inter-node communications The changes are minimal for #1 and #2 and #3 are complex I would recommend reverting #2 and #3 and let #1 continue to be there with the bug fix (and more auditing) I have just submitted. I'll go with the decision of the community and do the necessary work > javabin performance regressions > --- > > Key: SOLR-14013 > URL: https://issues.apache.org/jira/browse/SOLR-14013 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 7.7 >Reporter: Yonik Seeley >Assignee: Yonik Seeley >Priority: Major > Attachments: SOLR-14013.patch, TestQuerySpeed.java, test.json > > > As noted by [~rrockenbaugh] in SOLR-13963, javabin also recently became > orders of magnitude slower in certain cases since v7.7. The cases identified > so far include large numbers of values in a field. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14013) javabin performance regressions
[ https://issues.apache.org/jira/browse/SOLR-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16991014#comment-16991014 ] Yonik Seeley commented on SOLR-14013: - I worked up a quick-n-dirty patch to disable the charseq optimization stuff to test my hypothesis on slower indexing speed: {code} git diff diff --git a/solr/core/src/java/org/apache/solr/handler/component/HttpShardHandler.java b/solr/core/src/java/org/apache/solr/handler/component/HttpShardHandler.java index 69da3948fe9..620fffb1303 100644 --- a/solr/core/src/java/org/apache/solr/handler/component/HttpShardHandler.java +++ b/solr/core/src/java/org/apache/solr/handler/component/HttpShardHandler.java @@ -146,7 +146,7 @@ public class HttpShardHandler extends ShardHandler { private static final BinaryResponseParser READ_STR_AS_CHARSEQ_PARSER = new BinaryResponseParser() { @Override protected JavaBinCodec createCodec() { - return new JavaBinCodec(null, stringCache).setReadStringAsCharSeq(true); + return new JavaBinCodec(null, stringCache).setReadStringAsCharSeq(false); } }; diff --git a/solr/core/src/java/org/apache/solr/response/DocsStreamer.java b/solr/core/src/java/org/apache/solr/response/DocsStreamer.java index 3d1976e143c..056dc08d963 100644 --- a/solr/core/src/java/org/apache/solr/response/DocsStreamer.java +++ b/solr/core/src/java/org/apache/solr/response/DocsStreamer.java @@ -148,9 +148,7 @@ public class DocsStreamer implements Iterator { // because that doesn't include extra fields needed by transformers final Set fieldNamesNeeded = fields.getLuceneFieldNames(); -final SolrDocument out = ResultContext.READASBYTES.get() == null ? -new SolrDocument() : -new BinaryResponseWriter.MaskCharSeqSolrDocument(); +final SolrDocument out = new SolrDocument(); // NOTE: it would be tempting to try and optimize this to loop over fieldNamesNeeded // when it's smaller then the IndexableField[] in the Document -- but that's actually *less* effecient diff --git a/solr/solrj/src/java/org/apache/solr/common/util/ByteArrayUtf8CharSequence.java b/solr/solrj/src/java/org/apache/solr/common/util/ByteArrayUtf8CharSequence.java index 7a4abe2c303..53cfbee320f 100644 --- a/solr/solrj/src/java/org/apache/solr/common/util/ByteArrayUtf8CharSequence.java +++ b/solr/solrj/src/java/org/apache/solr/common/util/ByteArrayUtf8CharSequence.java @@ -209,8 +209,11 @@ public class ByteArrayUtf8CharSequence implements Utf8CharSequence { } return vals; } - public static Object convertCharSeq(Object o) { +return o; // nocommit + } + + public static Object _convertCharSeq(Object o) { if (o == null) return null; if (o instanceof Utf8CharSequence) return ((Utf8CharSequence) o).toString(); if (o instanceof Collection) return convertCharSeq((Collection) o); {code} I also hacked up the unit test I used to find the N^2 issue... it's obviously not good for benchmarking (being a unit test, etc), but good enough to detect anything major. I tested with a single value per string field (and many fields per doc).. it would be worse for multiple values per field. Results: = master, single valued string fields [junit4] 2> INDEX TIME=10293 [junit4] 2> QUERY TIME=891 xml [junit4] 2> QUERY TIME=415 javabin [junit4] 2> QUERY TIME=600 json [junit4] 2> INDEX TIME=10313 [junit4] 2> QUERY TIME=872 xml [junit4] 2> QUERY TIME=389 javabin [junit4] 2> QUERY TIME=579 json [junit4] 2> INDEX TIME=10307 [junit4] 2> QUERY TIME=858 xml [junit4] 2> QUERY TIME=410 javabin [junit4] 2> QUERY TIME=570 json [junit4] 2> INDEX TIME=10318 [junit4] 2> QUERY TIME=915 xml [junit4] 2> QUERY TIME=382 javabin [junit4] 2> QUERY TIME=600 json [junit4] 2> INDEX TIME=10579 [junit4] 2> QUERY TIME=843 xml [junit4] 2> QUERY TIME=386 javabin [junit4] 2> QUERY TIME=570 json = patch disabling charseq stuff, single valued string fields [junit4] 2> INDEX TIME=8547 [junit4] 2> QUERY TIME=881 xml [junit4] 2> QUERY TIME=396 javabin [junit4] 2> QUERY TIME=576 json [junit4] 2> INDEX TIME=9428 [junit4] 2> QUERY TIME=821 xml [junit4] 2> QUERY TIME=374 javabin [junit4] 2> QUERY TIME=543 json [junit4] 2> INDEX TIME=9181 [junit4] 2> QUERY TIME=812 xml [junit4] 2> QUERY TIME=382 javabin [junit4] 2> QUERY TIME=533 json [junit4] 2> INDEX TIME=9455 [junit4] 2> QUERY TIME=863 xml [junit4] 2> QUERY TIME=395 javabin [junit4] 2> QUERY TIME=613 json [junit4] 2> INDEX TIME=9530 [junit4] 2> QUERY TIME=863 xml [junit4] 2> QUERY TIME=385 javabin [junit4] 2> QUERY TIME=559 json So the charseq stuff (or rather probably the extra work to auto-convert-to-string) did cause slower indexing speed. There is enough noise that I don't think one can draw any conclusions about query speed.
[jira] [Commented] (SOLR-14013) javabin performance regressions
[ https://issues.apache.org/jira/browse/SOLR-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16990926#comment-16990926 ] Yonik Seeley commented on SOLR-14013: - Those benchmarks look like they are testing different settings, not a before-vs-after patch scenario? > javabin performance regressions > --- > > Key: SOLR-14013 > URL: https://issues.apache.org/jira/browse/SOLR-14013 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 7.7 >Reporter: Yonik Seeley >Assignee: Yonik Seeley >Priority: Major > Attachments: test.json > > > As noted by [~rrockenbaugh] in SOLR-13963, javabin also recently became > orders of magnitude slower in certain cases since v7.7. The cases identified > so far include large numbers of values in a field. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14013) javabin performance regressions
[ https://issues.apache.org/jira/browse/SOLR-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16990919#comment-16990919 ] Ishan Chattopadhyaya commented on SOLR-14013: - bq. Perf improvements should be backed by benchmarks. FYI, https://issues.apache.org/jira/browse/SOLR-12885?focusedCommentId=16709641&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16709641 > javabin performance regressions > --- > > Key: SOLR-14013 > URL: https://issues.apache.org/jira/browse/SOLR-14013 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 7.7 >Reporter: Yonik Seeley >Assignee: Yonik Seeley >Priority: Major > Attachments: test.json > > > As noted by [~rrockenbaugh] in SOLR-13963, javabin also recently became > orders of magnitude slower in certain cases since v7.7. The cases identified > so far include large numbers of values in a field. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14013) javabin performance regressions
[ https://issues.apache.org/jira/browse/SOLR-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16990868#comment-16990868 ] Adrien Grand commented on SOLR-14013: - I'm also a bit disappointed that SOLR-12885 changed Field's constructor from String to CharSequence silently. > javabin performance regressions > --- > > Key: SOLR-14013 > URL: https://issues.apache.org/jira/browse/SOLR-14013 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 7.7 >Reporter: Yonik Seeley >Assignee: Yonik Seeley >Priority: Major > Attachments: test.json > > > As noted by [~rrockenbaugh] in SOLR-13963, javabin also recently became > orders of magnitude slower in certain cases since v7.7. The cases identified > so far include large numbers of values in a field. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14013) javabin performance regressions
[ https://issues.apache.org/jira/browse/SOLR-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16990802#comment-16990802 ] Jan Høydahl commented on SOLR-14013: +1 Perf improvements should be backed by benchmarks. > javabin performance regressions > --- > > Key: SOLR-14013 > URL: https://issues.apache.org/jira/browse/SOLR-14013 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 7.7 >Reporter: Yonik Seeley >Assignee: Yonik Seeley >Priority: Major > Attachments: test.json > > > As noted by [~rrockenbaugh] in SOLR-13963, javabin also recently became > orders of magnitude slower in certain cases since v7.7. The cases identified > so far include large numbers of values in a field. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14013) javabin performance regressions
[ https://issues.apache.org/jira/browse/SOLR-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16990752#comment-16990752 ] Ishan Chattopadhyaya commented on SOLR-14013: - bq. At this point I think the best thing to do is roll it back. +1 > javabin performance regressions > --- > > Key: SOLR-14013 > URL: https://issues.apache.org/jira/browse/SOLR-14013 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 7.7 >Reporter: Yonik Seeley >Assignee: Yonik Seeley >Priority: Major > Attachments: test.json > > > As noted by [~rrockenbaugh] in SOLR-13963, javabin also recently became > orders of magnitude slower in certain cases since v7.7. The cases identified > so far include large numbers of values in a field. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14013) javabin performance regressions
[ https://issues.apache.org/jira/browse/SOLR-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16990579#comment-16990579 ] Yonik Seeley commented on SOLR-14013: - Even without the O(N^2) bug, which would be that hard to work around, this auto-check-and-convert on access is quite a trap (as seen above) that would be constantly biting devs forever. It's also almost assuredly the case that after just handling the N^2 bug, things will be slower overall (often with more memory usage) than before this attempt to save utf-8 conversion. At this point I think the best thing to do is roll it back. I support the idea of trying to use more CharSequence... but it's hard in practice and we need to be careful. The original fault lies with Java of course, which introduced CharSequence long after String, and was never fully converted/adopted ;-) In the future, we should certainly benchmark any changes that are meant to improve performance. > javabin performance regressions > --- > > Key: SOLR-14013 > URL: https://issues.apache.org/jira/browse/SOLR-14013 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 7.7 >Reporter: Yonik Seeley >Assignee: Yonik Seeley >Priority: Major > Attachments: test.json > > > As noted by [~rrockenbaugh] in SOLR-13963, javabin also recently became > orders of magnitude slower in certain cases since v7.7. The cases identified > so far include large numbers of values in a field. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14013) javabin performance regressions
[ https://issues.apache.org/jira/browse/SOLR-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16990572#comment-16990572 ] Yonik Seeley commented on SOLR-14013: - OK, found the primary issue it's an N^2 bug. h3. Background: SOLR-12885 changed SolrDocument (among other things) to make it so that when used through certain interfaces, gets would auto-check-and-convert keys & values of type Utf8CharSequence to String. This happened in methods getFieldValueMap.get(), getFieldValueMap.keySet(), remove() and getValues() h3. Issues: SolrDocument does not auto-convert through many other of its methods, such as .get(), .put(), .keySet(), so depending on this anywhere is extremely fragile and will break if you change how you access SolrDocument SolrInputField has some of the same issues as SolrDocument, the mere act of doing a .get() on a multi-valued field (which should be O(1)) scans the entire list for CharSequence and if it finds one, creates a new list and iterates over the whole thing again to convert each element. h3. Client side indexing: And it's worse, because it looks like this auto-check-and-convert logic is even triggered when the SolrJ is using JavaBinCodec to send documents... so even if some field values were Utf8CharSequence to begin with, they would still be converted to String before being converted back to utf8 by JavaBinCodec! h3. Server side indexing: Then on the server side, JavaBinCodec parses String values as Utf8CharSequence, and we start going through the update processor chain. FieldMutatingUpdateProcessor (used in our _default config to remove blank values) asks each SolrInputField for its value, which again triggers iteration over the complete list. Also, for *any* string values (single valued too), FieldMutatingUpdateProcessor replaces those Utf8CharSequence objects with String objects (destroying any attempted re-serializing optimization) Then comes NestedUpdateProcessorFactory, which triggers the auto-check-and-convert *twice*, because getValue() returned a pointer previously, which would have been optimized away. Both lines below iterate over all values, *before* the actual iteration by the explicit "for" loop: {code:java} boolean isSingleVal = !(field.getValue() instanceof Collection); for(Objectval: field) { {code} then isAtomicUpdate(), and then finally writeSolrInputDocument() to convert to JavaBin for the transaction log both trigger the extra iterate-over-all-values with each inspection. If FieldMutatingUpdateProcessor hadn't overwritten Utf8CharSequence already, all of these accesses would have also triggered a new collection creation each time (and an additional iteration to create the new collection) for every multi-valued string field. h3. Server side query: On the query side, we get a Lucene Document, and then convert it into a SolrDocument. Binary ResponseWriter uses MaskCharSeqSolrDocument which inherits from SolrDocument to do the auto-convert-on-access stuff more thoroughly. {code:java} for (IndexableField f : doc.getFields()) { final String fname = f.name(); if (null == fieldNamesNeeded || fieldNamesNeeded.contains(fname) ) { // Make sure multivalued fields are represented as lists Object existing = out.get(fname); {code} For multi-valued fields, what we get back from lucene is actually a flat list of all the values in the whole document. We need to collect all values with the same field into a list. So if there are 1000 values in a field, the outer loop executes over 1000 times. Then in the inner loop we retrieve any existing value for the field by calling "out.get(fname)", which triggers the auto-convert-on-access which scans all the values so far (on average 1000/2), and hence we have our O(N^2) behavior that the original poster reported. h3. Other: It took a really long time to review some of this code (and I've only reviewed some), often because a lack of comments around non-obvious things. I thought there might be lifetime/sharing bugs with BytesBlock for example, until I realized that strings are appended in the block rather than placed at the start. Same issue for FastInputStream.readDirectUtf8... since it looked like it was sharing the internal buffer, I thought there was a possible lifetime issue there. A single line comment in both of those cases would have saved me quite a bit. Actually... looking at it again, there still may be a subtle sharing bug in this new FastInputStream.readDirectUtf8. I can't say I quite understand the logic behind for when you can't share the internal buffer. {code:java} if (in !=null || end < pos + len) return false; {code} You can only share the buffer when the bytes you want are right up at the end of the buffer? I'm not sure I understand the logic around that, but ChannelFastInputStream (used by TransactionLog) derives from
[jira] [Commented] (SOLR-14013) javabin performance regressions
[ https://issues.apache.org/jira/browse/SOLR-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16990124#comment-16990124 ] Noble Paul commented on SOLR-14013: --- [~ysee...@gmail.com] You can just avoid the call to {{JavaBinCodec#setReadStringAsCharSeq()}} and get the old behavior > javabin performance regressions > --- > > Key: SOLR-14013 > URL: https://issues.apache.org/jira/browse/SOLR-14013 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 7.7 >Reporter: Yonik Seeley >Assignee: Yonik Seeley >Priority: Major > Attachments: test.json > > > As noted by [~rrockenbaugh] in SOLR-13963, javabin also recently became > orders of magnitude slower in certain cases since v7.7. The cases identified > so far include large numbers of values in a field. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14013) javabin performance regressions
[ https://issues.apache.org/jira/browse/SOLR-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16989860#comment-16989860 ] Yonik Seeley commented on SOLR-14013: - Just an update... I tried speeding things up by skipping most of the above, and it did get faster, but it's still much slower. Still digging... > javabin performance regressions > --- > > Key: SOLR-14013 > URL: https://issues.apache.org/jira/browse/SOLR-14013 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 7.7 >Reporter: Yonik Seeley >Assignee: Yonik Seeley >Priority: Major > Attachments: test.json > > > As noted by [~rrockenbaugh] in SOLR-13963, javabin also recently became > orders of magnitude slower in certain cases since v7.7. The cases identified > so far include large numbers of values in a field. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14013) javabin performance regressions
[ https://issues.apache.org/jira/browse/SOLR-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16989121#comment-16989121 ] Yonik Seeley commented on SOLR-14013: - I did some digging... the current code is certainly more complex than it used to be. So for a multi-valued field, the values internally are now IndexableField (for each value) For each of those values, we call JavaBinCodec.writeVal which tries writeKnownTypes which tests if it's a member of ~10 primitive types via instanceof failing, then tests against 19 other types failing, falls back to object resolver which tries against 4 other types, finally matching it up to IndexableField the schema is used to look up the SchemaField based on name (2 more hash lookups) then we call DocStreamer.getValue(), which does another hash lookup based on the .getClass() and then calls FieldType.toObject() which calls toExternal() which finally calls stringValue() on the IndexableField and now that we have our object, JavaBinCodec can try writeKnownTypes() again And this is now the common case! > javabin performance regressions > --- > > Key: SOLR-14013 > URL: https://issues.apache.org/jira/browse/SOLR-14013 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 7.7 >Reporter: Yonik Seeley >Assignee: Yonik Seeley >Priority: Major > Attachments: test.json > > > As noted by [~rrockenbaugh] in SOLR-13963, javabin also recently became > orders of magnitude slower in certain cases since v7.7. The cases identified > so far include large numbers of values in a field. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14013) javabin performance regressions
[ https://issues.apache.org/jira/browse/SOLR-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16988961#comment-16988961 ] Ryan Rockenbaugh commented on SOLR-14013: - Here are my initial steps to reproduce: extract solr 7.6.0 start solr: {noformat} bin/solr start -e cloud -noprompt{noformat} index record: {noformat} curl -X POST "http://localhost:8983/solr/gettingstarted/update/json/docs?commit=true"; -T test.json{noformat} query record: {noformat} curl "http://localhost:8983/solr/gettingstarted/select?q=id:1"{noformat} Results are returned in miliseconds (10-20 ms for me) Then do the same for solr 7.7.2: extract solr 7.7.2 start solr: {noformat} bin/solr start -e cloud -noprompt{noformat} index record: {noformat} curl -X POST "http://localhost:8983/solr/gettingstarted/update/json/docs?commit=true"; -T test.json{noformat} query record: {noformat} curl "http://localhost:8983/solr/gettingstarted/select?q=id:1"{noformat} Results are returned in seconds(20-30 seconds for me) I had the same behavior in 8.0.0, 8.1.0, 8.2.0, 8.3.0 Note: If I query a specific shard and set distrib=false, and javabin format is not used, and return times are miliseconds. {noformat} curl "http://localhost:8983/solr/gettingstarted_shard1_replica_n1/select?q=id:1&distrib=false"{noformat} If I add wt=javabin: {noformat} curl "http://localhost:8983/solr/gettingstarted_shard1_replica_n1/select?q=id:1&distrib=false&wt=javabin"{noformat} Results are returned in seconds (20-30 seconds for me) > javabin performance regressions > --- > > Key: SOLR-14013 > URL: https://issues.apache.org/jira/browse/SOLR-14013 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 7.7 >Reporter: Yonik Seeley >Assignee: Yonik Seeley >Priority: Major > Attachments: test.json > > > As noted by [~rrockenbaugh] in SOLR-13963, javabin also recently became > orders of magnitude slower in certain cases since v7.7. The cases identified > so far include large numbers of values in a field. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14013) javabin performance regressions
[ https://issues.apache.org/jira/browse/SOLR-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16988249#comment-16988249 ] Noble Paul commented on SOLR-14013: --- [~rrockenbaugh] can you add relevant details here as well please > javabin performance regressions > --- > > Key: SOLR-14013 > URL: https://issues.apache.org/jira/browse/SOLR-14013 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 7.7 >Reporter: Yonik Seeley >Assignee: Yonik Seeley >Priority: Major > > As noted by [~rrockenbaugh] in SOLR-13963, javabin also recently became > orders of magnitude slower in certain cases since v7.7. The cases identified > so far include large numbers of values in a field. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14013) javabin performance regressions
[ https://issues.apache.org/jira/browse/SOLR-14013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16987986#comment-16987986 ] Yonik Seeley commented on SOLR-14013: - The original reporter suspects that this may be caused by SOLR-12983, so I'll link it as such for now. > javabin performance regressions > --- > > Key: SOLR-14013 > URL: https://issues.apache.org/jira/browse/SOLR-14013 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 7.7 >Reporter: Yonik Seeley >Priority: Major > > As noted by [~rrockenbaugh] in SOLR-13963, javabin also recently became > orders of magnitude slower in certain cases since v7.7. The cases identified > so far include large numbers of values in a field. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org