That's it! I hand edited the file that says you are not supposed to edit it and removed that copyField. Indexing performance is now back to expected levels.
I created an issue for this, https://issues.apache.org/jira/browse/SOLR-7284 --Mike On Sun, Mar 22, 2015 at 3:29 PM, Yonik Seeley <ysee...@gmail.com> wrote: > I took a quick look at the stock schemaless configs... unfortunately > they contain a performance trap. > There's a copyField by default that copies *all* fields to a catch-all > field called "_text". > > IMO, that's not a great default. Double the index size (well, the > "index" portion of it at least... not stored fields), and slower > indexing performance. > > The other unfortunate thing is the name. No where else in solr (that > I know of) do we have a single underscore field name. _text looks > more like a dynamicField pattern. Our other fields with underscores > look like _version_ and _root_. If we're going to start a new naming > convention (or expand the naming conventions) we need to have some > consistency and logic behind it. > > -Yonik > > On Sun, Mar 22, 2015 at 12:32 PM, Mike Murphy <mmurphy3...@gmail.com> wrote: >> I start up solr schemaless and index a bunch of data, and it takes a >> lot longer to finish indexing. >> No configuration changes, just straight schemaless. >> >> --Mike >> >> On Sun, Mar 22, 2015 at 12:27 PM, Erick Erickson >> <erickerick...@gmail.com> wrote: >>> Please review: http://wiki.apache.org/solr/UsingMailingLists >>> >>> You haven't quantified the slowdown. Or given any details on how >>> you're measuring the "slowdown". Or how you've configured your setups >>> in 4.10 and 5.0. Or... Ad Hossman would say "details matter". >>> >>> Best, >>> Erick >>> >>> On Sun, Mar 22, 2015 at 8:35 AM, Mike Murphy <mmurphy3...@gmail.com> wrote: >>>> I'm trying out schemaless in solr 5.0, but the indexing seems quite a >>>> bit slower than it did in the past on 4.10. Any pointers? >>>> >>>> --Mike