you won't see indexing improvements there because the dataset in question is wikipedia and mostly indexing full text. I think it may have one measly numeric field.
On Thu, Apr 14, 2016 at 6:25 PM, Otis Gospodnetić <[email protected]> wrote: > (replying to my original email because I didn't get people's replies, even > though I see in the archives people replied) > > Re BJ and beast2 upgrade. Yeah, I saw that, but.... > * if there is no indexing throughput improvement after that, does that mean > that those particular indexing tests happen to be disk bound and not CPU > bound? (I'm assuming beast2 has more cores than the previous hardware.... > oh, I see, 72 cores vs. only 20 indexing threads) > * the metrics for GC times are sums across all CPUs, not averages per CPU? > Would the latter be more useful? > > What I was fishing for was something in that indexing chart that would show > me this little nugget: > > *Lucene 6 brings a major new feature called Dimensional Points: a new > tree-based data structure which will be used for numeric, date, and > geospatial fields. Compared to the existing field format, this new > structure uses half the disk space, is twice as fast to index, and > increases search performance by 25%.* > > How come the charts on > http://home.apache.org/~mikemccand/lucenebench/indexing.html don't show the > 2x faster indexing and various query performance charts don't show 25% > improvement in search performance? > > Thanks, > Otis > -- > Monitoring - Log Management - Alerting - Anomaly Detection > Solr & Elasticsearch Consulting Support Training - http://sematext.com/ > > > On Thu, Apr 14, 2016 at 1:13 PM, Otis Gospodnetić < > [email protected]> wrote: > >> Hi, >> >> I was looking at Mike's >> http://home.apache.org/~mikemccand/lucenebench/indexing.html secretly >> hoping to spot some recent improvements in indexing throughput.... but >> instead it looks like: >> >> * indexing throughput hasn't really gone up in the last ~5 years >> * indexing was faster in 2014, but then dropped to pre-2014 speed in early >> 2015 >> * indexing rate dropped some more in early 2016, and that seems to roughly >> correlate to a *big* jump in Young GC in late 2015 >> >> Does anyone know what happened in late 2015 that causes that big Young GC >> jump? >> Or does that big jump just look scary in that chart, but is not actually a >> big concern in practice? >> >> Thanks, >> Otis >> -- >> Monitoring - Log Management - Alerting - Anomaly Detection >> Solr & Elasticsearch Consulting Support Training - http://sematext.com/ >> >> --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
