For some reason I guess this didn't go thru and caused all the confusion. ||Seg size||Query||Tot hits||Sort||Top N||QPS old||QPS new||Pct change|| |log|<all>|1000000|rand string|10|91.76|108.63|{color:green}18.4%{color}| |log|<all>|1000000|rand string|25|92.39|106.79|{color:green}15.6%{color}| |log|<all>|1000000|rand string|50|91.30|104.02|{color:green}13.9%{color}| |log|<all>|1000000|rand string|500|86.16|63.27|{color:red}-26.6%{color}| |log|<all>|1000000|rand string|1000|76.92|64.85|{color:red}-15.7%{color}| |log|<all>|1000000|country|10|92.42|108.78|{color:green}17.7%{color}| |log|<all>|1000000|country|25|92.60|106.26|{color:green}14.8%{color}| |log|<all>|1000000|country|50|92.64|103.76|{color:green}12.0%{color}| |log|<all>|1000000|country|500|83.92|50.30|{color:red}-40.1%{color}| |log|<all>|1000000|country|1000|74.78|46.59|{color:red}-37.7%{color}| |log|<all>|1000000|rand int|10|114.03|114.85|{color:green}0.7%{color}| |log|<all>|1000000|rand int|25|113.77|112.92|{color:red}-0.7%{color}| |log|<all>|1000000|rand int|50|113.36|109.56|{color:red}-3.4%{color}| |log|<all>|1000000|rand int|500|103.90|66.29|{color:red}-36.2%{color}| |log|<all>|1000000|rand int|1000|89.52|70.67|{color:red}-21.1%{color}|
On Thu, Oct 22, 2009 at 7:43 PM, John Wang <john.w...@gmail.com> wrote: > Mike: > I did just post with what I saw, feel free to read and comment on > it. > > I am simply trying to work with Michael on this and trying to > understand the code. > > As I have expressed previously, I have seen a difference between 1.5 > and 1.6 that is significant. Since Mike has posted some numbers on jdk 1.6, > I was hoping to eliminate all variables relating to the index and > environment and see if he sees the same thing. > > I guess I should be more clear in the email. > > -John > > On Thu, Oct 22, 2009 at 7:39 PM, Mark Miller <markrmil...@gmail.com>wrote: > >> I am patient :) And I'm not speaking for Mike, I'm speaking for me. I'm >> wondering what your seeing. Asking Mike to rerun the tests without >> giving any further info (you didn't even say that your seeing something >> different) is unfair to the rest of us ;) >> >> Giving 0 info along with your request just makes 0 sense to me and I >> said as much. >> >> John Wang wrote: >> > Mark: >> > >> > Please be patient with me. I am seeing a difference and was >> > wondering if Mike would see the same thing. I thought Michael would be >> > willing to because he expressed interest in understanding what the >> > performance discrepancies are. >> > >> > Again, it is only a request. It is perfectly fine if Michael >> > refuses to. But it would be great if Michael speaks for himself. >> > >> > Thanks >> > >> > -John >> > >> > On Thu, Oct 22, 2009 at 7:29 PM, Mark Miller <markrmil...@gmail.com >> > <mailto:markrmil...@gmail.com>> wrote: >> > >> > Why? What might he find? Whats with the cryptic request? >> > >> > Why would Java 1.5 perform better than 1.6? It erases 20 and 40% >> > gains? >> > >> > I know point 2 certainly doesn't. Cards on the table? >> > >> > John Wang wrote: >> > > Hey Michael: >> > > >> > > Would you mind rerunning the test you have with jdk1.5? >> > > >> > > Also, if you would, change the comparator method to avoid >> > > brachning for int and string comparators, e.g. >> > > >> > > >> > > return index.order[i.doc] - index.order[j.doc]; >> > > >> > > >> > > Thanks >> > > >> > > >> > > -John >> > > >> > > >> > > On Thu, Oct 22, 2009 at 2:38 AM, Michael McCandless >> > > <luc...@mikemccandless.com <mailto:luc...@mikemccandless.com> >> > <mailto:luc...@mikemccandless.com >> > <mailto:luc...@mikemccandless.com>>> wrote: >> > > >> > > On Thu, Oct 22, 2009 at 2:17 AM, John Wang >> > <john.w...@gmail.com <mailto:john.w...@gmail.com> >> > > <mailto:john.w...@gmail.com <mailto:john.w...@gmail.com>>> >> > wrote: >> > > >> > > > I have been playing with the patch, and I think I >> > have some >> > > information >> > > > that you might like. >> > > > Let me spend sometime and gather some more numbers and >> > > update in jira. >> > > >> > > Excellent! >> > > >> > > > say bottom has ords 23, 45, 76, each corresponding to a >> > > string. When >> > > > moving to the next segment, you need to make bottom to >> > have ords >> > > that can be >> > > > comparable to other docs in this new segment, so you would >> > need >> > > to find the >> > > > new ords for the values in 23,45 and 76, don't you? To >> > find it, >> > > assuming the >> > > > values are s1,s2,s3, you would do a bin. search on the new >> val >> > > array, and >> > > > find index for s1,s2,s3. >> > > >> > > It's that inversion (from ord->Comparable in first seg, and >> > > Comparable->ord in second seg) that I'm trying to avoid (w/ >> > this new >> > > proposal). >> > > >> > > > Which is 3 bin searches per convert, I am not sure >> > > > how you can short circuit it. Are you suggesting we call >> > > Comparable on >> > > > compareBottom until some doc beats it? >> > > >> > > I'm saying on seg transition you indeed get the Comparable >> > for current >> > > bottom, but, don't attempt to invert it. Instead, as seg 2 >> > finds a >> > > hit, you get that hit's Comparables and compare to bottom. >> > If it >> > > beats bottom, it goes into the queue. If it does not, you >> > use the ord >> > > (in seg 2's ord space) to "learn" a bottom in the ord space >> > of seg 2. >> > > >> > > > That would hurt performance I lot though, no? >> > > >> > > Yeah I think likely it would, since we're talking about a >> binary >> > > search on transition VS having to do possibly many >> > > upgrade-to-Comparable and compare-Comparabls to slowly learn >> the >> > > equivalent ord in the new segment. I was proposing it for >> > cases where >> > > inversion is very difficult. But realistically, since you >> > must keep >> > > around the ful ord -> Comparable for every segment anyway >> > (in order to >> > > merge in the end), inversion shouldn't ever actually be >> > "difficult" -- >> > > it'd just be a binary search on presumably in-RAM storage. >> > > >> > > Mike >> > > >> > > >> > >> --------------------------------------------------------------------- >> > > To unsubscribe, e-mail: >> > java-dev-unsubscr...@lucene.apache.org >> > <mailto:java-dev-unsubscr...@lucene.apache.org> >> > > <mailto:java-dev-unsubscr...@lucene.apache.org >> > <mailto:java-dev-unsubscr...@lucene.apache.org>> >> > > For additional commands, e-mail: >> > java-dev-h...@lucene.apache.org >> > <mailto:java-dev-h...@lucene.apache.org> >> > > <mailto:java-dev-h...@lucene.apache.org >> > <mailto:java-dev-h...@lucene.apache.org>> >> > > >> > > >> > >> > >> > -- >> > - Mark >> > >> > http://www.lucidimagination.com >> > >> > >> > >> > >> > >> --------------------------------------------------------------------- >> > To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org >> > <mailto:java-dev-unsubscr...@lucene.apache.org> >> > For additional commands, e-mail: java-dev-h...@lucene.apache.org >> > <mailto:java-dev-h...@lucene.apache.org> >> > >> > >> >> >> -- >> - Mark >> >> http://www.lucidimagination.com >> >> >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org >> For additional commands, e-mail: java-dev-h...@lucene.apache.org >> >> >