If the response time from each shard shows decent figures, then aggregator seems to be a bottleneck. Do you btw have a lot of concurrent users?
On Wed, Nov 23, 2011 at 4:38 PM, Artem Lokotosh <arco...@gmail.com> wrote: > > Is this log from the frontend SOLR (aggregator) or from a shard? > from aggregator > > > Can you merge, e.g. 3 shards together or is it much effort for your team? > Yes, we can merge. We'll try to do this and review how it will works > Thanks, Dmitry > > Any another ideas? > > On Wed, Nov 23, 2011 at 4:01 PM, Dmitry Kan <dmitry....@gmail.com> wrote: > > Hello, > > > > Is this log from the frontend SOLR (aggregator) or from a shard? > > Can you merge, e.g. 3 shards together or is it much effort for your team? > > > > In our setup we currently have 16 shards with ~30GB each, but we rarely > > search in all of them at once. > > > > Best, > > Dmitry > > > > On Wed, Nov 23, 2011 at 3:12 PM, Artem Lokotosh <arco...@gmail.com> > wrote: > > > >> Hi! > >> > >> * Data: > >> - Solr 3.4; > >> - 30 shards ~ 13GB, 27-29M docs each shard. > >> > >> * Machine parameters (Ubuntu 10.04 LTS): > >> user@Solr:~$ uname -a > >> Linux Solr 2.6.32-31-server #61-Ubuntu SMP Fri Apr 8 19:44:42 UTC 2011 > >> x86_64 GNU/Linux > >> user@Solr:~$ cat /proc/cpuinfo > >> processor : 0 - 3 > >> vendor_id : GenuineIntel > >> cpu family : 6 > >> model : 44 > >> model name : Intel(R) Xeon(R) CPU X5690 @ 3.47GHz > >> stepping : 2 > >> cpu MHz : 3458.000 > >> cache size : 12288 KB > >> fpu : yes > >> fpu_exception : yes > >> cpuid level : 11 > >> wp : yes > >> flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge > >> mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss syscall nx > >> rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology > >> tsc_reliable nonstop_tsc aperfmperf pni pclmulqdq ssse3 cx16 sse4_1 > >> sse4_2 popcnt aes hypervisor lahf_lm ida arat > >> bogomips : 6916.00 > >> clflush size : 64 > >> cache_alignment : 64 > >> address sizes : 40 bits physical, 48 bits virtual > >> power management: > >> user@Solr:~$ cat /proc/meminfo > >> MemTotal: 16992680 kB > >> MemFree: 110424 kB > >> Buffers: 9976 kB > >> Cached: 11588380 kB > >> SwapCached: 41952 kB > >> Active: 9860764 kB > >> Inactive: 6198668 kB > >> Active(anon): 4062144 kB > >> Inactive(anon): 398972 kB > >> Active(file): 5798620 kB > >> Inactive(file): 5799696 kB > >> Unevictable: 0 kB > >> Mlocked: 0 kB > >> SwapTotal: 46873592 kB > >> SwapFree: 46810712 kB > >> Dirty: 36 kB > >> Writeback: 0 kB > >> AnonPages: 4424756 kB > >> Mapped: 940660 kB > >> Shmem: 40 kB > >> Slab: 362344 kB > >> SReclaimable: 350372 kB > >> SUnreclaim: 11972 kB > >> KernelStack: 2488 kB > >> PageTables: 68568 kB > >> NFS_Unstable: 0 kB > >> Bounce: 0 kB > >> WritebackTmp: 0 kB > >> CommitLimit: 55369932 kB > >> Committed_AS: 5740556 kB > >> VmallocTotal: 34359738367 kB > >> VmallocUsed: 350532 kB > >> VmallocChunk: 34359384964 kB > >> HardwareCorrupted: 0 kB > >> HugePages_Total: 0 > >> HugePages_Free: 0 > >> HugePages_Rsvd: 0 > >> HugePages_Surp: 0 > >> Hugepagesize: 2048 kB > >> DirectMap4k: 10240 kB > >> DirectMap2M: 17299456 kB > >> > >> - Apache Tomcat 6.0.32: > >> <!-- java arguments --> > >> -XX:+DisableExplicitGC > >> -XX:PermSize=512M > >> -XX:MaxPermSize=512M > >> -Xmx12G > >> -Xms3G > >> -XX:NewSize=128M > >> -XX:MaxNewSize=128M > >> -XX:+UseParNewGC > >> -XX:+UseConcMarkSweepGC > >> -XX:+CMSClassUnloadingEnabled > >> -XX:CMSInitiatingOccupancyFraction=50 > >> -XX:GCTimeRatio=9 > >> -XX:MinHeapFreeRatio=25 > >> -XX:MaxHeapFreeRatio=25 > >> -verbose:gc > >> -XX:+PrintGCTimeStamps > >> -Xloggc:/opt/search/tomcat/logs/gc.log > >> > >> Out search schema is: > >> - 5 servers with configuration above; > >> - one tomcat6 application on each server with 6 solr applications. > >> > >> - Full addresses are: > >> 1) http://192.168.1.85:8080/solr1,http://192.168.1.85:8080/solr2,..., > >> http://192.168.1.85:8080/solr6 > >> 2) http://192.168.1.86:8080/solr7,http://192.168.1.86:8080/solr8,..., > >> http://192.168.1.86:8080/solr12 > >> ... > >> 5) http://192.168.1.89:8080/solr25,http://192.168.1.89:8080/solr26,..., > >> http://192.168.1.89:8080/solr30 > >> - At another server there is a additional "common" application with > >> shards paramerter: > >> <requestHandler name="search" class="solr.SearchHandler" default="true"> > >> <lst name="defaults"> > >> <str name="echoParams">explicit</str> > >> <str name="shards">192.168.1.85:8080/solr1,192.168.1.85:8080/solr2,..., > >> 192.168.1.89:8080/solr30</str> > >> <int name="rows">10</int> > >> </lst> > >> </requestHandler> > >> - schema and solrconfig are identical for all shards, for first shard > >> see attach; > >> - on these servers are only search, indexation is on another > >> (optimized to 2 segments shards replicate with ssh/rsync scripts). > >> > >> So now the major problem is huge performance on distributed search. > >> Take look on, for example, these logs: > >> This is on 30 shards: > >> INFO: [] webapp=/solr > >> path=/select/params={fl=*,score&ident=true&start=0&q=(barium)&rows=2000} > >> status=0 QTime=40712 > >> INFO: [] webapp=/solr > >> > path=/select/params={fl=*,score&ident=true&start=0&q=(pittances)&rows=2000} > >> status=0 QTime=36097 > >> INFO: [] webapp=/solr > >> > >> > path=/select/params={fl=*,score&ident=true&start=0&q=(reliability)&rows=2000} > >> status=0 QTime=75756 > >> INFO: [] webapp=/solr > >> > >> > path=/select/params={fl=*,score&ident=true&start=0&q=(blessing's)&rows=2000} > >> status=0 QTime=30342 > >> INFO: [] webapp=/solr > >> > >> > path=/select/params={fl=*,score&ident=true&start=0&q=(reiterated)&rows=2000} > >> status=0 QTime=55690 > >> > >> Sometimes QTime is more than 150000. But when we run identical queries > >> on one shard separately, QTime is between 200 and 1500. > >> Does ditributed solr search really slow or our architecture is non > >> optimal? Or maybe need to use any third-party applications? > >> Thanks for any replies. > >> > >> -- > >> Best regards, > >> Artem > >> > > > > > > -- > Best regards, > Artem Lokotosh mailto:arco...@gmail.com > -- Regards, Dmitry Kan