Thans Walter for info, we will disable optimize then and do more testing.

Regards,
Yandong

2013/2/22 Walter Underwood <wun...@wunderwood.org>

> That seems fairly fast. We index about 3 million documents in about half
> that time. We are probably limited by the time it takes to get the data
> from MySQL.
>
> Don't optimize. Solr automatically merges index segments as needed.
> Optimize forces a full merge. You'll probably never notice the difference,
> either in disk space or speed.
>
> It might make sense to force merge (optimize) if you reindex everything
> once per day and have no updates in between. But even then it may be a
> waste of time.
>
> You need lots of free disk space for merging, whether a forced merge or
> automatic. Free space equal to the size of the index is usually enough, but
> worst case can need double the size of the index.
>
> wunder
>
> On Feb 21, 2013, at 9:20 AM, Yandong Yao wrote:
>
> > Hi Guys,
> >
> > I am using Solr 4.1 and have indexed 18M documents using solrj
> > ConcurrentUpdateSolrServer (each document contains 5 fields, and average
> > length is less than 1k).
> >
> > 1) It takes 70 minutes to index those documents without optimize on my
> mac
> > 10.8, how is the performance, slow, fast or common?
> >
> > 2) It takes about 40 minutes to optimize those documents, following is
> top
> > output, and there are lots of FAULTS, what does this means?
> >
> > Processes: 118 total, 2 running, 8 stuck, 108 sleeping, 719 threads
> >
> >       00:56:52
> > Load Avg: 1.48, 1.56, 1.73  CPU usage: 6.63% user, 6.40% sys, 86.95% idle
> > SharedLibs: 31M resident, 0B data, 6712K linkedit.
> > MemRegions: 34734 total, 5801M resident, 39M private, 638M shared.
> PhysMem:
> > 982M wired, 3600M active, 3567M inactive, 8150M used, 38M free.
> > VM: 254G vsize, 1285M framework vsize, 1469887(368) pageins, 1095550(0)
> > pageouts.  Networks: packets: 14842595/9661M in, 14777685/9395M out.
> > Disks: 820048/43G read, 523814/53G written.
> >
> > PID   COMMAND      %CPU  TIME     #TH  #WQ  #POR #MRE RPRVT  RSHRD  RSIZE
> > VPRVT  VSIZE  PGRP PPID STATE   UID  FAULTS   COW  MSGSENT  MSGRECV
> SYSBSD
> >   SYSMACH
> > 4585  java         11.7  02:52:01 32   1    483  342  3866M+ 6724K
>  3856M+
> > 4246M  6908M  4580 4580 sleepin 501  1490340+ 402  3000781+ 231785+
> > 15044055+ 10033109+
> >
> > 3) If I don't run optimize, what is the impact? bigger disk size or slow
> > query performance?
> >
> > Following is my index config in  solrconfig.xml:
> >
> > <ramBufferSizeMB>100</ramBufferSizeMB>
> > <mergeFactor>10</mergeFactor>
> > <autoCommit>
> >       <maxDocs>100000</maxDocs>    <!-- 100K docs -->
> >       <maxTime>300000</maxTime>    <!-- 5 minutes -->
> >       <openSearcher>false</openSearcher>
> > </autoCommit>
> >
> > Thanks very much in advance!
> >
> > Regards,
> > Yandong
>
>
>
>
>

Reply via email to