Hi testn, I have tested Filter, it is pretty fast, but still take a lot of CPU resource, Maybe it could due to the number of filter I run.
Thank you eChuang, Chew -----Original Message----- From: testn [mailto:[EMAIL PROTECTED] Sent: Tuesday, August 07, 2007 10:37 PM To: java-user@lucene.apache.org Subject: RE: High CPU usage duing index and search Check out Filter class. You can create a separate filter for each field and then chain them together using ChainFilter. If you cache the filter, it will be pretty fast. Chew Yee Chuang wrote: > > Greetings, > > Yes, process a little bit and stop for a while really reduce the CPU > usage, > but I need to find out a balance so that the indexing or searching will > not > have so much delay. > > Execute 20,000 queries at a time is because the process is generating the > aggregation data for reporting, > E.g Gender (M,F), Department (Accounting, R&D, Financial,...etc), > 1Q - Gender:M AND Department: Accounting > 2Q - Gender:M AND Department: R&D > 3Q - Gender:M AND Department: Financial > 4Q - Gender:F AND Department: Accounting > 5Q - .... > Thus, the more combination, the more query need to run. For now, I still > can't get any idea on how to reduce it, just thinking maybe there is a > different way to index it so that I can get It easily. > > Any help would be appreciated. > > Thanks > eChuang, Chew > > -----Original Message----- > From: karl wettin [mailto:[EMAIL PROTECTED] > Sent: Thursday, August 02, 2007 7:11 AM > To: java-user@lucene.apache.org > Subject: Re: High CPU usage duing index and search > > It sounds like you have a fairly busy system, perhaps 100% load on the > process is not that strange, at least not during short periods of time. > > A simpler solution would be to nice the process a little bit in order to > give your background jobs some more time to think. > > Running a profiler is still the best advice I can think of. It should > clearly show you what is going on when you run out of CPU. > > -- > karl > > 1 aug 2007 kl. 04.29 skrev Chew Yee Chuang: > >> Hi, >> >> Thanks for the link provided, actually I've go through those >> article when I >> developing the index and search function for my application. I >> haven’t try >> profiler yet, but I monitor the CPU usage and notice that whatever >> index or >> search performing, the CPU usage raise to 100%. Below I will try to >> elaborate more on what my application is doing and how I index and >> search. >> >> There are many concurrent process running, first, the application >> will write >> records that received into a text file with tab separated each >> different >> field. Application will point to a new file every 10mins and start >> writing >> to it. So every file will contains only 10mins record, approximate >> 600,000 >> records per file. Then, the indexing process will check whether >> there is a >> text file to be index, if it is, the thread will wake up and start >> perform >> indexing. >> >> The indexing process will first add documents to RAMDir, Then >> later, add >> RAMDir into FSDir by calling addIndexNoOptimize() when there is >> 100,000 >> documents(32 fields per doc) in RAMDir. There is only 1 IndexWriter >> (FSDir) >> was created but a few IndexWriter(RAMDir) was created during the whole >> process. Below are some configuration for IndexWriters that I >> mentioned:- >> >> IndexWriter (RAMDir) >> - SimpleAnalyzer >> - setMaxBufferedDocs(10000) >> - Filed.Store.YES >> - Field.Index.NO_NORMS >> >> IndexWriter (FSDir) >> - SimpleAnalyzer >> - setMergeFactor(20) >> - addIndexesNoOptimize() >> >> For the searching, because there are many queries(20,000) run >> continuously >> to generate the aggregate table for reporting purpose. All this >> queries is >> run in nested loop, and there is only 1 Searcher created, I try >> searcher and >> filter as well, filter give me better result, but both also utilize >> lots of >> CPU resources. >> >> Hope this info will help and sorry for my bad English. >> >> Thanks >> eChuang, Chew >> >> -----Original Message----- >> From: karl wettin [mailto:[EMAIL PROTECTED] >> Sent: Tuesday, July 31, 2007 5:54 PM >> To: java-user@lucene.apache.org >> Subject: Re: High CPU usage duing index and search >> >> >> 31 jul 2007 kl. 05.25 skrev Chew Yee Chuang: >>> But just notice that when Lucene performing search or index, >>> the CPU usage on my machine raise to 100%, because of this issue, >>> some of my >>> others backend process will slow down eventually. Just want to know >>> does >>> anyone face this problem before ? and is it any idea on how to >>> overcome this >>> problem ? >> >> Did you run a profiler to see what it is that consume all the >> resources? >> It is very hard to guess based on the information you supplied. Start >> here: >> >> http://wiki.apache.org/lucene-java/BasicsOfPerformance >> http://wiki.apache.org/lucene-java/ImproveIndexingSpeed >> http://wiki.apache.org/lucene-java/ImproveSearchingSpeed >> >> >> -- >> karl >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: [EMAIL PROTECTED] >> For additional commands, e-mail: [EMAIL PROTECTED] >> >> >> No virus found in this incoming message. >> Checked by AVG Free Edition. >> Version: 7.5.476 / Virus Database: 269.11.0/927 - Release Date: >> 7/30/2007 >> 5:02 PM >> >> >> No virus found in this outgoing message. >> Checked by AVG Free Edition. >> Version: 7.5.476 / Virus Database: 269.11.0/929 - Release Date: >> 7/31/2007 >> 5:26 PM >> >> >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: [EMAIL PROTECTED] >> For additional commands, e-mail: [EMAIL PROTECTED] >> > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > > No virus found in this incoming message. > Checked by AVG Free Edition. > Version: 7.5.476 / Virus Database: 269.11.2/933 - Release Date: 8/2/2007 > 2:22 PM > > > No virus found in this outgoing message. > Checked by AVG Free Edition. > Version: 7.5.476 / Virus Database: 269.11.8/940 - Release Date: 8/6/2007 > 4:53 PM > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > > -- View this message in context: http://www.nabble.com/High-CPU-usage-duing-index-and-search-tf4190756.html#a12035676 Sent from the Lucene - Java Users mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] No virus found in this incoming message. Checked by AVG Free Edition. Version: 7.5.476 / Virus Database: 269.11.8/940 - Release Date: 8/6/2007 4:53 PM No virus found in this outgoing message. Checked by AVG Free Edition. Version: 7.5.476 / Virus Database: 269.11.15/949 - Release Date: 8/12/2007 11:03 AM --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]