RE: solr indexing takes a long time and is not reponsive to abort command

2010-06-25 Thread Ya-Wen Hsu
Thanks for the response. I double-checked that we don't have the core open 
multiple times. The complete index size is about 200M (around 1,060,000 
documents).  During the indexing process, 26 files were created. Core admin 
interface indicated that no query or process were running after roughly 5 hours 
but the Time Elapsed was still going. We have the indexDefults setting as 
followed:

useCompoundFilefalse/useCompoundFile
mergeFactor10/mergeFactor

Do you thinking lower mergeFactor to 5 and set useCompoundfile to true would 
help? I'll try it out on Monday.

Thanks again!


-Original Message-
From: Don Werve [mailto:d...@madwombat.com] 
Sent: Thursday, June 24, 2010 9:09 PM
To: solr-user@lucene.apache.org
Subject: Re: solr indexing takes a long time and is not reponsive to abort 
command

2010/6/25 Ya-Wen Hsu y...@eline.com

 This situation doesn't happen consistently. When we only ran the
 problematic core, the indexing took significant longer than usual(4hrs - 11
 hrs). It ran successful in the end. When we ran indexing for all cores at
 the same time, the problematic core never finished indexing such that we
 have to kill the process. This happened twice already. I'm running it
 parallel again to see if the problem still persists.


Off the top of my head:

Have you accidentally opened this core multiple times within the same JVM?
 I had the same thing happen to me when I was testing out a Solr interface I
had written under JRuby; that was loads of fun to track down...

How physically large is the core ('du -sh' if you're on Unix), and how many
files does the index contain?  I've run into issues where frequent updates
created a lot of index files, and which slowed down all core access.

If you've got a lot of index files, has the problem core been optimized?


RE: solr indexing takes a long time and is not reponsive to abort command

2010-06-24 Thread Ya-Wen Hsu
This situation doesn't happen consistently. When we only ran the problematic 
core, the indexing took significant longer than usual(4hrs - 11 hrs). It ran 
successful in the end. When we ran indexing for all cores at the same time, the 
problematic core never finished indexing such that we have to kill the process. 
This happened twice already. I'm running it parallel again to see if the 
problem still persists.

I also notice one thing, in the dataimport UI, the Total Documents Processed 
is missing from the problematic core and appeared for other cores. Does anyone 
know why? Thanks!

Wen

-Original Message-
From: Lance Norskog [mailto:goks...@gmail.com] 
Sent: Friday, June 18, 2010 5:38 PM
To: solr-user@lucene.apache.org
Subject: Re: solr indexing takes a long time and is not reponsive to abort 
command

Does this happen over and over? Does it happen every time?

On Fri, Jun 18, 2010 at 1:19 PM, Ya-Wen Hsu y...@eline.com wrote:
 I don’t see my last email showed in the mailing list so I’m sending again. 
 Below is the original email.

 Hi,

 I have multi-core solr setup. All cores finished indexing in reasonable time 
 but one. I look at the dataimport info for the one that’s hanging. The 
 process is still in busy state but no requests made or rows fetched. The 
 database side just showed the process is waiting for future command and is 
 doing nothing. The attempt to abort the process doesn’t really work. Does 
 anyone know what’s happening here? Thanks!

 Wen




-- 
Lance Norskog
goks...@gmail.com


solr indexing takes a long time and is not reponsive to abort command

2010-06-18 Thread Ya-Wen Hsu
Hi,

I have multi-core solr setup. All cores finished indexing in reasonable time 
but one. I look at the dataimport info for the one that's hanging. The process 
is still in busy state but no requests made or rows fetched. The database side 
just showed the process is waiting for future command and is doing nothing. The 
attempt to abort the process doesn't really work. Does anyone know what's 
happening here? Thanks!

Wen


RE: performance sorting multivalued field

2010-06-18 Thread Ya-Wen Hsu
Hi,

I have sort on multivalued field with field collapse plugin. Solr always use 
the first value it gets from the search result when sorting multivalued fileds. 
I might be wrong but I vaguely remember it's the smallest value.

Wen

-Original Message-
From: Erik Hatcher [mailto:erik.hatc...@gmail.com] 
Sent: Friday, June 18, 2010 10:32 AM
To: solr-user@lucene.apache.org
Subject: Re: performance sorting multivalued field

do you mean sorting facets?  or sorting search results?   you can't  
sort search results by a multivalued field - which value would it use?

Erik

On Jun 18, 2010, at 12:45 PM, Marc Sturlese wrote:


 hey there!
 can someone explain me how impacts to have multivalued fields when  
 sorting?
 I have read in other threads how does it affect when faceting but  
 couldn't
 find any info of the impact when sorting
 Thanks in advance

 -- 
 View this message in context: 
 http://lucene.472066.n3.nabble.com/performance-sorting-multivalued-field-tp905943p905943.html
 Sent from the Solr - User mailing list archive at Nabble.com.



solr indexing takes a long time and is not reponsive to abort command

2010-06-18 Thread Ya-Wen Hsu
I don’t see my last email showed in the mailing list so I’m sending again. 
Below is the original email.

Hi,

I have multi-core solr setup. All cores finished indexing in reasonable time 
but one. I look at the dataimport info for the one that’s hanging. The process 
is still in busy state but no requests made or rows fetched. The database side 
just showed the process is waiting for future command and is doing nothing. The 
attempt to abort the process doesn’t really work. Does anyone know what’s 
happening here? Thanks!

Wen


RE: solr indexing takes a long time and is not reponsive to abort command

2010-06-18 Thread Ya-Wen Hsu
Sorry if you received duplicate email from me. 

I checked the log, there is no error in the log and no write-lock message in 
the log. Where else can I check for more information? Can I see if any query is 
running? 

I finally killed the process and run it again. This situation happened couple 
times in our production and qa environment. It usually works after we kill and 
restart the process. However, we would like to figure out what happen in the 
first place. Thanks!

Wen
-Original Message-
From: Peter Karich [mailto:peat...@yahoo.de] 
Sent: Friday, June 18, 2010 1:04 PM
To: solr-user@lucene.apache.org
Subject: Re: solr indexing takes a long time and is not reponsive to abort 
command

Did you kill the process or does a reload help afterwards?
Did you look into the logs? Are there errors saying sth. of a write-lock?

Peter.

 Hi,

 I have multi-core solr setup. All cores finished indexing in reasonable time 
 but one. I look at the dataimport info for the one that's hanging. The 
 process is still in busy state but no requests made or rows fetched. The 
 database side just showed the process is waiting for future command and is 
 doing nothing. The attempt to abort the process doesn't really work. Does 
 anyone know what's happening here? Thanks!

 Wen



dismax and WordDelimiterFilterFactory with PreserveOriginal = 1

2010-03-11 Thread Ya-Wen Hsu
Hi all,

I'm facing the same issue as previous post here: 
http://www.mail-archive.com/solr-user@lucene.apache.org/msg19511.html. Since no 
one answers this post, I thought I'll ask again. In my case, I use below 
setting for index
filter class=solr.WordDelimiterFilterFactory generateWordParts=1 
generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=0 
splitOnCaseChange=0 preserveOriginal=1/
and
filter class=solr.WordDelimiterFilterFactory generateWordParts=1 
generateNumberParts=1 catenateWords=0 catenateNumbers=0 catenateAll=0 
splitOnCaseChange=0 preserveOriginal=1/ for query.

When I use query with word ain't, no result is returned. When I turned on the 
logging, I found the word is interpreted as (ain't ain) t.

0.0 = (NON-MATCH) Failure to meet condition(s) of required/prohibited clause(s)
0.0 = no match on required clause ((description:(ain't ain) t^2.0 | 
name:(ain't ain) t^3.0 | search_keywords:(ain't ain) t)~0.1)

Does anyone know why ain't be parsed as (ain't ain) t and how to fix it so it 
can match documents that include ain't in the name? Thanks in advance!

Wen