Re: Nwebie Question on boosting

2008-12-11 Thread ayyanar


Thanks Rob. Can you plz provide some sample documents (lucene) for title
bassed boosting?
-- 
View this message in context: 
http://www.nabble.com/Nwebie-Question-on-boosting-tp20950286p20952532.html
Sent from the Solr - User mailing list archive at Nabble.com.



Taxonomy Support on Solr

2008-12-11 Thread Jana, Kumar Raja
Hi,
 
Any plans of supporting user-defined classifications on Solr? Is there
any component which returns all the children of a node (till the leaf
node) when I search for any node?
 
May be this would help:
 
Say I have a few SolrDocuments classified as:
 
A
B--C
123  8--9
 
(I.e A has 2 child nodes B and C. B has 3 child nodes 1,2,3 and C has 2
child nodes 8,9)
When my search criteria matches B, my results should contain B as well
as 1,2 and 3 too.
Search for A would return all the nodes mentioned above.
 
-Kumar
 
   


Make it more performant - solr 1.3 - 1200msec respond time.

2008-12-11 Thread sunnyfr

Hi,

I'm doing a stress test on solr.
I've around 8,5M of doc, the size of my data's directory is 5,6G.

I've  indexed again my data to make it faster, and applied all the last
patch.
My index data store just two field : id and text (which is a copy of three
fiels)
But I still think it's very long, what do you think?

For 50request/sec during 40mn, my average  respond time : 1235msec.
49430request.

When I make this test with 100request second during 10mn and 10 other
minutes with 50 request : my average respond time is 1600msec. Don't you
think it's a bit long.

Should I partition this index more ? or what should I do to make this work
faster.
I can read post with people who have just 300msec request for 300Go of index
partionned ? 
My request for collecting all this book is quite complex and have lot of
table linked together, maybe it would be faster if I create a csv file ? 

The server that I'm using for the test has 8G of memory.
4CPU : Intel(R) Xeon(R) CPU5160  @ 3.00GHz
Tomcat55 : -Xms2000m -Xmx4000m 
Solr 1.3.

What can I modify to make it more performant ? memory, indexation ...?
Does it can come from my request to the mysql database which is too much
linked ? 

Thanks a lot for your help,
Johanna


-- 
View this message in context: 
http://www.nabble.com/Make-it-more-performant---solr-1.3---1200msec-respond-time.-tp20953079p20953079.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Nwebie Question on boosting

2008-12-11 Thread Steffen B.

Hi,

ayyanar wrote:
 
 Thanks Rob. Can you plz provide some sample documents (lucene) for title
 bassed boosting?
 

I'm not sure about the Lucene part (this is the Solr mailing list after
all), but if you want index time boosting of certain fields, you have to add
documents like this:
add
  doc
field name=id123/field
field name=title boost=2.0My boosted field/field
field name=textHere goes the text, yadda yadda/field
  /doc
/add

In this document the title-field will be boosted with a factor of 2, if it
has norms enabled. (omitNorms = false) You can read more about this here:
http://wiki.apache.org/solr/UpdateXmlMessages#head-63c170cad521de8d1e9be0d76025774cf3b66dfc
-- 
View this message in context: 
http://www.nabble.com/Nwebie-Question-on-boosting-tp20950286p20953736.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Make it more performant - solr 1.3 - 1200msec respond time.

2008-12-11 Thread Shalin Shekhar Mangar
Are each of those queries unique?

First time queries are slower. They are cached by Solr and the same query
again will return results very quickly because it won't need to hit the file
system.

On Thu, Dec 11, 2008 at 4:08 PM, sunnyfr [EMAIL PROTECTED] wrote:


 Hi,

 I'm doing a stress test on solr.
 I've around 8,5M of doc, the size of my data's directory is 5,6G.

 I've  indexed again my data to make it faster, and applied all the last
 patch.
 My index data store just two field : id and text (which is a copy of three
 fiels)
 But I still think it's very long, what do you think?

 For 50request/sec during 40mn, my average  respond time : 1235msec.
 49430request.

 When I make this test with 100request second during 10mn and 10 other
 minutes with 50 request : my average respond time is 1600msec. Don't you
 think it's a bit long.

 Should I partition this index more ? or what should I do to make this work
 faster.
 I can read post with people who have just 300msec request for 300Go of
 index
 partionned ?
 My request for collecting all this book is quite complex and have lot of
 table linked together, maybe it would be faster if I create a csv file ?

 The server that I'm using for the test has 8G of memory.
 4CPU : Intel(R) Xeon(R) CPU5160  @ 3.00GHz
 Tomcat55 : -Xms2000m -Xmx4000m
 Solr 1.3.

 What can I modify to make it more performant ? memory, indexation ...?
 Does it can come from my request to the mysql database which is too much
 linked ?

 Thanks a lot for your help,
 Johanna


 --
 View this message in context:
 http://www.nabble.com/Make-it-more-performant---solr-1.3---1200msec-respond-time.-tp20953079p20953079.html
 Sent from the Solr - User mailing list archive at Nabble.com.




-- 
Regards,
Shalin Shekhar Mangar.


Re: Make it more performant - solr 1.3 - 1200msec respond time.

2008-12-11 Thread sunnyfr

So according to you and everything explained in my post, I did my best to
optimize it ? 
Yes it's unique queries. I will try it again and activate cache. 

What you mean by hit the file system?
thanks a lot




Shalin Shekhar Mangar wrote:
 
 Are each of those queries unique?
 
 First time queries are slower. They are cached by Solr and the same query
 again will return results very quickly because it won't need to hit the
 file
 system.
 
 On Thu, Dec 11, 2008 at 4:08 PM, sunnyfr [EMAIL PROTECTED] wrote:
 

 Hi,

 I'm doing a stress test on solr.
 I've around 8,5M of doc, the size of my data's directory is 5,6G.

 I've  indexed again my data to make it faster, and applied all the last
 patch.
 My index data store just two field : id and text (which is a copy of
 three
 fiels)
 But I still think it's very long, what do you think?

 For 50request/sec during 40mn, my average  respond time : 1235msec.
 49430request.

 When I make this test with 100request second during 10mn and 10 other
 minutes with 50 request : my average respond time is 1600msec. Don't you
 think it's a bit long.

 Should I partition this index more ? or what should I do to make this
 work
 faster.
 I can read post with people who have just 300msec request for 300Go of
 index
 partionned ?
 My request for collecting all this book is quite complex and have lot of
 table linked together, maybe it would be faster if I create a csv file ?

 The server that I'm using for the test has 8G of memory.
 4CPU : Intel(R) Xeon(R) CPU5160  @ 3.00GHz
 Tomcat55 : -Xms2000m -Xmx4000m
 Solr 1.3.

 What can I modify to make it more performant ? memory, indexation ...?
 Does it can come from my request to the mysql database which is too much
 linked ?

 Thanks a lot for your help,
 Johanna


 --
 View this message in context:
 http://www.nabble.com/Make-it-more-performant---solr-1.3---1200msec-respond-time.-tp20953079p20953079.html
 Sent from the Solr - User mailing list archive at Nabble.com.


 
 
 -- 
 Regards,
 Shalin Shekhar Mangar.
 
 

-- 
View this message in context: 
http://www.nabble.com/Make-it-more-performant---solr-1.3---1200msec-respond-time.-tp20953079p20954392.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Sum of Fields and Record Count

2008-12-11 Thread John Martyniak

Hi Otis,

Thanks for the info and help.  I started reading up about it (on  
Markmail, nice site), and it looks like there is some activity to put  
it into 1.4.  I will try and apply the patch, and see how that works.   
It seems like a couple of people are using it in a production  
environment already, with out grief.  So that is a good thing.


-John

On Dec 11, 2008, at 1:24 AM, Otis Gospodnetic wrote:


Hi John,

It's not in the current release, but the chances are it will make it  
into 1.4.  You can try one of the recent patches and apply it to  
your Solr 1.3 sources.  Check list archives for more discussion,  
this field collapsing was just discussed again today/yesterday.   
markmail.org is a good one.



Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 

From: John Martyniak [EMAIL PROTECTED]
To: solr-user@lucene.apache.org
Sent: Wednesday, December 10, 2008 10:51:57 PM
Subject: Re: Sum of Fields and Record Count

Otis,

Thanks for the information.  It looks like the field collapsing is  
similar to
what I am looking.  But is that in the current release?  Is it  
stable?


Is there anyway to do it in Solr 1.3?

-John

On Dec 10, 2008, at 9:59 PM, Otis Gospodnetic wrote:


Hi John,

This sounds a lot like field collapsing functionality that a few  
people are

working on in SOLR-236:


https://issues.apache.org/jira/browse/SOLR-236

Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 

From: John Martyniak
To: solr-user@lucene.apache.org
Sent: Wednesday, December 10, 2008 6:16:21 PM
Subject: Sum of Fields and Record Count

Hi,

I am a new solr user.

I have an application that I would like to show the results but  
one result

may
be the part of larger set of results.  So for example result #1  
might also

have

10 other results that are part of the same data set.

Hopefully this makes sense.

What I would like to find out is if there is a way within Solr to  
show the
result that matched with the query, and then to also show that  
this result is

part of a collection of 10 items.

I have thought about doing it using some sort of external process  
that runs,

and
with doing multiple queries, so get the list of items and then  
query against

each item.  But those don't seem elegant.

So I would like to find out if there is a way to do it within  
Solr that is a
little more elegant, and hopefully without having to write  
additional code.


Thank you in advance for the help.

-John








Re: Make it more performant - solr 1.3 - 1200msec respond time.

2008-12-11 Thread sunnyfr

Actually I just notices, lot of request didnt bring back correct answer, but
 No read Solr server available so my jmeter didn't take that for an error.
Obviously out of memory, and a file gc.log is created with :
0.054: [GC [PSYoungGen: 5121K-256K(298688K)] 5121K-256K(981376K),
0.0020630 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
0.056: [Full GC (System) [PSYoungGen: 256K-0K(298688K)] [PSOldGen:
0K-180K(682688K)] 256K-180K(981376K) [PSPermGen: 3002K-3002K(21248K)],
0.0055170 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]

so far my tomcat55 file is configurate like that :
JAVA_OPTS=-Xms1000m -Xmx4000m -XX:+HeapDumpOnOutOfMemoryError
-Xloggc:gc.log -XX:+PrintGCDetails -XX:+PrintGCTimeStamps


Thanks for your help



Shalin Shekhar Mangar wrote:
 
 Are each of those queries unique?
 
 First time queries are slower. They are cached by Solr and the same query
 again will return results very quickly because it won't need to hit the
 file
 system.
 
 On Thu, Dec 11, 2008 at 4:08 PM, sunnyfr [EMAIL PROTECTED] wrote:
 

 Hi,

 I'm doing a stress test on solr.
 I've around 8,5M of doc, the size of my data's directory is 5,6G.

 I've  indexed again my data to make it faster, and applied all the last
 patch.
 My index data store just two field : id and text (which is a copy of
 three
 fiels)
 But I still think it's very long, what do you think?

 For 50request/sec during 40mn, my average  respond time : 1235msec.
 49430request.

 When I make this test with 100request second during 10mn and 10 other
 minutes with 50 request : my average respond time is 1600msec. Don't you
 think it's a bit long.

 Should I partition this index more ? or what should I do to make this
 work
 faster.
 I can read post with people who have just 300msec request for 300Go of
 index
 partionned ?
 My request for collecting all this book is quite complex and have lot of
 table linked together, maybe it would be faster if I create a csv file ?

 The server that I'm using for the test has 8G of memory.
 4CPU : Intel(R) Xeon(R) CPU5160  @ 3.00GHz
 Tomcat55 : -Xms2000m -Xmx4000m
 Solr 1.3.

 What can I modify to make it more performant ? memory, indexation ...?
 Does it can come from my request to the mysql database which is too much
 linked ?

 Thanks a lot for your help,
 Johanna


 --
 View this message in context:
 http://www.nabble.com/Make-it-more-performant---solr-1.3---1200msec-respond-time.-tp20953079p20953079.html
 Sent from the Solr - User mailing list archive at Nabble.com.


 
 
 -- 
 Regards,
 Shalin Shekhar Mangar.
 
 

-- 
View this message in context: 
http://www.nabble.com/Make-it-more-performant---solr-1.3---1200msec-respond-time.-tp20953079p20955210.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Make it more performant - solr 1.3 - 1200msec respond time.

2008-12-11 Thread sunnyfr

Ok sorry I just add the parameter -XX:+UseParallelGC and it seems to don't go
oom.




sunnyfr wrote:
 
 Actually I just notices, lot of request didnt bring back correct answer,
 but  No read Solr server available so my jmeter didn't take that for an
 error. Obviously out of memory, and a file gc.log is created with :
 0.054: [GC [PSYoungGen: 5121K-256K(298688K)] 5121K-256K(981376K),
 0.0020630 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
 0.056: [Full GC (System) [PSYoungGen: 256K-0K(298688K)] [PSOldGen:
 0K-180K(682688K)] 256K-180K(981376K) [PSPermGen: 3002K-3002K(21248K)],
 0.0055170 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
 
 so far my tomcat55 file is configurate like that :
 JAVA_OPTS=-Xms1000m -Xmx4000m -XX:+HeapDumpOnOutOfMemoryError
 -Xloggc:gc.log -XX:+PrintGCDetails -XX:+PrintGCTimeStamps
 
 My error:
 Dec 11 14:16:27 solr-test jsvc.exec[30653]: Dec 11, 2008 2:16:27 PM
 org.apache.solr.core.SolrCore execute INFO: [video] webapp=/solr
 path=/admin/ping params={} hits=0 status=0 QTime=1  Dec 11, 2008 2:16:27
 PM org.apache.solr.core.SolrCore execute INFO: [video] webapp=/solr
 path=/admin/ping params={} status=0 QTime=1  Dec 11, 2008 2:16:27 PM
 org.apache.solr.core.SolrCore execute INFO: [video] webapp=/solr
 path=/admin/ping params={} hits=0 status=0 QTime=1  Dec 11, 2008 2:16:27
 PM org.apache.solr.core.SolrCore execute INFO: [video] webapp=/solr
 path=/admin/ping params={} status=0 QTime=2
 Dec 11 14:16:27 solr-test jsvc.exec[30653]: java.lang.OutOfMemoryError: GC
 overhead limit exceeded Dumping heap to java_pid30655.hprof ...
 
 
 
 
 Thanks for your help
 
 
 
 Shalin Shekhar Mangar wrote:
 
 Are each of those queries unique?
 
 First time queries are slower. They are cached by Solr and the same query
 again will return results very quickly because it won't need to hit the
 file
 system.
 
 On Thu, Dec 11, 2008 at 4:08 PM, sunnyfr [EMAIL PROTECTED] wrote:
 

 Hi,

 I'm doing a stress test on solr.
 I've around 8,5M of doc, the size of my data's directory is 5,6G.

 I've  indexed again my data to make it faster, and applied all the last
 patch.
 My index data store just two field : id and text (which is a copy of
 three
 fiels)
 But I still think it's very long, what do you think?

 For 50request/sec during 40mn, my average  respond time : 1235msec.
 49430request.

 When I make this test with 100request second during 10mn and 10 other
 minutes with 50 request : my average respond time is 1600msec. Don't you
 think it's a bit long.

 Should I partition this index more ? or what should I do to make this
 work
 faster.
 I can read post with people who have just 300msec request for 300Go of
 index
 partionned ?
 My request for collecting all this book is quite complex and have lot of
 table linked together, maybe it would be faster if I create a csv file ?

 The server that I'm using for the test has 8G of memory.
 4CPU : Intel(R) Xeon(R) CPU5160  @ 3.00GHz
 Tomcat55 : -Xms2000m -Xmx4000m
 Solr 1.3.

 What can I modify to make it more performant ? memory, indexation ...?
 Does it can come from my request to the mysql database which is too much
 linked ?

 Thanks a lot for your help,
 Johanna


 --
 View this message in context:
 http://www.nabble.com/Make-it-more-performant---solr-1.3---1200msec-respond-time.-tp20953079p20953079.html
 Sent from the Solr - User mailing list archive at Nabble.com.


 
 
 -- 
 Regards,
 Shalin Shekhar Mangar.
 
 
 
 

-- 
View this message in context: 
http://www.nabble.com/Make-it-more-performant---solr-1.3---1200msec-respond-time.-tp20953079p20955479.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Make it more performant - solr 1.3 - 1200msec respond time.

2008-12-11 Thread sunnyfr

Actually I still have this error :  No read Solr server available 



sunnyfr wrote:
 
 Ok sorry I just add the parameter -XX:+UseParallelGC and it seems to don't
 go oom.
 
 
 
 
 sunnyfr wrote:
 
 Actually I just notices, lot of request didnt bring back correct answer,
 but  No read Solr server available so my jmeter didn't take that for an
 error. Obviously out of memory, and a file gc.log is created with :
 0.054: [GC [PSYoungGen: 5121K-256K(298688K)] 5121K-256K(981376K),
 0.0020630 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
 0.056: [Full GC (System) [PSYoungGen: 256K-0K(298688K)] [PSOldGen:
 0K-180K(682688K)] 256K-180K(981376K) [PSPermGen: 3002K-3002K(21248K)],
 0.0055170 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
 
 so far my tomcat55 file is configurate like that :
 JAVA_OPTS=-Xms1000m -Xmx4000m -XX:+HeapDumpOnOutOfMemoryError
 -Xloggc:gc.log -XX:+PrintGCDetails -XX:+PrintGCTimeStamps
 
 My error:
 Dec 11 14:16:27 solr-test jsvc.exec[30653]: Dec 11, 2008 2:16:27 PM
 org.apache.solr.core.SolrCore execute INFO: [video] webapp=/solr
 path=/admin/ping params={} hits=0 status=0 QTime=1  Dec 11, 2008 2:16:27
 PM org.apache.solr.core.SolrCore execute INFO: [video] webapp=/solr
 path=/admin/ping params={} status=0 QTime=1  Dec 11, 2008 2:16:27 PM
 org.apache.solr.core.SolrCore execute INFO: [video] webapp=/solr
 path=/admin/ping params={} hits=0 status=0 QTime=1  Dec 11, 2008 2:16:27
 PM org.apache.solr.core.SolrCore execute INFO: [video] webapp=/solr
 path=/admin/ping params={} status=0 QTime=2
 Dec 11 14:16:27 solr-test jsvc.exec[30653]: java.lang.OutOfMemoryError:
 GC overhead limit exceeded Dumping heap to java_pid30655.hprof ...
 
 
 
 
 Thanks for your help
 
 
 
 Shalin Shekhar Mangar wrote:
 
 Are each of those queries unique?
 
 First time queries are slower. They are cached by Solr and the same
 query
 again will return results very quickly because it won't need to hit the
 file
 system.
 
 On Thu, Dec 11, 2008 at 4:08 PM, sunnyfr [EMAIL PROTECTED] wrote:
 

 Hi,

 I'm doing a stress test on solr.
 I've around 8,5M of doc, the size of my data's directory is 5,6G.

 I've  indexed again my data to make it faster, and applied all the last
 patch.
 My index data store just two field : id and text (which is a copy of
 three
 fiels)
 But I still think it's very long, what do you think?

 For 50request/sec during 40mn, my average  respond time : 1235msec.
 49430request.

 When I make this test with 100request second during 10mn and 10 other
 minutes with 50 request : my average respond time is 1600msec. Don't
 you
 think it's a bit long.

 Should I partition this index more ? or what should I do to make this
 work
 faster.
 I can read post with people who have just 300msec request for 300Go of
 index
 partionned ?
 My request for collecting all this book is quite complex and have lot
 of
 table linked together, maybe it would be faster if I create a csv file
 ?

 The server that I'm using for the test has 8G of memory.
 4CPU : Intel(R) Xeon(R) CPU5160  @ 3.00GHz
 Tomcat55 : -Xms2000m -Xmx4000m
 Solr 1.3.

 What can I modify to make it more performant ? memory, indexation ...?
 Does it can come from my request to the mysql database which is too
 much
 linked ?

 Thanks a lot for your help,
 Johanna


 --
 View this message in context:
 http://www.nabble.com/Make-it-more-performant---solr-1.3---1200msec-respond-time.-tp20953079p20953079.html
 Sent from the Solr - User mailing list archive at Nabble.com.


 
 
 -- 
 Regards,
 Shalin Shekhar Mangar.
 
 
 
 
 
 

-- 
View this message in context: 
http://www.nabble.com/Make-it-more-performant---solr-1.3---1200msec-respond-time.-tp20953079p20955609.html
Sent from the Solr - User mailing list archive at Nabble.com.



Building Solr from Source

2008-12-11 Thread John Martyniak

Hi,

I have downloaded Maven 2.0.9, and tried to build using mvn clean  
install and mvn install, nothing works.


Can somebody tell me how to build solr from source?  I am trying to  
build the 1.3 source.


thank you very much,

-John


Re: Building Solr from Source

2008-12-11 Thread Noble Paul നോബിള്‍ नोब्ळ्
Solr uses ant for build
install ant

On Thu, Dec 11, 2008 at 7:13 PM, John Martyniak [EMAIL PROTECTED] wrote:
 Hi,

 I have downloaded Maven 2.0.9, and tried to build using mvn clean install
 and mvn install, nothing works.

 Can somebody tell me how to build solr from source?  I am trying to build
 the 1.3 source.

 thank you very much,

 -John




-- 
--Noble Paul


Re: Dynamic Boosting at query time with boost value as another fieldvalue

2008-12-11 Thread Shalin Shekhar Mangar
Take a look at FunctionQuery support in Solr:

http://wiki.apache.org/solr/FunctionQuery
http://wiki.apache.org/solr/SolrRelevancyFAQ#head-b1b1cdedcb9cd9bfd9c994709b4d7e540359b1fd

On Thu, Dec 11, 2008 at 7:01 PM, Pooja Verlani [EMAIL PROTECTED]wrote:

 Hi all,

 I have a specific requirement for query time boosting.
 I have to boost a field on the basis of the value returned from one of the
 fields of the document.

 Basically, I have the creationDate for a document and in order to introduce
 recency factor in the search, i need to give a boost to the creation field,
 where the boost value is something like a log(1/x) function and x is the
 (presentDate - creationDate).
 Till now what I have seen is we can give only a static boost to the
 documents.

 In case you can provide a solution to my problem.. please do reply :)

 Thanks a lot,
 Regards.
 Pooja




-- 
Regards,
Shalin Shekhar Mangar.


Re: Make it more performant - solr 1.3 - 1200msec respond time.

2008-12-11 Thread Shalin Shekhar Mangar
On Thu, Dec 11, 2008 at 5:56 PM, sunnyfr [EMAIL PROTECTED] wrote:


 So according to you and everything explained in my post, I did my best to
 optimize it ?
 Yes it's unique queries. I will try it again and activate cache.


If you run unique queries then it is not a very realistic test. Turn on
caching and try running queries from a old access log.



 What you mean by hit the file system?


Solr/Lucene has to access the file system to load the results. To avoid this
access and processing, there are many caches in Solr. This makes Solr
faster.

-- 
Regards,
Shalin Shekhar Mangar.


Re: Make it more performant - solr 1.3 - 1200msec respond time.

2008-12-11 Thread sunnyfr

Hi,

Around 50threads/sec the request bring back   No read Solr server
available , the gc seems to be quite full, but I didnt get OOM error,
would love an advice.

Thanks a lot 

Details :
8G of memory
4CPU : Intel(R) Xeon(R) CPU5160  @ 3.00GHz 
Solr 1.3
# Arguments to pass to the Java virtual machine (JVM).
JAVA_OPTS=-Xms1000m -Xmx4000m -XX:+UseParallelGC
-XX:+HeapDumpOnOutOfMemoryError -Xloggc:gc.log -XX:+PrintGCDetails
-XX:+PrintGCTimeStamps
5.3G of data with 8,2M of documents.

Thanks a lot for the help



sunnyfr wrote:
 
 Actually I just notices, lot of request didnt bring back correct answer,
 but  No read Solr server available so my jmeter didn't take that for an
 error. Obviously out of memory, and a file gc.log is created with :
 0.054: [GC [PSYoungGen: 5121K-256K(298688K)] 5121K-256K(981376K),
 0.0020630 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
 0.056: [Full GC (System) [PSYoungGen: 256K-0K(298688K)] [PSOldGen:
 0K-180K(682688K)] 256K-180K(981376K) [PSPermGen: 3002K-3002K(21248K)],
 0.0055170 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
 
 so far my tomcat55 file is configurate like that :
 JAVA_OPTS=-Xms1000m -Xmx4000m -XX:+HeapDumpOnOutOfMemoryError
 -Xloggc:gc.log -XX:+PrintGCDetails -XX:+PrintGCTimeStamps
 
 My error:
 Dec 11 14:16:27 solr-test jsvc.exec[30653]: Dec 11, 2008 2:16:27 PM
 org.apache.solr.core.SolrCore execute INFO: [video] webapp=/solr
 path=/admin/ping params={} hits=0 status=0 QTime=1  Dec 11, 2008 2:16:27
 PM org.apache.solr.core.SolrCore execute INFO: [video] webapp=/solr
 path=/admin/ping params={} status=0 QTime=1  Dec 11, 2008 2:16:27 PM
 org.apache.solr.core.SolrCore execute INFO: [video] webapp=/solr
 path=/admin/ping params={} hits=0 status=0 QTime=1  Dec 11, 2008 2:16:27
 PM org.apache.solr.core.SolrCore execute INFO: [video] webapp=/solr
 path=/admin/ping params={} status=0 QTime=2
 Dec 11 14:16:27 solr-test jsvc.exec[30653]: java.lang.OutOfMemoryError: GC
 overhead limit exceeded Dumping heap to java_pid30655.hprof ...
 
 
 
 
 Thanks for your help
 
 
 
 Shalin Shekhar Mangar wrote:
 
 Are each of those queries unique?
 
 First time queries are slower. They are cached by Solr and the same query
 again will return results very quickly because it won't need to hit the
 file
 system.
 
 On Thu, Dec 11, 2008 at 4:08 PM, sunnyfr [EMAIL PROTECTED] wrote:
 

 Hi,

 I'm doing a stress test on solr.
 I've around 8,5M of doc, the size of my data's directory is 5,6G.

 I've  indexed again my data to make it faster, and applied all the last
 patch.
 My index data store just two field : id and text (which is a copy of
 three
 fiels)
 But I still think it's very long, what do you think?

 For 50request/sec during 40mn, my average  respond time : 1235msec.
 49430request.

 When I make this test with 100request second during 10mn and 10 other
 minutes with 50 request : my average respond time is 1600msec. Don't you
 think it's a bit long.

 Should I partition this index more ? or what should I do to make this
 work
 faster.
 I can read post with people who have just 300msec request for 300Go of
 index
 partionned ?
 My request for collecting all this book is quite complex and have lot of
 table linked together, maybe it would be faster if I create a csv file ?

 The server that I'm using for the test has 8G of memory.
 4CPU : Intel(R) Xeon(R) CPU5160  @ 3.00GHz
 Tomcat55 : -Xms2000m -Xmx4000m
 Solr 1.3.

 What can I modify to make it more performant ? memory, indexation ...?
 Does it can come from my request to the mysql database which is too much
 linked ?

 Thanks a lot for your help,
 Johanna


 --
 View this message in context:
 http://www.nabble.com/Make-it-more-performant---solr-1.3---1200msec-respond-time.-tp20953079p20953079.html
 Sent from the Solr - User mailing list archive at Nabble.com.


 
 
 -- 
 Regards,
 Shalin Shekhar Mangar.
 
 
 
 

-- 
View this message in context: 
http://www.nabble.com/Make-it-more-performant---solr-1.3---1200msec-respond-time.-tp20953079p20955856.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Newbie Question boosting

2008-12-11 Thread Erick Erickson
I can help a bit with 2...

First, keep in mind the difference between index and query
time boosting:

From Hossman:
..Index time field bosts are a way to express things like
this documents title is worth twice as much as the title of most documents
query time boosts are a way to express i care about matches on this clause
of my query twice as much as i do about matches to other clauses of my
query.

Boosting doesn't do anything if your search doesn't use the field you've
boosted
on, but that's only relevant for index-time boosts. You can't have a
query-time
boost that doesn't reference the field since that's how you boost at query
time by definition.

So, say you have documents that you think are more important than
run-of-the-mill
documents. An example could be an animals database. You might want
documents from scholarly sources to have a greater weight, thus be more
likely
to appear at the top of your search results. These would be fine documents
to
boost the fields for at index time. That would mean that the scholarly
documents
would tend to be higher up in your search results.

For query time, say you have a genealogy database. Books in this category
often have
titles like The Erickson family history in New England, 1700-1930. Now say
you
have separate fields for title and text. A user searches for erickson. You
want the
title to weigh much more than the text since it's much more likely that the
user
would find that document more valuable than some document about the Swanson
family that mentions somewhere in the text that Olaf Swanson married Edna
Erickson. So you'd form some query like: title:erickson^10 text:erickson


HTH
Erick

On Thu, Dec 11, 2008 at 1:47 AM, ayyanar
[EMAIL PROTECTED]wrote:


 I read many articles on boosting still iam not so clear on boosting. Can
 anyone explain the following questions with examples?

 1) Can you given an example for field level boosting and document level
 boosting and the difference between two?

 2) If we set the boost at field level (index time), should the query
 contains the that particular field?
 For example, if we set the boost for title field, should we create the
 termquery for title field?

 Also, based on your experience, can you explain why you need the boosting.

 Thanks,
 Ayyanar. A
 --
 View this message in context:
 http://www.nabble.com/Newbie-Question-boosting-tp20950268p20950268.html
 Sent from the Solr - User mailing list archive at Nabble.com.




Re: Building Solr from Source

2008-12-11 Thread John Martyniak
My mistake I saw the maven directories and did not see the build.xml  
in the src directory so just assumed...My Bad.


Anyway built successfully, thanks.

Now to apply the field collapsing patch.

-John

On Dec 11, 2008, at 8:46 AM, Noble Paul നോബിള്‍  
नोब्ळ् wrote:



Solr uses ant for build
install ant

On Thu, Dec 11, 2008 at 7:13 PM, John Martyniak  
[EMAIL PROTECTED] wrote:

Hi,

I have downloaded Maven 2.0.9, and tried to build using mvn clean  
install

and mvn install, nothing works.

Can somebody tell me how to build solr from source?  I am trying to  
build

the 1.3 source.

thank you very much,

-John





--
--Noble Paul




Re: SolrConfig.xml Replication

2008-12-11 Thread Jeff Newburn
Thank you for the quick response.  I will keep an eye on that to see how it
progresses.


On 12/10/08 8:03 PM, Noble Paul നോബിള്‍ नोब्ळ् [EMAIL PROTECTED]
wrote:

 This is a known issue and I was planning to take it up soon.
 https://issues.apache.org/jira/browse/SOLR-821
 
 
 On Thu, Dec 11, 2008 at 5:30 AM, Jeff Newburn [EMAIL PROTECTED] wrote:
 I am curious as to whether there is a solution to be able to replicate
 solrconfig.xml with the 1.4 replication.  The obvious problem is that the
 master would replicate the solrconfig turning all slaves into masters with
 its config.  I have also tried on a whim to configure the master and slave
 on the master so that the slave points to the same server but that seems to
 break the replication completely.  Please let me know if anybody has any
 ideas
 
 -Jeff
 
 
 



Re: Upgrade from 1.2 to 1.3 gives 3x slowdown + script!

2008-12-11 Thread Fergus McMenemie
Yonik

Another thought I just had - do you have autocommit enabled?

No; not as far as I know!

The solrconfig.xml from the two versions are equivalent as best I can tell,
also they are exactly as provided in the download. The only changes were
made by the attached script and should not affect committing. Finally the
indexing command has commit=true, which I think means do a single commit
at the end of the file?

Regards Fergus.


A lucene commit is now more expensive because it syncs the files for
safety.  If you commit frequently, this could definitely cause a
slowdown.

-Yonik

On Wed, Nov 26, 2008 at 10:54 AM, Fergus McMenemie [EMAIL PROTECTED] wrote:
 Hello Grant,

 Not much good with Java profilers (yet!) so I thought I
 would send a script!

 Details... details! Having decided to produce a script to
 replicate the 1.2 vis 1.3 speed problem. The required rigor
 revealed a lot more.

 1) The faster version I have previously referred to as 1.2,
   was actually a 1.3-dev I had downloaded as part of the
   solr bootcamp class at ApacheCon Europe 2008. The ID
   string in the CHANGES.txt document is:-
   $Id: CHANGES.txt 643465 2008-04-01 16:10:19Z gsingers $

 2) I did actually download and speed test a version of 1.2
   from the internet. It's CHANGES.txt id is:-
   $Id: CHANGES.txt 543263 2007-05-31 21:19:02Z yonik $
   Speed wise it was about the same as 1.3 at 64min. It also
   had lots of char set issues and is ignored from now on.

 3) The version I was planning to use, till I found this,
   speed issue was the latest official version:-
   $Id: CHANGES.txt 694377 2008-09-11 17:40:11Z klaas $
   I also verified the behavior with a nightly build.
   $Id: CHANGES.txt 712457 2008-11-09 01:24:11Z koji $

 Anyway, The following script indexes the content in 22min
 for the 1.3-dev version and takes 68min for the newer releases
 of 1.3. I took the conf directory from the 1.3dev (bootcamp)
 release and used it replace the conf directory from the
 official 1.3 release. The 3x slow down was still there; it is
 not a configuration issue!
 =






 #! /bin/bash

 # This script assumes a /usr/local/tomcat link to whatever version
 # of tomcat you have installed. I have apache-tomcat-5.5.20 Also
 # /usr/local/tomcat/conf/Catalina/localhost contains no solr.xml.
 # All the following was done as root.


 # I have a directory /usr/local/ts which contains four versions of solr. The
 # official 1.2 along with two 1.3 releases and a version of 1.2 or a 
 1.3beata
 # I got while attending a solr bootcamp. I indexed the same content using the
 # different versions of solr as follows:
 cd /usr/local/ts
 if [  ]
 then
   echo Starting from a-fresh
   sleep 5 # allow time for me to interrupt!
   cp -Rp apache-solr-bc/example/solr  ./solrbc  #bc = bootcamp
   cp -Rp apache-solr-nightly/example/solr ./solrnightly
   cp -Rp apache-solr-1.3.0/example/solr   ./solr13

   # the gaz is regularly updated and its name keeps changing :-) The page
   # http://earth-info.nga.mil/gns/html/namefiles.htm has a link to the latest
   # version.
   curl 
 http://earth-info.nga.mil/gns/html/geonames_dd_dms_date_20081118.zip;  
 geonames.zip
   unzip -q geonames.zip
   # delete corrupt blips!
   perl -i -n -e 'print unless
   ($.  2128495 and $.  2128505) or
   ($.  5944254 and $.  5944260)
   ;' geonames_dd_dms_date_20081118.txt
   #following was used to detect bad short records
   #perl -a -F\\t -n -e ' print line $. is bad with ,scalar(@F), args\n 
 if (@F != 26);' geonames_dd_dms_date_20081118.txt

   # my set of fields and copyfields for the schema.xml
   fields='
   fields
  field name=UNI   type=string indexed=true  stored=true 
 required=true /
  field name=CCODE type=string indexed=true  
 stored=true/
  field name=DSG   type=string indexed=true  
 stored=true/
  field name=CC1   type=string indexed=true  
 stored=true/
  field name=LAT   type=sfloat indexed=true  
 stored=true/
  field name=LONG  type=sfloat indexed=true  
 stored=true/
  field name=MGRS  type=string indexed=false 
 stored=true/
  field name=JOG   type=string indexed=false 
 stored=true/
  field name=FULL_NAME type=string indexed=true  
 stored=true/
  field name=FULL_NAME_ND  type=string indexed=true  
 stored=true/
  !--field name=text   type=text   indexed=true  
 stored=false multiValued=true/ --
  !--field name=timestamp  type=date   indexed=true  stored=true 
  default=NOW multiValued=false/--
   '
   copyfields='
  /fields
  copyField source=FULL_NAME dest=text/
  copyField source=FULL_NAME_ND dest=text/
   '

   # add in my fields and copyfields
   perl -i -p -e print qq($fields) if s/fields//;   
 solr*/conf/schema.xml
   perl -i -p -e print qq($copyfields) if s[/fields][]; 
 solr*/conf/schema.xml
   # change the unique key and mark the id field as not required
   perl 

Dismax Minimum Match/Stopwords Bug

2008-12-11 Thread Jeff Newburn
I have discovered some weirdness with our Minimum Match functionality.
Essentially it comes up with absolutely no results on certain queries.
Basically, searches with 2 words and 1 being ³the² don¹t have a return
result.  From what we can gather the minimum match criteria is making it
such that if there are 2 words then both are required.  Unfortunately, the
stopwords are pulled resulting in ³the² being removed and then solr is
requiring 2 words when only 1 exists to match on.  Is there a way around
this?  I really need it to either require only non-stopwords or not filter
out stopwords.  We know stopwords are causing the issue because taking out
the stopwords fixes the problem.  Also, we can change mm setting to 75% and
fix the problem.

Example:
Brand: The North Face
Search: the north (returns no results)

Our config is basically:
MM: str name=mm2lt;-1/str
FieldType: 
tokenizer class=solr.WhitespaceTokenizerFactory/
   filter class=solr.WordDelimiterFilterFactory
generateWordParts=1 generateNumberParts=1 catenateWords=1
catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/
   filter class=solr.LowerCaseFilterFactory/
   filter class=solr.StopFilterFactory ignoreCase=true
words=stopwords.txt/
   filter class=solr.SynonymFilterFactory
synonyms=synonyms.txt ignoreCase=true expand=true/

   filter class=solr.EnglishPorterFilterFactory
protected=protwords.txt/
   filter class=solr.RemoveDuplicatesTokenFilterFactory/





Exception running Solr in Weblogic

2008-12-11 Thread Alexander Ramos Jardim
Guys,

I keep getting this exception from time to time on my application when it
comunicates with Solr. Does anyone knows if Solr tries to write headers
after the response has been sent?

Dec 11, 2008 3:05:37 PM BRST Error HTTP localhost suba_c1lg1
[ACTIVE] ExecuteThread: '12' for queue: 'weblogic.kernel.Default
(self-tuning)' WLS Kernel   1229015137499 BEA-101020
[EMAIL PROTECTED] - appName:
'SolrRepo-cluster1', name: 'solr', context-path: '/solr'] Servlet failed
with Exception
java.lang.IllegalStateException: Response already committed
at
weblogic.servlet.internal.ServletResponseImpl.objectIfCommitted(ServletResponseImpl.java:1486)

at
weblogic.servlet.internal.ServletResponseImpl.sendError(ServletResponseImpl.java:603)

at
org.apache.solr.servlet.SolrDispatchFilter.sendError(SolrDispatchFilter.java:362)

at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:320)

at
weblogic.servlet.internal.FilterChainImpl.doFilter(FilterChainImpl.java:42)

-- 
Alexander Ramos Jardim


Re: Exception running Solr in Weblogic

2008-12-11 Thread Otis Gospodnetic
I think somebody just put up a page about Solr and WebLogic up on the Solr 
Wiki...


Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 
 From: Alexander Ramos Jardim alexander.ramos.jar...@gmail.com
 To: solr-user@lucene.apache.org
 Sent: Thursday, December 11, 2008 1:01:16 PM
 Subject: Exception running Solr in Weblogic
 
 Guys,
 
 I keep getting this exception from time to time on my application when it
 comunicates with Solr. Does anyone knows if Solr tries to write headers
 after the response has been sent?
 
 
 [ACTIVE] ExecuteThread: '12' for queue: 'weblogic.kernel.Default
 (self-tuning)'1229015137499 
 [weblogic.servlet.internal.webappservletcont...@2de805b - appName:
 'SolrRepo-cluster1', name: 'solr', context-path: '/solr'] Servlet failed
 with Exception
 java.lang.IllegalStateException: Response already committed
 at
 weblogic.servlet.internal.ServletResponseImpl.objectIfCommitted(ServletResponseImpl.java:1486)
 
 at
 weblogic.servlet.internal.ServletResponseImpl.sendError(ServletResponseImpl.java:603)
 
 at
 org.apache.solr.servlet.SolrDispatchFilter.sendError(SolrDispatchFilter.java:362)
 
 at
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:320)
 
 at
 weblogic.servlet.internal.FilterChainImpl.doFilter(FilterChainImpl.java:42)
 
 -- 
 Alexander Ramos Jardim



Re: Exception running Solr in Weblogic

2008-12-11 Thread Alexander Ramos Jardim
Can't find it on the wiki. Could you put the url here?

2008/12/11 Otis Gospodnetic otis_gospodne...@yahoo.com

 I think somebody just put up a page about Solr and WebLogic up on the Solr
 Wiki...


 Otis
 --
 Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



 - Original Message 
  From: Alexander Ramos Jardim alexander.ramos.jar...@gmail.com
  To: solr-user@lucene.apache.org
  Sent: Thursday, December 11, 2008 1:01:16 PM
  Subject: Exception running Solr in Weblogic
 
  Guys,
 
  I keep getting this exception from time to time on my application when it
  comunicates with Solr. Does anyone knows if Solr tries to write headers
  after the response has been sent?
 
  
  [ACTIVE] ExecuteThread: '12' for queue: 'weblogic.kernel.Default
  (self-tuning)'1229015137499
  [weblogic.servlet.internal.webappservletcont...@2de805b - appName:
  'SolrRepo-cluster1', name: 'solr', context-path: '/solr'] Servlet failed
  with Exception
  java.lang.IllegalStateException: Response already committed
  at
 
 weblogic.servlet.internal.ServletResponseImpl.objectIfCommitted(ServletResponseImpl.java:1486)
 
  at
 
 weblogic.servlet.internal.ServletResponseImpl.sendError(ServletResponseImpl.java:603)
 
  at
 
 org.apache.solr.servlet.SolrDispatchFilter.sendError(SolrDispatchFilter.java:362)
 
  at
 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:320)
 
  at
 
 weblogic.servlet.internal.FilterChainImpl.doFilter(FilterChainImpl.java:42)
 
  --
  Alexander Ramos Jardim




-- 
Alexander Ramos Jardim


Re: Taxonomy Support on Solr

2008-12-11 Thread Otis Gospodnetic
This is what Hoss was hinting at yesterday (or was that on the Lucene list?).  
You can do that if you encode the hierarchy in a field properly., e.g. /A /B 
/1  may be one doc's field. /A /B /2 may be another doc's field.  THen you 
just have to figure out how to query that to get a sub-tree.


Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 
 From: Jana, Kumar Raja kj...@ptc.com
 To: solr-user@lucene.apache.org
 Sent: Thursday, December 11, 2008 5:03:02 AM
 Subject: Taxonomy Support on Solr
 
 Hi,
 
 Any plans of supporting user-defined classifications on Solr? Is there
 any component which returns all the children of a node (till the leaf
 node) when I search for any node?
 
 May be this would help:
 
 Say I have a few SolrDocuments classified as:
 
 A
 B--C
 123  8--9
 
 (I.e A has 2 child nodes B and C. B has 3 child nodes 1,2,3 and C has 2
 child nodes 8,9)
 When my search criteria matches B, my results should contain B as well
 as 1,2 and 3 too.
 Search for A would return all the nodes mentioned above.
 
 -Kumar



Applying Field Collapsing Patch

2008-12-11 Thread John Martyniak

Hi,

I am trying to apply Ivan's field collapsing patch to solr 1.3 (not a  
nightly), and it continously fails.  I am using the following command:

patch -p0 -i collapsing-patch-to-1.3.0-ivan_2.patch --dry-run

I am in the apache-solr directory, and have read write for all files  
directories and files.


I am get the following results:

patching file src/test/org/apache/solr/search/TestDocSet.java
Hunk #1 FAILED at 88.
1 out of 1 hunk FAILED -- saving rejects to file src/test/org/apache/ 
solr/search/TestDocSet.java.rej

patching file src/java/org/apache/solr/search/CollapseFilter.java
patching file src/java/org/apache/solr/search/DocSet.java
Hunk #1 FAILED at 195.
1 out of 1 hunk FAILED -- saving rejects to file src/java/org/apache/ 
solr/search/DocSet.java.rej

patching file src/java/org/apache/solr/search/NegatedDocSet.java
patching file src/java/org/apache/solr/search/SolrIndexSearcher.java
Hunk #1 FAILED at 1357.
1 out of 1 hunk FAILED -- saving rejects to file src/java/org/apache/ 
solr/search/SolrIndexSearcher.java.rej

patching file src/java/org/apache/solr/common/params/CollapseParams.java
patching file src/java/org/apache/solr/handler/component/ 
CollapseComponent.java



Also the '.rej' files are not created.

Does anybody have any ideas?

thanks in advance for the help.

-John


Re: Taxonomy Support on Solr

2008-12-11 Thread Alexander Ramos Jardim
I use this workaround all the time.

When I need to put the hierarchy which a product belongs, I simply arranje
all the nodes as: a ^ b ^ c ^ d

2008/12/11 Otis Gospodnetic otis_gospodne...@yahoo.com

 This is what Hoss was hinting at yesterday (or was that on the Lucene
 list?).  You can do that if you encode the hierarchy in a field properly.,
 e.g. /A /B /1  may be one doc's field. /A /B /2 may be another doc's
 field.  THen you just have to figure out how to query that to get a
 sub-tree.


 Otis
 --
 Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



 - Original Message 
  From: Jana, Kumar Raja kj...@ptc.com
  To: solr-user@lucene.apache.org
  Sent: Thursday, December 11, 2008 5:03:02 AM
  Subject: Taxonomy Support on Solr
 
  Hi,
 
  Any plans of supporting user-defined classifications on Solr? Is there
  any component which returns all the children of a node (till the leaf
  node) when I search for any node?
 
  May be this would help:
 
  Say I have a few SolrDocuments classified as:
 
  A
  B--C
  123  8--9
 
  (I.e A has 2 child nodes B and C. B has 3 child nodes 1,2,3 and C has 2
  child nodes 8,9)
  When my search criteria matches B, my results should contain B as well
  as 1,2 and 3 too.
  Search for A would return all the nodes mentioned above.
 
  -Kumar




-- 
Alexander Ramos Jardim


exceeded limit of maxWarmingSearchers=4

2008-12-11 Thread chip correra


We’re using Solr as a backend indexer/search engine to support an AJAX 
based consumer application.  Basically, when users of our system create 
“Documents” in our product, we commit right away, because we want to 
immediately re-query and get counts back from Solr to update the user’s 
interface.  Occasionally, usually under load, we see the following error...

ERROR [IndexSubscription] error committing changes to solr 
java.lang.RuntimeException: org.apache.solr.common.SolrException: Error opening 
new searcher. exceeded limit of maxWarmingSearchers=4, try again later.

My understanding of Solr’s caches and warming searches is that we get nearly no 
value out of them because with many users, each only ever querying their own 
data, the caches would quickly be marked irrelevant by new unique user queries.

So, we’ve tried setting useColdSearchertrue/useColdSearcher
And we’ve set all the cache autowarmCounts=”0”, for example: documentCache 
class=solr.LRUCache size=512 initialSize=512 autowarmCount=0/

Yet the same problem is periodically logged.

Is there a way to disable warming Searches?  

Should we even be trying this? 

Is there a better configuration, i.e. Do we just set Warming Searches to a 
higher number? 

I read in solr...@jira that removing this configuration all together forces the 
number of warming Searches be infinite – that can’t be good? 


_
Send e-mail anywhere. No map, no compass.
http://windowslive.com/Explore/hotmail?ocid=TXT_TAGLM_WL_hotmail_acq_anywhere_122008

Re: exceeded limit of maxWarmingSearchers=4

2008-12-11 Thread Mark Miller

chip correra wrote:

We’re using Solr as a backend indexer/search engine to support an AJAX 
based consumer application.  Basically, when users of our system create 
“Documents” in our product, we commit right away, because we want to 
immediately re-query and get counts back from Solr to update the user’s 
interface.  Occasionally, usually under load, we see the following error...

ERROR [IndexSubscription] error committing changes to solr 
java.lang.RuntimeException: org.apache.solr.common.SolrException: Error opening 
new searcher. exceeded limit of maxWarmingSearchers=4, try again later.

My understanding of Solr’s caches and warming searches is that we get nearly no 
value out of them because with many users, each only ever querying their own 
data, the caches would quickly be marked irrelevant by new unique user queries.

So, we’ve tried setting useColdSearchertrue/useColdSearcher
And we’ve set all the cache autowarmCounts=”0”, for example: documentCache class=solr.LRUCache 
size=512 initialSize=512 autowarmCount=0/

Yet the same problem is periodically logged.

Is there a way to disable warming Searches?  

Should we even be trying this? 

Is there a better configuration, i.e. Do we just set Warming Searches to a higher number? 

I read in solr...@jira that removing this configuration all together forces the number of warming Searches be infinite – that can’t be good? 



_
Send e-mail anywhere. No map, no compass.
http://windowslive.com/Explore/hotmail?ocid=TXT_TAGLM_WL_hotmail_acq_anywhere_122008
  
It likely means your committing too often. Pretty often if it happens 
even with a cold searcher. Whats your commit policy?




Re: Applying Field Collapsing Patch

2008-12-11 Thread Stephen Weiss
Are you sure you have a clean copy of the source?  Every time I've  
applied his patch I grab a fresh copy of the tarball and run the exact  
same command, it always works for me.


Now, whether the collapsing actually works is a different matter...

--
Steve

On Dec 11, 2008, at 1:29 PM, John Martyniak wrote:


Hi,

I am trying to apply Ivan's field collapsing patch to solr 1.3 (not  
a nightly), and it continously fails.  I am using the following  
command:

patch -p0 -i collapsing-patch-to-1.3.0-ivan_2.patch --dry-run

I am in the apache-solr directory, and have read write for all files  
directories and files.


I am get the following results:

patching file src/test/org/apache/solr/search/TestDocSet.java
Hunk #1 FAILED at 88.
1 out of 1 hunk FAILED -- saving rejects to file src/test/org/apache/ 
solr/search/TestDocSet.java.rej

patching file src/java/org/apache/solr/search/CollapseFilter.java
patching file src/java/org/apache/solr/search/DocSet.java
Hunk #1 FAILED at 195.
1 out of 1 hunk FAILED -- saving rejects to file src/java/org/apache/ 
solr/search/DocSet.java.rej

patching file src/java/org/apache/solr/search/NegatedDocSet.java
patching file src/java/org/apache/solr/search/SolrIndexSearcher.java
Hunk #1 FAILED at 1357.
1 out of 1 hunk FAILED -- saving rejects to file src/java/org/apache/ 
solr/search/SolrIndexSearcher.java.rej
patching file src/java/org/apache/solr/common/params/ 
CollapseParams.java
patching file src/java/org/apache/solr/handler/component/ 
CollapseComponent.java



Also the '.rej' files are not created.

Does anybody have any ideas?

thanks in advance for the help.

-John




RE: exceeded limit of maxWarmingSearchers=4

2008-12-11 Thread chip correra

We commit immediately after each and every document submit.  I think we have to 
because we want to immediately retrieve a count on the number of documents of 
that type, including the one that we just submitted.  And my understanding is 
that if we don't commit immediately, the new document will not be part of the 
query results.

 Date: Thu, 11 Dec 2008 14:09:47 -0500
 From: markrmil...@gmail.com
 To: solr-user@lucene.apache.org
 Subject: Re: exceeded limit of maxWarmingSearchers=4
 
 chip correra wrote:
  We’re using Solr as a backend indexer/search engine to support an 
  AJAX based consumer application.  Basically, when users of our system 
  create “Documents” in our product, we commit right away, because we want to 
  immediately re-query and get counts back from Solr to update the user’s 
  interface.  Occasionally, usually under load, we see the following error...
 
  ERROR [IndexSubscription] error committing changes to solr 
  java.lang.RuntimeException: org.apache.solr.common.SolrException: Error 
  opening new searcher. exceeded limit of maxWarmingSearchers=4, try again 
  later.
 
  My understanding of Solr’s caches and warming searches is that we get 
  nearly no value out of them because with many users, each only ever 
  querying their own data, the caches would quickly be marked irrelevant by 
  new unique user queries.
 
  So, we’ve tried setting useColdSearchertrue/useColdSearcher
  And we’ve set all the cache autowarmCounts=”0”, for example: documentCache 
  class=solr.LRUCache size=512 initialSize=512 autowarmCount=0/
 
  Yet the same problem is periodically logged.
 
  Is there a way to disable warming Searches?  
 
  Should we even be trying this? 
 
  Is there a better configuration, i.e. Do we just set Warming Searches to a 
  higher number? 
 
  I read in solr...@jira that removing this configuration all together forces 
  the number of warming Searches be infinite – that can’t be good? 
 
 
  _
  Send e-mail anywhere. No map, no compass.
  http://windowslive.com/Explore/hotmail?ocid=TXT_TAGLM_WL_hotmail_acq_anywhere_122008

 It likely means your committing too often. Pretty often if it happens 
 even with a cold searcher. Whats your commit policy?
 

_
Send e-mail anywhere. No map, no compass.
http://windowslive.com/Explore/hotmail?ocid=TXT_TAGLM_WL_hotmail_acq_anywhere_122008

Re: Exception running Solr in Weblogic

2008-12-11 Thread Shalin Shekhar Mangar
http://wiki.apache.org/solr/SolrWebSphere

On Fri, Dec 12, 2008 at 12:00 AM, Alexander Ramos Jardim 
alexander.ramos.jar...@gmail.com wrote:

 Can't find it on the wiki. Could you put the url here?

 2008/12/11 Otis Gospodnetic otis_gospodne...@yahoo.com

  I think somebody just put up a page about Solr and WebLogic up on the
 Solr
  Wiki...
 
 
  Otis
  --
  Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
 
 
 
  - Original Message 
   From: Alexander Ramos Jardim alexander.ramos.jar...@gmail.com
   To: solr-user@lucene.apache.org
   Sent: Thursday, December 11, 2008 1:01:16 PM
   Subject: Exception running Solr in Weblogic
  
   Guys,
  
   I keep getting this exception from time to time on my application when
 it
   comunicates with Solr. Does anyone knows if Solr tries to write headers
   after the response has been sent?
  
   
   [ACTIVE] ExecuteThread: '12' for queue: 'weblogic.kernel.Default
   (self-tuning)'1229015137499
   [weblogic.servlet.internal.webappservletcont...@2de805b - appName:
   'SolrRepo-cluster1', name: 'solr', context-path: '/solr'] Servlet
 failed
   with Exception
   java.lang.IllegalStateException: Response already committed
   at
  
 
 weblogic.servlet.internal.ServletResponseImpl.objectIfCommitted(ServletResponseImpl.java:1486)
  
   at
  
 
 weblogic.servlet.internal.ServletResponseImpl.sendError(ServletResponseImpl.java:603)
  
   at
  
 
 org.apache.solr.servlet.SolrDispatchFilter.sendError(SolrDispatchFilter.java:362)
  
   at
  
 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:320)
  
   at
  
 
 weblogic.servlet.internal.FilterChainImpl.doFilter(FilterChainImpl.java:42)
  
   --
   Alexander Ramos Jardim
 
 


 --
 Alexander Ramos Jardim




-- 
Regards,
Shalin Shekhar Mangar.


Re: Problems with SOLR-236 (field collapsing)

2008-12-11 Thread Stephen Weiss
Thanks Doug, removing query definitely helped.  I just switched to  
Ivan's new patch (which definitely helped a lot - no SEVERE errors now  
- thanks Ivan!) but I'm still struggling with faceting myself.


Basically, I can tell that faceting is happening after the collapse -  
because the facet counts are definitely lower than they would be  
otherwise.  For example, with one search, I'd have 196 results with no  
collapsing, I get 120 results with collapsing - but the facet count is  
119???  In other searches the difference is more drastic - In another  
search, I get 61 results without collapsing, 61 with collapsing, but  
the facet count is 39.


Looking at it for a while now, I think I can guess what the problem  
might be...


The incorrect counts seem to only happen when the term in question  
does not occur evenly across all duplicates of a document.  That is,  
multiple document records may exist for the same image (it's an image  
search engine), but each document will have different terms in  
different fields depending on the audience it's targeting.  So, when  
you collapse, the counts are lower than they should be because when  
you actually execute a search with that facet's term included in the  
query, *all* the documents after collapsing will be ones that have  
that term.


Here's an illustration:

Collapse field is link_id, facet field is keyword:


Doc 1:
id: 123456,
link_id: 2,
keyword: Black, Printed, Dress

Doc 2:
id: 123457,
link_id: 2,
keyword: Black, Shoes, Patent

Doc 3:
id: 123458,
link_id: 2,
keyword: Red, Hat, Felt

Doc 4:
id: 123459,
link_id:1,
keyword: Felt, Hat, Black

So, when you collapse, only two of these documents are in the result  
set (123456, 123459), and only the keywords Black, Printed, Dress,  
Felt, and Hat are counted.  The facet count for Black is 2, the facet  
count for Felt is 1.  If you choose Black and add it to your query,  
you get 2 results (great).  However, if you add *Felt* to your query,  
you get 2 results (because a different document for link_id 2 is  
chosen in that query than is in the more general query from which the  
facets are produced).


I think what needs to happen here is that all the terms for all the  
documents that are collapsed together need to be included (just once)  
with the document that gets counted for faceting.  In this example,  
when the document for link_id 2 is counted, it would need to appear to  
the facet counter to have keywords Black, Printed, Dress, Shoes,  
Patent, Red, Hat, and Felt, as opposed to just Black, Printed, and  
Dress.


Unfortunately, not knowing Java at all really, I have absolutely no  
idea how this change would be implemented... I mean I can tweak here  
or there, but I think this is above my pay grade.   I've looked at the  
code affected by the patch and the code for faceting but I can't make  
heads or tails of it.


I think I'll go post this over on the JIRA...

Any ideas?

--
Steve



On Dec 10, 2008, at 8:52 AM, Doug Steigerwald wrote:

The first output is from the query component.  You might just need  
to make the collapse component first and remove the query component  
completely.


We perform geographic searching with localsolr first (if we need  
to), and then try to collapse those results (if collapse=true).  If  
we don't have any results yet, that's the only time we use the  
standard query component.  I'm making sure we set the  
builder.setNeedDocSet=false and then I modified the query component  
to only execute when builder.isNeedDocSet=true.


In the field collapsing patch that I'm using, I've got code to  
remove a previous 'response' from the builder.rsp so we don't have  
duplicates.


Now, if I could get field collapsing to work properly with a docSet/ 
docList from localsolr and also have faceting work, I'd be golden.


Doug

On Dec 9, 2008, at 9:37 PM, Stephen Weiss wrote:


Hi Tracy,

Well, I managed to get it working (I think) but the weird thing is,  
in the XML output it gives both recordsets (the filtered and  
unfiltered - filtered second).  In the JSON (the one I actually use  
anyway, at least) I only get the filtered results (as expected).


In my core's solrconfig.xml, I added:

 searchComponent name=collapse  
class=org.apache.solr.handler.component.CollapseComponent /


(I'm not sure if it's supposed to go anywhere in particular but for  
me it's right before StandardRequestHandler)


and then within StandardRequestHandler:

requestHandler name=standard class=solr.StandardRequestHandler
  !-- default values for query parameters --
   lst name=defaults
 str name=echoParamsexplicit/str
 !--
 int name=rows10/int
 str name=fl*/str
 str name=version2.1/str
  --
   /lst
   arr name=components
  strquery/str
  strfacet/str
  strmlt/str
  strhighlight/str
  strdebug/str
  strcollapse/str
   /arr
/requestHandler


Which is basically all the default values plus collapse.  Not sure  
if this was needed for prior 

Re: exceeded limit of maxWarmingSearchers=4

2008-12-11 Thread Walter Underwood
It sounds like you need real-time search, where documents are
available in the next query. Solr doesn't do that.

That is a pretty rare feature and must be designed in at the start.

The usual workaround is to have a main index plus a small delta
index and search both. Deletes have to be handled separately.

The Sphinx docs have a description of how to do that with their
engine:

  http://www.sphinxsearch.com/docs/current.html#live-updates

wunder

On 12/11/08 10:44 AM, chip correra chipcorr...@hotmail.com wrote:

 
 
 We¹re using Solr as a backend indexer/search engine to support an AJAX
 based consumer application.  Basically, when users of our system create
 ³Documents² in our product, we commit right away, because we want to
 immediately re-query and get counts back from Solr to update the user¹s
 interface.  Occasionally, usually under load, we see the following error...
 
 ERROR [IndexSubscription] error committing changes to solr
 java.lang.RuntimeException: org.apache.solr.common.SolrException: Error
 opening new searcher. exceeded limit of maxWarmingSearchers=4, try again
 later.
 
 My understanding of Solr¹s caches and warming searches is that we get nearly
 no value out of them because with many users, each only ever querying their
 own data, the caches would quickly be marked irrelevant by new unique user
 queries.
 
 So, we¹ve tried setting useColdSearchertrue/useColdSearcher
 And we¹ve set all the cache autowarmCounts=²0², for example: documentCache
 class=solr.LRUCache size=512 initialSize=512 autowarmCount=0/
 
 Yet the same problem is periodically logged.
 
 Is there a way to disable warming Searches?
 
 Should we even be trying this?
 
 Is there a better configuration, i.e. Do we just set Warming Searches to a
 higher number? 
 
 I read in solr...@jira that removing this configuration all together forces
 the number of warming Searches be infinite ­ that can¹t be good?
 
 
 _
 Send e-mail anywhere. No map, no compass.
 http://windowslive.com/Explore/hotmail?ocid=TXT_TAGLM_WL_hotmail_acq_anywhere_
 122008



Re: Applying Field Collapsing Patch

2008-12-11 Thread John Martyniak

thanks for the advice.

I just downloaded a completely clean version, haven't even tried to  
build it yet.


Applied the same, and I received exactly the same results.

Do you only apply the ivan patch 2?  What version of patch are you  
running?


-John

On Dec 11, 2008, at 2:10 PM, Stephen Weiss wrote:

Are you sure you have a clean copy of the source?  Every time I've  
applied his patch I grab a fresh copy of the tarball and run the  
exact same command, it always works for me.


Now, whether the collapsing actually works is a different matter...

--
Steve

On Dec 11, 2008, at 1:29 PM, John Martyniak wrote:


Hi,

I am trying to apply Ivan's field collapsing patch to solr 1.3 (not  
a nightly), and it continously fails.  I am using the following  
command:

patch -p0 -i collapsing-patch-to-1.3.0-ivan_2.patch --dry-run

I am in the apache-solr directory, and have read write for all  
files directories and files.


I am get the following results:

patching file src/test/org/apache/solr/search/TestDocSet.java
Hunk #1 FAILED at 88.
1 out of 1 hunk FAILED -- saving rejects to file src/test/org/ 
apache/solr/search/TestDocSet.java.rej

patching file src/java/org/apache/solr/search/CollapseFilter.java
patching file src/java/org/apache/solr/search/DocSet.java
Hunk #1 FAILED at 195.
1 out of 1 hunk FAILED -- saving rejects to file src/java/org/ 
apache/solr/search/DocSet.java.rej

patching file src/java/org/apache/solr/search/NegatedDocSet.java
patching file src/java/org/apache/solr/search/SolrIndexSearcher.java
Hunk #1 FAILED at 1357.
1 out of 1 hunk FAILED -- saving rejects to file src/java/org/ 
apache/solr/search/SolrIndexSearcher.java.rej
patching file src/java/org/apache/solr/common/params/ 
CollapseParams.java
patching file src/java/org/apache/solr/handler/component/ 
CollapseComponent.java



Also the '.rej' files are not created.

Does anybody have any ideas?

thanks in advance for the help.

-John






Re: exceeded limit of maxWarmingSearchers=4

2008-12-11 Thread Mark Miller
Also, if you are using solr 1.3, solr 1.4 will reopen readers rather 
than open them again. This means only changed segments have to be 
reloaded. If you turn off all the caches and use a bit higher merge 
factor, maybe a low max merge docs, you can prob get things a lot 
quicker. There will still be instances where it can't keep up I'm sure 
though. In that case, just ignore the warning. You might set the limit 
lower even. Some stale views will go out, but it should be faster than 
what you are getting now (and your getting the stale views now, as 
walter says, solr doesnt do realtime indexing).


Walter Underwood wrote:

It sounds like you need real-time search, where documents are
available in the next query. Solr doesn't do that.

That is a pretty rare feature and must be designed in at the start.

The usual workaround is to have a main index plus a small delta
index and search both. Deletes have to be handled separately.

The Sphinx docs have a description of how to do that with their
engine:

  http://www.sphinxsearch.com/docs/current.html#live-updates

wunder

On 12/11/08 10:44 AM, chip correra chipcorr...@hotmail.com wrote:

  

We¹re using Solr as a backend indexer/search engine to support an AJAX
based consumer application.  Basically, when users of our system create
³Documents² in our product, we commit right away, because we want to
immediately re-query and get counts back from Solr to update the user¹s
interface.  Occasionally, usually under load, we see the following error...

ERROR [IndexSubscription] error committing changes to solr
java.lang.RuntimeException: org.apache.solr.common.SolrException: Error
opening new searcher. exceeded limit of maxWarmingSearchers=4, try again
later.

My understanding of Solr¹s caches and warming searches is that we get nearly
no value out of them because with many users, each only ever querying their
own data, the caches would quickly be marked irrelevant by new unique user
queries.

So, we¹ve tried setting useColdSearchertrue/useColdSearcher
And we¹ve set all the cache autowarmCounts=²0², for example: documentCache
class=solr.LRUCache size=512 initialSize=512 autowarmCount=0/

Yet the same problem is periodically logged.

Is there a way to disable warming Searches?

Should we even be trying this?

Is there a better configuration, i.e. Do we just set Warming Searches to a
higher number? 


I read in solr...@jira that removing this configuration all together forces
the number of warming Searches be infinite ­ that can¹t be good?


_
Send e-mail anywhere. No map, no compass.
http://windowslive.com/Explore/hotmail?ocid=TXT_TAGLM_WL_hotmail_acq_anywhere_
122008



  




move /solr directory from /tomcat/bin/

2008-12-11 Thread Marc Sturlese

Hey there,
I would like to change the default directory where solr looks for the config
files and index.
Let's say I would like to put:
/opt/tomcat/bin/solr/data/index in /var/searchengine_data/index
and
/opt/tomcat/bin/solr/conf in /usr/home/searchengine_files/conf

Is there any way to do it via configuration or I should modify the
SolrResourceLoader?

Thanks in advance
-- 
View this message in context: 
http://www.nabble.com/move--solr-directory-from--tomcat-bin--tp20963811p20963811.html
Sent from the Solr - User mailing list archive at Nabble.com.



RE: move /solr directory from /tomcat/bin/

2008-12-11 Thread Feak, Todd
You can set the home directory in your Tomcat context snippet/file.

http://wiki.apache.org/solr/SolrTomcat#head-7036378fa48b79c0797cc8230a8a
a0965412fb2e

This controls where Solr looks for solrconfig.xml and schema.xml. The
solrconfig.xml in turn specifies where to find the data directory.

-Original Message-
From: Marc Sturlese [mailto:marc.sturl...@gmail.com] 
Sent: Thursday, December 11, 2008 12:20 PM
To: solr-user@lucene.apache.org
Subject: move /solr directory from /tomcat/bin/


Hey there,
I would like to change the default directory where solr looks for the
config
files and index.
Let's say I would like to put:
/opt/tomcat/bin/solr/data/index in /var/searchengine_data/index
and
/opt/tomcat/bin/solr/conf in /usr/home/searchengine_files/conf

Is there any way to do it via configuration or I should modify the
SolrResourceLoader?

Thanks in advance
-- 
View this message in context:
http://www.nabble.com/move--solr-directory-from--tomcat-bin--tp20963811p
20963811.html
Sent from the Solr - User mailing list archive at Nabble.com.




Re: Applying Field Collapsing Patch

2008-12-11 Thread John Martyniak
It was a completely clean install.  I downloaded it from one of  
mirrors right before applying the patch to it.


Very troubling.  Any other suggestions or ideas?

I am running it on Mac OS Maybe I will try looking for some answers  
around that.


-John

On Dec 11, 2008, at 3:05 PM, Stephen Weiss swe...@stylesight.com  
wrote:


Yes, only ivan patch 2 (and before, only ivan patch 1), my sense was  
these patches were meant to be used in isolation (there were no  
notes saying to apply any other patches first).


Are you using patches for any other purpose (non-SOLR-236)?  Maybe  
you need to apply this one first, then those patches.  For me using  
any patch makes me nervous (we have a pretty strict policy about  
using beta code anywhere), I'm only doing it this once because it's  
absolutely necessary to provide the functionality desired.


--
Steve

On Dec 11, 2008, at 2:53 PM, John Martyniak wrote:


thanks for the advice.

I just downloaded a completely clean version, haven't even tried to  
build it yet.


Applied the same, and I received exactly the same results.

Do you only apply the ivan patch 2?  What version of patch are you  
running?


-John

On Dec 11, 2008, at 2:10 PM, Stephen Weiss wrote:

Are you sure you have a clean copy of the source?  Every time I've  
applied his patch I grab a fresh copy of the tarball and run the  
exact same command, it always works for me.


Now, whether the collapsing actually works is a different matter...

--
Steve

On Dec 11, 2008, at 1:29 PM, John Martyniak wrote:


Hi,

I am trying to apply Ivan's field collapsing patch to solr 1.3  
(not a nightly), and it continously fails.  I am using the  
following command:

patch -p0 -i collapsing-patch-to-1.3.0-ivan_2.patch --dry-run

I am in the apache-solr directory, and have read write for all  
files directories and files.


I am get the following results:

patching file src/test/org/apache/solr/search/TestDocSet.java
Hunk #1 FAILED at 88.
1 out of 1 hunk FAILED -- saving rejects to file src/test/org/ 
apache/solr/search/TestDocSet.java.rej

patching file src/java/org/apache/solr/search/CollapseFilter.java
patching file src/java/org/apache/solr/search/DocSet.java
Hunk #1 FAILED at 195.
1 out of 1 hunk FAILED -- saving rejects to file src/java/org/ 
apache/solr/search/DocSet.java.rej

patching file src/java/org/apache/solr/search/NegatedDocSet.java
patching file src/java/org/apache/solr/search/ 
SolrIndexSearcher.java

Hunk #1 FAILED at 1357.
1 out of 1 hunk FAILED -- saving rejects to file src/java/org/ 
apache/solr/search/SolrIndexSearcher.java.rej
patching file src/java/org/apache/solr/common/params/ 
CollapseParams.java
patching file src/java/org/apache/solr/handler/component/ 
CollapseComponent.java



Also the '.rej' files are not created.

Does anybody have any ideas?

thanks in advance for the help.

-John








Request for paid help...

2008-12-11 Thread cstadler18
We are currently using Apache, SOLR and Java to populate the lucene/solr index.

We need someone to upgrade our SOLR to the newest version, review our schema 
(solr) and solr config to performance tune and make suggestions. Also the 
schema and solr config most likely contain information we don't need as default 
examples.

This is actually on a very powerful Windows2k3 dedicated server.

Current version : 
Apache 2.2.8
Mysql 5x
Solr Specification Version: 1.2.0
Solr Implementation Version: 1.2.0 - Yonik - 2007-06-02 17:35:12
Lucene Specification Version: 2007-05-20_00-04-53
Lucene Implementation Version: build 2007-05-20

I am willing to pay up to $250.
-Craig
cstadle...@hotmail.com


Re: Exception running Solr in Weblogic

2008-12-11 Thread Otis Gospodnetic
http://wiki.apache.org/solr/SolrWebSphere

Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 
 From: Alexander Ramos Jardim alexander.ramos.jar...@gmail.com
 To: solr-user@lucene.apache.org
 Sent: Thursday, December 11, 2008 1:30:03 PM
 Subject: Re: Exception running Solr in Weblogic
 
 Can't find it on the wiki. Could you put the url here?
 
 2008/12/11 Otis Gospodnetic 
 
  I think somebody just put up a page about Solr and WebLogic up on the Solr
  Wiki...
 
 
  Otis
  --
  Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
 
 
 
  - Original Message 
   From: Alexander Ramos Jardim 
   To: solr-user@lucene.apache.org
   Sent: Thursday, December 11, 2008 1:01:16 PM
   Subject: Exception running Solr in Weblogic
  
   Guys,
  
   I keep getting this exception from time to time on my application when it
   comunicates with Solr. Does anyone knows if Solr tries to write headers
   after the response has been sent?
  
   
   [ACTIVE] ExecuteThread: '12' for queue: 'weblogic.kernel.Default
   (self-tuning)'1229015137499
   [weblogic.servlet.internal.webappservletcont...@2de805b - appName:
   'SolrRepo-cluster1', name: 'solr', context-path: '/solr'] Servlet failed
   with Exception
   java.lang.IllegalStateException: Response already committed
   at
  
  
 weblogic.servlet.internal.ServletResponseImpl.objectIfCommitted(ServletResponseImpl.java:1486)
  
   at
  
  
 weblogic.servlet.internal.ServletResponseImpl.sendError(ServletResponseImpl.java:603)
  
   at
  
  
 org.apache.solr.servlet.SolrDispatchFilter.sendError(SolrDispatchFilter.java:362)
  
   at
  
  
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:320)
  
   at
  
  weblogic.servlet.internal.FilterChainImpl.doFilter(FilterChainImpl.java:42)
  
   --
   Alexander Ramos Jardim
 
 
 
 
 -- 
 Alexander Ramos Jardim



Re: ExtractingRequestHandler and XmlUpdateHandler

2008-12-11 Thread Grant Ingersoll


On Dec 10, 2008, at 10:21 PM, Jacob Singh wrote:


Hey folks,

I'm looking at implementing ExtractingRequestHandler in the  
Apache_Solr_PHP

library, and I'm wondering what we can do about adding meta-data.

I saw the docs, which suggests you use different post headers to  
pass field
values along with ext.literal.  Is there anyway to use the  
XmlUpdateHandler
instead along with a document?  I'm not sure how this would work,  
perhaps it
would require 2 trips, perhaps the XML would be in the post  
content and
the file in something else?  The thing is we would need to refactor  
the
class pretty heavily in this case when indexing RichDocs and we were  
hoping

to avoid it.



I'm not sure I follow how the XmlUpdateHandler plays in, can you  
explain a little more?  My PHP is weak, but maybe some code will help...




Thanks,
Jacob
--

+1 510 277-0891 (o)
+91  33 7458 (m)

web: http://pajamadesign.com

Skype: pajamadesign
Yahoo: jacobsingh
AIM: jacobsingh
gTalk: jacobsi...@gmail.com





Format of highlighted fields in query response

2008-12-11 Thread Mark Ferguson
Hello,

I am making a query to my Solr server in which I would like to have a number
of fields returned, with highlighting if available. I've noticed that in the
query response, I get back both the original field name and then in a
different section, the highlighted snippet. I am wondering if there is a
parameter which will allow me to collapse this data, returning only the
highlighted snippet in the doc itself, when available. For example, I am
currently receiving the following data:

doc
  float name=score0.2963915/float
  str name=page_titleChinese Visa Information/str
  str name=url
http://www.danwei.org/china_information/chinese_visa_information.php/str
  str name=urlmd501598a6e06190bd8b05c8b03f51233a1/str
/doc
... and farther down ...
lst name=highlighting
  lst name=01598a6e06190bd8b05c8b03f51233a1
arr name=page_title
  strTITLE: span class='highlight'Chinese/span Visa
Information/str
/arr
  /lst
/lst

I would like it to just look like this:

doc
  float name=score0.2963915/float
  str name=page_titlespan class='highlight'Chinese/span Visa
Information/str
  str name=url
http://www.danwei.org/china_information/chinese_visa_information.php/str
  str name=urlmd501598a6e06190bd8b05c8b03f51233a1/str
/doc

The reason I would prefer this second response format is because I don't
need the first field, and it greatly simplifies my call to
QueryResponse.getBeans() in SolrJ, as it will fill in everything I need in
one call.

Thanks very much,

Mark Ferguson


Re: Applying Field Collapsing Patch

2008-12-11 Thread Stephen Weiss
I am also a Mac user.  10.5.5.   I generally compile on OS X then  
upload to Debian (Debian's java just isn't as friendly to me).


Perhaps try this old Mac adage - have you repaired permissions? :-)

If you want I can send you a tar ball of the patched code off-list so  
you can move on.


--
Steve

On Dec 11, 2008, at 3:50 PM, John Martyniak wrote:

It was a completely clean install.  I downloaded it from one of  
mirrors right before applying the patch to it.


Very troubling.  Any other suggestions or ideas?

I am running it on Mac OS Maybe I will try looking for some answers  
around that.


-John

On Dec 11, 2008, at 3:05 PM, Stephen Weiss swe...@stylesight.com  
wrote:


Yes, only ivan patch 2 (and before, only ivan patch 1), my sense  
was these patches were meant to be used in isolation (there were no  
notes saying to apply any other patches first).


Are you using patches for any other purpose (non-SOLR-236)?  Maybe  
you need to apply this one first, then those patches.  For me using  
any patch makes me nervous (we have a pretty strict policy about  
using beta code anywhere), I'm only doing it this once because it's  
absolutely necessary to provide the functionality desired.


--
Steve

On Dec 11, 2008, at 2:53 PM, John Martyniak wrote:


thanks for the advice.

I just downloaded a completely clean version, haven't even tried  
to build it yet.


Applied the same, and I received exactly the same results.

Do you only apply the ivan patch 2?  What version of patch are you  
running?


-John

On Dec 11, 2008, at 2:10 PM, Stephen Weiss wrote:

Are you sure you have a clean copy of the source?  Every time  
I've applied his patch I grab a fresh copy of the tarball and run  
the exact same command, it always works for me.


Now, whether the collapsing actually works is a different matter...

--
Steve

On Dec 11, 2008, at 1:29 PM, John Martyniak wrote:


Hi,

I am trying to apply Ivan's field collapsing patch to solr 1.3  
(not a nightly), and it continously fails.  I am using the  
following command:

patch -p0 -i collapsing-patch-to-1.3.0-ivan_2.patch --dry-run

I am in the apache-solr directory, and have read write for all  
files directories and files.


I am get the following results:

patching file src/test/org/apache/solr/search/TestDocSet.java
Hunk #1 FAILED at 88.
1 out of 1 hunk FAILED -- saving rejects to file src/test/org/ 
apache/solr/search/TestDocSet.java.rej

patching file src/java/org/apache/solr/search/CollapseFilter.java
patching file src/java/org/apache/solr/search/DocSet.java
Hunk #1 FAILED at 195.
1 out of 1 hunk FAILED -- saving rejects to file src/java/org/ 
apache/solr/search/DocSet.java.rej

patching file src/java/org/apache/solr/search/NegatedDocSet.java
patching file src/java/org/apache/solr/search/ 
SolrIndexSearcher.java

Hunk #1 FAILED at 1357.
1 out of 1 hunk FAILED -- saving rejects to file src/java/org/ 
apache/solr/search/SolrIndexSearcher.java.rej
patching file src/java/org/apache/solr/common/params/ 
CollapseParams.java
patching file src/java/org/apache/solr/handler/component/ 
CollapseComponent.java



Also the '.rej' files are not created.

Does anybody have any ideas?

thanks in advance for the help.

-John










Re: Query Performance while updating teh index

2008-12-11 Thread Otis Gospodnetic
Oleg,

The reliable formula is situation-specific, I think.  One sure way to decrease 
the warm time is to minimize the number of items to copy from old caches to new 
caches on warmup.

Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 
 From: oleg_gnatovskiy oleg_gnatovs...@citysearch.com
 To: solr-user@lucene.apache.org
 Sent: Thursday, December 11, 2008 8:43:26 PM
 Subject: RE: Query Performance while updating teh index
 
 
 We are still having this problem. I am wondering if it can be fixed with
 autowarm settings. Is there a reliable formula for determining the autowarm
 settings?
 -- 
 View this message in context: 
 http://www.nabble.com/Query-Performance-while-updating-the-index-tp20452835p20968516.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: Applying Field Collapsing Patch

2008-12-11 Thread Doug Steigerwald
Have you tried just checking out (or exporting) the source from SVN  
and applying the patch?  Works fine for me that way.


$ svn co http://svn.apache.org/repos/asf/lucene/solr/tags/ 
release-1.3.0 solr-1.3.0
$ cd solr-1.3.0 ; patch -p0  ~/Downloads/collapsing-patch-to-1.3.0- 
ivan_2.patch


Doug

On Dec 11, 2008, at 3:50 PM, John Martyniak wrote:

It was a completely clean install.  I downloaded it from one of  
mirrors right before applying the patch to it.


Very troubling.  Any other suggestions or ideas?

I am running it on Mac OS Maybe I will try looking for some answers  
around that.


-John

On Dec 11, 2008, at 3:05 PM, Stephen Weiss swe...@stylesight.com  
wrote:


Yes, only ivan patch 2 (and before, only ivan patch 1), my sense  
was these patches were meant to be used in isolation (there were no  
notes saying to apply any other patches first).


Are you using patches for any other purpose (non-SOLR-236)?  Maybe  
you need to apply this one first, then those patches.  For me using  
any patch makes me nervous (we have a pretty strict policy about  
using beta code anywhere), I'm only doing it this once because it's  
absolutely necessary to provide the functionality desired.


--
Steve

On Dec 11, 2008, at 2:53 PM, John Martyniak wrote:


thanks for the advice.

I just downloaded a completely clean version, haven't even tried  
to build it yet.


Applied the same, and I received exactly the same results.

Do you only apply the ivan patch 2?  What version of patch are you  
running?


-John

On Dec 11, 2008, at 2:10 PM, Stephen Weiss wrote:

Are you sure you have a clean copy of the source?  Every time  
I've applied his patch I grab a fresh copy of the tarball and run  
the exact same command, it always works for me.


Now, whether the collapsing actually works is a different matter...

--
Steve

On Dec 11, 2008, at 1:29 PM, John Martyniak wrote:


Hi,

I am trying to apply Ivan's field collapsing patch to solr 1.3  
(not a nightly), and it continously fails.  I am using the  
following command:

patch -p0 -i collapsing-patch-to-1.3.0-ivan_2.patch --dry-run

I am in the apache-solr directory, and have read write for all  
files directories and files.


I am get the following results:

patching file src/test/org/apache/solr/search/TestDocSet.java
Hunk #1 FAILED at 88.
1 out of 1 hunk FAILED -- saving rejects to file src/test/org/ 
apache/solr/search/TestDocSet.java.rej

patching file src/java/org/apache/solr/search/CollapseFilter.java
patching file src/java/org/apache/solr/search/DocSet.java
Hunk #1 FAILED at 195.
1 out of 1 hunk FAILED -- saving rejects to file src/java/org/ 
apache/solr/search/DocSet.java.rej

patching file src/java/org/apache/solr/search/NegatedDocSet.java
patching file src/java/org/apache/solr/search/ 
SolrIndexSearcher.java

Hunk #1 FAILED at 1357.
1 out of 1 hunk FAILED -- saving rejects to file src/java/org/ 
apache/solr/search/SolrIndexSearcher.java.rej
patching file src/java/org/apache/solr/common/params/ 
CollapseParams.java
patching file src/java/org/apache/solr/handler/component/ 
CollapseComponent.java



Also the '.rej' files are not created.

Does anybody have any ideas?

thanks in advance for the help.

-John










RE: exceeded limit of maxWarmingSearchers=4

2008-12-11 Thread Lance Norskog
How big is your index? There is a variant of the Lucene disk accessors in
the Lucene contrib area. It stores all of the index data directly in POJOs
(Java objects) and does not marshal them into a disk-saveable format. The
indexes are understandably larger, but all data added is automatically
commited; there is no actual commit operation.

-Original Message-
From: Walter Underwood [mailto:wunderw...@netflix.com] 
Sent: Thursday, December 11, 2008 11:45 AM
To: solr-user@lucene.apache.org
Subject: Re: exceeded limit of maxWarmingSearchers=4

It sounds like you need real-time search, where documents are available in
the next query. Solr doesn't do that.

That is a pretty rare feature and must be designed in at the start.

The usual workaround is to have a main index plus a small delta index and
search both. Deletes have to be handled separately.

The Sphinx docs have a description of how to do that with their
engine:

  http://www.sphinxsearch.com/docs/current.html#live-updates

wunder

On 12/11/08 10:44 AM, chip correra chipcorr...@hotmail.com wrote:

 
 
 We¹re using Solr as a backend indexer/search engine to support 
 an AJAX based consumer application.  Basically, when users of our 
 system create ³Documents² in our product, we commit right away, 
 because we want to immediately re-query and get counts back from Solr 
 to update the user¹s interface.  Occasionally, usually under load, we see
the following error...
 
 ERROR [IndexSubscription] error committing changes to solr
 java.lang.RuntimeException: org.apache.solr.common.SolrException: 
 Error opening new searcher. exceeded limit of maxWarmingSearchers=4, 
 try again later.
 
 My understanding of Solr¹s caches and warming searches is that we get 
 nearly no value out of them because with many users, each only ever 
 querying their own data, the caches would quickly be marked irrelevant 
 by new unique user queries.
 
 So, we¹ve tried setting useColdSearchertrue/useColdSearcher
 And we¹ve set all the cache autowarmCounts=²0², for example: 
 documentCache class=solr.LRUCache size=512 initialSize=512 
 autowarmCount=0/
 
 Yet the same problem is periodically logged.
 
 Is there a way to disable warming Searches?
 
 Should we even be trying this?
 
 Is there a better configuration, i.e. Do we just set Warming Searches 
 to a higher number?
 
 I read in solr...@jira that removing this configuration all together 
 forces the number of warming Searches be infinite ­ that can¹t be good?
 
 
 _
 Send e-mail anywhere. No map, no compass.
 http://windowslive.com/Explore/hotmail?ocid=TXT_TAGLM_WL_hotmail_acq_a
 nywhere_
 122008




importing csv into Solr

2008-12-11 Thread phil cryer
I can't import csv files into Solr - I've gone through the wiki and
all the examples online, but I've hit the same error - what am I'm
doing wrong?

curl 'http://localhost:8080/solr/update/csv?commit=true' --data-binary
@green.csv -H 'Content-type:text/plain; charset=utf-8' -s -u
solrAdmin:solrAdmin

htmlheadtitleApache Tomcat/5.5.26 - Error
report/titlestyle!--H1
{font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:22px;}
H2 
{font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:16px;}
H3 
{font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:14px;}
BODY {font-family:Tahoma,Arial,sans-serif;color:black;background-color:white;}
B {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;}
P 
{font-family:Tahoma,Arial,sans-serif;background:white;color:black;font-size:12px;}A
{color : black;}A.name {color : black;}HR {color :
#525D76;}--/style /headbodyh1HTTP Status 400 - undefined
field color/h1HR size=1 noshade=noshadepbtype/b Status
report/ppbmessage/b uundefined field
color/u/ppbdescription/b uThe request sent by the client
was syntactically incorrect (undefined field color)./u/pHR
size=1 noshade=noshadeh3Apache
Tomcat/5.5.26/h3/body/html[11:23:51]

I've also tried to stream the csv in:

curl http://localhost:8080/solr/update/csv?stream.file=green.csv -s -u
solrAdmin:solrAdmin

htmlheadtitleApache Tomcat/5.5.26 - Error
report/titlestyle!--H1
{font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:22px;}
H2 
{font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:16px;}
H3 
{font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:14px;}
BODY {font-family:Tahoma,Arial,sans-serif;color:black;background-color:white;}
B {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;}
P 
{font-family:Tahoma,Arial,sans-serif;background:white;color:black;font-size:12px;}A
{color : black;}A.name {color : black;}HR {color :
#525D76;}--/style /headbodyh1HTTP Status 400 - undefined
field color/h1HR size=1 noshade=noshadepbtype/b Status
report/ppbmessage/b uundefined field
color/u/ppbdescription/b uThe request sent by the client
was syntactically incorrect (undefined field color)./u/pHR
size=1 noshade=noshadeh3Apache Tomcat/5.5.26/h3/body/html

Thanks

P


Re: importing csv into Solr

2008-12-11 Thread Yonik Seeley
The error message is saying undefined field color
Is that field defined in your schema?  If not, you need to define it,
or map the color field to another field during import.

-Yonik


On Thu, Dec 11, 2008 at 11:37 PM, phil cryer p...@cryer.us wrote:
 I can't import csv files into Solr - I've gone through the wiki and
 all the examples online, but I've hit the same error - what am I'm
 doing wrong?

 curl 'http://localhost:8080/solr/update/csv?commit=true' --data-binary
 @green.csv -H 'Content-type:text/plain; charset=utf-8' -s -u
 solrAdmin:solrAdmin

 htmlheadtitleApache Tomcat/5.5.26 - Error
 report/titlestyle!--H1
 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:22px;}
 H2 
 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:16px;}
 H3 
 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:14px;}
 BODY {font-family:Tahoma,Arial,sans-serif;color:black;background-color:white;}
 B {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;}
 P 
 {font-family:Tahoma,Arial,sans-serif;background:white;color:black;font-size:12px;}A
 {color : black;}A.name {color : black;}HR {color :
 #525D76;}--/style /headbodyh1HTTP Status 400 - undefined
 field color/h1HR size=1 noshade=noshadepbtype/b Status
 report/ppbmessage/b uundefined field
 color/u/ppbdescription/b uThe request sent by the client
 was syntactically incorrect (undefined field color)./u/pHR
 size=1 noshade=noshadeh3Apache
 Tomcat/5.5.26/h3/body/html[11:23:51]

 I've also tried to stream the csv in:

 curl http://localhost:8080/solr/update/csv?stream.file=green.csv -s -u
 solrAdmin:solrAdmin

 htmlheadtitleApache Tomcat/5.5.26 - Error
 report/titlestyle!--H1
 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:22px;}
 H2 
 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:16px;}
 H3 
 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:14px;}
 BODY {font-family:Tahoma,Arial,sans-serif;color:black;background-color:white;}
 B {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;}
 P 
 {font-family:Tahoma,Arial,sans-serif;background:white;color:black;font-size:12px;}A
 {color : black;}A.name {color : black;}HR {color :
 #525D76;}--/style /headbodyh1HTTP Status 400 - undefined
 field color/h1HR size=1 noshade=noshadepbtype/b Status
 report/ppbmessage/b uundefined field
 color/u/ppbdescription/b uThe request sent by the client
 was syntactically incorrect (undefined field color)./u/pHR
 size=1 noshade=noshadeh3Apache Tomcat/5.5.26/h3/body/html

 Thanks

 P



Re: SolrConfig.xml Replication

2008-12-11 Thread Noble Paul നോബിള്‍ नोब्ळ्
The moment the current patch is tested it will be checked in.



On Thu, Dec 11, 2008 at 8:33 PM, Jeff Newburn jnewb...@zappos.com wrote:
 Thank you for the quick response.  I will keep an eye on that to see how it
 progresses.


 On 12/10/08 8:03 PM, Noble Paul നോബിള്‍ नोब्ळ् noble.p...@gmail.com
 wrote:

 This is a known issue and I was planning to take it up soon.
 https://issues.apache.org/jira/browse/SOLR-821


 On Thu, Dec 11, 2008 at 5:30 AM, Jeff Newburn jnewb...@zappos.com wrote:
 I am curious as to whether there is a solution to be able to replicate
 solrconfig.xml with the 1.4 replication.  The obvious problem is that the
 master would replicate the solrconfig turning all slaves into masters with
 its config.  I have also tried on a whim to configure the master and slave
 on the master so that the slave points to the same server but that seems to
 break the replication completely.  Please let me know if anybody has any
 ideas

 -Jeff








-- 
--Noble Paul


Re: Error, when i update the rich text documents such as .doc, .ppt files.

2008-12-11 Thread Chris Harris
I don't have time to verify this now, but the RichDocumentHandler does
not have a separate contrib directory and I don't think the
RichDocumentHandler patch makes a jar particular to the handler;
instead, the java files get dumped in the main solr tree
(java/org/apache/solr) , and therefore they get compiled into the main
solr war when you do ant example.

I too would suggest looking at ExtractingRequestHandler rather than
RichDocumentHandler, though. I could be wrong, but I don't think
anyone necessarily has plans to maintain RichDocumentHandler going
forward. Plus ExtractingRequestHandler has more features.

On Wed, Dec 10, 2008 at 9:04 AM, Grant Ingersoll gsing...@apache.org wrote:
 Hi Raghav,

 Recently, integration with Tika was completed for SOLR-284 and it is now
 committed on the trunk (but does not use the old RichDocumentHandler
 approach).  See http://wiki.apache.org/solr/ExtractingRequestHandler for how
 to use and configure.

 Otherwise, it looks to me like the jar file for the RichDocHandler is not in
 your WAR or in the Solr Home lib directory.

 HTH,
 Grant

 On Dec 10, 2008, at 7:09 AM, RaghavPrabhu wrote:


 Hi all,

 I want to index the rich text documents like .doc, .xls, .ppt files. I had
 done the patch for updating the rich documents by followed the
 instructions
 in this below url. http://wiki.apache.org/solr/UpdateRichDocuments

 When i indexing the doc file, im getting this following error in the
 browser.