Re: Unable to move index file error during replication

2009-03-27 Thread sunnyfr

Sorry but which one shoud I take?? 
where exactly ?


Noble Paul നോബിള്‍  नोब्ळ् wrote:
 
 this fix is there in the trunk ,
 you may not need to apply the patch
 
 On Fri, Mar 27, 2009 at 6:02 AM, sunnyfr johanna...@gmail.com wrote:

 Hi,

 It doesn't seem to work for me, I changed as well this part below is it
 ok??
 -    ListString copiedfiles = new ArrayListString();
 +    SetString filesToCopy = new HashSetString();

 http://www.nabble.com/file/p22734005/ReplicationHandler.java
 ReplicationHandler.java

 Thanks a lot,





 Noble Paul നോബിള്‍  नोब्ळ् wrote:

 James thanks .

 If this is true the place to fix this is in
 ReplicationHandler#getFileList(). patch is attached.


 On Wed, Dec 24, 2008 at 4:04 PM, James Grant james.gr...@semantico.com
 wrote:
 I had the same problem. It turned out that the list of files from the
 master
 included duplicates. When the slave completes the download and tries to
 move
 the files into the index it comes across a file that does not exist
 because
 it has already been moved so it backs out the whole operation.

 My solution for now was to patch the copyindexFiles method of
 org.apache.solr.handler.SnapPuller so that it normalises the list
 before
 moving the files. This isn't the best solution since it will still
 download
 the file twice but it was the easiest and smallest change to make. The
 patch
 is below

 Regards

 James

 --- src/java/org/apache/solr/handler/SnapPuller.java    (revision
 727347)
 +++ src/java/org/apache/solr/handler/SnapPuller.java    (working copy)
 @@ -470,7 +470,7 @@
   */
  private boolean copyIndexFiles(File snapDir, File indexDir) {
    String segmentsFile = null;
 -    ListString copiedfiles = new ArrayListString();
 +    SetString filesToCopy = new HashSetString();
    for (MapString, Object f : filesDownloaded) {
      String fname = (String) f.get(NAME);
      // the segments file must be copied last
 @@ -482,6 +482,10 @@
        segmentsFile = fname;
        continue;
      }
 +      filesToCopy.add(fname);
 +    }
 +    ListString copiedfiles = new ArrayListString();
 +    for (String fname: filesToCopy) {
      if (!copyAFile(snapDir, indexDir, fname, copiedfiles)) return
 false;
      copiedfiles.add(fname);
    }


 Jaco wrote:

 Hello,

 While testing out the new replication features, I'm running into some
 strange problem. On the slave, I keep getting an error like this after
 all
 files have been copied from the master to the temporary
 index.x
 directory:

 SEVERE: Unable to move index file from:
 D:\Data\solr\Slave\data\index.20081224110855\_21e.tvx to:
 D:\Data\Solr\Slave\data\index\_21e.tvx

 The replication then stops, index remains in original state, so the
 updates
 are not available at the slave.

 This is my replication config at the master:

    requestHandler name=/replication class=solr.ReplicationHandler
 
        lst name=master
            !--Replicate on 'optimize' it can also be  'commit' --
            str name=replicateAftercommit/str
            str name=confFilesschema.xml/str
        /lst
    /requestHandler

 This is the replication config at the slave:

    requestHandler name=/replication class=solr.ReplicationHandler
 
        lst name=slave
            str name=masterUrl
 http://hostnamemaster:8080/solr/Master/replication/str
            str name=pollInterval00:10:00/str
            str name=ziptrue/str
        /lst
    /requestHandler

 I'm running a Solr nightly build of 21.12.2008 in Tomcat 6 on Windows
 2003.
 Initially I thought there was some problem with disk space, but this
 is
 not
 the case. Replication did run fine for intial version of index, but
 after
 that at some point it didn't work anymore. Any ideas what could be
 wrong
 here?

 Thanks very much in advance, bye,

 Jaco.







 --
 --Noble Paul

 Index: src/java/org/apache/solr/handler/ReplicationHandler.java
 ===
 --- src/java/org/apache/solr/handler/ReplicationHandler.java  (revision
 729282)
 +++ src/java/org/apache/solr/handler/ReplicationHandler.java  (working
 copy)
 @@ -268,7 +268,7 @@
      ListMapString, Object result = new ArrayListMapString,
 Object();
      try {
        //get all the files in the commit
 -      CollectionString files = commit.getFileNames();
 +      CollectionString files = new
 HashSetString(commit.getFileNames());
        for (String fileName : files) {
          File file = new File(core.getIndexDir(), fileName);
          MapString, Object fileMeta = getFileInfo(file);



 --
 View this message in context:
 http://www.nabble.com/%22Unable-to-move-index-file%22-error-during-replication-tp21157722p22734005.html
 Sent from the Solr - User mailing list archive at Nabble.com.


 
 
 
 -- 
 --Noble Paul
 
 

-- 
View this message in context: 
http://www.nabble.com/%22Unable-to-move-index-file%22-error-during-replication-tp21157722p22737672.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Unable to move index file error during replication

2009-03-27 Thread Noble Paul നോബിള്‍ नोब्ळ्
the latest nightly should do fine

On Fri, Mar 27, 2009 at 1:59 PM, sunnyfr johanna...@gmail.com wrote:

 Sorry but which one shoud I take??
 where exactly ?


 Noble Paul നോബിള്‍  नोब्ळ् wrote:

 this fix is there in the trunk ,
 you may not need to apply the patch

 On Fri, Mar 27, 2009 at 6:02 AM, sunnyfr johanna...@gmail.com wrote:

 Hi,

 It doesn't seem to work for me, I changed as well this part below is it
 ok??
 -    ListString copiedfiles = new ArrayListString();
 +    SetString filesToCopy = new HashSetString();

 http://www.nabble.com/file/p22734005/ReplicationHandler.java
 ReplicationHandler.java

 Thanks a lot,





 Noble Paul നോബിള്‍  नोब्ळ् wrote:

 James thanks .

 If this is true the place to fix this is in
 ReplicationHandler#getFileList(). patch is attached.


 On Wed, Dec 24, 2008 at 4:04 PM, James Grant james.gr...@semantico.com
 wrote:
 I had the same problem. It turned out that the list of files from the
 master
 included duplicates. When the slave completes the download and tries to
 move
 the files into the index it comes across a file that does not exist
 because
 it has already been moved so it backs out the whole operation.

 My solution for now was to patch the copyindexFiles method of
 org.apache.solr.handler.SnapPuller so that it normalises the list
 before
 moving the files. This isn't the best solution since it will still
 download
 the file twice but it was the easiest and smallest change to make. The
 patch
 is below

 Regards

 James

 --- src/java/org/apache/solr/handler/SnapPuller.java    (revision
 727347)
 +++ src/java/org/apache/solr/handler/SnapPuller.java    (working copy)
 @@ -470,7 +470,7 @@
   */
  private boolean copyIndexFiles(File snapDir, File indexDir) {
    String segmentsFile = null;
 -    ListString copiedfiles = new ArrayListString();
 +    SetString filesToCopy = new HashSetString();
    for (MapString, Object f : filesDownloaded) {
      String fname = (String) f.get(NAME);
      // the segments file must be copied last
 @@ -482,6 +482,10 @@
        segmentsFile = fname;
        continue;
      }
 +      filesToCopy.add(fname);
 +    }
 +    ListString copiedfiles = new ArrayListString();
 +    for (String fname: filesToCopy) {
      if (!copyAFile(snapDir, indexDir, fname, copiedfiles)) return
 false;
      copiedfiles.add(fname);
    }


 Jaco wrote:

 Hello,

 While testing out the new replication features, I'm running into some
 strange problem. On the slave, I keep getting an error like this after
 all
 files have been copied from the master to the temporary
 index.x
 directory:

 SEVERE: Unable to move index file from:
 D:\Data\solr\Slave\data\index.20081224110855\_21e.tvx to:
 D:\Data\Solr\Slave\data\index\_21e.tvx

 The replication then stops, index remains in original state, so the
 updates
 are not available at the slave.

 This is my replication config at the master:

    requestHandler name=/replication class=solr.ReplicationHandler
 
        lst name=master
            !--Replicate on 'optimize' it can also be  'commit' --
            str name=replicateAftercommit/str
            str name=confFilesschema.xml/str
        /lst
    /requestHandler

 This is the replication config at the slave:

    requestHandler name=/replication class=solr.ReplicationHandler
 
        lst name=slave
            str name=masterUrl
 http://hostnamemaster:8080/solr/Master/replication/str
            str name=pollInterval00:10:00/str
            str name=ziptrue/str
        /lst
    /requestHandler

 I'm running a Solr nightly build of 21.12.2008 in Tomcat 6 on Windows
 2003.
 Initially I thought there was some problem with disk space, but this
 is
 not
 the case. Replication did run fine for intial version of index, but
 after
 that at some point it didn't work anymore. Any ideas what could be
 wrong
 here?

 Thanks very much in advance, bye,

 Jaco.







 --
 --Noble Paul

 Index: src/java/org/apache/solr/handler/ReplicationHandler.java
 ===
 --- src/java/org/apache/solr/handler/ReplicationHandler.java  (revision
 729282)
 +++ src/java/org/apache/solr/handler/ReplicationHandler.java  (working
 copy)
 @@ -268,7 +268,7 @@
      ListMapString, Object result = new ArrayListMapString,
 Object();
      try {
        //get all the files in the commit
 -      CollectionString files = commit.getFileNames();
 +      CollectionString files = new
 HashSetString(commit.getFileNames());
        for (String fileName : files) {
          File file = new File(core.getIndexDir(), fileName);
          MapString, Object fileMeta = getFileInfo(file);



 --
 View this message in context:
 http://www.nabble.com/%22Unable-to-move-index-file%22-error-during-replication-tp21157722p22734005.html
 Sent from the Solr - User mailing list archive at Nabble.com.





 --
 --Noble Paul



 --
 View this message in context: 
 

Re: Incorrect sort with with function query in query parameters

2009-03-27 Thread Otis Gospodnetic

Asif,

Could it have something to do with the deleted documents in your unoptimized 
index?  There documents are only marked as deleted.  When you run optimize you 
really remove them completely.  It could be that they are getting counted by 
something and that messes up the scoring/order.


Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 
 From: Asif Rahman a...@newscred.com
 To: solr-user@lucene.apache.org
 Sent: Thursday, March 26, 2009 10:24:19 PM
 Subject: Incorrect sort with with function query in query parameters
 
 
 Hi all,
 
 I'm having an issue with the order of my results when attempting to sort by
 a function in my query.  Looking at the debug output of the query, the score
 returned with in the result section for any given document does not match
 the score in the debug output.  It turns out that if I optimize the index,
 then the results are sorted correctly.  The scores in the debug output are
 the correct scores.  This behavior only occurs using a recent nightly build
 of Solr.  It works correctly in Solr 1.3.
 
 An example query is:
 
 http://localhost:8080/solr/core-01/select?qt=standardfl=*,scorerows=10q=*:*%20_val_:recip(rord(article_published_at),1,1000,1000)^1debugQuery=on
 
 I've attached the result to this email.
 
 Can anybody shed any light on this problem? 
 
 Thanks,
 
 Asif
 http://www.nabble.com/file/p22735009/result.xml result.xml 
 -- 
 View this message in context: 
 http://www.nabble.com/Incorrect-sort-with-with-function-query-in-query-parameters-tp22735009p22735009.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: optimization advice?

2009-03-27 Thread Otis Gospodnetic

Steve,

Maybe you can tell us about:
- your hardware
- query rate
- document cache and query cache settings
- your current response times
- any pain points, any slow query patterns
- etc.


Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 
 From: Steve Conover scono...@gmail.com
 To: solr-user@lucene.apache.org
 Sent: Friday, March 27, 2009 1:50:48 AM
 Subject: optimization advice?
 
 Hi,
 
 I've looked over the public Solr perf docs and done some searching on
 this mailing list.  Still, I'd like to seek some advice based on my
 specific situation:
 
 - 2-3 million documents / 5GB index
 - each document has 40+ indexed fields, and many multivalue fields
 - only primary keys are stored
 - very low write frequency
 - queries can be sorted by any combination of fields, and are always
 sorted by at least one field
 - query criteria vary from very simple to very complex
   (the point about queries being that they're not very amenable to being 
 cached)
 
 So far I've set my mergefactor very low.I haven't paid much
 attention to caching except for basic query result caching - I don't
 think many of the cache features really apply well to my problem.
 Increasing the amount of ram available to java (by 1GB) has no effect
 I can detect.
 
 Ideally I'd like to get response times down to near-instantaneous / 
 50ms (which is where they were when the index was ~ 1 millions
 documents).  I'd love to hear suggestions - in particular are there
 obvious optimization options I've missed?
 
 Regards,
 Steve



Re: Initial query performance poor after update / delete

2009-03-27 Thread Otis Gospodnetic

Hi Tom,

 
 Thanks Otis. After some further testing - I've noticed that initial searches
 are only slow if I include the qt=geo parameter. Searches without this
 parameter appear to show no slow down whatsoever after updates - so I'm
 wondering if the problem is actually a localsolr one.
 
 Can you tell me where I can specify the configuration to set up the
 parameters for swapping the searchers? Is this within solrconfig.xml? Any
 light you could shed on this would be really appreciated.

In a single server environment searchers should be swapped whenever you issue a 
commit.

 Thanks again,
 Tom
 
 PS. If you wrote a SOLR in Action - I would buy it today!

Careful what you wish! ;)

Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch


Re: Initial query performance poor after update / delete

2009-03-27 Thread TomWilliamson

Thanks Otis. After some further testing - I've noticed that initial searches
are only slow if I include the qt=geo parameter. Searches without this
parameter appear to show no slow down whatsoever after updates - so I'm
wondering if the problem is actually a localsolr one.

Can you tell me where I can specify the configuration to set up the
parameters for swapping the searchers? Is this within solrconfig.xml? Any
light you could shed on this would be really appreciated.

Thanks again,
Tom

PS. If you wrote a SOLR in Action - I would buy it today!
-- 
View this message in context: 
http://www.nabble.com/Initial-query-performance-poor-after-update---delete-tp22732463p22739929.html
Sent from the Solr - User mailing list archive at Nabble.com.



Solrj exception posting XML docs

2009-03-27 Thread Giovanni De Stefano
Hello all,

I am currently using Solr 1.3 and its Solrj.

I am trying to post XML docs directly through Solrj but I get the following
exception:

13:12:09,119 ERROR [STDERR] Mar 27, 2009 1:12:09 PM
org.apache.solr.common.SolrException log
SEVERE: java.lang.NullPointerException
 at
org.apache.solr.handler.XmlUpdateRequestHandler.processUpdate(XmlUpdateRequestHandler.java:194)
 at
org.apache.solr.handler.XmlUpdateRequestHandler.handleRequestBody(XmlUpdateRequestHandler.java:123)
 at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
 at org.apache.solr.core.SolrCore.execute(SolrCore.java:1204)
 at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:303)
 at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:232)
 at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
 at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
 at
org.jboss.web.tomcat.filters.ReplyHeaderFilter.doFilter(ReplyHeaderFilter.java:96)
 at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
 at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
 at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:235)
 at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
 at
org.jboss.web.tomcat.security.SecurityAssociationValve.invoke(SecurityAssociationValve.java:190)
 at
org.jboss.web.tomcat.security.JaccContextValve.invoke(JaccContextValve.java:92)
 at
org.jboss.web.tomcat.security.SecurityContextEstablishmentValve.process(SecurityContextEstablishmentValve.java:126)
 at
org.jboss.web.tomcat.security.SecurityContextEstablishmentValve.invoke(SecurityContextEstablishmentValve.java:70)
 at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
 at
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
 at
org.jboss.web.tomcat.service.jca.CachedConnectionValve.invoke(CachedConnectionValve.java:158)
 at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
 at
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:330)
 at
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:829)
 at
org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:601)
 at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447)
 at java.lang.Thread.run(Thread.java:595)
13:12:09,120 ERROR [STDERR] Mar 27, 2009 1:12:09 PM
org.apache.solr.core.SolrCore execute
INFO: [downloadable] webapp=/solr path=/update
params={wt=javabinversion=2.2} status=500 QTime=2
13:12:09,121 ERROR [STDERR] Mar 27, 2009 1:12:09 PM
org.apache.solr.common.SolrException log
SEVERE: java.lang.NullPointerException
 at
org.apache.solr.handler.XmlUpdateRequestHandler.processUpdate(XmlUpdateRequestHandler.java:194)
 at
org.apache.solr.handler.XmlUpdateRequestHandler.handleRequestBody(XmlUpdateRequestHandler.java:123)
 at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
 at org.apache.solr.core.SolrCore.execute(SolrCore.java:1204)
 at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:303)
 at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:232)
 at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
 at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
 at
org.jboss.web.tomcat.filters.ReplyHeaderFilter.doFilter(ReplyHeaderFilter.java:96)
 at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
 at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
 at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:235)
 at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
 at
org.jboss.web.tomcat.security.SecurityAssociationValve.invoke(SecurityAssociationValve.java:190)
 at
org.jboss.web.tomcat.security.JaccContextValve.invoke(JaccContextValve.java:92)
 at
org.jboss.web.tomcat.security.SecurityContextEstablishmentValve.process(SecurityContextEstablishmentValve.java:126)
 at
org.jboss.web.tomcat.security.SecurityContextEstablishmentValve.invoke(SecurityContextEstablishmentValve.java:70)
 at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
 at
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
 at
org.jboss.web.tomcat.service.jca.CachedConnectionValve.invoke(CachedConnectionValve.java:158)
 at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
 at
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:330)
 at

Re: Search transparently with Solr with multiple cores, different indexes, common response type

2009-03-27 Thread Giovanni De Stefano
Hello Hoss, Steve,

thank you very much for your feedbacks, they have been very helpful making
me feel more confident now about this architecture.

In fact I decided to go for a single shared schema, but keeping multiple
indexes (multicore) because those two indexes are very different: one is
huge and updated not very often (once a day delta, once a week full) and the
other one is not that big and it is updated frequently (once per hour, once
per day, once per week full).

My boss is happy...thus I am happy too :-)

Now I am struggling a bit with Solrj...but that is already in another post
of mine :-)

Cheers,
Giovanni


On 3/26/09, Stephen Weiss swe...@stylesight.com wrote:


 I have a very similar setup and that's precisely what we do - except with
 JSON.

 1) Request comes into PHP
 2) PHP runs the search against several different cores (in a multicore
 setup) - ours are a little more than slightly different
 3) PHP constructs a new object with the responseHeader and response objects
 joined together (basically add the record counts together in the header and
 then concatenate the arrays of documents)
 4) PHP encodes the combined data into JSON and returns it

 It sounds clunky but it all manages to happen very quickly ( 200 ms round
 trip).  The only problem you might hit is with paging, but from the way you
 describe your situation it doesn't sound like that will be a problem.  It's
 more of an issue if you're trying to make them seamlessly flow into each
 other, but it sounds like you plan on presenting them separately (as we do).

 --
 Steve


 it could be a custom request handler, but it doesn't have to be -- you
 could implment it in whatever way is easiest for you (there's no reason
 why it has to run in the same JVM or on the same physical machine as Solr
 ... it could be a PHP script on another server if you want)




 -Hoss





Re: Incorrect sort with with function query in query parameters

2009-03-27 Thread Asif Rahman

Hi Otis,

Any documents marked deleted in this index are just the result of updates to
those documents.  There are no purely deleted documents.  Furthermore, the
field that I am ordering by in my function query remains untouched over the
updates.

I've read in other posts that the logic used by the debug component to
calculate the score is different from what the query component uses.  The
score shown in the debug output is correct.  It seems like the two
components are getting two different values for the rord function.

I'm particularly concerned by the fact that this only happens in the nightly
build.  Any ideas on how to correct this?  Unfortunately, it's not feasible
for me to only perform searches on optimized indices because we are doing
constant updates.

Thanks,

Asif


Otis Gospodnetic wrote:
 
 
 Asif,
 
 Could it have something to do with the deleted documents in your
 unoptimized index?  There documents are only marked as deleted.  When you
 run optimize you really remove them completely.  It could be that they are
 getting counted by something and that messes up the scoring/order.
 
 

-- 
View this message in context: 
http://www.nabble.com/Incorrect-sort-with-with-function-query-in-query-parameters-tp22735009p22741058.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Solrj exception posting XML docs

2009-03-27 Thread Giovanni De Stefano
Hello all,

the null pointer exception was caused by a wrong XML...

Basically my doc was something like this:

doc
...
...
/doc

but it had to be wrapped with a add as follow:

add
  doc
...
  /doc
/add

A more useful message would have been nice to have because I had to look at
the source code to understand that the command was missing...

Anyway I posted my own resolution for future reference :-)

Cheers,
Giovanni


On 3/27/09, Giovanni De Stefano giovanni.destef...@gmail.com wrote:

 Hello all,

 I am currently using Solr 1.3 and its Solrj.

 I am trying to post XML docs directly through Solrj but I get the following
 exception:

 13:12:09,119 ERROR [STDERR] Mar 27, 2009 1:12:09 PM
 org.apache.solr.common.SolrException log
 SEVERE: java.lang.NullPointerException
  at
 org.apache.solr.handler.XmlUpdateRequestHandler.processUpdate(XmlUpdateRequestHandler.java:194)
  at
 org.apache.solr.handler.XmlUpdateRequestHandler.handleRequestBody(XmlUpdateRequestHandler.java:123)
  at
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
  at org.apache.solr.core.SolrCore.execute(SolrCore.java:1204)
  at
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:303)
  at
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:232)
  at
 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
  at
 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
  at
 org.jboss.web.tomcat.filters.ReplyHeaderFilter.doFilter(ReplyHeaderFilter.java:96)
  at
 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
  at
 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
  at
 org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:235)
  at
 org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
  at
 org.jboss.web.tomcat.security.SecurityAssociationValve.invoke(SecurityAssociationValve.java:190)
  at
 org.jboss.web.tomcat.security.JaccContextValve.invoke(JaccContextValve.java:92)
  at
 org.jboss.web.tomcat.security.SecurityContextEstablishmentValve.process(SecurityContextEstablishmentValve.java:126)
  at
 org.jboss.web.tomcat.security.SecurityContextEstablishmentValve.invoke(SecurityContextEstablishmentValve.java:70)
  at
 org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
  at
 org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
  at
 org.jboss.web.tomcat.service.jca.CachedConnectionValve.invoke(CachedConnectionValve.java:158)
  at
 org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
  at
 org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:330)
  at
 org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:829)
  at
 org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:601)
  at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447)
  at java.lang.Thread.run(Thread.java:595)
 13:12:09,120 ERROR [STDERR] Mar 27, 2009 1:12:09 PM
 org.apache.solr.core.SolrCore execute
 INFO: [downloadable] webapp=/solr path=/update
 params={wt=javabinversion=2.2} status=500 QTime=2
 13:12:09,121 ERROR [STDERR] Mar 27, 2009 1:12:09 PM
 org.apache.solr.common.SolrException log
 SEVERE: java.lang.NullPointerException
  at
 org.apache.solr.handler.XmlUpdateRequestHandler.processUpdate(XmlUpdateRequestHandler.java:194)
  at
 org.apache.solr.handler.XmlUpdateRequestHandler.handleRequestBody(XmlUpdateRequestHandler.java:123)
  at
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
  at org.apache.solr.core.SolrCore.execute(SolrCore.java:1204)
  at
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:303)
  at
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:232)
  at
 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
  at
 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
  at
 org.jboss.web.tomcat.filters.ReplyHeaderFilter.doFilter(ReplyHeaderFilter.java:96)
  at
 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
  at
 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
  at
 org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:235)
  at
 org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
  at
 org.jboss.web.tomcat.security.SecurityAssociationValve.invoke(SecurityAssociationValve.java:190)
  at
 org.jboss.web.tomcat.security.JaccContextValve.invoke(JaccContextValve.java:92)
  at
 

Clarifying use of lst name=appends within a requestHandler

2009-03-27 Thread fergus mcmenemie
Hello,

Due to limitations with the way my content is organised and DIH I have
to add “-imgCaption:[* TO *]” to some of my queries. I discovered the
name=”appends” functionality tucked away inside solconfig.xml. This
looks a very useful feature, and I created a new requestHandler to deal
with my problem queries. I tried adding the following to my alternate
requestHandler:-

 lst name=appendsstr name=q-imgCaption:[* TO *]/str/lst

which did not work; however

 lst name=appendsstr name=fq-imgCaption:[* TO *]/str/lst

worked fine and is also more efficient. I guess I was caught by the
“identify values which should be appended to the list of ***multi-val
params from the query” portion of the comment within solconfig.xml.
I am now wondering how do I know which query params are multi-val or
not? Is this documented anywhere?

Regards Fergus.





Faceting question

2009-03-27 Thread rayandev

I am using the faceting feature and it works, I get back the facet counts,
but I need to know which facet.method(enum  or fc) is used. Is there a way
to turn on the debug info for faceting. 


Here's my setup
Solr 1.3
EmbededSolrServer
SolrJ
Facet fields are indexed as multivalued solr.StrField

Thanks
Rayan
-- 
View this message in context: 
http://www.nabble.com/Faceting-question-tp22743106p22743106.html
Sent from the Solr - User mailing list archive at Nabble.com.



Solr date parsing issue

2009-03-27 Thread Giovanni De Stefano
Hello,

I am having a problem indexing a date field.

In my schema the date field is defined the standard way:

fieldType name=date class=solr.DateField sortMissingLast=true
omitNorms=true/

I know the Solr format is 1995-12-31T23:59:59Z, but the dates coming from my
sources are in the format 2009-04-10T02:02:55+0200

How can I make the conversion?

Do I have to extend DateField or is there any cleaner way to do it?

Thanks in advance!

Giovanni


Encoding problem

2009-03-27 Thread Rui Pereira
I'm having problems with encoding in responses from search queries. The
encoding problem only occurs in the topologyname field, if a instancename
has accents it is returned correctly. In all my configurations I have UTF-8.

?xml version=1.0 encoding=UTF-8?
dataConfig
document name=topologies
entity query=SELECT DISTINCT '3141-' || Sub0.SUBID as id, 'Inventário' as
topologyname, 3141 as topologyid, Sub0.SUBID as instancekey, Sub0.NAME as
instancename FROM ...
  field column=INSTANCEKEY name=instancekey/
  field column=ID name=id/
  field column=TOPOLOGYID name=topologyid/
  field column=INSTANCENAME name=instancename/
  field column=TOPOLOGYNAME name=topologyname/...


As an example, I can have in the response the following result:

doc
long name=instancekey285/long
str name=instancenameInformática/str
long name=topologyid3141/long
str name=topologynameInventário/str
/doc


Thanks in advance,
   Rui Pereira


Re: Faceting question

2009-03-27 Thread Yonik Seeley
It would be the enum method... Solr 1.3 doesn't have the fc method
for multi-valued fields... that's a 1.4 feature.

-Yonik
http://www.lucidimagination.com

On Fri, Mar 27, 2009 at 10:44 AM, rayandev rayanm...@gmail.com wrote:

 I am using the faceting feature and it works, I get back the facet counts,
 but I need to know which facet.method(enum  or fc) is used. Is there a way
 to turn on the debug info for faceting.


 Here's my setup
 Solr 1.3
 EmbededSolrServer
 SolrJ
 Facet fields are indexed as multivalued solr.StrField

 Thanks
 Rayan
 --
 View this message in context: 
 http://www.nabble.com/Faceting-question-tp22743106p22743106.html
 Sent from the Solr - User mailing list archive at Nabble.com.


Solr Search Error

2009-03-27 Thread Narayanan, Karthikeyan
Hi All,
   I am intermittently getting this Exception when I do the search.  
What could be the reason?.

Caused by: org.apache.solr.common.SolrException: 11938  
java.lang.ArrayIndexOutOfBoundsException: 11938 at 
org.apache.lucene.search.TermScorer.score(TermScorer.java:74)at 
org.apache.lucene.search.TermScorer.score(TermScorer.java:61)at 
org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:137)at 
org.apache.lucene.search.Searcher.search(Searcher.java:126)  at 
org.apache.lucene.search.Searcher.search(Searcher.java:105)  at 
org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:966)
at 
org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:838)
 at 
org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:269)  at 
org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:160)
at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:169)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1204)at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:303) 
 at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:232)
 at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:215)
at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:188)
at 
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:210)
  at 
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:174)
  at 
org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:433)
   at 
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)   
 at 
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:117)   
 at 
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:108)
at 
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:151)  at 
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:870)   at 
org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection(Http11BaseProtocol.java:665)
   at 
org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(PoolTcpEndpoint.java:528)
   at 
org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(LeaderFollowerWorkerThread.java:81)
  at 
org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:685)
   at java.lang.Thread.run(Thread.java:619)


Thanks.
  
Karthik




Best way to unit test solr integration

2009-03-27 Thread Joe Pollard
Hello,

On our project, we have quite a bit of code used to generate Solr queries, and 
I need to create some unit tests to ensure that these continue to work.  In 
addition, I need to generate some unit tests that will test indexing and 
retrieval of certain documents, based on our current schema and the application 
logic that generates the indexable documents as well as generates the Solr 
queries.

My question is - what's the best way for me to unit test our Solr integration?

I'd like to be able to spin up an embedded/in-memory solr, or that failing just 
start one up as part of my test case setup, fill it with interesting documents, 
and do some queries, comparing the results to expected results.

Are there wiki pages or other documented examples of doing this?  It seems 
rather straight-forward, but who knows, it may be dead simple with some unknown 
feature.

Thanks!
-Joe


Re: Encoding problem

2009-03-27 Thread aerox7

Hi,
I had the same problem with DATAIMPORTHandler : i have a utf-8 mysql
DATABASE but it's seems that DIH import data in LATIN... So i just use
Transformer to (re)encode my strings in UTF-8.


Rui Pereira-2 wrote:
 
 I'm having problems with encoding in responses from search queries. The
 encoding problem only occurs in the topologyname field, if a instancename
 has accents it is returned correctly. In all my configurations I have
 UTF-8.
 
 ?xml version=1.0 encoding=UTF-8?
 dataConfig
 document name=topologies
 entity query=SELECT DISTINCT '3141-' || Sub0.SUBID as id, 'Inventário'
 as
 topologyname, 3141 as topologyid, Sub0.SUBID as instancekey, Sub0.NAME as
 instancename FROM ...
   field column=INSTANCEKEY name=instancekey/
   field column=ID name=id/
   field column=TOPOLOGYID name=topologyid/
   field column=INSTANCENAME name=instancename/
   field column=TOPOLOGYNAME name=topologyname/...
 
 
 As an example, I can have in the response the following result:
 
 doc
 long name=instancekey285/long
 str name=instancenameInformática/str
 long name=topologyid3141/long
 str name=topologynameInventário/str
 /doc
 
 
 Thanks in advance,
Rui Pereira
 
 

-- 
View this message in context: 
http://www.nabble.com/Encoding-problem-tp22743698p22745133.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Best way to unit test solr integration

2009-03-27 Thread Eric Pugh
So my first thought is that unit test + solr integration is an  
oxymoron.  In the sense that unit test implies the smallest functional  
unit, and solr integration implies multiple units working together.


It sounds like you have two different tasks.  the code that generate  
queies, you can test that without Solr.  If you need to parse some  
sort of solr document to generate a query based on it, then mock up  
the query.   A lot of folks will just use Solr to build a result set,  
and then save that on the filesystem.  my_big_result1.xml and read  
it in and feed it to your code.


On the other hand, for you code testing indexing and retrieval, again,  
if you can use the same approach to decouple what solr does from your  
code.  Unless you've patched Solr, you shouldn't need to unit test  
Solr, Solr has very nice unit testing built in.


On the other hand, if you are doing integration testing, where you  
want a more end to end view of your application, then you probably  
already have a test solr setup in your environment somewhere that  
you can rely on to use.


Spinning up and shutting down Solr for tests can be done, and I can  
think of use cases for why you might want to do it, but it does incur  
a penalty of being more work.  And you still need to validate that  
your embedded/unit test solr works the same as your integration/test  
environment Solr.


Eric



On Mar 27, 2009, at 11:59 AM, Joe Pollard wrote:


Hello,

On our project, we have quite a bit of code used to generate Solr  
queries, and I need to create some unit tests to ensure that these  
continue to work.  In addition, I need to generate some unit tests  
that will test indexing and retrieval of certain documents, based on  
our current schema and the application logic that generates the  
indexable documents as well as generates the Solr queries.


My question is - what's the best way for me to unit test our Solr  
integration?


I'd like to be able to spin up an embedded/in-memory solr, or that  
failing just start one up as part of my test case setup, fill it  
with interesting documents, and do some queries, comparing the  
results to expected results.


Are there wiki pages or other documented examples of doing this?  It  
seems rather straight-forward, but who knows, it may be dead simple  
with some unknown feature.


Thanks!
-Joe


-
Eric Pugh | Principal | OpenSource Connections, LLC | 434.466.1467 | 
http://www.opensourceconnections.com
Free/Busy: http://tinyurl.com/eric-cal






RE: Best way to unit test solr integration

2009-03-27 Thread Joe Pollard
Thanks for the tips, I like the suggestion of testing the document and query 
generation without having solr involved.  That seems like a more bite-sized 
unit; I think I'll do that.

However, here's the test case that I'm considering where I'd like to have a 
live solr instance:

During an exercise of optimizing our schema, I'm going to be making wholesale 
changes that I'd like to ensure don't break some portion of our app.  It seems 
like a good method for this would be to write a test with the following steps: 
(arguably not a unit test, but a very valuable test indeed in our application)
* take some defined model object generated at test time, store it in db
* run it through our document creation code
* submit it into solr
* generate a query using our custom criteria-based generation code
* ensure that the query returns the results as expected
* flesh out the new model objects from the db using only the id fields returned 
from Solr
* In the end, it would be expected to have model objects retrieved from the db 
that match model objects at the beginning of the test.

These building blocks could be stacked in numerous ways to test almost all the 
different scenarios in which we use Solr.

Also, when/if we start making solr config changes, I can ensure that they 
change nothing from my app's functional point of view (with the exception of 
ridding us of dreaded OOMs).

Thanks,
-Joe

-Original Message-
From: Eric Pugh [mailto:ep...@opensourceconnections.com]
Sent: Friday, March 27, 2009 11:27 AM
To: solr-user@lucene.apache.org
Subject: Re: Best way to unit test solr integration

So my first thought is that unit test + solr integration is an
oxymoron.  In the sense that unit test implies the smallest functional
unit, and solr integration implies multiple units working together.

It sounds like you have two different tasks.  the code that generate
queies, you can test that without Solr.  If you need to parse some
sort of solr document to generate a query based on it, then mock up
the query.   A lot of folks will just use Solr to build a result set,
and then save that on the filesystem.  my_big_result1.xml and read
it in and feed it to your code.

On the other hand, for you code testing indexing and retrieval, again,
if you can use the same approach to decouple what solr does from your
code.  Unless you've patched Solr, you shouldn't need to unit test
Solr, Solr has very nice unit testing built in.

On the other hand, if you are doing integration testing, where you
want a more end to end view of your application, then you probably
already have a test solr setup in your environment somewhere that
you can rely on to use.

Spinning up and shutting down Solr for tests can be done, and I can
think of use cases for why you might want to do it, but it does incur
a penalty of being more work.  And you still need to validate that
your embedded/unit test solr works the same as your integration/test
environment Solr.

Eric



On Mar 27, 2009, at 11:59 AM, Joe Pollard wrote:

 Hello,

 On our project, we have quite a bit of code used to generate Solr
 queries, and I need to create some unit tests to ensure that these
 continue to work.  In addition, I need to generate some unit tests
 that will test indexing and retrieval of certain documents, based on
 our current schema and the application logic that generates the
 indexable documents as well as generates the Solr queries.

 My question is - what's the best way for me to unit test our Solr
 integration?

 I'd like to be able to spin up an embedded/in-memory solr, or that
 failing just start one up as part of my test case setup, fill it
 with interesting documents, and do some queries, comparing the
 results to expected results.

 Are there wiki pages or other documented examples of doing this?  It
 seems rather straight-forward, but who knows, it may be dead simple
 with some unknown feature.

 Thanks!
 -Joe

-
Eric Pugh | Principal | OpenSource Connections, LLC | 434.466.1467 | 
http://www.opensourceconnections.com
Free/Busy: http://tinyurl.com/eric-cal






Re: Solr Search Error

2009-03-27 Thread Otis Gospodnetic

Hi Karthik,

First thing I'd do is get the latest Solr nightly build.
If that doesn't fix thing, I'd grab the latest Lucene nightly build and use it 
to replace Lucene jars that are in your version of Solr.
If that doesn't work I'd email the ML with a bit more info about the type of 
search that causes this (e.g. Do all searches cause this or only some?  What do 
those that trigger this error look like or have in common?)

Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 
 From: Narayanan, Karthikeyan karthikeyan.naraya...@gs.com
 To: solr-user@lucene.apache.org
 Sent: Friday, March 27, 2009 11:42:12 AM
 Subject: Solr Search Error
 
 Hi All,
I am intermittently getting this Exception when I do the search.  
 What could be the reason?.
 
 Caused by: org.apache.solr.common.SolrException: 11938  
 java.lang.ArrayIndexOutOfBoundsException: 11938 at 
 org.apache.lucene.search.TermScorer.score(TermScorer.java:74)at 
 org.apache.lucene.search.TermScorer.score(TermScorer.java:61)at 
 org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:137)
 at 
 org.apache.lucene.search.Searcher.search(Searcher.java:126)  at 
 org.apache.lucene.search.Searcher.search(Searcher.java:105)  at 
 org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:966)
   
   at 
 org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:838)
   
at 
 org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:269)  
 at 
 org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:160)
   
   at 
 org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:169)
   
   at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
   
   at org.apache.solr.core.SolrCore.execute(SolrCore.java:1204)at 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:303)
   
 at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:232)
   
at 
 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:215)
   
   at 
 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:188)
   
   at 
 org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:210)
   
 at 
 org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:174)
   
 at 
 org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:433)
   
  at 
 org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) 

 at 
 org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:117) 

 at 
 org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:108)
   
   at 
 org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:151)  
 at 
 org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:870)   
 at 
 org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection(Http11BaseProtocol.java:665)
   
 at 
 org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(PoolTcpEndpoint.java:528)
   
 at 
 org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(LeaderFollowerWorkerThread.java:81)
   
 at 
 org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:685)
   
 at java.lang.Thread.run(Thread.java:619)
 
 
 Thanks.
   
 Karthik



Re: Best way to unit test solr integration

2009-03-27 Thread Otis Gospodnetic

Joe,

Have a look at Solr's own unit test, I believe they have pieces of what you 
need - the ability to start a Solr instance, index docs, run a query, and test 
if the results contain what you expect to see in them.  You can get to Solr's 
unit test by checking out Solr from svn, or by browising the svn repository via 
the Web.

 Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 
 From: Joe Pollard joe.poll...@bazaarvoice.com
 To: solr-user@lucene.apache.org solr-user@lucene.apache.org
 Sent: Friday, March 27, 2009 12:50:31 PM
 Subject: RE: Best way to unit test solr integration
 
 Thanks for the tips, I like the suggestion of testing the document and query 
 generation without having solr involved.  That seems like a more bite-sized 
 unit; I think I'll do that.
 
 However, here's the test case that I'm considering where I'd like to have a 
 live 
 solr instance:
 
 During an exercise of optimizing our schema, I'm going to be making wholesale 
 changes that I'd like to ensure don't break some portion of our app.  It 
 seems 
 like a good method for this would be to write a test with the following 
 steps: 
 (arguably not a unit test, but a very valuable test indeed in our application)
 * take some defined model object generated at test time, store it in db
 * run it through our document creation code
 * submit it into solr
 * generate a query using our custom criteria-based generation code
 * ensure that the query returns the results as expected
 * flesh out the new model objects from the db using only the id fields 
 returned 
 from Solr
 * In the end, it would be expected to have model objects retrieved from the 
 db 
 that match model objects at the beginning of the test.
 
 These building blocks could be stacked in numerous ways to test almost all 
 the 
 different scenarios in which we use Solr.
 
 Also, when/if we start making solr config changes, I can ensure that they 
 change 
 nothing from my app's functional point of view (with the exception of ridding 
 us 
 of dreaded OOMs).
 
 Thanks,
 -Joe
 
 -Original Message-
 From: Eric Pugh [mailto:ep...@opensourceconnections.com]
 Sent: Friday, March 27, 2009 11:27 AM
 To: solr-user@lucene.apache.org
 Subject: Re: Best way to unit test solr integration
 
 So my first thought is that unit test + solr integration is an
 oxymoron.  In the sense that unit test implies the smallest functional
 unit, and solr integration implies multiple units working together.
 
 It sounds like you have two different tasks.  the code that generate
 queies, you can test that without Solr.  If you need to parse some
 sort of solr document to generate a query based on it, then mock up
 the query.   A lot of folks will just use Solr to build a result set,
 and then save that on the filesystem.  my_big_result1.xml and read
 it in and feed it to your code.
 
 On the other hand, for you code testing indexing and retrieval, again,
 if you can use the same approach to decouple what solr does from your
 code.  Unless you've patched Solr, you shouldn't need to unit test
 Solr, Solr has very nice unit testing built in.
 
 On the other hand, if you are doing integration testing, where you
 want a more end to end view of your application, then you probably
 already have a test solr setup in your environment somewhere that
 you can rely on to use.
 
 Spinning up and shutting down Solr for tests can be done, and I can
 think of use cases for why you might want to do it, but it does incur
 a penalty of being more work.  And you still need to validate that
 your embedded/unit test solr works the same as your integration/test
 environment Solr.
 
 Eric
 
 
 
 On Mar 27, 2009, at 11:59 AM, Joe Pollard wrote:
 
  Hello,
 
  On our project, we have quite a bit of code used to generate Solr
  queries, and I need to create some unit tests to ensure that these
  continue to work.  In addition, I need to generate some unit tests
  that will test indexing and retrieval of certain documents, based on
  our current schema and the application logic that generates the
  indexable documents as well as generates the Solr queries.
 
  My question is - what's the best way for me to unit test our Solr
  integration?
 
  I'd like to be able to spin up an embedded/in-memory solr, or that
  failing just start one up as part of my test case setup, fill it
  with interesting documents, and do some queries, comparing the
  results to expected results.
 
  Are there wiki pages or other documented examples of doing this?  It
  seems rather straight-forward, but who knows, it may be dead simple
  with some unknown feature.
 
  Thanks!
  -Joe
 
 -
 Eric Pugh | Principal | OpenSource Connections, LLC | 434.466.1467 | 
 http://www.opensourceconnections.com
 Free/Busy: http://tinyurl.com/eric-cal



RE: Solr Search Error

2009-03-27 Thread Narayanan, Karthikeyan
Hi Otis,
  Thanks for the  recommendation. Will try with latest
nightly build.. I  did couple of full data import and got this error at
few times while searching..


Thanks.
  
Karthik


-Original Message-
From: Otis Gospodnetic [mailto:otis_gospodne...@yahoo.com] 
Sent: Friday, March 27, 2009 12:57 PM
To: solr-user@lucene.apache.org
Subject: Re: Solr Search Error


Hi Karthik,

First thing I'd do is get the latest Solr nightly build.
If that doesn't fix thing, I'd grab the latest Lucene nightly build and
use it to replace Lucene jars that are in your version of Solr.
If that doesn't work I'd email the ML with a bit more info about the
type of search that causes this (e.g. Do all searches cause this or only
some?  What do those that trigger this error look like or have in
common?)

Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 
 From: Narayanan, Karthikeyan karthikeyan.naraya...@gs.com
 To: solr-user@lucene.apache.org
 Sent: Friday, March 27, 2009 11:42:12 AM
 Subject: Solr Search Error
 
 Hi All,
I am intermittently getting this Exception when I do the
search.  
 What could be the reason?.
 
 Caused by: org.apache.solr.common.SolrException: 11938  
 java.lang.ArrayIndexOutOfBoundsException: 11938 at 
 org.apache.lucene.search.TermScorer.score(TermScorer.java:74)
at 
 org.apache.lucene.search.TermScorer.score(TermScorer.java:61)
at 
 org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:137)
at 
 org.apache.lucene.search.Searcher.search(Searcher.java:126)  at 
 org.apache.lucene.search.Searcher.search(Searcher.java:105)  at 

org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.
java:966)  
   at 

org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.j
ava:838)  
at 

org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:2
69)  at 

org.apache.solr.handler.component.QueryComponent.process(QueryComponent.
java:160)  
   at 

org.apache.solr.handler.component.SearchHandler.handleRequestBody(Search
Handler.java:169)  
   at 

org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerB
ase.java:131)  
   at org.apache.solr.core.SolrCore.execute(SolrCore.java:1204)
at 

org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.ja
va:303)  
 at 

org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.j
ava:232)  
at 

org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(Applica
tionFilterChain.java:215)  
   at 

org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilt
erChain.java:188)  
   at 

org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValv
e.java:210)  
 at 

org.apache.catalina.core.StandardContextValve.invoke(StandardContextValv
e.java:174)  
 at 

org.apache.catalina.authenticator.AuthenticatorBase.invoke(Authenticator
Base.java:433)  
  at 

org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java
:127)
 at 

org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java
:117)
 at 

org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.
java:108)  
   at 

org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:1
51)  at 

org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:87
0)   at 

org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.proc
essConnection(Http11BaseProtocol.java:665)  
 at 

org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(PoolTcpEndpoint
.java:528)  
 at 

org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(LeaderFollow
erWorkerThread.java:81)  
 at 

org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool
.java:685)  
 at java.lang.Thread.run(Thread.java:619)
 
 
 Thanks.
   
 Karthik


Re: Best way to unit test solr integration

2009-03-27 Thread Eric Pugh
So in the building block story you talked about, that sounds like an  
integration (functional?  user acceptance?) test..   And I would treat  
Solr the same way you treat your database that you are storing model  
objects in.


If in your tests you bring up a fresh version of the db, populate it  
with tables etc, put in sample data, then you should do the same with  
Solr.  My guess is that you have a test database running, and  
therefore you need a live supported test Solr.  And the same  
processes you use so that two functional tests don't step on each  
others data in the database can be applied to Solr!


You can think of tweaking solr config changes as similar to tweaking  
indexes in your db..  Both require Configuration Management to track  
those changes, ensure they are deployed, and don't regress anything.


Let us know how you get on!

Eric


On Mar 27, 2009, at 12:50 PM, Joe Pollard wrote:

Thanks for the tips, I like the suggestion of testing the document  
and query generation without having solr involved.  That seems like  
a more bite-sized unit; I think I'll do that.


However, here's the test case that I'm considering where I'd like to  
have a live solr instance:


During an exercise of optimizing our schema, I'm going to be making  
wholesale changes that I'd like to ensure don't break some portion  
of our app.  It seems like a good method for this would be to write  
a test with the following steps: (arguably not a unit test, but a  
very valuable test indeed in our application)
* take some defined model object generated at test time, store it in  
db

* run it through our document creation code
* submit it into solr
* generate a query using our custom criteria-based generation code
* ensure that the query returns the results as expected
* flesh out the new model objects from the db using only the id  
fields returned from Solr
* In the end, it would be expected to have model objects retrieved  
from the db that match model objects at the beginning of the test.


These building blocks could be stacked in numerous ways to test  
almost all the different scenarios in which we use Solr.


Also, when/if we start making solr config changes, I can ensure that  
they change nothing from my app's functional point of view (with the  
exception of ridding us of dreaded OOMs).


Thanks,
-Joe

-Original Message-
From: Eric Pugh [mailto:ep...@opensourceconnections.com]
Sent: Friday, March 27, 2009 11:27 AM
To: solr-user@lucene.apache.org
Subject: Re: Best way to unit test solr integration

So my first thought is that unit test + solr integration is an
oxymoron.  In the sense that unit test implies the smallest functional
unit, and solr integration implies multiple units working together.

It sounds like you have two different tasks.  the code that generate
queies, you can test that without Solr.  If you need to parse some
sort of solr document to generate a query based on it, then mock up
the query.   A lot of folks will just use Solr to build a result set,
and then save that on the filesystem.  my_big_result1.xml and read
it in and feed it to your code.

On the other hand, for you code testing indexing and retrieval, again,
if you can use the same approach to decouple what solr does from your
code.  Unless you've patched Solr, you shouldn't need to unit test
Solr, Solr has very nice unit testing built in.

On the other hand, if you are doing integration testing, where you
want a more end to end view of your application, then you probably
already have a test solr setup in your environment somewhere that
you can rely on to use.

Spinning up and shutting down Solr for tests can be done, and I can
think of use cases for why you might want to do it, but it does incur
a penalty of being more work.  And you still need to validate that
your embedded/unit test solr works the same as your integration/test
environment Solr.

Eric



On Mar 27, 2009, at 11:59 AM, Joe Pollard wrote:


Hello,

On our project, we have quite a bit of code used to generate Solr
queries, and I need to create some unit tests to ensure that these
continue to work.  In addition, I need to generate some unit tests
that will test indexing and retrieval of certain documents, based on
our current schema and the application logic that generates the
indexable documents as well as generates the Solr queries.

My question is - what's the best way for me to unit test our Solr
integration?

I'd like to be able to spin up an embedded/in-memory solr, or that
failing just start one up as part of my test case setup, fill it
with interesting documents, and do some queries, comparing the
results to expected results.

Are there wiki pages or other documented examples of doing this?  It
seems rather straight-forward, but who knows, it may be dead simple
with some unknown feature.

Thanks!
-Joe


-
Eric Pugh | Principal | OpenSource Connections, LLC | 434.466.1467 | 

Re: Faceting question

2009-03-27 Thread rayan dev
   Thanks Yonik.

If it is using enum method then it should also be caching the facet query
for every indexed value for the facet fields.

1) Do I need to add filterCache  and hashDocSet entry to the solrconfig.xml
for this caching to happen.?
I did not find any noticeable difference in query time if I added that or
not.

2) Will the performance be the same with more facet values and bigger index
?
 The current implementation has 10 different facet values and there is
300,000 documents indexed.
I will be adding more multivalued facet fields,  the combined facet values
could be upto 60 and index could go up to 1 million documents

Thanks
Rayan


On Fri, Mar 27, 2009 at 11:18 AM, Yonik Seeley
yo...@lucidimagination.comwrote:

 It would be the enum method... Solr 1.3 doesn't have the fc method
 for multi-valued fields... that's a 1.4 feature.

 -Yonik
 http://www.lucidimagination.com

 On Fri, Mar 27, 2009 at 10:44 AM, rayandev rayanm...@gmail.com wrote:
 
  I am using the faceting feature and it works, I get back the facet
 counts,
  but I need to know which facet.method(enum  or fc) is used. Is there a
 way
  to turn on the debug info for faceting.
 
 
  Here's my setup
  Solr 1.3
  EmbededSolrServer
  SolrJ
  Facet fields are indexed as multivalued solr.StrField
 
  Thanks
  Rayan
  --
  View this message in context:
 http://www.nabble.com/Faceting-question-tp22743106p22743106.html
  Sent from the Solr - User mailing list archive at Nabble.com.



Re: optimization advice?

2009-03-27 Thread Steve Conover
 Steve,

 Maybe you can tell us about:

sure

 - your hardware

2.5GB RAM, pretty modern virtual servers

 - query rate

Let's say a few queries per second max...  4

And in general the challenge is to get latency on any given query down
to something very low - we don't have to worry about a huge amount of
load at the moment.

 - document cache and query cache settings

queryResultCache
class=solr.LRUCache
size=512
initialSize=512
autowarmCount=256/

documentCache
class=solr.LRUCache
size=512
initialSize=512
autowarmCount=0/

 - your current response times

This depends on the query.  For queries that involve a total record
count of  1 million, we often see  10ms response times, up to
4-500ms in the worst case.  When we do a page one, sorted query on our
full record set of 2 million+ records, response times can get up into
2+ seconds.

 - any pain points, any slow query patterns

Something that can't be emphasized enough is that we can't predict
what records people will want.  Almost every query is aimed at a
different set of records.

-Steve


Test

2009-03-27 Thread Wesley Small
Sorry, I am having trouble sending a message to this Distribution list. This
is a test.



Question about Solr memory usage.

2009-03-27 Thread Jim Adams
I'm running an old version of Solr -- it's 1.2, and I'm about to upgrade to
1.3.  But I have a question about Solr 1.2 memory usage.

I am occasionally seeing out of memory errors in my Solr log.

Doesn't Solr release memory after a document has been indexed ?   I would
not think it is right for the memory usage to climb to its max specified in
java options then give out of memory errors...

Any thoughts you have are appreciated.

Thanks.


use extrernal index for spellcheck component

2009-03-27 Thread Marc Sturlese

Hey there,
I have a doubt with spellcheck component...
If I tell the spellcheck component to load the dictionary from a field of my
solr main index there's no problem but... Does someone know how to tell the
spellcheck component to load the dictionary from a filed of an external
index?
What I do is:

  searchComponent name=spellcheck class=solr.SpellCheckComponent
str name=queryAnalyzerFieldTypetext/str
lst name=spellchecker
  str name=namedefault/str
  str name=fieldword_spell/str
  str name=spellcheckIndexDir./spellchecker1/str
/lst
  /searchComponent

word_spell is the field witch contains the dictionary in my secondary solr
index, with I have placed in /spellchecker1.
Don't know if has something to do with the field called name.
I am missing something but don't know what...
Thanks in advance.
-- 
View this message in context: 
http://www.nabble.com/use-extrernal-index-for-spellcheck-component-tp22745638p22745638.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: use extrernal index for spellcheck component

2009-03-27 Thread Shalin Shekhar Mangar
On Sat, Mar 28, 2009 at 12:16 AM, Marc Sturlese marc.sturl...@gmail.comwrote:


 Hey there,
 I have a doubt with spellcheck component...
 If I tell the spellcheck component to load the dictionary from a field of
 my
 solr main index there's no problem but... Does someone know how to tell the
 spellcheck component to load the dictionary from a filed of an external
 index?


You need to specify sourceLocation which is the location of the external
index.



 What I do is:

  searchComponent name=spellcheck class=solr.SpellCheckComponent
str name=queryAnalyzerFieldTypetext/str
lst name=spellchecker
  str name=namedefault/str
  str name=fieldword_spell/str
  str name=spellcheckIndexDir./spellchecker1/str
/lst
  /searchComponent

 word_spell is the field witch contains the dictionary in my secondary
 solr
 index, with I have placed in /spellchecker1.


The spellcheckIndexDir is the location where this spellcheck index will be
created.

I guess the wiki documentation is lacking sourceLocation completely. I'll
add more documentation.

http://wiki.apache.org/solr/SpellCheckComponent

-- 
Regards,
Shalin Shekhar Mangar.


Re: Solr date parsing issue

2009-03-27 Thread Shalin Shekhar Mangar
On Fri, Mar 27, 2009 at 8:17 PM, Giovanni De Stefano 
giovanni.destef...@gmail.com wrote:

 Hello,

 I am having a problem indexing a date field.

 In my schema the date field is defined the standard way:

 fieldType name=date class=solr.DateField sortMissingLast=true
 omitNorms=true/

 I know the Solr format is 1995-12-31T23:59:59Z, but the dates coming from
 my
 sources are in the format 2009-04-10T02:02:55+0200

 How can I make the conversion?


If you are using Solrj then parse it into a Date object and add it. Solrj
will take care of writing it out in the correct format.

If you are using DataImportHandler then use the DateFormatTransformer.
-- 
Regards,
Shalin Shekhar Mangar.


Re: Clarifying use of lst name=appends within a requestHandler

2009-03-27 Thread Shalin Shekhar Mangar
On Fri, Mar 27, 2009 at 8:00 PM, fergus mcmenemie fer...@twig.me.uk wrote:

 Hello,

 Due to limitations with the way my content is organised and DIH I have
 to add “-imgCaption:[* TO *]” to some of my queries. I discovered the
 name=”appends” functionality tucked away inside solconfig.xml. This
 looks a very useful feature, and I created a new requestHandler to deal
 with my problem queries. I tried adding the following to my alternate
 requestHandler:-

 lst name=appendsstr name=q-imgCaption:[* TO *]/str/lst

 which did not work; however

 lst name=appendsstr name=fq-imgCaption:[* TO *]/str/lst


appends parameters are appended to the request parameters. An existing q
parameter might be overriding it. Also, I'm not sure if pure negative
queries are supported in the q parameter. You might need to do *:* AND
-imgCaption:[* TO *] instead. I do remember that negative queries in fq
work.



 worked fine and is also more efficient. I guess I was caught by the
 “identify values which should be appended to the list of ***multi-val
 params from the query” portion of the comment within solconfig.xml.
 I am now wondering how do I know which query params are multi-val or
 not? Is this documented anywhere?


For example, fq, facet.field etc are multi-valued since multiple such
params can be specified in the same request. q is single-valued. You can
look through the Input Parameters section on the wiki front page for more
details.

-- 
Regards,
Shalin Shekhar Mangar.


Re: Solr date parsing issue

2009-03-27 Thread Giovanni De Stefano
Hello,

the problem is that I use both Solrj and DIH but I would like to perform
such a change only in 1 place.

Is there any way to do it? Otherwise I will stick with the other approach...

Cheers,
Giovanni


On 3/27/09, Shalin Shekhar Mangar shalinman...@gmail.com wrote:

 On Fri, Mar 27, 2009 at 8:17 PM, Giovanni De Stefano 
 giovanni.destef...@gmail.com wrote:

  Hello,
 
  I am having a problem indexing a date field.
 
  In my schema the date field is defined the standard way:
 
  fieldType name=date class=solr.DateField sortMissingLast=true
  omitNorms=true/
 
  I know the Solr format is 1995-12-31T23:59:59Z, but the dates coming from
  my
  sources are in the format 2009-04-10T02:02:55+0200
 
  How can I make the conversion?
 

 If you are using Solrj then parse it into a Date object and add it. Solrj
 will take care of writing it out in the correct format.

 If you are using DataImportHandler then use the DateFormatTransformer.
 --
 Regards,
 Shalin Shekhar Mangar.



Re: Encoding problem

2009-03-27 Thread Shalin Shekhar Mangar
On Fri, Mar 27, 2009 at 8:41 PM, Rui Pereira ruipereira...@gmail.comwrote:

 I'm having problems with encoding in responses from search queries. The
 encoding problem only occurs in the topologyname field, if a instancename
 has accents it is returned correctly. In all my configurations I have
 UTF-8.

 ?xml version=1.0 encoding=UTF-8?
 dataConfig
document name=topologies
 entity query=SELECT DISTINCT '3141-' || Sub0.SUBID as id, 'Inventário' as
 topologyname, 3141 as topologyid, Sub0.SUBID as instancekey, Sub0.NAME as
 instancename FROM ...
  field column=INSTANCEKEY name=instancekey/
  field column=ID name=id/
  field column=TOPOLOGYID name=topologyid/
  field column=INSTANCENAME name=instancename/
  field column=TOPOLOGYNAME name=topologyname/...


 As an example, I can have in the response the following result:

 doc
 long name=instancekey285/long
 str name=instancenameInformática/str
 long name=topologyid3141/long
 str name=topologynameInventário/str
 /doc


I see that you are specifying the topologyname's value in the query itself.
It might be a bug in DataImportHandler because it reads the data-config as a
string from an InputStream. If your default platform encoding is not UTF-8,
this may be the cause.

Can you try running the Solr's (or your servlet-container's) java process
with -Dfile.encoding=UTF-8 and see if that fixes the problem?

-- 
Regards,
Shalin Shekhar Mangar.


Re: Solr date parsing issue

2009-03-27 Thread Shalin Shekhar Mangar
On Sat, Mar 28, 2009 at 12:46 AM, Giovanni De Stefano 
giovanni.destef...@gmail.com wrote:

 Hello,

 the problem is that I use both Solrj and DIH but I would like to perform
 such a change only in 1 place.

 Is there any way to do it? Otherwise I will stick with the other
 approach...


Which of them are you using for adding documents? Both?

-- 
Regards,
Shalin Shekhar Mangar.


Re: Question about Solr memory usage.

2009-03-27 Thread Shalin Shekhar Mangar
On Sat, Mar 28, 2009 at 12:13 AM, Jim Adams jasolru...@gmail.com wrote:

 I'm running an old version of Solr -- it's 1.2, and I'm about to upgrade to
 1.3.  But I have a question about Solr 1.2 memory usage.

 I am occasionally seeing out of memory errors in my Solr log.

 Doesn't Solr release memory after a document has been indexed ?   I would
 not think it is right for the memory usage to climb to its max specified in
 java options then give out of memory errors...


It does. But then there's the caches and auto-warming (after commits). A lot
of that stuff can be tweaked though.

There are a lot of old mail threads on memory usage and optimization which
you may find useful. Use a mailing list search engine like
lucidimagination.com, markmail or nabble.

-- 
Regards,
Shalin Shekhar Mangar.


Re: Solr date parsing issue

2009-03-27 Thread Giovanni De Stefano
Hello,

yes, I use both: I have a multicore architecture, multiple indexes but I
have been able to manage a common schema.

Giovanni


On 3/27/09, Shalin Shekhar Mangar shalinman...@gmail.com wrote:

 On Sat, Mar 28, 2009 at 12:46 AM, Giovanni De Stefano 
 giovanni.destef...@gmail.com wrote:

  Hello,
 
  the problem is that I use both Solrj and DIH but I would like to perform
  such a change only in 1 place.
 
  Is there any way to do it? Otherwise I will stick with the other
  approach...
 

 Which of them are you using for adding documents? Both?

 --
 Regards,
 Shalin Shekhar Mangar.



How to optimize Index Process?

2009-03-27 Thread vivek sar
Hi,

  We have a distributed Solr system (2-3 boxes with each running 2
instances of Solr and each Solr instance can write to multiple cores).
Our use case is high index volume - we can get up to 100 million
records (1 record = 500 bytes) per day, but very low query traffic
(only administrators may need to search for data - once an hour our
so). So, we need very fast index time. Here are the things I'm trying
to find out in order to optimize our index process,

1) What's the optimum index size? I've noticed as the index size grows
the indexing time starts increasing. In our test less than 10G index
size we could index over 2K/sec, but as it grows over 20G the index
rate drops to 1400/sec and keeps dropping as index size grows. I'm
trying to see whether we can partition (create new SolrCore) after
10G.
 - related question, is there a way to find the SolrCore size (any
web service for that?) - based on that information I can create a new
core and freeze the one which has reached 10G.

2) In our test, we noticed that after few hours (after 8 hours of
indexing) there is a period (3-4 hours period) where the indexing is
very-very slow (like 500 records/sec) and after that period indexing
returns back to normal rate (1500/sec). Does Solr run any optimize
command on its own? How can we find that out?  I'm not issuing any
optimize command - should I be doing that after certain time?

3) Every time I add new documents (10K at once) to the index I see
searcher closing and then re-opening/re-warming (in Catalina.out)
after commit is done. I'm not sure if this is an expensive operation.
Since, our search volume is very low can I configure Solr to not do
this? Would it make indexing any faster?

Mar 26, 2009 11:59:45 PM org.apache.solr.search.SolrIndexSearcher close
INFO: Closing searc...@33d9337c main
Mar 26, 2009 11:59:52 PM org.apache.solr.update.DirectUpdateHandler2 commit
INFO: start commit(optimize=false,waitFlush=false,waitSearcher=true)
Mar 26, 2009 11:59:52 PM org.apache.solr.search.SolrIndexSearcher init
INFO: Opening searc...@46ba6905 main
Mar 26, 2009 11:59:52 PM org.apache.solr.search.SolrIndexSearcher warm
INFO: autowarming searc...@46ba6905 main from searc...@5c5ffecd main

4) Anything else (any other configuration in Solr - I'm currently
using all default settings in the solrconfig.xml and default handlers)
that could help optimize my indexing process?

Thanks,
-vivek


RE: large index vs multicore

2009-03-27 Thread Manepalli, Kalyan
Thanks for the reply. 
Yes in most of the usecase the data would be from both the indices.
It's like a parent child relation. The usecase requires the data from the child 
be displayed along with parent product information.


Thanks,
Kalyan Manepalli

-Original Message-
From: Ryan McKinley [mailto:ryan...@gmail.com] 
Sent: Wednesday, March 25, 2009 8:54 PM
To: solr-user@lucene.apache.org
Subject: Re: large index vs multicore



 My question is - From design and query speed point of - should I add  
 new core to handle the additional data or should I add the data to  
 the existing core.

Do you ever need to get results from both sets of data in the same  
query?  If so, putting them in the same index will be faster.  If  
every query is always limited to results within on set or the other --  
and the doc count is not huge, then the choice of single core vs multi  
core is more about what you are more comfortable managing then it is  
about query speeds.

Advantages of multicore-
  - the distinct data is in different indexes, you can maintain them  
independently
(perhaps one data set never changes and the other changes often)

Advantages of single core (with multiple data sets)
  - everything is in one place
  - replicate / load balance a single index rather then multiple.


ryan


Re: Encoding problem

2009-03-27 Thread Shalin Shekhar Mangar
On Sat, Mar 28, 2009 at 12:51 AM, Shalin Shekhar Mangar 
shalinman...@gmail.com wrote:


 I see that you are specifying the topologyname's value in the query itself.
 It might be a bug in DataImportHandler because it reads the data-config as a
 string from an InputStream. If your default platform encoding is not UTF-8,
 this may be the cause.


I've opened SOLR-1090 to fix this issue.

https://issues.apache.org/jira/browse/SOLR-1090

-- 
Regards,
Shalin Shekhar Mangar.


Apachecon 2009 Europe

2009-03-27 Thread Olivier Dobberkau

Hi all,

you came back with a head full of impressions from Apachecon Europe.
Thanks a lot for the great Speeches and the inspiring personal talks.

I strongly believe that solr will have great future.

Olivier

--
Olivier Dobberkau
d.k.d Internet Service GmbH
fon:  +49 (0)69 - 43 05 61-70 fax:  +49 (0)69 - 43 05 61-90
mail: olivier.dobber...@dkd.de home: http://www.dkd.de


using multisearcher

2009-03-27 Thread Brent Palmer

Hi everybody,
I'm interested in using Solr to search multiple indexes at once.  We 
currently use our own search application which uses lucene's 
multisearcher.   Has anyone attempted to or successfully replaced 
SolrIndexSearcher with some kind of multisearcher?  I have looked at the 
DistributedSearch on the wiki and I'm pretty sure this isn't what we 
want.  Also does anyone have any comments about trying to replace the 
SolrIndexSearcher with a SolrMultiSearcher;  reasons why we shouldn't do 
this, pitfalls; suggestions about how to go about it etc.  Also, it 
should be noted that we would only be adding documents to one of the 
indexes.  I can give more info about the context of this application if 
necessary.


Thank you for any suggestions!

--
Brent Palmer
Widernet.org
University of Iowa
319-335-2200



OOM at MultiSegmentReader.norms

2009-03-27 Thread vivek sar
Hi,

   I've index of size 50G (around 100 million documents) and growing -
around 2000 records (1 rec = 500 byes) are being written every second
continuously. If I make any search on this index I get OOM. I'm using
default cache settings (512,512,256) in the solrconfig.xml. The search
is using the admin interface (returning 10 rows) with no sorting,
faceting or highlighting. Max heap size is 1024m.

Mar 27, 2009 9:13:41 PM org.apache.solr.common.SolrException log
SEVERE: java.lang.OutOfMemoryError: Java heap space
at 
org.apache.lucene.index.MultiSegmentReader.norms(MultiSegmentReader.java:335)
at 
org.apache.lucene.search.TermQuery$TermWeight.scorer(TermQuery.java:69)
at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:132)
at org.apache.lucene.search.Searcher.search(Searcher.java:126)
at org.apache.lucene.search.Searcher.search(Searcher.java:105)
at 
org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:966)
at 
org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:838)
at 
org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:269)
at 
org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:160)
at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:169)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1204)
at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:303)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:232)
at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at 
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
at 
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
at 
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
at 
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
at 
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)

What could be the problem?

Thanks,
-vivek


Re: optimization advice?

2009-03-27 Thread Otis Gospodnetic

OK, we are a step closer.  Sorting makes things slower.  What field(s) do you 
sort on, what are their types, and if there is a date in there, are the dates 
very granular, and if they are, do you really need them to be that precise?


Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 
 From: Steve Conover scono...@gmail.com
 To: solr-user@lucene.apache.org
 Sent: Friday, March 27, 2009 1:51:14 PM
 Subject: Re: optimization advice?
 
  Steve,
 
  Maybe you can tell us about:
 
 sure
 
  - your hardware
 
 2.5GB RAM, pretty modern virtual servers
 
  - query rate
 
 Let's say a few queries per second max...  4
 
 And in general the challenge is to get latency on any given query down
 to something very low - we don't have to worry about a huge amount of
 load at the moment.
 
  - document cache and query cache settings
 
 
 class=solr.LRUCache
 size=512
 initialSize=512
 autowarmCount=256/
 
 
 class=solr.LRUCache
 size=512
 initialSize=512
 autowarmCount=0/
 
  - your current response times
 
 This depends on the query.  For queries that involve a total record
 count of  1 million, we often see  10ms response times, up to
 4-500ms in the worst case.  When we do a page one, sorted query on our
 full record set of 2 million+ records, response times can get up into
 2+ seconds.
 
  - any pain points, any slow query patterns
 
 Something that can't be emphasized enough is that we can't predict
 what records people will want.  Almost every query is aimed at a
 different set of records.
 
 -Steve



Re: How to optimize Index Process?

2009-03-27 Thread Otis Gospodnetic

Hi,

Answers inlined.

 
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch

- Original Message 
   We have a distributed Solr system (2-3 boxes with each running 2
 instances of Solr and each Solr instance can write to multiple cores).

Is this really optimal?  How many CPU cores do your boxes have vs. the number 
of Solr cores?

 Our use case is high index volume - we can get up to 100 million
 records (1 record = 500 bytes) per day, but very low query traffic
 (only administrators may need to search for data - once an hour our
 so). So, we need very fast index time. Here are the things I'm trying
 to find out in order to optimize our index process,

It's tarting to sound like you might be able to batch your data and use 
http://wiki.apache.org/solr/UpdateCSV -- it's the fastest indexing method, I 
believe.

 1) What's the optimum index size? I've noticed as the index size grows
 the indexing time starts increasing. In our test less than 10G index
 size we could index over 2K/sec, but as it grows over 20G the index
 rate drops to 1400/sec and keeps dropping as index size grows. I'm
 trying to see whether we can partition (create new SolrCore) after
 10G.

That's likely due to Lucene's segment merging. You can make mergeFactor bigger 
to make segment merging less frequent, but don't make it to high or you'll run 
into open file descriptor limits (which you could raise, of course).

  - related question, is there a way to find the SolrCore size (any
 web service for that?) - based on that information I can create a new
 core and freeze the one which has reached 10G.

You can see the number of docs in an index via Admin Statistics page (the 
response is actually XML, look at the source)

 2) In our test, we noticed that after few hours (after 8 hours of
 indexing) there is a period (3-4 hours period) where the indexing is
 very-very slow (like 500 records/sec) and after that period indexing
 returns back to normal rate (1500/sec). Does Solr run any optimize
 command on its own? How can we find that out?  I'm not issuing any
 optimize command - should I be doing that after certain time?

No, it doesn't run optimize on its own.  It could be running auto-commit, but 
you should comment that out anyway.  Try doing a thread dump to see what's 
doing on and watching the system with top, vmstat.
No, you shouldn't optimize until you are completely done.

 3) Every time I add new documents (10K at once) to the index I see
 searcher closing and then re-opening/re-warming (in Catalina.out)
 after commit is done. I'm not sure if this is an expensive operation.
 Since, our search volume is very low can I configure Solr to not do
 this? Would it make indexing any faster?

Are you running the commit command after every 10K docs?  No need to do that if 
you don't need your searcher to see the changes immediately.

 Mar 26, 2009 11:59:45 PM org.apache.solr.search.SolrIndexSearcher close
 INFO: Closing searc...@33d9337c main
 Mar 26, 2009 11:59:52 PM org.apache.solr.update.DirectUpdateHandler2 commit
 INFO: start commit(optimize=false,waitFlush=false,waitSearcher=true)
 Mar 26, 2009 11:59:52 PM org.apache.solr.search.SolrIndexSearcher 
 INFO: Opening searc...@46ba6905 main
 Mar 26, 2009 11:59:52 PM org.apache.solr.search.SolrIndexSearcher warm
 INFO: autowarming searc...@46ba6905 main from searc...@5c5ffecd main
 
 4) Anything else (any other configuration in Solr - I'm currently
 using all default settings in the solrconfig.xml and default handlers)
 that could help optimize my indexing process?

Increase ramBufferSizeMB as much as you can afford.
Comment out maxBufferedDocs, it's deprecated.
Increase mergeFactor slightly.
Consider the CSV approach.
Index with multiple threads (match the number of CPU cores).
If you are using Solrj, use the Streaming version of SolrServer.
Give the JVM more memory (you'll need it if you increase ramBufferSizeMB)

Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch 



Re: OOM at MultiSegmentReader.norms

2009-03-27 Thread Otis Gospodnetic

That's a tiny heap.  Part of it is used for indexing, too.  And the fact that 
your heap is so small shows you are not really making use of that nice 
ramBufferSizeMB setting. :)

Also, use omitNorms=true for fields that don't need norms (if their types 
don't already do that).

Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 
 From: vivek sar vivex...@gmail.com
 To: solr-user@lucene.apache.org
 Sent: Friday, March 27, 2009 6:15:59 PM
 Subject: OOM at MultiSegmentReader.norms
 
 Hi,
 
I've index of size 50G (around 100 million documents) and growing -
 around 2000 records (1 rec = 500 byes) are being written every second
 continuously. If I make any search on this index I get OOM. I'm using
 default cache settings (512,512,256) in the solrconfig.xml. The search
 is using the admin interface (returning 10 rows) with no sorting,
 faceting or highlighting. Max heap size is 1024m.
 
 Mar 27, 2009 9:13:41 PM org.apache.solr.common.SolrException log
 SEVERE: java.lang.OutOfMemoryError: Java heap space
 at 
 org.apache.lucene.index.MultiSegmentReader.norms(MultiSegmentReader.java:335)
 at 
 org.apache.lucene.search.TermQuery$TermWeight.scorer(TermQuery.java:69)
 at 
 org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:132)
 at org.apache.lucene.search.Searcher.search(Searcher.java:126)
 at org.apache.lucene.search.Searcher.search(Searcher.java:105)
 at 
 org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:966)
 at 
 org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:838)
 at 
 org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:269)
 at 
 org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:160)
 at 
 org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:169)
 at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
 at org.apache.solr.core.SolrCore.execute(SolrCore.java:1204)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:303)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:232)
 at 
 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
 at 
 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
 at 
 org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
 at 
 org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
 at 
 org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
 at 
 org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
 at 
 org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
 
 What could be the problem?
 
 Thanks,
 -vivek



solr date parsing issue

2009-03-27 Thread Suryasnat Das
Hi,

I am implementing a project using SOLR in which we need to do a search based
on date range. I am passing the date in SOLR date format. During formation
of the SOLR query i am encoding the date string using UTF-8 encoding. After
forming the whole query string i am posting the search request to SOLR by
using Apache's HTTPClient class. But during posting it says invalid query.
How to resolve this?
please i need a resolution on a immediate basis. Thanks in advance.

Regards
Suryasnat Das
Infosys


More Robust Search Timeouts (to Kill Zombie Queries)?

2009-03-27 Thread Chris Harris
I've noticed that some of my queries take so long (5 min+) that by the
time they return, there is no longer any plausible use for the search
results. I've started calling these zombie queries because, well, they
should be dead, but they just won't die. Instead, they stick around,
wasting my Solr box's CPU, RAM, and I/O resources, and potentially
causing more legitimate queries to stack up. (Regarding stacking up,
see SOLR-138.)

I may be able to prevent some of this by optimizing my index settings
and by disallowing certain things, such as prefix wildcard queries
(e.g. *ar). However, right now I'm most interested in figuring out how
to get more robust server-side search timeouts in Solr. This would
seem to provide a good balance between these goals:

1) I would like to allow users to attempt to run potentially expensive
queries, such as queries with lots of wildcards or ranges
2) I would like to make sure that potentially expensive queries don't
turn into zombies -- especially long-lasting zombies

For example, I think some of my users might be willing to wait a
minute or two for certain classes of search to complete. But after
that point, I'd really like to say enough is enough.

[Background]

While my load is pretty low (it's not a public-facing site), some of
my queries are monsters that can take, say, over 5 minutes. (I don't
know how much longer than 5 minutes they might take. Some of them
might take hours, for all I know, if allowed to run to completion!)

The biggest cuprit queries currently seem to be wildcard queries. This
is made worse by how I've allowed prefix wildcard searches on an index
with a large # of terms. (This is made worse yet by doing word bigram
indexing.)

I've implemented the timeAllowed search time out support feature
introduced in SOLR-502, and this does catch some searches that would
have become zombies. (Some proximity searches, for example.) But the
timeAllowed mechanism does not catch everything. And, as I understand
it, it's powerless to do anything about, say, wildcard expansions that
are taking forever.

The question is how to proceed.

[Option 1: Wait for someone to bring timeAllowed support to more parts
of Solr search]

This might be nice. I sort of assume it will happen eventually. I kind
of want a more immediate solution, though. Any thoughts on how hard it
would be to add the timeout to, say, wildcard expansion? I haven't
figured out if I know enough about Solr yet to work on this myself.

[Option 2: Add gross timeout support to StandardRequestHandler?]

What if I modified StandardRequestHandler so that, when it was invoked,
the following would happen:

* spawn new thread t to do the stuff that StandardRequestHandlerStuff
would normally do
* start thread t
* sleep, waiting for either the thread t to finish or for a timer go off
* after waking up, look whether the timer went off. if so, then
terminate thread t

This would kill any runaway zombie queries. But maybe it would also
have horrible side-effects. Is it wishful thinking to believe that
this might not screw up referencing counting, or create deadlocks, or
anything else?

[Option 3: Servlet container-level Solutions?]

I thought Jetty and friends would might have an option along the lines of
if a request is taking longer than x seconds, then abort the thread
handling it. This seems troublesome in practice, though:

1) I can't find a servlet container with documentation clearly stating
that this is possible.
2) I played with Jetty, and maxIdleTime sounded like it *might* cause
this behavior, but experiments suggest otherwise.
3) This behavior sounds dangerous, especially unless you can convince
the servlet container to only abort index-reading threads, while
leaving index-writing threads alone.

Thanks for any advice,
Chris


Re: optimization advice?

2009-03-27 Thread Steve Conover
We sort by default on name, which varies quite a bit (we're never
going to make sorting by field go away).

The thing is solr has been pretty amazing across 1 million records.
Now that we've doubled the size of the dataset things are definitely
slower in a nonlinear way...I'm wondering what factors are involved
here.

-Steve

On Fri, Mar 27, 2009 at 6:58 PM, Otis Gospodnetic
otis_gospodne...@yahoo.com wrote:

 OK, we are a step closer.  Sorting makes things slower.  What field(s) do you 
 sort on, what are their types, and if there is a date in there, are the dates 
 very granular, and if they are, do you really need them to be that precise?


 Otis
 --
 Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



 - Original Message 
 From: Steve Conover scono...@gmail.com
 To: solr-user@lucene.apache.org
 Sent: Friday, March 27, 2009 1:51:14 PM
 Subject: Re: optimization advice?

  Steve,
 
  Maybe you can tell us about:

 sure

  - your hardware

 2.5GB RAM, pretty modern virtual servers

  - query rate

 Let's say a few queries per second max...  4

 And in general the challenge is to get latency on any given query down
 to something very low - we don't have to worry about a huge amount of
 load at the moment.

  - document cache and query cache settings


         class=solr.LRUCache
         size=512
         initialSize=512
         autowarmCount=256/


         class=solr.LRUCache
         size=512
         initialSize=512
         autowarmCount=0/

  - your current response times

 This depends on the query.  For queries that involve a total record
 count of  1 million, we often see  10ms response times, up to
 4-500ms in the worst case.  When we do a page one, sorted query on our
 full record set of 2 million+ records, response times can get up into
 2+ seconds.

  - any pain points, any slow query patterns

 Something that can't be emphasized enough is that we can't predict
 what records people will want.  Almost every query is aimed at a
 different set of records.

 -Steve




Re: solr date parsing issue

2009-03-27 Thread Kurt Nordstrom

Mr. Das,

Can you provide a little more details here?

Helpful information would be:

- The query string you're using

- The fieldtype you're using for indexing the value in question.

- The exact error message you're getting from Solr.


Suryasnat Das wrote:
 
 Hi,
 
 I am implementing a project using SOLR in which we need to do a search
 based
 on date range. I am passing the date in SOLR date format. During formation
 of the SOLR query i am encoding the date string using UTF-8 encoding.
 After
 forming the whole query string i am posting the search request to SOLR by
 using Apache's HTTPClient class. But during posting it says invalid query.
 How to resolve this?
 please i need a resolution on a immediate basis. Thanks in advance.
 
 Regards
 Suryasnat Das
 Infosys
 
 

-- 
View this message in context: 
http://www.nabble.com/solr-date-parsing-issue-tp22753196p22753613.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: optimization advice?

2009-03-27 Thread Otis Gospodnetic

Steve,

A field named name sounds like a free text field.  What is its type, string 
or text?  Fields you sort by should not be tokenized and should be indexed.  I 
have a hunch your name field is tokenized.


Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 
 From: Steve Conover scono...@gmail.com
 To: solr-user@lucene.apache.org
 Sent: Friday, March 27, 2009 11:59:52 PM
 Subject: Re: optimization advice?
 
 We sort by default on name, which varies quite a bit (we're never
 going to make sorting by field go away).
 
 The thing is solr has been pretty amazing across 1 million records.
 Now that we've doubled the size of the dataset things are definitely
 slower in a nonlinear way...I'm wondering what factors are involved
 here.
 
 -Steve
 
 On Fri, Mar 27, 2009 at 6:58 PM, Otis Gospodnetic
 wrote:
 
  OK, we are a step closer.  Sorting makes things slower.  What field(s) do 
  you 
 sort on, what are their types, and if there is a date in there, are the dates 
 very granular, and if they are, do you really need them to be that precise?
 
 
  Otis
  --
  Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
 
 
 
  - Original Message 
  From: Steve Conover 
  To: solr-user@lucene.apache.org
  Sent: Friday, March 27, 2009 1:51:14 PM
  Subject: Re: optimization advice?
 
   Steve,
  
   Maybe you can tell us about:
 
  sure
 
   - your hardware
 
  2.5GB RAM, pretty modern virtual servers
 
   - query rate
 
  Let's say a few queries per second max...  4
 
  And in general the challenge is to get latency on any given query down
  to something very low - we don't have to worry about a huge amount of
  load at the moment.
 
   - document cache and query cache settings
 
 
  class=solr.LRUCache
  size=512
  initialSize=512
  autowarmCount=256/
 
 
  class=solr.LRUCache
  size=512
  initialSize=512
  autowarmCount=0/
 
   - your current response times
 
  This depends on the query.  For queries that involve a total record
  count of  1 million, we often see  10ms response times, up to
  4-500ms in the worst case.  When we do a page one, sorted query on our
  full record set of 2 million+ records, response times can get up into
  2+ seconds.
 
   - any pain points, any slow query patterns
 
  Something that can't be emphasized enough is that we can't predict
  what records people will want.  Almost every query is aimed at a
  different set of records.
 
  -Steve
 
 



Re: solr date parsing issue

2009-03-27 Thread Shalin Shekhar Mangar
On Sat, Mar 28, 2009 at 8:17 AM, Suryasnat Das suryaatw...@gmail.comwrote:

 Hi,

 I am implementing a project using SOLR in which we need to do a search
 based
 on date range. I am passing the date in SOLR date format. During formation
 of the SOLR query i am encoding the date string using UTF-8 encoding. After
 forming the whole query string i am posting the search request to SOLR by
 using Apache's HTTPClient class. But during posting it says invalid query.
 How to resolve this?
 please i need a resolution on a immediate basis. Thanks in advance.


Why don't you use Solrj client?

http://wiki.apache.org/solr/Solrj

-- 
Regards,
Shalin Shekhar Mangar.