Re: SOLR developer

2007-08-31 Thread Tim Archambault
Thanks. I didn't mean to send that to the list-serv :}

On 8/31/07, Bertrand Delacretaz [EMAIL PROTECTED] wrote:

 On 8/31/07, Tim Archambault [EMAIL PROTECTED] wrote:
  ...I'm thinking of sending a similar
  list-serv item out, but I noticed this is a solr-user list, not
 necessarily
  a developers list so I thought I'd ask

 Note that there's also [EMAIL PROTECTED] for such purposes, see
 http://www.apachenews.org/archives/000465.html

 But AFAIK, project-related job offers are ok on ASF lists, preferably
 with a [JOB] marker in the subject line.

 -Bertrand (*not* available for consulting ATM, and currently inactive
 on Solr anyway)



Re: multiple solr home directories

2007-08-31 Thread Chris Hostetter



Just to make sure.  you mean we can create a directory containing the shared
jars, and each solr home/lib will symlink to the jar files in that
directory. Right?


correct.


-Hoss


Re: minimum occurances of term in document

2007-08-31 Thread Jed Reynolds

Mike Klaas wrote:


On 30-Aug-07, at 4:01 PM, Chris Hostetter wrote:



You could accomplish the goal without any coding by using phrase 
queries: calico calico calico~1 will match only documents 
that have at least three occurrences of calico.  If this is 
performant enough, you are done. Otherwise, you'll have to do some 
custom coding.


I'll be searching article content so literals like cat cat cat are 
improbable.


i think you missunderstood Mike's point ... the query string...
 foo:cat cat cat~1

...will only match documents containing three instances of the term 
cat in the field foo where those instances are all withing 1 
term positions of eachother ... hte idea being that as long as the 
slop (number) used is bigger then the largest document you expect 
to deal with, this will esentially give you want you want.


Note too that by default solr only indexes the first 10k tokens, so 
this should work for all documents in the index.


-Mike




Whoa! When I first read the original suggestion, I was thinking ^1 
because I happened to be googling solr filter by score (another topic 
I learned is hardly worth persuing).


Yeah, I'm going to try that right now

Jed


Re: multiple solr home directories

2007-08-31 Thread Ozgur Yilmazel
I have a related question on this topic. I have a web application
which I would like to create indexes for individual users on the fly,
is it possible to do JNDI configuration without restarting Tomcat?
Here is some more detail on what I am trying to do:
Our search application has a web based administration page in which
administrators can select set of documents and make them available for
search on different URLs or with different user privileges. I know we
could use the same index and filter results based on a indexname
field, but having separate indexes would make it easy for us to
migrate an index to a different machine easier.

Thank you for your help.

Ozgur




On 8/31/07, Chris Hostetter [EMAIL PROTECTED] wrote:

  Just to make sure.  you mean we can create a directory containing the shared
  jars, and each solr home/lib will symlink to the jar files in that
  directory. Right?

 correct.


 -Hoss



RE: performance questions

2007-08-31 Thread Jonathan Woods
Only if you think the rest of Solr would be better written in JRuby too! 

 -Original Message-
 From: Erik Hatcher [mailto:[EMAIL PROTECTED] 
 Sent: 31 August 2007 02:57
 To: solr-user@lucene.apache.org
 Subject: Re: performance questions
 
 
 On Aug 30, 2007, at 6:31 PM, Mike Klaas wrote:
  Another reason why people use stored procs is to prevent multiple 
  round-trips in a multi-stage query operation.  This is exactly what 
  complex RequestHandlers do (and the equivalent to a custom 
 stored proc 
  would be writing your own handler).
 
 And we should be writing those handlers in JRuby ;)   Who's with me?
 
   Erik
 
 
 
 



Solrsharp now supports debugQuery

2007-08-31 Thread Jeff Rodenburg
Solrsharp now supports query debugging.  This is enabled through the
debugQuery and explainOther parameters.

A DebugResults object is referenced by a SearchResults instance and provides
all the debugging information that is available through these parameters,
such as:

   - QueryString and ParsedQuery string values
   - Array of ExplanationRecord objects
   - OtherQuery value (if provided)
   - Array of ExplanationRecord objects supporting the OtherQuery value

The ExplanationRecord object provides the details of the debug results,
specifically including the ExplainInfo string (the debug analysis payload)
and a reference to the UniqueRecordKey of the evaluated record.  The
UniqueRecordKey, though returned as a string, could then be cast
appropriately to reference the matching SearchRecord referenced by the same
SearchResults instance.

The example program with the source code has been updated to show how to
make use of these properties.  If any issues are found, please log them to
JIRA and associate them with the C# client component.

cheers,
jeff r.


Distribution Information?

2007-08-31 Thread Matthew Runo

Hello!

/solr/admin/distributiondump.jsp

This server is set up as a master server, and other servers use the  
replication scripts to pull updates from it every few minutes. My  
distribution information screen is blank.. and I couldn't find any  
information on fixing this in the wiki.


Any chance someone would be able to explain how to get this page  
working, or what I'm doing wrong?


++
 | Matthew Runo
 | Zappos Development
 | [EMAIL PROTECTED]
 | 702-943-7833
++




RE: Re: multiple solr home directories

2007-08-31 Thread Stu Hood
You can use a combination of the Tomcat Manager app: 
http://tomcat.apache.org/tomcat-6.0-doc/manager-howto.html and this patch: 
https://issues.apache.org/jira/browse/SOLR-336 to create instances on the fly.

My three types of instances have separate home directories, but each running 
instance uses a different data directory.

Thanks,
Stu


-Original Message-
From: Ozgur Yilmazel 
Sent: Friday, August 31, 2007 4:48am
To: solr-user@lucene.apache.org
Subject: Re: multiple solr home directories

I have a related question on this topic. I have a web application
which I would like to create indexes for individual users on the fly,
is it possible to do JNDI configuration without restarting Tomcat?
Here is some more detail on what I am trying to do:
Our search application has a web based administration page in which
administrators can select set of documents and make them available for
search on different URLs or with different user privileges. I know we
could use the same index and filter results based on a indexname
field, but having separate indexes would make it easy for us to
migrate an index to a different machine easier.

Thank you for your help.

Ozgur




On 8/31/07, Chris Hostetter  wrote:

  Just to make sure.  you mean we can create a directory containing the shared
  jars, and each solr home/lib will symlink to the jar files in that
  directory. Right?

 correct.


 -Hoss



Replication broken.. no helpful errors?

2007-08-31 Thread Matthew Runo

Hello!

On a somewhat related note, our replication seems very much broken.  
I've added -v to all my cron jobs, and I think I've seen the error  
(below).


As you can see, it's rsyncing an updated index, but then doesn't seem  
to know to install it. I'm not sure why though.. no errors are  
reported anywhere via the -v. Any help would be most appreciated, I'm  
sure I'm just missing something.   You can see the cronjob command in  
the subject of the forwarded message.


++
 | Matthew Runo
 | Zappos Development
 | [EMAIL PROTECTED]
 | 702-943-7833
++


Begin forwarded message:


From: [EMAIL PROTECTED] (Cron Daemon)
Date: August 31, 2007 1:02:36 PM PDT
To: [EMAIL PROTECTED]
Subject: Cron [EMAIL PROTECTED] /opt/solr/bin/snappuller -M search1  
-P 18080 -D /opt/solr/data -S /opt/solr/logs -d /opt/solr/data -v;/ 
opt/solr/bin/snapinstaller -M search1 -S /opt/solr/logs -d /opt/ 
solr/data -v


started by tomcat5
command: /opt/solr/bin/snappuller -M search1 -P 18080 -D /opt/solr/ 
data -S /opt/solr/logs -d /opt/solr/data -v

pulling snapshot snapshot.20070831130005
receiving file list ... done
deleting segments_19zm
deleting _10bg.tis
deleting _10bg.tii
deleting _10bg.prx
deleting _10bg.nrm
deleting _10bg.frq
deleting _10bg.fnm
deleting _10bg.fdx
deleting _10bg.fdt
./
_14ff.fdt
_14ff.fdx
_14ff.fnm
_14ff.frq
_14ff.nrm
_14ff.prx
_14ff.tii
_14ff.tis
_14ff_9.del
_14fq.fdt
_14fq.fdx
_14fq.fnm
_14fq.frq
_14fq.nrm
_14fq.prx
_14fq.tii
_14fq.tis
_14fq_3.del
_14fr.fdt
_14fr.fdx
_14fr.fnm
_14fr.frq
_14fr.nrm
_14fr.prx
_14fr.tii
_14fr.tis
_14fr_2.del
_14fs.fdt
_14fs.fdx
_14fs.fnm
_14fs.frq
_14fs.nrm
_14fs.prx
_14fs.tii
_14fs.tis
_14fs_1.del
segments.gen
segments_1fza
write.lock

sent 871 bytes  received 843185604 bytes  24440187.68 bytes/sec
total size is 843080453  speedup is 1.00
started by tomcat5
command: /opt/solr/bin/snapinstaller -M search1 -S /opt/solr/logs - 
d /opt/solr/data -v
latest snapshot /opt/solr/data/temp-snapshot.20070816120113 already  
installed






Re: Replication broken.. no helpful errors?

2007-08-31 Thread Bill Au
 latest snapshot /opt/solr/data/temp-snapshot.20070816120113 already
 installed

It looks like you have a directory named temp-snapshot.20070816120113
in your data directory.  You should remove it.  One of the other
script might have left that behind somehow.

I will update the snapinstaller script to ignore non-snapshot when
looking for the latest snapshot to install.

Bill

On 8/31/07, Matthew Runo [EMAIL PROTECTED] wrote:
 Hello!

 On a somewhat related note, our replication seems very much broken.
 I've added -v to all my cron jobs, and I think I've seen the error
 (below).

 As you can see, it's rsyncing an updated index, but then doesn't seem
 to know to install it. I'm not sure why though.. no errors are
 reported anywhere via the -v. Any help would be most appreciated, I'm
 sure I'm just missing something.   You can see the cronjob command in
 the subject of the forwarded message.

 ++
   | Matthew Runo
   | Zappos Development
   | [EMAIL PROTECTED]
   | 702-943-7833
 ++


 Begin forwarded message:

  From: [EMAIL PROTECTED] (Cron Daemon)
  Date: August 31, 2007 1:02:36 PM PDT
  To: [EMAIL PROTECTED]
  Subject: Cron [EMAIL PROTECTED] /opt/solr/bin/snappuller -M search1
  -P 18080 -D /opt/solr/data -S /opt/solr/logs -d /opt/solr/data -v;/
  opt/solr/bin/snapinstaller -M search1 -S /opt/solr/logs -d /opt/
  solr/data -v
 
  started by tomcat5
  command: /opt/solr/bin/snappuller -M search1 -P 18080 -D /opt/solr/
  data -S /opt/solr/logs -d /opt/solr/data -v
  pulling snapshot snapshot.20070831130005
  receiving file list ... done
  deleting segments_19zm
  deleting _10bg.tis
  deleting _10bg.tii
  deleting _10bg.prx
  deleting _10bg.nrm
  deleting _10bg.frq
  deleting _10bg.fnm
  deleting _10bg.fdx
  deleting _10bg.fdt
  ./
  _14ff.fdt
  _14ff.fdx
  _14ff.fnm
  _14ff.frq
  _14ff.nrm
  _14ff.prx
  _14ff.tii
  _14ff.tis
  _14ff_9.del
  _14fq.fdt
  _14fq.fdx
  _14fq.fnm
  _14fq.frq
  _14fq.nrm
  _14fq.prx
  _14fq.tii
  _14fq.tis
  _14fq_3.del
  _14fr.fdt
  _14fr.fdx
  _14fr.fnm
  _14fr.frq
  _14fr.nrm
  _14fr.prx
  _14fr.tii
  _14fr.tis
  _14fr_2.del
  _14fs.fdt
  _14fs.fdx
  _14fs.fnm
  _14fs.frq
  _14fs.nrm
  _14fs.prx
  _14fs.tii
  _14fs.tis
  _14fs_1.del
  segments.gen
  segments_1fza
  write.lock
 
  sent 871 bytes  received 843185604 bytes  24440187.68 bytes/sec
  total size is 843080453  speedup is 1.00
  started by tomcat5
  command: /opt/solr/bin/snapinstaller -M search1 -S /opt/solr/logs -
  d /opt/solr/data -v
  latest snapshot /opt/solr/data/temp-snapshot.20070816120113 already
  installed
 




Re: Distribution Information?

2007-08-31 Thread Bill Au
Are there any error message in your appserver log files?

Bill

On 8/31/07, Matthew Runo [EMAIL PROTECTED] wrote:
 Hello!

 /solr/admin/distributiondump.jsp

 This server is set up as a master server, and other servers use the
 replication scripts to pull updates from it every few minutes. My
 distribution information screen is blank.. and I couldn't find any
 information on fixing this in the wiki.

 Any chance someone would be able to explain how to get this page
 working, or what I'm doing wrong?

 ++
   | Matthew Runo
   | Zappos Development
   | [EMAIL PROTECTED]
   | 702-943-7833
 ++





Re: Sort basics

2007-08-31 Thread mel2k

Yes, when I upgraded to version 1.2 of Solr, sort works fine. Thank you for
your reply and help.


Yonik Seeley wrote:
 
 The separate sort parameter for the standard handler is relatively new
 (as of Solr 1.2)
 Is that the version of Solr you are using?  If so, can you also supply
 the output Solr gives you as the result of your query?
 
 -Yonik
 
 On 8/23/07, mel2k [EMAIL PROTECTED] wrote:

 Hello,

 I am new to Solr and trying to understand how the sort functionality is
 working. Thanks in advance for your help on the following questions.

 I have taken the default download, started Solr and posted the mem.xml. I
 updated the mem.xml by copying each of the items and changing ONLY the id
 and price fields. The xml file is shown below. Now when I sort by price
 it
 does not seem to work. Simply shows me the documents in the order I have
 inserted.  I was expecting the results that matched the term to be sorted
 by
 price despite the 'score'.

 1. What am I doing wrong?
 2. Does the 'score' overwrite any sort parameter? Or how do I get the
 list
 that match 'sdram' in name field sorted by price?


 Query:
 http://localhost:8983/solr/select?q=name%3A%28sdram%29sort=price%20ascversion=2.1start=0rows=100fl=name+price+scoreqt=standard

 Results:

 response
 −
 responseHeader
 status0/status
 QTime0/QTime
 −
 lst name=params
 str name=sortprice asc/str
 str name=flname price score/str
 str name=start0/str
 str name=qname:(sdram)/str
 str name=qtstandard/str
 str name=version2.1/str
 str name=rows100/str
 /lst
 /responseHeader
 −
 result name=response numFound=6 start=0
 maxScore=0.30217415
 −
 doc
 float name=score0.30217415/float
 −
 str name=name
 CORSAIR ValueSelect 1GB 184-Pin DDR SDRAM Unbuffered DDR 400 (PC 3200)
 System Memory - Retail
 /str
 float name=price374.99/float
 /doc
 −
 doc
 float name=score0.30217415/float
 −
 str name=name
 A-DATA V-Series 1GB 184-Pin DDR SDRAM Unbuffered DDR 400 (PC 3200) System
 Memory - OEM
 /str
 /doc
 −
 doc
 float name=score0.30217415/float
 −
 str name=name
 CORSAIR ValueSelect 1GB 184-Pin DDR SDRAM Unbuffered DDR 400 (PC 3200)
 System Memory - Retail
 /str
 float name=price274.99/float
 /doc
 −
 doc
 float name=score0.30217415/float
 −
 str name=name
 A-DATA V-Series 1GB 184-Pin DDR SDRAM Unbuffered DDR 400 (PC 3200) System
 Memory - OEM
 /str
 /doc
 −
 doc
 float name=score0.2590064/float
 −
 str name=name
 CORSAIR XMS 2GB (2 x 1GB) 184-Pin DDR SDRAM Unbuffered DDR 400 (PC 3200)
 Dual Channel Kit System Memory - Retail
 /str
 float name=price3185.0/float
 /doc
 −
 doc
 float name=score0.2590064/float
 −
 str name=name
 CORSAIR XMS 2GB (2 x 1GB) 184-Pin DDR SDRAM Unbuffered DDR 400 (PC 3200)
 Dual Channel Kit System Memory - Retail
 /str
 float name=price2185.0/float
 /doc
 /result
 /response

 Data file posted:

 ?xml version=1.0 ?
 add
 doc
   field name=idTWINX2048-3200PRO/field
   field name=nameCORSAIR XMS 2GB (2 x 1GB) 184-Pin DDR SDRAM
 Unbuffered
 DDR 400 (PC 3200) Dual Channel Kit System Memory - Retail/field
   field name=manuCorsair Microsystems Inc./field
   field name=catelectronics/field
   field name=catmemory/field
   field name=featuresCAS latency 2, 2-3-3-6 timing, 2.75v,
 unbuffered,
 heat-spreader/field
   field name=price3185/field
   field name=popularity5/field
   field name=inStocktrue/field
 /doc

 doc
   field name=idVS1GB400C3/field
   field name=nameCORSAIR ValueSelect 1GB 184-Pin DDR SDRAM Unbuffered
 DDR 400 (PC 3200) System Memory - Retail/field
   field name=manuCorsair Microsystems Inc./field
   field name=catelectronics/field
   field name=catmemory/field
   field name=price374.99/field
   field name=popularity7/field
   field name=inStocktrue/field
 /doc

 doc
   field name=idVDBDB1A16/field
   field name=nameA-DATA V-Series 1GB 184-Pin DDR SDRAM Unbuffered DDR
 400 (PC 3200) System Memory - OEM/field
   field name=manuA-DATA Technology Inc./field
   field name=catelectronics/field
   field name=catmemory/field
   field name=featuresCAS latency 3,  2.7v/field
   !-- note: price is missing on this one --
   field name=popularity5/field
   field name=inStocktrue/field

 /doc
 doc
   field name=id2TWINX2048-3200PRO/field
   field name=nameCORSAIR XMS 2GB (2 x 1GB) 184-Pin DDR SDRAM
 Unbuffered
 DDR 400 (PC 3200) Dual Channel Kit System Memory - Retail/field
   field name=manu2Corsair Microsystems Inc./field
   field name=catelectronics/field
   field name=catmemory/field
   field name=featuresCAS latency 2, 2-3-3-6 timing, 2.75v,
 unbuffered,
 heat-spreader/field
   field name=price2185/field
   field name=popularity5/field
   field name=inStocktrue/field
 /doc

 doc
   field name=id2VS1GB400C3/field
   field name=nameCORSAIR ValueSelect 1GB 184-Pin DDR SDRAM Unbuffered
 DDR 400 (PC 3200) System Memory - Retail/field
   field name=manu2Corsair Microsystems Inc./field
   field name=catelectronics/field
   field