Re: Multiple Cores Vs. Single Core for the following use case

2010-01-27 Thread Marc Sturlese

In case you are going to use core per user take a look to this patch:
http://wiki.apache.org/solr/LotsOfCores

Trey-13 wrote:
 
 Hi Matt,
 
 In most cases you are going to be better off going with the userid method
 unless you have a very small number of users and a very large number of
 docs/user. The userid method will likely be much easier to manage, as you
 won't have to spin up a new core every time you add a new user.  I would
 start here and see if the performance is good enough for your requirements
 before you start worrying about it not being efficient.
 
 That being said, I really don't have any idea what your data looks like.
 How many users do you have?  How many documents per user?  Are any
 documents
 shared by multiple users?
 
 -Trey
 
 
 
 On Tue, Jan 26, 2010 at 7:27 PM, Matthieu Labour
 matthieu_lab...@yahoo.comwrote:
 
 Hi



 Shall I set up Multiple Core or Single core for the following use case:



 I have X number of users.



 When I do a search, I always know for which user I am doing a search



 Shall I set up X cores, 1 for each user ? Or shall I set up 1 core and
 add
 a userId field to each document?



 If I choose the 1 core solution then I am concerned with performance.
 Let's say I search for NewYork ... If lucene returns all New York
 matches for all users and then filters based on the userId, then this
 is going to be less efficient than if I have sharded per user and send
 the request for New York to the user's core



 Thank you for your help



 matt







 
 

-- 
View this message in context: 
http://old.nabble.com/Multiple-Cores-Vs.-Single-Core-for-the-following-use-case-tp27332288p27335403.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Fastest way to use solrj

2010-01-27 Thread Noble Paul നോബിള്‍ नोब्ळ्
how many fields are there in each doc? the binary format just reduces
overhead. it does not touch/compress the payload

2010/1/27 Tim Terlegård tim.terleg...@gmail.com:
 I have 3 millon documents, each having 5000 chars. The xml file is
 about 15GB. The binary file is also about 15GB.

 I was a bit surprised about this. It doesn't bother me much though. At
 least it performs better.

 /Tim

 2010/1/27 Noble Paul നോബിള്‍  नोब्ळ् noble.p...@corp.aol.com:
 if you write only a few docs you may not observe much difference in
 size. if you write large no:of docs you may observe a big difference.

 2010/1/27 Tim Terlegård tim.terleg...@gmail.com:
 I got the binary format to work perfectly now. Performance is better
 than with xml. Thanks!

 Although, it doesn't look like a binary file is smaller in size than
 an xml file?

 /Tim

 2010/1/27 Noble Paul നോബിള്‍  नोब्ळ् noble.p...@corp.aol.com:
 2010/1/21 Tim Terlegård tim.terleg...@gmail.com:
 Yes, it worked! Thank you very much. But do I need to use curl or can
 I use CommonsHttpSolrServer or StreamingUpdateSolrServer? If I can't
 use BinaryWriter then I don't know how to do this.
 if your data is serialized using JavaBinUpdateRequestCodec, you may
 POST it using curl.
 If you are writing directly , use CommonsHttpSolrServer

 /Tim

 2010/1/20 Noble Paul നോബിള്‍  नोब्ळ् noble.p...@corp.aol.com:
 2010/1/20 Tim Terlegård tim.terleg...@gmail.com:
 BinaryRequestWriter does not read from a file and post it

 Is there any other way or is this use case not supported? I tried 
 this:

 $ curl host/solr/update/javabin -F stream.file=/tmp/data.bin
 $ curl host/solr/update -F stream.body=' commit /'

 Solr did read the file, because solr complained when the file wasn't
 in the format the JavaBinUpdateRequestCodec expected. But no data is
 added to the index for some reason.

 how did you create the file /tmp/data.bin ? what is the format?

 I wrote this in the first email. It's in the javabin format (I think).
 I did like this (groovy code):

   fieldId = new NamedList()
   fieldId.add(name, id)
   fieldId.add(val, 9-0)
   fieldId.add(boost, null)
   fieldText = new NamedList()
   fieldText.add(name, text)
   fieldText.add(val, Some text)
   fieldText.add(boost, null)
   fieldNull = new NamedList()
   fieldNull.add(boost, null)
   doc = [fieldNull, fieldId, fieldText]
   docs = [doc]
   root = new NamedList()
   root.add(docs, docs)
   fos = new FileOutputStream(data.bin)
   new JavaBinCodec().marshal(root, fos)

 /Tim

 JavaBin is a format.
 use this method JavaBinUpdateRequestCodec# marshal(UpdateRequest
 updateRequest, OutputStream os)

 The output of this can be posted to solr and it should work



 --
 -
 Noble Paul | Systems Architect| AOL | http://aol.com





 --
 -
 Noble Paul | Systems Architect| AOL | http://aol.com





 --
 -
 Noble Paul | Systems Architect| AOL | http://aol.com





-- 
-
Noble Paul | Systems Architect| AOL | http://aol.com


Re: Fastest way to use solrj

2010-01-27 Thread Tim Terlegård
I have 6 fields. The text field is the biggest, it contains almost all
of the 5000 chars.

/Tim

2010/1/27 Noble Paul നോബിള്‍  नोब्ळ् noble.p...@corp.aol.com:
 how many fields are there in each doc? the binary format just reduces
 overhead. it does not touch/compress the payload

 2010/1/27 Tim Terlegård tim.terleg...@gmail.com:
 I have 3 millon documents, each having 5000 chars. The xml file is
 about 15GB. The binary file is also about 15GB.

 I was a bit surprised about this. It doesn't bother me much though. At
 least it performs better.

 /Tim

 2010/1/27 Noble Paul നോബിള്‍  नोब्ळ् noble.p...@corp.aol.com:
 if you write only a few docs you may not observe much difference in
 size. if you write large no:of docs you may observe a big difference.

 2010/1/27 Tim Terlegård tim.terleg...@gmail.com:
 I got the binary format to work perfectly now. Performance is better
 than with xml. Thanks!

 Although, it doesn't look like a binary file is smaller in size than
 an xml file?

 /Tim

 2010/1/27 Noble Paul നോബിള്‍  नोब्ळ् noble.p...@corp.aol.com:
 2010/1/21 Tim Terlegård tim.terleg...@gmail.com:
 Yes, it worked! Thank you very much. But do I need to use curl or can
 I use CommonsHttpSolrServer or StreamingUpdateSolrServer? If I can't
 use BinaryWriter then I don't know how to do this.
 if your data is serialized using JavaBinUpdateRequestCodec, you may
 POST it using curl.
 If you are writing directly , use CommonsHttpSolrServer

 /Tim

 2010/1/20 Noble Paul നോബിള്‍  नोब्ळ् noble.p...@corp.aol.com:
 2010/1/20 Tim Terlegård tim.terleg...@gmail.com:
 BinaryRequestWriter does not read from a file and post it

 Is there any other way or is this use case not supported? I tried 
 this:

 $ curl host/solr/update/javabin -F stream.file=/tmp/data.bin
 $ curl host/solr/update -F stream.body=' commit /'

 Solr did read the file, because solr complained when the file wasn't
 in the format the JavaBinUpdateRequestCodec expected. But no data is
 added to the index for some reason.

 how did you create the file /tmp/data.bin ? what is the format?

 I wrote this in the first email. It's in the javabin format (I think).
 I did like this (groovy code):

   fieldId = new NamedList()
   fieldId.add(name, id)
   fieldId.add(val, 9-0)
   fieldId.add(boost, null)
   fieldText = new NamedList()
   fieldText.add(name, text)
   fieldText.add(val, Some text)
   fieldText.add(boost, null)
   fieldNull = new NamedList()
   fieldNull.add(boost, null)
   doc = [fieldNull, fieldId, fieldText]
   docs = [doc]
   root = new NamedList()
   root.add(docs, docs)
   fos = new FileOutputStream(data.bin)
   new JavaBinCodec().marshal(root, fos)

 /Tim

 JavaBin is a format.
 use this method JavaBinUpdateRequestCodec# marshal(UpdateRequest
 updateRequest, OutputStream os)

 The output of this can be posted to solr and it should work



 --
 -
 Noble Paul | Systems Architect| AOL | http://aol.com





 --
 -
 Noble Paul | Systems Architect| AOL | http://aol.com





 --
 -
 Noble Paul | Systems Architect| AOL | http://aol.com





 --
 -
 Noble Paul | Systems Architect| AOL | http://aol.com



Re: solr1.5

2010-01-27 Thread David MARTIN
Good question indeed : I'm waiting as many others I guess for the patch 236
(the collapse thing :) ).

David

On Tue, Jan 26, 2010 at 4:24 PM, Matthieu Labour matth...@strateer.comwrote:

 Hi
 quick question:
 Is there any release date scheduled for solr 1.5 with all the wonderful
 patches (StreamingUpdateSolrServer etc ...)?
 Thank you !



scenario with FQ parameter

2010-01-27 Thread Ravi Gidwani
HI all:
  I am trying to figure out a way to do the following:

   qf=field1^10 field2^20 field^100fq=*:9+OR+(field1:xyz)

*Expected Results:
The above should return me documents where 9 appears in any of the fields
(field1,field2 or field3) OR field1 matches xyz.
*
I know I can use copy field (say 'text') to copy all the fields and then
use:

qf=field1^10 field2^20 field^100fq=*text*:9+OR+(field1:xyz ^100.0)

but doing so , the boost weights specified in the 'qf' field have no effect
on the score.

I am using solr 1.4 and the searchHandler is dismax.

Is there any way I can achieve the above expected results but still affect
the score with qf parameter ?

Thanks,
~Ravi Gidwani.


Re: Wildcard Search and Filter in Solr

2010-01-27 Thread Ravi Gidwani
Ashok:
  May be this will help:
http://gravi2.blogspot.com/2009/05/solr-wildcards-and-omitnorms.html

~Ravi

On Tue, Jan 26, 2010 at 9:56 PM, ashokcz ashokkumar.gane...@tcs.com wrote:


 Hi just looked at the analysis.jsp and found out what it does during index
 /
 query

 Index Analyzer
 Intel
 intel
 intel
 intel
 intel
 intel

 Query Analyzer
 Inte*
 Inte*
 inte*
 inte
 inte
 inte
 int

 I think somewhere my configuration or my definition of the type text is
 wrong.
 This is my configuration .

 fieldType class=solr.TextField name=text
  analyzer type=index
  tokenizer class=solr.WhitespaceTokenizerFactory/
  filter class=solr.LowerCaseFilterFactory/
filter catenateAll=0 catenateNumbers=0 catenateWords=0
 class=solr.WordDelimiterFilterFactory generateNumberParts=1
 generateWordParts=1/

  filter class=solr.StopFilterFactory/
  filter class=solr.TrimFilterFactory/
  filter class=solr.PorterStemFilterFactory/
   /analyzer


 analyzer type=query
  tokenizer class=solr.WhitespaceTokenizerFactory/
  filter class=solr.SynonymFilterFactory expand=true
 ignoreCase=true
 synonyms=synonyms.txt/
  filter class=solr.LowerCaseFilterFactory/
  filter catenateAll=0 catenateNumbers=0 catenateWords=0
 class=solr.WordDelimiterFilterFactory generateNumberParts=1
 generateWordParts=1/
  filter class=solr.StopFilterFactory/
  filter class=solr.TrimFilterFactory/
  filter class=solr.PorterStemFilterFactory/
  /analyzer

/fieldType

 I think i am missing some basic configuration for doing wildcard searches .
 but could not figure it out .
 can someone help please


 Ahmet Arslan wrote:
 
 
  Hi ,
  I m trying to use wildcard keywords in my search term and
  filter term . but
  i didnt get any results.
  Searched a lot but could not find any lead .
  Can someone help me in this.
  i m using solr 1.2.0 and have few records indexed with
  vendorName value as
  Intel
 
  In solr admin interface i m trying to do the search like
  this
 
 
 http://localhost:8983/solr/select?indent=onversion=2.2q=intelstart=0rows=10fl=*%2Cscoreqt=standardwt=standardexplainOther=hl.fl=
 
  and i m getting the result properly
 
  but when i use q=inte* no records are returned.
 
  the same is the case for Filter Query on using
  fq=VendorName:Intel i get
  my results.
 
  but on using fq=VendorName:Inte* no results are
  returned.
 
  I can guess i doing mistake in few obvious things , but
  could not figure it
  out ..
  Can someone pls help me out :) :)
 
  If q=intel returns documents while q=inte* does not, it means that
  fieldType of your defaultSearchField is reducing the token intel into
  something.
 
  Can you find out it by using /admin/anaysis.jsp what happens to Intel
  intel at index and query time?
 
  What is your defaultSearchField? Is it VendorName?
 
  It is expected that fq=VendorName:Intel returns results while
  fq=VendorName:Inte* does not. Because prefix queries are not analyzed.
 
 
  But it is strange that q=inte* does not return anything. Maybe your index
  analyzer is reducing Intel into int or ıntel?
 
  I am not 100% sure but solr 1.2.0  may use default locale in lowercase
  operation. What is your default locale?
 
  It is better to see what happens word Intel using analysis.jsp page.
 
 
 
 
 

 --
 View this message in context:
 http://old.nabble.com/Wildcard-Search-and-Filter-in-Solr-tp27306734p27334486.html
 Sent from the Solr - User mailing list archive at Nabble.com.




Plurals in solr indexing

2010-01-27 Thread murali k

Hi, 
I am having trouble with indexing plurals, 

I have the schema with following fields
gender (field) - string (field type) (eg. data Boys)
all (field) - text (field type)  - solr.WhitespaceTokenizerFactory,
solr.SynonymFilterFactory, solr.WordDelimiterFilterFactory,
solr.LowerCaseFilterFactory, SnowballPorterFilterFactory

i am using copyField from gender to all

and searching on all field

When i search for Boy, I get the results, If i search for Boys i dont get
results, 
I have tried things like boys bikes - no results
boy bikes - works

kid and kids are synonymns for boy and boys, so i tried adding 
kid,kids,boy,boys in synonyms hoping it will work, it doesnt work that way

I also have other content fields which are copied to all , and it contains
words like kids, boys etc...
any idea?





-- 
View this message in context: 
http://old.nabble.com/Plurals-in-solr-indexing-tp27335639p27335639.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Plurals in solr indexing

2010-01-27 Thread murali k

I have found that my synonyms.txt file had

kids,boys,girls,childrens,children,boys  girls,kid,boy,girl

I ran analyzer, somehow it is matching with girl ,, i am not sure whats
happening yet, so i removed ampersand
Kids,boys,girls,childrens,children,boy,girl,kid

I guessed when i add them comma separated it will do as a group and when any
one the words are queried matches will be returned.

it is working now... after i made that change in synonyms.txt file






murali k wrote:
 
 Hi, 
 I am having trouble with indexing plurals, 
 
 I have the schema with following fields
 gender (field) - string (field type) (eg. data Boys)
 all (field) - text (field type)  - solr.WhitespaceTokenizerFactory,
 solr.SynonymFilterFactory, solr.WordDelimiterFilterFactory,
 solr.LowerCaseFilterFactory, SnowballPorterFilterFactory
 
 i am using copyField from gender to all
 
 and searching on all field
 
 When i search for Boy, I get the results, If i search for Boys i dont get
 results, 
 I have tried things like boys bikes - no results
 boy bikes - works
 
 kid and kids are synonymns for boy and boys, so i tried adding 
 kid,kids,boy,boys in synonyms hoping it will work, it doesnt work that way
 
 I also have other content fields which are copied to all , and it
 contains words like kids, boys etc...
 any idea?
 
 
 
 
 
 

-- 
View this message in context: 
http://old.nabble.com/Plurals-in-solr-indexing-tp27335639p27336508.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Fastest way to use solrj

2010-01-27 Thread Noble Paul നോബിള്‍ नोब्ळ्
The binary format just reduces overhead. in your case , all the data
is in the big text field which is not compressed. But overall, the
parsing is a lot faster for the binary format. So you see a perf boost

2010/1/27 Tim Terlegård tim.terleg...@gmail.com:
 I have 6 fields. The text field is the biggest, it contains almost all
 of the 5000 chars.

 /Tim

 2010/1/27 Noble Paul നോബിള്‍  नोब्ळ् noble.p...@corp.aol.com:
 how many fields are there in each doc? the binary format just reduces
 overhead. it does not touch/compress the payload

 2010/1/27 Tim Terlegård tim.terleg...@gmail.com:
 I have 3 millon documents, each having 5000 chars. The xml file is
 about 15GB. The binary file is also about 15GB.

 I was a bit surprised about this. It doesn't bother me much though. At
 least it performs better.

 /Tim

 2010/1/27 Noble Paul നോബിള്‍  नोब्ळ् noble.p...@corp.aol.com:
 if you write only a few docs you may not observe much difference in
 size. if you write large no:of docs you may observe a big difference.

 2010/1/27 Tim Terlegård tim.terleg...@gmail.com:
 I got the binary format to work perfectly now. Performance is better
 than with xml. Thanks!

 Although, it doesn't look like a binary file is smaller in size than
 an xml file?

 /Tim

 2010/1/27 Noble Paul നോബിള്‍  नोब्ळ् noble.p...@corp.aol.com:
 2010/1/21 Tim Terlegård tim.terleg...@gmail.com:
 Yes, it worked! Thank you very much. But do I need to use curl or can
 I use CommonsHttpSolrServer or StreamingUpdateSolrServer? If I can't
 use BinaryWriter then I don't know how to do this.
 if your data is serialized using JavaBinUpdateRequestCodec, you may
 POST it using curl.
 If you are writing directly , use CommonsHttpSolrServer

 /Tim

 2010/1/20 Noble Paul നോബിള്‍  नोब्ळ् noble.p...@corp.aol.com:
 2010/1/20 Tim Terlegård tim.terleg...@gmail.com:
 BinaryRequestWriter does not read from a file and post it

 Is there any other way or is this use case not supported? I tried 
 this:

 $ curl host/solr/update/javabin -F stream.file=/tmp/data.bin
 $ curl host/solr/update -F stream.body=' commit /'

 Solr did read the file, because solr complained when the file wasn't
 in the format the JavaBinUpdateRequestCodec expected. But no data is
 added to the index for some reason.

 how did you create the file /tmp/data.bin ? what is the format?

 I wrote this in the first email. It's in the javabin format (I think).
 I did like this (groovy code):

   fieldId = new NamedList()
   fieldId.add(name, id)
   fieldId.add(val, 9-0)
   fieldId.add(boost, null)
   fieldText = new NamedList()
   fieldText.add(name, text)
   fieldText.add(val, Some text)
   fieldText.add(boost, null)
   fieldNull = new NamedList()
   fieldNull.add(boost, null)
   doc = [fieldNull, fieldId, fieldText]
   docs = [doc]
   root = new NamedList()
   root.add(docs, docs)
   fos = new FileOutputStream(data.bin)
   new JavaBinCodec().marshal(root, fos)

 /Tim

 JavaBin is a format.
 use this method JavaBinUpdateRequestCodec# marshal(UpdateRequest
 updateRequest, OutputStream os)

 The output of this can be posted to solr and it should work



 --
 -
 Noble Paul | Systems Architect| AOL | http://aol.com





 --
 -
 Noble Paul | Systems Architect| AOL | http://aol.com





 --
 -
 Noble Paul | Systems Architect| AOL | http://aol.com





 --
 -
 Noble Paul | Systems Architect| AOL | http://aol.com





-- 
-
Noble Paul | Systems Architect| AOL | http://aol.com


Help using CachedSqlEntityProcessor

2010-01-27 Thread KirstyS

Hi, I have looked on the wiki. Using the CachedSqlEntityProcessor looks like
it was simple. But I am getting no speed benefit and am not sure if I have
even got the syntax correct. 
I have a main root entity called 'article'.

And then I have a number of sub entities. One such entity is as such :

entity name=LinkedCategory pk=LinkedCatAricleId
  query=SELECT LinkedCategoryBC, CmsArticleId as
LinkedCatAricleId
 FROM LinkedCategoryBreadCrumb_SolrSearch (nolock)
 WHERE convert(varchar(50), CmsArticleId) =
convert(varchar(50), '${article.CmsArticleId}') 
processor=CachedSqlEntityProcessor
WHERE=LinkedCatArticleId = article.CmsArticleId 
deltaQuery=SELECT LinkedCategoryBC
FROM LinkedCategoryBreadCrumb_SolrSearch
(nolock)
WHERE convert(varchar(50), CmsArticleId) =
convert(varchar(50), '${article.CmsArticleId}')
AND (convert(varchar(50), LastUpdateDate) 
'${dataimporter.article.last_index_time}'
OR   convert(varchar(50), PublishDate) 
'${dataimporter.article.last_index_time}')
parentDeltaQuery=SELECT * from
vArticleSummaryDetail_SolrSearch (nolock)
 WHERE convert(varchar(50), CmsArticleId) =
convert(varchar(50), '${article.CmsArticleId}')
field column=LinkedCategoryBC name=LinkedCategoryBreadCrumb/
  /entity


As you can see I have added (for the main query - not worrying about the
delta queries yet!!) the processor and the 'where' but not sure if it's
correct.
Can anyone point me in the right direction???
Thanks
Kirsty
-- 
View this message in context: 
http://old.nabble.com/Help-using-CachedSqlEntityProcessor-tp27337635p27337635.html
Sent from the Solr - User mailing list archive at Nabble.com.



RE: solr with tomcat in cluster mode

2010-01-27 Thread ZAROGKIKAS,GIORGOS
Hi again 
I finally setup my solr Cluster with tomcat6
The configuration I user is two tomcat servers on the same server in 
different ports(ex localhost:8180/solr and
 Localhost:8280/solr for testing purposes) with different indexes on 
each server  and index replication through replication handler of solr , and 
its working fine for me and very quick 

Now I want to use load balance for these two tomcat servers but without using 
apache http server 
Is there any solution for that ???






-Original Message-
From: Matt Mitchell [mailto:goodie...@gmail.com] 
Sent: Friday, January 22, 2010 9:33 PM
To: solr-user@lucene.apache.org
Subject: Re: solr with tomcat in cluster mode

Hey Otis,

We're indexing on a separate machine because we want to keep our production
nodes away from processes like indexing. The indexing server also has a ton
of resources available, more so than the production nodes. We set it up as
an indexing server at one point and have decided to stick with it.

We're not indexing the same index as the search indexes because we want to
be able to step back a day or two if needed. So we do the SWAP when things
are done and OK.

So that last part you mentioned about the searchers needing to re-open will
happen with a SWAP right? Is your concern that there will be a lag time,
making it so the slaves will be out of sync for some small period of time?

Would it be simpler/better to move to using Solrs native slave/master
feature?

I'd love to hear any suggestions you might have.

Thanks,

Matt

On Fri, Jan 22, 2010 at 1:58 PM, Otis Gospodnetic 
otis_gospodne...@yahoo.com wrote:

 This should work fine.
 But why are you indexing to a separate index/core?  Why not index in the
 very same index you are searching?
 Slaves won't see changes until their searchers re-open.

 Otis
 --
 Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch



 - Original Message 
  From: Matt Mitchell goodie...@gmail.com
  To: solr-user@lucene.apache.org
  Sent: Fri, January 22, 2010 9:44:03 AM
  Subject: Re: solr with tomcat in cluster mode
 
  We have a similar setup and I'd be curious to see how folks are doing
 this
  as well.
 
  Our setup: A few servers and an F5 load balancer. Each Solr instance
 points
  to a shared index. We use a separate server for indexing. When the index
 is
  complete, we do some juggling using the Core Admin SWAP function and
 update
  the shared index. I've wondered about having a shared index across
 multiple
  instances of (read-only) Solr -- any problems there?
 
  Matt
 
  On Fri, Jan 22, 2010 at 9:35 AM, ZAROGKIKAS,GIORGOS 
  g.zarogki...@multirama.gr wrote:
 
   Hi
  I'm using solr 1.4 with tomcat in a single pc
  and I want to turn it in cluster mode with 2 nodes and load
   balancing
  But I can't find info how to do
  Is there any manual or a recorded procedure on the internet  to
   do that
  Or is there anyone to help me ?
  
   Thanks in advance
  
  
   Ps : I use windows server 2008 for OS
  
  
  
  
  




Starting Jetty Server using JettySolrRunner

2010-01-27 Thread Rakhi Khatwani
Hi,
  I am trying 2 run a solr server using JettySolrRunner, however i
keep gettin the following exception:
Can't find resource 'solrconfig.xml' in classpath or 'solr/conf/',
cwd=/home/ithurs/shellworkspace/SolrPOC
 at
org.apache.solr.core.SolrResourceLoader.openResource(SolrResourceLoader.java:260)
 at
org.apache.solr.core.SolrResourceLoader.openConfig(SolrResourceLoader.java:228)
 at org.apache.solr.core.Config.init(Config.java:101)
 at org.apache.solr.core.SolrConfig.init(SolrConfig.java:130)
 at
org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:134)
 at
org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:83)
 at org.mortbay.jetty.servlet.FilterHolder.doStart(FilterHolder.java:99)
 at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40)
 at
org.mortbay.jetty.servlet.ServletHandler.initialize(ServletHandler.java:594)
 at org.mortbay.jetty.servlet.Context.startContext(Context.java:139)
 at
org.mortbay.jetty.handler.ContextHandler.doStart(ContextHandler.java:500)
 at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40)
 at
org.mortbay.jetty.handler.HandlerWrapper.doStart(HandlerWrapper.java:117)
 at org.mortbay.jetty.Server.doStart(Server.java:210)
 at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40)
 at
org.apache.solr.client.solrj.embedded.JettySolrRunner.start(JettySolrRunner.java:99)
 at
org.apache.solr.client.solrj.embedded.JettySolrRunner.start(JettySolrRunner.java:93)
 at com.germinait.solr.jetty.StartStopJetty.main(StartStopJetty.java:9)
Jan 27, 2010 4:48:56 PM org.apache.solr.core.CoreContainer finalize
SEVERE: CoreContainer was not shutdown prior to finalize(), indicates a bug
-- POSSIBLE RESOURCE LEAK!!!
Jan 27, 2010 4:48:56 PM org.apache.solr.common.SolrException log
SEVERE: java.lang.RuntimeException: Can't find resource 'solrconfig.xml' in
classpath or 'solr/conf/', cwd=/home/ithurs/shellworkspace/SolrPOC
 at
org.apache.solr.core.SolrResourceLoader.openResource(SolrResourceLoader.java:260)
 at
org.apache.solr.core.SolrResourceLoader.openConfig(SolrResourceLoader.java:228)
 at org.apache.solr.core.Config.init(Config.java:101)
 at org.apache.solr.core.SolrConfig.init(SolrConfig.java:130)
 at
org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:134)
 at
org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:83)
 at org.mortbay.jetty.servlet.FilterHolder.doStart(FilterHolder.java:99)
 at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40)
 at
org.mortbay.jetty.servlet.ServletHandler.initialize(ServletHandler.java:594)
 at org.mortbay.jetty.servlet.Context.startContext(Context.java:139)
 at
org.mortbay.jetty.handler.ContextHandler.doStart(ContextHandler.java:500)
 at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40)
 at
org.mortbay.jetty.handler.HandlerWrapper.doStart(HandlerWrapper.java:117)
 at org.mortbay.jetty.Server.doStart(Server.java:210)
 at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40)
 at
org.apache.solr.client.solrj.embedded.JettySolrRunner.start(JettySolrRunner.java:99)
 at
org.apache.solr.client.solrj.embedded.JettySolrRunner.start(JettySolrRunner.java:93)
 at com.germinait.solr.jetty.StartStopJetty.main(StartStopJetty.java:9)

Jan 27, 2010 4:48:56 PM org.apache.solr.servlet.SolrDispatchFilter init
INFO: SolrDispatchFilter.init() done
Jan 27, 2010 4:48:56 PM sun.reflect.NativeMethodAccessorImpl invoke0
WARNING: failed SocketConnector @ 0.0.0.0:8983
java.net.BindException: Address already in use
 at java.net.PlainSocketImpl.socketBind(Native Method)
 at java.net.PlainSocketImpl.bind(PlainSocketImpl.java:359)
 at java.net.ServerSocket.bind(ServerSocket.java:319)
 at java.net.ServerSocket.init(ServerSocket.java:185)
 at java.net.ServerSocket.init(ServerSocket.java:141)
 at
org.mortbay.jetty.bio.SocketConnector.newServerSocket(SocketConnector.java:78)
 at org.mortbay.jetty.bio.SocketConnector.open(SocketConnector.java:72)
 at org.mortbay.jetty.AbstractConnector.doStart(AbstractConnector.java:252)
 at org.mortbay.jetty.bio.SocketConnector.doStart(SocketConnector.java:145)
 at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40)
 at org.mortbay.jetty.Server.doStart(Server.java:221)
 at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40)
 at
org.apache.solr.client.solrj.embedded.JettySolrRunner.start(JettySolrRunner.java:99)
 at
org.apache.solr.client.solrj.embedded.JettySolrRunner.start(JettySolrRunner.java:93)
 at com.germinait.solr.jetty.StartStopJetty.main(StartStopJetty.java:9)
Jan 27, 2010 4:48:56 PM sun.reflect.NativeMethodAccessorImpl invoke0

is there any way to specify the current working directory?
and wht if we hv multicore with several cores, each core has a
solrconfig.xml in the conf folder. how would we start a jetty server from
the API in that case?
Regards,
Raakhi Khatwani


Re: Plurals in solr indexing

2010-01-27 Thread Erick Erickson
It would be more informative for you to actually post your
schema definitions for the fields in question, along
with your copyfield. The summary in your first
post leaves a lot of questions unanswered...

But a couple of things.
1 beware the SOLR string type. It does NOT tokenize
 the input. Text type is usually what people want
 unless they are doing something special
 purpose.
2 WordDelimiterFilterFactory is often a source of
 misunderstanding, take a close look at
 http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters3 I'd strongly
advise either really getting to know the admin
 page in SOLR and/or getting a copy of Luke to examine
 your index and see if what you *think* is in there actually is.
4 Try running your queries with debugQuery=on and see
 what that shows.


HTH
Erick

On Wed, Jan 27, 2010 at 6:09 AM, murali k ilar...@gmail.com wrote:


 I have found that my synonyms.txt file had

 kids,boys,girls,childrens,children,boys  girls,kid,boy,girl

 I ran analyzer, somehow it is matching with girl ,, i am not sure whats
 happening yet, so i removed ampersand
 Kids,boys,girls,childrens,children,boy,girl,kid

 I guessed when i add them comma separated it will do as a group and when
 any
 one the words are queried matches will be returned.

 it is working now... after i made that change in synonyms.txt file






 murali k wrote:
 
  Hi,
  I am having trouble with indexing plurals,
 
  I have the schema with following fields
  gender (field) - string (field type) (eg. data Boys)
  all (field) - text (field type)  - solr.WhitespaceTokenizerFactory,
  solr.SynonymFilterFactory, solr.WordDelimiterFilterFactory,
  solr.LowerCaseFilterFactory, SnowballPorterFilterFactory
 
  i am using copyField from gender to all
 
  and searching on all field
 
  When i search for Boy, I get the results, If i search for Boys i dont get
  results,
  I have tried things like boys bikes - no results
  boy bikes - works
 
  kid and kids are synonymns for boy and boys, so i tried adding
  kid,kids,boy,boys in synonyms hoping it will work, it doesnt work that
 way
 
  I also have other content fields which are copied to all , and it
  contains words like kids, boys etc...
  any idea?
 
 
 
 
 
 

 --
 View this message in context:
 http://old.nabble.com/Plurals-in-solr-indexing-tp27335639p27336508.html
 Sent from the Solr - User mailing list archive at Nabble.com.




Re: Help using CachedSqlEntityProcessor

2010-01-27 Thread Rolf Johansson
I recently had issues with CachedSqlEntityProcessor too, figuring out how to
use the syntax. After a while, I managed to get it working with cacheKey and
cacheLookup. I think this is 1.4 specific though.

It seems you have double WHERE clauses, one in the query and one in the
where attribute.

Try using cacheKey and cacheLookup instead in something like this:

entity name=LinkedCategory pk=LinkedCatArticleId
query=SELECT LinkedCategoryBC, CmsArticleId as LinkedCatAricleId
   FROM LinkedCategoryBreadCrumb_SolrSearch (nolock)
processor=CachedSqlEntityProcessor
cacheKey=LINKEDCATARTICLEID
cacheLookup=article.CMSARTICLEID
deltaQuery=SELECT LinkedCategoryBC
FROM LinkedCategoryBreadCrumb_SolrSearch (nolock)
WHERE convert(varchar(50), LastUpdateDate) 
'${dataimporter.article.last_index_time}'
OR convert(varchar(50), PublishDate) 
'${dataimporter.article.last_index_time}'
parentDeltaQuery=SELECT * from vArticleSummaryDetail_SolrSearch
 (nolock)
field column=LinkedCategoryBC name=LinkedCategoryBreadCrumb/
/entity

/Rolf


Den 2010-01-27 12.36, skrev KirstyS kirst...@gmail.com:

 
 Hi, I have looked on the wiki. Using the CachedSqlEntityProcessor looks like
 it was simple. But I am getting no speed benefit and am not sure if I have
 even got the syntax correct.
 I have a main root entity called 'article'.
 
 And then I have a number of sub entities. One such entity is as such :
 
 entity name=LinkedCategory pk=LinkedCatAricleId
   query=SELECT LinkedCategoryBC, CmsArticleId as
 LinkedCatAricleId
  FROM LinkedCategoryBreadCrumb_SolrSearch (nolock)
  WHERE convert(varchar(50), CmsArticleId) =
 convert(varchar(50), '${article.CmsArticleId}') 
 processor=CachedSqlEntityProcessor
 WHERE=LinkedCatArticleId = article.CmsArticleId
 deltaQuery=SELECT LinkedCategoryBC
 FROM LinkedCategoryBreadCrumb_SolrSearch
 (nolock)
 WHERE convert(varchar(50), CmsArticleId) =
 convert(varchar(50), '${article.CmsArticleId}')
 AND (convert(varchar(50), LastUpdateDate) 
 '${dataimporter.article.last_index_time}'
 OR   convert(varchar(50), PublishDate) 
 '${dataimporter.article.last_index_time}')
 parentDeltaQuery=SELECT * from
 vArticleSummaryDetail_SolrSearch (nolock)
  WHERE convert(varchar(50), CmsArticleId) =
 convert(varchar(50), '${article.CmsArticleId}')
 field column=LinkedCategoryBC name=LinkedCategoryBreadCrumb/
   /entity
 
 
 As you can see I have added (for the main query - not worrying about the
 delta queries yet!!) the processor and the 'where' but not sure if it's
 correct.
 Can anyone point me in the right direction???
 Thanks
 Kirsty



Re: Lock problems: Lock obtain timed out

2010-01-27 Thread Ian Connor
Can anyone think of a reason why these locks would hang around for more than
2 hours?

I have been monitoring them and they look like they are very short lived.

On Tue, Jan 26, 2010 at 10:15 AM, Ian Connor ian.con...@gmail.com wrote:

 We traced one of the lock files, and it had been around for 3 hours. A
 restart removed it - but is 3 hours normal for one of these locks?

 Ian.


 On Mon, Jan 25, 2010 at 4:14 PM, mike anderson saidthero...@gmail.comwrote:

 I am getting this exception as well, but disk space is not my problem.
 What
 else can I do to debug this? The solr log doesn't appear to lend any other
 clues..

 Jan 25, 2010 4:02:22 PM org.apache.solr.core.SolrCore execute
 INFO: [] webapp=/solr path=/update params={} status=500 QTime=1990
 Jan 25, 2010 4:02:22 PM org.apache.solr.common.SolrException log
 SEVERE: org.apache.lucene.store.LockObtainFailedException: Lock obtain
 timed
 out: NativeFSLock@
 /solr8984/index/lucene-98c1cb272eb9e828b1357f68112231e0-write.lock
 at org.apache.lucene.store.Lock.obtain(Lock.java:85)
 at org.apache.lucene.index.IndexWriter.init(IndexWriter.java:1545)
 at org.apache.lucene.index.IndexWriter.init(IndexWriter.java:1402)
 at org.apache.solr.update.SolrIndexWriter.init(SolrIndexWriter.java:190)
 at

 org.apache.solr.update.UpdateHandler.createMainIndexWriter(UpdateHandler.java:98)
 at

 org.apache.solr.update.DirectUpdateHandler2.openWriter(DirectUpdateHandler2.java:173)
 at

 org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:220)
 at

 org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:61)
 at org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:139)
 at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:69)
 at

 org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54)
 at

 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
 at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
 at

 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
 at

 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
 at

 org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089)
 at
 org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365)
 at

 org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
 at
 org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
 at
 org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712)
 at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405)
 at

 org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:211)
 at

 org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
 at
 org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139)
 at org.mortbay.jetty.Server.handle(Server.java:285)
 at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502)
 at

 org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:835)
 at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:641)
 at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:208)
 at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378)
 at

 org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:226)
 at

 org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:442)


 Should I consider changing the lock timeout settings (currently set to
 defaults)? If so, I'm not sure what to base these values on.

 Thanks in advance,
 mike


 On Wed, Nov 4, 2009 at 8:27 PM, Lance Norskog goks...@gmail.com wrote:

  This will not ever work reliably. You should have 2x total disk space
  for the index. Optimize, for one, requires this.
 
  On Wed, Nov 4, 2009 at 6:37 AM, Jérôme Etévé jerome.et...@gmail.com
  wrote:
   Hi,
  
   It seems this situation is caused by some No space left on device
  exeptions:
   SEVERE: java.io.IOException: No space left on device
  at java.io.RandomAccessFile.writeBytes(Native Method)
  at java.io.RandomAccessFile.write(RandomAccessFile.java:466)
  at
 
 org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexOutput.flushBuffer(SimpleFSDirectory.java:192)
  at
 
 org.apache.lucene.store.BufferedIndexOutput.flushBuffer(BufferedIndexOutput.java:96)
  
  
   I'd better try to set my maxMergeDocs and mergeFactor to more
   adequates values for my app (I'm indexing ~15 Gb of data on 20Gb
   device, so I guess there's problem when solr tries to merge the index
   bits being build.
  
   At the moment, they are set to   mergeFactor100/mergeFactor and
   maxMergeDocs2147483647/maxMergeDocs
  
   Jerome.
  
   --
   Jerome Eteve.
   http://www.eteve.net
   jer...@eteve.net
  
 
 
 
  --
  Lance Norskog
  goks...@gmail.com
 




-- 
Regards,

Ian Connor
1 Leighton St 

Re: Multiple Cores Vs. Single Core for the following use case

2010-01-27 Thread Matthieu Labour
@Marc: Thank you marc. This is a logic we had to implement in the client 
application. Will look into applying the patch to replace our own grown logic

@Trey: I have 1000 users per machine. 1 core / user. Each core is 35000 
documents. Documents are small...each core goes from 100MB to 1.3GB at most. 
There are 7 types of documents.
What I am trying to understand is the search/filter algorithm. If I have 1 core 
with all documents and I  search for Paris for userId=123, is lucene going 
to first search for all Paris documents and then apply a filter on the userId ? 
If this is the case, then I am better off having a specific index for the 
user=123 because this will be faster 





--- On Wed, 1/27/10, Marc Sturlese marc.sturl...@gmail.com wrote:

From: Marc Sturlese marc.sturl...@gmail.com
Subject: Re: Multiple Cores Vs. Single Core for the following use case
To: solr-user@lucene.apache.org
Date: Wednesday, January 27, 2010, 2:22 AM


In case you are going to use core per user take a look to this patch:
http://wiki.apache.org/solr/LotsOfCores

Trey-13 wrote:
 
 Hi Matt,
 
 In most cases you are going to be better off going with the userid method
 unless you have a very small number of users and a very large number of
 docs/user. The userid method will likely be much easier to manage, as you
 won't have to spin up a new core every time you add a new user.  I would
 start here and see if the performance is good enough for your requirements
 before you start worrying about it not being efficient.
 
 That being said, I really don't have any idea what your data looks like.
 How many users do you have?  How many documents per user?  Are any
 documents
 shared by multiple users?
 
 -Trey
 
 
 
 On Tue, Jan 26, 2010 at 7:27 PM, Matthieu Labour
 matthieu_lab...@yahoo.comwrote:
 
 Hi



 Shall I set up Multiple Core or Single core for the following use case:



 I have X number of users.



 When I do a search, I always know for which user I am doing a search



 Shall I set up X cores, 1 for each user ? Or shall I set up 1 core and
 add
 a userId field to each document?



 If I choose the 1 core solution then I am concerned with performance.
 Let's say I search for NewYork ... If lucene returns all New York
 matches for all users and then filters based on the userId, then this
 is going to be less efficient than if I have sharded per user and send
 the request for New York to the user's core



 Thank you for your help



 matt







 
 

-- 
View this message in context: 
http://old.nabble.com/Multiple-Cores-Vs.-Single-Core-for-the-following-use-case-tp27332288p27335403.html
Sent from the Solr - User mailing list archive at Nabble.com.




  

Re: Multiple Cores Vs. Single Core for the following use case

2010-01-27 Thread didier deshommes
On Wed, Jan 27, 2010 at 9:48 AM, Matthieu Labour
matthieu_lab...@yahoo.com wrote:
 What I am trying to understand is the search/filter algorithm. If I have 1 
 core with all documents and I  search for Paris for userId=123, is lucene 
 going to first search for all Paris documents and then apply a filter on the 
 userId ? If this is the case, then I am better off having a specific index 
 for the user=123 because this will be faster

If you want to apply the filter to userid first, use filter queries
(http://wiki.apache.org/solr/CommonQueryParameters#fq). This will
filter by userid first then search for Paris.

didier






 --- On Wed, 1/27/10, Marc Sturlese marc.sturl...@gmail.com wrote:

 From: Marc Sturlese marc.sturl...@gmail.com
 Subject: Re: Multiple Cores Vs. Single Core for the following use case
 To: solr-user@lucene.apache.org
 Date: Wednesday, January 27, 2010, 2:22 AM


 In case you are going to use core per user take a look to this patch:
 http://wiki.apache.org/solr/LotsOfCores

 Trey-13 wrote:

 Hi Matt,

 In most cases you are going to be better off going with the userid method
 unless you have a very small number of users and a very large number of
 docs/user. The userid method will likely be much easier to manage, as you
 won't have to spin up a new core every time you add a new user.  I would
 start here and see if the performance is good enough for your requirements
 before you start worrying about it not being efficient.

 That being said, I really don't have any idea what your data looks like.
 How many users do you have?  How many documents per user?  Are any
 documents
 shared by multiple users?

 -Trey



 On Tue, Jan 26, 2010 at 7:27 PM, Matthieu Labour
 matthieu_lab...@yahoo.comwrote:

 Hi



 Shall I set up Multiple Core or Single core for the following use case:



 I have X number of users.



 When I do a search, I always know for which user I am doing a search



 Shall I set up X cores, 1 for each user ? Or shall I set up 1 core and
 add
 a userId field to each document?



 If I choose the 1 core solution then I am concerned with performance.
 Let's say I search for NewYork ... If lucene returns all New York
 matches for all users and then filters based on the userId, then this
 is going to be less efficient than if I have sharded per user and send
 the request for New York to the user's core



 Thank you for your help



 matt










 --
 View this message in context: 
 http://old.nabble.com/Multiple-Cores-Vs.-Single-Core-for-the-following-use-case-tp27332288p27335403.html
 Sent from the Solr - User mailing list archive at Nabble.com.







update doc success, but could not find the new value

2010-01-27 Thread Jennifer Luo
I am using
http://localhost:8983/solr/update?commit=trueoverwrite=truecommitWithi
n=10 to update a document. The responseHeader's status is 0.

But when I search the new value, it couldn't be found.


Re: Multiple Cores Vs. Single Core for the following use case

2010-01-27 Thread Toby Cole
I've not looked at the filtering for quite a while, but if you're  
getting lots of similar queries, the filter's caching can play a huge  
part in speeding up queries, so even if the first query for paris  
was slow, subsequent queries from different users for the same terms  
will be sped up considerably (especially if you're using the  
FastLRUCache).


IF filtering is slow for your queries, why not try simply using a  
boolean query (i.e, for the example below: paris AND userId:123)  
this would remove the cross-user usefulness of the caches, if I  
understand them correctly, but may speed up uncached searches.


Toby.


On 27 Jan 2010, at 15:48, Matthieu Labour wrote:

@Marc: Thank you marc. This is a logic we had to implement in the  
client application. Will look into applying the patch to replace our  
own grown logic


@Trey: I have 1000 users per machine. 1 core / user. Each core is  
35000 documents. Documents are small...each core goes from 100MB to  
1.3GB at most. There are 7 types of documents.
What I am trying to understand is the search/filter algorithm. If I  
have 1 core with all documents and I  search for Paris for  
userId=123, is lucene going to first search for all Paris  
documents and then apply a filter on the userId ? If this is the  
case, then I am better off having a specific index for the  
user=123 because this will be faster






--- On Wed, 1/27/10, Marc Sturlese marc.sturl...@gmail.com wrote:

From: Marc Sturlese marc.sturl...@gmail.com
Subject: Re: Multiple Cores Vs. Single Core for the following use case
To: solr-user@lucene.apache.org
Date: Wednesday, January 27, 2010, 2:22 AM


In case you are going to use core per user take a look to this patch:
http://wiki.apache.org/solr/LotsOfCores

Trey-13 wrote:


Hi Matt,

In most cases you are going to be better off going with the userid  
method
unless you have a very small number of users and a very large  
number of
docs/user. The userid method will likely be much easier to manage,  
as you
won't have to spin up a new core every time you add a new user.  I  
would
start here and see if the performance is good enough for your  
requirements

before you start worrying about it not being efficient.

That being said, I really don't have any idea what your data looks  
like.

How many users do you have?  How many documents per user?  Are any
documents
shared by multiple users?

-Trey



On Tue, Jan 26, 2010 at 7:27 PM, Matthieu Labour
matthieu_lab...@yahoo.comwrote:


Hi



Shall I set up Multiple Core or Single core for the following use  
case:




I have X number of users.



When I do a search, I always know for which user I am doing a search



Shall I set up X cores, 1 for each user ? Or shall I set up 1 core  
and

add
a userId field to each document?



If I choose the 1 core solution then I am concerned with  
performance.
Let's say I search for NewYork ... If lucene returns all New  
York
matches for all users and then filters based on the userId, then  
this
is going to be less efficient than if I have sharded per user and  
send

the request for New York to the user's core



Thank you for your help



matt












--
View this message in context: 
http://old.nabble.com/Multiple-Cores-Vs.-Single-Core-for-the-following-use-case-tp27332288p27335403.html
Sent from the Solr - User mailing list archive at Nabble.com.







--
Toby Cole
Senior Software Engineer, Semantico Limited
Registered in England and Wales no. 03841410, VAT no. GB-744614334.
Registered office Lees House, 21-23 Dyke Road, Brighton BN1 3FE, UK.

Check out all our latest news and thinking on the Discovery blog
http://blogs.semantico.com/discovery-blog/



Re: Multiple Cores Vs. Single Core for the following use case

2010-01-27 Thread Matthieu Labour
Thanks Didier for your response
And in your opinion, this should be as fast as if I would getCore(userId) -- 
provided that the core is already open -- and then search for Paris ?
matt

--- On Wed, 1/27/10, didier deshommes dfdes...@gmail.com wrote:

From: didier deshommes dfdes...@gmail.com
Subject: Re: Multiple Cores Vs. Single Core for the following use case
To: solr-user@lucene.apache.org
Date: Wednesday, January 27, 2010, 10:52 AM

On Wed, Jan 27, 2010 at 9:48 AM, Matthieu Labour
matthieu_lab...@yahoo.com wrote:
 What I am trying to understand is the search/filter algorithm. If I have 1 
 core with all documents and I  search for Paris for userId=123, is lucene 
 going to first search for all Paris documents and then apply a filter on the 
 userId ? If this is the case, then I am better off having a specific index 
 for the user=123 because this will be faster

If you want to apply the filter to userid first, use filter queries
(http://wiki.apache.org/solr/CommonQueryParameters#fq). This will
filter by userid first then search for Paris.

didier






 --- On Wed, 1/27/10, Marc Sturlese marc.sturl...@gmail.com wrote:

 From: Marc Sturlese marc.sturl...@gmail.com
 Subject: Re: Multiple Cores Vs. Single Core for the following use case
 To: solr-user@lucene.apache.org
 Date: Wednesday, January 27, 2010, 2:22 AM


 In case you are going to use core per user take a look to this patch:
 http://wiki.apache.org/solr/LotsOfCores

 Trey-13 wrote:

 Hi Matt,

 In most cases you are going to be better off going with the userid method
 unless you have a very small number of users and a very large number of
 docs/user. The userid method will likely be much easier to manage, as you
 won't have to spin up a new core every time you add a new user.  I would
 start here and see if the performance is good enough for your requirements
 before you start worrying about it not being efficient.

 That being said, I really don't have any idea what your data looks like.
 How many users do you have?  How many documents per user?  Are any
 documents
 shared by multiple users?

 -Trey



 On Tue, Jan 26, 2010 at 7:27 PM, Matthieu Labour
 matthieu_lab...@yahoo.comwrote:

 Hi



 Shall I set up Multiple Core or Single core for the following use case:



 I have X number of users.



 When I do a search, I always know for which user I am doing a search



 Shall I set up X cores, 1 for each user ? Or shall I set up 1 core and
 add
 a userId field to each document?



 If I choose the 1 core solution then I am concerned with performance.
 Let's say I search for NewYork ... If lucene returns all New York
 matches for all users and then filters based on the userId, then this
 is going to be less efficient than if I have sharded per user and send
 the request for New York to the user's core



 Thank you for your help



 matt










 --
 View this message in context: 
 http://old.nabble.com/Multiple-Cores-Vs.-Single-Core-for-the-following-use-case-tp27332288p27335403.html
 Sent from the Solr - User mailing list archive at Nabble.com.








  

How to Implement SpanQuery in Solr . . ?

2010-01-27 Thread Christopher Ball
I am about to attempt to implementing the SpanQuery in Solr 1.4.

 

I noticed there is a JIRA to add it in 1.5:  

 

*   https://issues.apache.org/jira/browse/SOLR-1337 

 

I also noticed a couple of email threads from Grant and Yonik about trying
to implement it such as:

 

*   http://old.nabble.com/SpanQuery-support-td15246477.html 

 

So . . . 

 

*   Question: Has anyone started working on SOLR-1337 for Solr 1.5?

 

And if not . . . 

 

*   Question: Is the best way to go about it is to follow the following
recipe?

 

1.   Configure

a.   Specify a new parser plugin in solrconfig.xml:

b.   queryParser name=mySpanQueryParser
class=SpanQueryParserPlugin/

2.   Implement

a.   Use the FooQParserPlugin as a starting template
(https://svn.apache.org/repos/asf/lucene/solr/trunk/src/test/org/apache/solr
/core/SOLR749Test.java)

3.   Access 

a.   Access the current query type via 'q=mySpanQueryParser '

 

Most grateful for any thoughts,

 

Christopher



Re: How to Implement SpanQuery in Solr . . ?

2010-01-27 Thread Yonik Seeley
As always, I'd try starting with what the user interface (in this
case, syntax) should look like.
It makes sense to add elementary spans first.

{!spannear a=query1 b=query2 slop=10}

Thinking about implementation... what would really magnify the
usefulness of the basic API above is to convert non-span queries to
span queries automatically.  This is useful because the sub-queries of
a span query must be span queries, and most query parsers generate
non-span queries.  I think there is code in the highlighter that uses
spans that can do this conversion.

-Yonik
http://www.lucidimagination.com


On Wed, Jan 27, 2010 at 12:24 PM, Christopher Ball
christopher.b...@metaheuristica.com wrote:
 I am about to attempt to implementing the SpanQuery in Solr 1.4.



 I noticed there is a JIRA to add it in 1.5:



 *       https://issues.apache.org/jira/browse/SOLR-1337



 I also noticed a couple of email threads from Grant and Yonik about trying
 to implement it such as:



 *       http://old.nabble.com/SpanQuery-support-td15246477.html



 So . . .



 *       Question: Has anyone started working on SOLR-1337 for Solr 1.5?



 And if not . . .



 *       Question: Is the best way to go about it is to follow the following
 recipe?



 1.       Configure

 a.       Specify a new parser plugin in solrconfig.xml:

 b.       queryParser name=mySpanQueryParser
 class=SpanQueryParserPlugin/

 2.       Implement

 a.       Use the FooQParserPlugin as a starting template
 (https://svn.apache.org/repos/asf/lucene/solr/trunk/src/test/org/apache/solr
 /core/SOLR749Test.java)

 3.       Access

 a.       Access the current query type via 'q=mySpanQueryParser '



 Most grateful for any thoughts,



 Christopher




filter query error

2010-01-27 Thread jxkmailbox-01
NewBie Using Solr1.4

I am trying to use a filter query that filters on more than one value for a 
given filter  ie. filters on field equals value1 or value2

If I enter the following 2 urls in a browser I get back the correct results I 
am looking for:

http://localhost:8080/apache-solr-1.4.0/select/?q=helpfl=*,scorefq=+searchScope:SRM+searchScope:SMNindent=on
or
http://localHost:8080/apache-solr-1.4.0/select/?q=helpfl=*,searchScope,scorefq=searchScope:(SRM+OR+SMN)indent=on

But when I try to do it programitically I get an error.  It only works when I 
am filtering on 1 value, but when I try more than one value it fails.
See code snippet  and error message below.  When I use filter2 or filter3 it 
fails, but filter1 gives me no Errors

Not sure what I am doing wrong.  Any help would be greatly appreciated.

-Begin Code snippet ---
String query = help;

//String filter1 = searchScope:SRM
//String filter2 = +searchScope:SRM+searchScope:SMN;
String filter3 = searchScope:(SRM+OR+SMN);

SolrQuery solrQuery = new SolrQuery(query);
solrQuery.addFilterQuery(filter3);
QueryResponse response = solr.query(solrQuery);
-End Code snippet ---

I have tried using 
SolrQuery solrQuery = new SolrQuery(ClientUtils.escapeQueryChars(query));
solrQuery.addFilterQuery(ClientUtils.escapeQueryChars(filter));

But that returns no results


Also note that if I cut and paste the url from the error message below, it 
fails when I paste it in a browser,
but I can get it to work only if I remove the wt=javabin parameter.

Error Message

Exception in thread main org.apache.solr.client.solrj.SolrServerException: 
Error executing query
at 
org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:95)
at org.apache.solr.client.solrj.SolrServer.query(SolrServer.java:118)
at com.xyz.search.SolrSearch.performSearch(SolrSearch.java:126)
at com.xyz.search.SearchMain.main(SearchMain.java:23)
Caused by: org.apache.solr.common.SolrException: 
org.apache.lucene.queryParser.ParseException: Cannot parse ' searchScope:SRM 
searchScope:SMN': Encountered  : :  at line 1, column 28.  Was expecting 
one of:  EOF   AND ...  OR ...  NOT ......
  - ...  ( ...  * ...  ^ ...  QUOTED ...  TERM 
...  FUZZY_SLOP ...  PREFIXTERM ...  WILDTERM ...  [ 
...  { ...  NUMBER ...

org.apache.lucene.queryParser.ParseException: Cannot parse ' searchScope:SRM 
searchScope:SMN': Encountered  : :  at line 1, column 28.  Was expecting 
one of:  EOF   AND ...  OR ...  NOT ......
  - ...  ( ...  * ...  ^ ...  QUOTED ...  TERM 
...  FUZZY_SLOP ...  PREFIXTERM ...  WILDTERM ...  [ 
...  { ...  NUMBER ...

request: 
http://localhost:8080/apache-solr-1.4.0/select?q=helpfq=+searchScope:SRM+searchScope:SMNhl=truerows=15wt=javabinversion=1
at 
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:424)
at 
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:243)
at 
org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:89)
... 3 more

Re: update doc success, but could not find the new value

2010-01-27 Thread Erick Erickson
Ummm, you have to provide a *lot* more detail before anyone can help.

Have you used Luke or the admin page to examine your index and determine
that the update did, indeed, work?

Have you tried firing your query with debugQuery=on to see if the fields
searched are the ones you expect?

etc.

Erick

On Wed, Jan 27, 2010 at 11:54 AM, Jennifer Luo jenni...@talenttech.comwrote:

 I am using
 http://localhost:8983/solr/update?commit=trueoverwrite=truecommitWithi
 n=10 to update a document. The responseHeader's status is 0.

 But when I search the new value, it couldn't be found.



doc with missing highlight info

2010-01-27 Thread Harsch, Timothy J. (ARC-TI)[PEROT SYSTEMS]
Hi,
I have a query where the query matches the document but no highlighting info is 
returned.  Why?  Normally, highlighting returns correctly.  This query is 
different from others in that it uses a phrase like CR1428-Occ1

Field:
field name=destSpan type=text indexed=true
stored=true termVectors=true termPositions=true 
termOffsets=true /

query:
http://localhost:8080/solr/select?q=destSpan%3A%28%22CR1428-Occ2%22%29fl=destSpanhl=truehl.fl=destSpan

results:
?xml version=1.0 encoding=UTF-8?
response
lst name=responseHeader
int name=status0/int
int name=QTime0/int
lst name=params
str name=fldestSpan/str
str name=qdestSpan:(CR1428-Occ2)/str
str name=hl.fldestSpan/str
str name=hltrue/str
/lst
/lst
result name=response numFound=1 start=0
doc
str name=destSpan CR1428-Occ2 abcCR1428 .../str
/doc
/result
lst name=highlighting
lst name=6de31965cda3612c0932a4ea51aba23f8c666c7f/
/lst
/response

Tim Harsch
Sr. Software Engineer
Dell Perot Systems

?xml version=1.0 encoding=UTF-8?
response
	lst name=responseHeader
		int name=status0/int
		int name=QTime0/int
		lst name=params
			str name=fldestSpan/str
			str name=qdestSpan:(CR1428-Occ2)/str
			str name=hl.fldestSpan/str
			str name=hltrue/str
		/lst
	/lst
	result name=response numFound=1 start=0
		doc
			str name=destSpan CR1428-Occ2 abcCR1428 is a token for searching with SPAN testuser System of Registries 2010-01-22T23:01:00.000Z 2010-01-22T23:01:00.000Z testuser System of Registries/str
		/doc
	/result
	lst name=highlighting
		lst name=6de31965cda3612c0932a4ea51aba23f8c666c7f/
	/lst
/response


RE: update doc success, but could not find the new value

2010-01-27 Thread Jennifer Luo
I am using example, only with two fields, id and body. Id is string
field, body is text field.

I use another program to do a http post to update the document, url is
http://localhost:8983/solr/update?commit=trueoverwrite=truecommitWithi
n=10 , the data is 
add
doc
field name=idid1/field
field name=bodytest body/field
  /doc
/add

I get the responseHeader back, the status is 0.

Then I go to admin page, do search, query is body:test.  The result
numFound = 0.

I think the reason should be the index is not updated with the updated
document.

What should I do? What's is missing?
Jennifer Luo

 -Original Message-
 From: Erick Erickson [mailto:erickerick...@gmail.com]
 Sent: Wednesday, January 27, 2010 1:39 PM
 To: solr-user@lucene.apache.org
 Subject: Re: update doc success, but could not find the new value
 
 Ummm, you have to provide a *lot* more detail before anyone can help.
 
 Have you used Luke or the admin page to examine your index and
determine
 that the update did, indeed, work?
 
 Have you tried firing your query with debugQuery=on to see if the
fields
 searched are the ones you expect?
 
 etc.
 
 Erick
 
 On Wed, Jan 27, 2010 at 11:54 AM, Jennifer Luo
 jenni...@talenttech.comwrote:
 
  I am using
 
http://localhost:8983/solr/update?commit=trueoverwrite=truecommitWithi
  n=10 to update a document. The responseHeader's status is 0.
 
  But when I search the new value, it couldn't be found.
 


Re: filter query error

2010-01-27 Thread Ahmet Arslan
 I am trying to use a filter query that filters on more than
 one value for a given filter  ie. filters on field
 equals value1 or value2
 
 If I enter the following 2 urls in a browser I get back the
 correct results I am looking for:
 
 http://localhost:8080/apache-solr-1.4.0/select/?q=helpfl=*,scorefq=+searchScope:SRM+searchScope:SMNindent=on
 or
 http://localHost:8080/apache-solr-1.4.0/select/?q=helpfl=*,searchScope,scorefq=searchScope:(SRM+OR+SMN)indent=on
 
 But when I try to do it programitically I get an
 error.  It only works when I am filtering on 1 value,
 but when I try more than one value it fails.
 See code snippet  and error message below.  When
 I use filter2 or filter3 it fails, but filter1 gives me no
 Errors
 
 Not sure what I am doing wrong.  Any help would be
 greatly appreciated.
 
 -Begin Code snippet ---
 String query = help;
 
 //String filter1 = searchScope:SRM
 //String filter2 = +searchScope:SRM+searchScope:SMN;
 String filter3 = searchScope:(SRM+OR+SMN);
 

You need to replace + with space: 
String filter3 = searchScope:(SRM OR SMN); should work.





Re: Wildcard Search and Filter in Solr

2010-01-27 Thread Ahmet Arslan

 Hi just looked at the analysis.jsp and found out what it
 does during index /
 query
 
 Index Analyzer
 Intel 
 intel 
 intel 
 intel 
 intel 
 intel 

If the resultant token is intel, then q=inte* should return documents.
What says when you add debugQuery=on to your search url?
And why are you using an old version of solr? 


  


Re: Wildcard Search and Filter in Solr

2010-01-27 Thread Erik Hatcher
Note that the query analyzer output is NOT doing query _parsing_, but  
rather taking the string you passed and running it through the query  
analyzer only.  When using the default query parser, Inte* will be a  
search for terms that begin with inte.  It is odd that you're not  
finding it.  But you're using a pretty old version of Solr and quite  
likely something here has been fixed since.


Give Solr 1.4 a try.

Erik


On Jan 27, 2010, at 12:56 AM, ashokcz wrote:



Hi just looked at the analysis.jsp and found out what it does during  
index /

query

Index Analyzer
Intel
intel
intel
intel
intel
intel

Query Analyzer
Inte*
Inte*
inte*
inte
inte
inte
int

I think somewhere my configuration or my definition of the type  
text is

wrong.
This is my configuration .

fieldType class=solr.TextField name=text
  analyzer type=index
 tokenizer class=solr.WhitespaceTokenizerFactory/
  filter class=solr.LowerCaseFilterFactory/
filter catenateAll=0 catenateNumbers=0 catenateWords=0
class=solr.WordDelimiterFilterFactory generateNumberParts=1
generateWordParts=1/

 filter class=solr.StopFilterFactory/
  filter class=solr.TrimFilterFactory/
  filter class=solr.PorterStemFilterFactory/
   /analyzer


analyzer type=query
 tokenizer class=solr.WhitespaceTokenizerFactory/
	  filter class=solr.SynonymFilterFactory expand=true  
ignoreCase=true

synonyms=synonyms.txt/
  filter class=solr.LowerCaseFilterFactory/
 filter catenateAll=0 catenateNumbers=0 catenateWords=0
class=solr.WordDelimiterFilterFactory generateNumberParts=1
generateWordParts=1/
 filter class=solr.StopFilterFactory/
  filter class=solr.TrimFilterFactory/
  filter class=solr.PorterStemFilterFactory/
 /analyzer

   /fieldType

I think i am missing some basic configuration for doing wildcard  
searches .

but could not figure it out .
can someone help please


Ahmet Arslan wrote:




Hi ,
I m trying to use wildcard keywords in my search term and
filter term . but
i didnt get any results.
Searched a lot but could not find any lead .
Can someone help me in this.
i m using solr 1.2.0 and have few records indexed with
vendorName value as
Intel

In solr admin interface i m trying to do the search like
this

http://localhost:8983/solr/select?indent=onversion=2.2q=intelstart=0rows=10fl=*%2Cscoreqt=standardwt=standardexplainOther=hl.fl=

and i m getting the result properly

but when i use q=inte* no records are returned.

the same is the case for Filter Query on using
fq=VendorName:Intel i get
my results.

but on using fq=VendorName:Inte* no results are
returned.

I can guess i doing mistake in few obvious things , but
could not figure it
out ..
Can someone pls help me out :) :)


If q=intel returns documents while q=inte* does not, it means that
fieldType of your defaultSearchField is reducing the token intel into
something.

Can you find out it by using /admin/anaysis.jsp what happens to  
Intel

intel at index and query time?

What is your defaultSearchField? Is it VendorName?

It is expected that fq=VendorName:Intel returns results while
fq=VendorName:Inte* does not. Because prefix queries are not  
analyzed.



But it is strange that q=inte* does not return anything. Maybe your  
index

analyzer is reducing Intel into int or ıntel?

I am not 100% sure but solr 1.2.0  may use default locale in  
lowercase

operation. What is your default locale?

It is better to see what happens word Intel using analysis.jsp page.







--
View this message in context: 
http://old.nabble.com/Wildcard-Search-and-Filter-in-Solr-tp27306734p27334486.html
Sent from the Solr - User mailing list archive at Nabble.com.





Re: Help using CachedSqlEntityProcessor

2010-01-27 Thread KirstyS

Thanks. I am on 1.4..so maybe that is the problem.
Will try when I get back to work tomorrow. 
Thanks


Rolf Johansson-2 wrote:
 
 I recently had issues with CachedSqlEntityProcessor too, figuring out how
 to
 use the syntax. After a while, I managed to get it working with cacheKey
 and
 cacheLookup. I think this is 1.4 specific though.
 
 It seems you have double WHERE clauses, one in the query and one in the
 where attribute.
 
 Try using cacheKey and cacheLookup instead in something like this:
 
 entity name=LinkedCategory pk=LinkedCatArticleId
 query=SELECT LinkedCategoryBC, CmsArticleId as LinkedCatAricleId
FROM LinkedCategoryBreadCrumb_SolrSearch (nolock)
 processor=CachedSqlEntityProcessor
 cacheKey=LINKEDCATARTICLEID
 cacheLookup=article.CMSARTICLEID
 deltaQuery=SELECT LinkedCategoryBC
 FROM LinkedCategoryBreadCrumb_SolrSearch (nolock)
 WHERE convert(varchar(50), LastUpdateDate) 
 '${dataimporter.article.last_index_time}'
 OR convert(varchar(50), PublishDate) 
 '${dataimporter.article.last_index_time}'
 parentDeltaQuery=SELECT * from vArticleSummaryDetail_SolrSearch
  (nolock)
 field column=LinkedCategoryBC name=LinkedCategoryBreadCrumb/
 /entity
 
 /Rolf
 
 
 Den 2010-01-27 12.36, skrev KirstyS kirst...@gmail.com:
 
 
 Hi, I have looked on the wiki. Using the CachedSqlEntityProcessor looks
 like
 it was simple. But I am getting no speed benefit and am not sure if I
 have
 even got the syntax correct.
 I have a main root entity called 'article'.
 
 And then I have a number of sub entities. One such entity is as such :
 
 entity name=LinkedCategory pk=LinkedCatAricleId
   query=SELECT LinkedCategoryBC, CmsArticleId as
 LinkedCatAricleId
  FROM LinkedCategoryBreadCrumb_SolrSearch (nolock)
  WHERE convert(varchar(50), CmsArticleId) =
 convert(varchar(50), '${article.CmsArticleId}') 
 processor=CachedSqlEntityProcessor
 WHERE=LinkedCatArticleId = article.CmsArticleId
 deltaQuery=SELECT LinkedCategoryBC
 FROM LinkedCategoryBreadCrumb_SolrSearch
 (nolock)
 WHERE convert(varchar(50), CmsArticleId) =
 convert(varchar(50), '${article.CmsArticleId}')
 AND (convert(varchar(50), LastUpdateDate) 
 '${dataimporter.article.last_index_time}'
 OR   convert(varchar(50), PublishDate) 
 '${dataimporter.article.last_index_time}')
 parentDeltaQuery=SELECT * from
 vArticleSummaryDetail_SolrSearch (nolock)
  WHERE convert(varchar(50), CmsArticleId)
 =
 convert(varchar(50), '${article.CmsArticleId}')
 field column=LinkedCategoryBC
 name=LinkedCategoryBreadCrumb/
   /entity
 
 
 As you can see I have added (for the main query - not worrying about the
 delta queries yet!!) the processor and the 'where' but not sure if it's
 correct.
 Can anyone point me in the right direction???
 Thanks
 Kirsty
 
 
 

-- 
View this message in context: 
http://old.nabble.com/Help-using-CachedSqlEntityProcessor-tp27337635p27345412.html
Sent from the Solr - User mailing list archive at Nabble.com.



RE: doc with missing highlight info (bug found?!?)

2010-01-27 Thread Harsch, Timothy J. (ARC-TI)[PEROT SYSTEMS]
The more I play with values the more I realize highlighting seems to have a 
bug.  It seems to do with tokenizing.

WILL match and highlight:
Query: TOKEN  Data:token
Query: SEARCH Data:searching
Query: abcCR  Data: abcCR1428(highlights abcCR)

WILL match and NOT highlight:
Query: abcCR1428   Data: abcCR1428

-Original Message-
From: Harsch, Timothy J. (ARC-TI)[PEROT SYSTEMS] 
[mailto:timothy.j.har...@nasa.gov] 
Sent: Wednesday, January 27, 2010 10:42 AM
To: solr-user@lucene.apache.org
Subject: doc with missing highlight info

Hi,
I have a query where the query matches the document but no highlighting info is 
returned.  Why?  Normally, highlighting returns correctly.  This query is 
different from others in that it uses a phrase like CR1428-Occ1

Field:
field name=destSpan type=text indexed=true
stored=true termVectors=true termPositions=true 
termOffsets=true /

query:
http://localhost:8080/solr/select?q=destSpan%3A%28%22CR1428-Occ2%22%29fl=destSpanhl=truehl.fl=destSpan

results:
?xml version=1.0 encoding=UTF-8?
response
lst name=responseHeader
int name=status0/int
int name=QTime0/int
lst name=params
str name=fldestSpan/str
str name=qdestSpan:(CR1428-Occ2)/str
str name=hl.fldestSpan/str
str name=hltrue/str
/lst
/lst
result name=response numFound=1 start=0
doc
str name=destSpan CR1428-Occ2 abcCR1428 .../str
/doc
/result
lst name=highlighting
lst name=6de31965cda3612c0932a4ea51aba23f8c666c7f/
/lst
/response

Tim Harsch
Sr. Software Engineer
Dell Perot Systems



Re: Multiple Cores Vs. Single Core for the following use case

2010-01-27 Thread Amit Nithian
It sounds to me that multiple cores won't scale.. wouldn't you have to
create multiple configurations per each core and does the ranking function
change per user?

I would imagine that the filter method would work better.. the caching is
there and as mentioned earlier would be fast for multiple searches. If you
have searches for the same user, then add that to your warming queries list
so that on server startup, the cache will be warm for certain users that you
know tend to do a lot of searches. This can be known empirically or by log
mining.

I haven't used multiple cores but I suspect that having that many
configuration files parsed and loaded in memory can't be good for memory
usage over filter caching.

Just my 2 cents
Amit

On Wed, Jan 27, 2010 at 8:58 AM, Matthieu Labour
matthieu_lab...@yahoo.comwrote:

 Thanks Didier for your response
 And in your opinion, this should be as fast as if I would getCore(userId)
 -- provided that the core is already open -- and then search for Paris ?
 matt

 --- On Wed, 1/27/10, didier deshommes dfdes...@gmail.com wrote:

 From: didier deshommes dfdes...@gmail.com
 Subject: Re: Multiple Cores Vs. Single Core for the following use case
 To: solr-user@lucene.apache.org
 Date: Wednesday, January 27, 2010, 10:52 AM

 On Wed, Jan 27, 2010 at 9:48 AM, Matthieu Labour
 matthieu_lab...@yahoo.com wrote:
  What I am trying to understand is the search/filter algorithm. If I have
 1 core with all documents and I  search for Paris for userId=123, is
 lucene going to first search for all Paris documents and then apply a filter
 on the userId ? If this is the case, then I am better off having a specific
 index for the user=123 because this will be faster

 If you want to apply the filter to userid first, use filter queries
 (http://wiki.apache.org/solr/CommonQueryParameters#fq). This will
 filter by userid first then search for Paris.

 didier

 
 
 
 
 
  --- On Wed, 1/27/10, Marc Sturlese marc.sturl...@gmail.com wrote:
 
  From: Marc Sturlese marc.sturl...@gmail.com
  Subject: Re: Multiple Cores Vs. Single Core for the following use case
  To: solr-user@lucene.apache.org
  Date: Wednesday, January 27, 2010, 2:22 AM
 
 
  In case you are going to use core per user take a look to this patch:
  http://wiki.apache.org/solr/LotsOfCores
 
  Trey-13 wrote:
 
  Hi Matt,
 
  In most cases you are going to be better off going with the userid
 method
  unless you have a very small number of users and a very large number of
  docs/user. The userid method will likely be much easier to manage, as
 you
  won't have to spin up a new core every time you add a new user.  I would
  start here and see if the performance is good enough for your
 requirements
  before you start worrying about it not being efficient.
 
  That being said, I really don't have any idea what your data looks like.
  How many users do you have?  How many documents per user?  Are any
  documents
  shared by multiple users?
 
  -Trey
 
 
 
  On Tue, Jan 26, 2010 at 7:27 PM, Matthieu Labour
  matthieu_lab...@yahoo.comwrote:
 
  Hi
 
 
 
  Shall I set up Multiple Core or Single core for the following use case:
 
 
 
  I have X number of users.
 
 
 
  When I do a search, I always know for which user I am doing a search
 
 
 
  Shall I set up X cores, 1 for each user ? Or shall I set up 1 core and
  add
  a userId field to each document?
 
 
 
  If I choose the 1 core solution then I am concerned with performance.
  Let's say I search for NewYork ... If lucene returns all New York
  matches for all users and then filters based on the userId, then this
  is going to be less efficient than if I have sharded per user and send
  the request for New York to the user's core
 
 
 
  Thank you for your help
 
 
 
  matt
 
 
 
 
 
 
 
 
 
 
  --
  View this message in context:
 http://old.nabble.com/Multiple-Cores-Vs.-Single-Core-for-the-following-use-case-tp27332288p27335403.html
  Sent from the Solr - User mailing list archive at Nabble.com.
 
 
 
 
 







RE: update doc success, but could not find the new value

2010-01-27 Thread Jennifer Luo
It works. I made some mistake in my code.

Jennifer Luo

 -Original Message-
 From: Jennifer Luo [mailto:jenni...@talenttech.com]
 Sent: Wednesday, January 27, 2010 1:57 PM
 To: solr-user@lucene.apache.org
 Subject: RE: update doc success, but could not find the new value
 
 I am using example, only with two fields, id and body. Id is string
 field, body is text field.
 
 I use another program to do a http post to update the document, url is

http://localhost:8983/solr/update?commit=trueoverwrite=truecommitWithi
 n=10 , the data is
 add
   doc
   field name=idid1/field
   field name=bodytest body/field
   /doc
 /add
 
 I get the responseHeader back, the status is 0.
 
 Then I go to admin page, do search, query is body:test.  The result
 numFound = 0.
 
 I think the reason should be the index is not updated with the updated
 document.
 
 What should I do? What's is missing?
 Jennifer Luo
 
  -Original Message-
  From: Erick Erickson [mailto:erickerick...@gmail.com]
  Sent: Wednesday, January 27, 2010 1:39 PM
  To: solr-user@lucene.apache.org
  Subject: Re: update doc success, but could not find the new value
 
  Ummm, you have to provide a *lot* more detail before anyone can
help.
 
  Have you used Luke or the admin page to examine your index and
 determine
  that the update did, indeed, work?
 
  Have you tried firing your query with debugQuery=on to see if the
 fields
  searched are the ones you expect?
 
  etc.
 
  Erick
 
  On Wed, Jan 27, 2010 at 11:54 AM, Jennifer Luo
  jenni...@talenttech.comwrote:
 
   I am using
  

http://localhost:8983/solr/update?commit=trueoverwrite=truecommitWithi
   n=10 to update a document. The responseHeader's status is 0.
  
   But when I search the new value, it couldn't be found.
  


RE: update doc success, but could not find the new value

2010-01-27 Thread Markus Jelsma
Check out Jetty's output or Tomcat's logs. The logging is very verbose and
you can get a clearer picture.


Jennifer Luo said:
 I am using example, only with two fields, id and body. Id is string
 field, body is text field.

 I use another program to do a http post to update the document, url is
 http://localhost:8983/solr/update?commit=trueoverwrite=truecommitWithi
 n=10 , the data is
 add
   doc
   field name=idid1/field
   field name=bodytest body/field
   /doc
 /add

 I get the responseHeader back, the status is 0.

 Then I go to admin page, do search, query is body:test.  The result
 numFound = 0.

 I think the reason should be the index is not updated with the updated
 document.

 What should I do? What's is missing?
 Jennifer Luo

 -Original Message-
 From: Erick Erickson [mailto:erickerick...@gmail.com]
 Sent: Wednesday, January 27, 2010 1:39 PM
 To: solr-user@lucene.apache.org
 Subject: Re: update doc success, but could not find the new value

 Ummm, you have to provide a *lot* more detail before anyone can help.

 Have you used Luke or the admin page to examine your index and
 determine
 that the update did, indeed, work?

 Have you tried firing your query with debugQuery=on to see if the
 fields
 searched are the ones you expect?

 etc.

 Erick

 On Wed, Jan 27, 2010 at 11:54 AM, Jennifer Luo
 jenni...@talenttech.comwrote:

  I am using
 
 http://localhost:8983/solr/update?commit=trueoverwrite=truecommitWithi
  n=10 to update a document. The responseHeader's status is 0.
 
  But when I search the new value, it couldn't be found.
 





Re: Multiple Cores Vs. Single Core for the following use case

2010-01-27 Thread Tom Hill
Hi -

I'd probably go with a single core on this one, just for ease of operations.

But here are some thoughts:

One advantage I can see to multiple cores, though, would be better idf
calculations. With individual cores, each user only sees the idf for his own
documents. With a single core, the idf will be across all documents. In
theory, better relevance.

While multi-core will use more ram to start with, and I would expect it to
use more disk (term dictionary per core). Filters would add to the memory
footprint of the multiple core setup.

However, if you only end up sorting/faceting on some of the cores, your
memory use with multiple cores may actually be less. With multiple cores,
each field cache only covers one user's docs. With single core, you have one
field cache entry per doc in the whole corpus. Depending on usage patterns,
index sizes, etc, this could be a significant amount of memory.

Tom


On Wed, Jan 27, 2010 at 11:38 AM, Amit Nithian anith...@gmail.com wrote:

 It sounds to me that multiple cores won't scale.. wouldn't you have to
 create multiple configurations per each core and does the ranking function
 change per user?

 I would imagine that the filter method would work better.. the caching is
 there and as mentioned earlier would be fast for multiple searches. If you
 have searches for the same user, then add that to your warming queries list
 so that on server startup, the cache will be warm for certain users that
 you
 know tend to do a lot of searches. This can be known empirically or by log
 mining.

 I haven't used multiple cores but I suspect that having that many
 configuration files parsed and loaded in memory can't be good for memory
 usage over filter caching.

 Just my 2 cents
 Amit

 On Wed, Jan 27, 2010 at 8:58 AM, Matthieu Labour
 matthieu_lab...@yahoo.comwrote:

  Thanks Didier for your response
  And in your opinion, this should be as fast as if I would getCore(userId)
  -- provided that the core is already open -- and then search for Paris
 ?
  matt
 
  --- On Wed, 1/27/10, didier deshommes dfdes...@gmail.com wrote:
 
  From: didier deshommes dfdes...@gmail.com
  Subject: Re: Multiple Cores Vs. Single Core for the following use case
  To: solr-user@lucene.apache.org
  Date: Wednesday, January 27, 2010, 10:52 AM
 
  On Wed, Jan 27, 2010 at 9:48 AM, Matthieu Labour
  matthieu_lab...@yahoo.com wrote:
   What I am trying to understand is the search/filter algorithm. If I
 have
  1 core with all documents and I  search for Paris for userId=123, is
  lucene going to first search for all Paris documents and then apply a
 filter
  on the userId ? If this is the case, then I am better off having a
 specific
  index for the user=123 because this will be faster
 
  If you want to apply the filter to userid first, use filter queries
  (http://wiki.apache.org/solr/CommonQueryParameters#fq). This will
  filter by userid first then search for Paris.
 
  didier
 
  
  
  
  
  
   --- On Wed, 1/27/10, Marc Sturlese marc.sturl...@gmail.com wrote:
  
   From: Marc Sturlese marc.sturl...@gmail.com
   Subject: Re: Multiple Cores Vs. Single Core for the following use case
   To: solr-user@lucene.apache.org
   Date: Wednesday, January 27, 2010, 2:22 AM
  
  
   In case you are going to use core per user take a look to this patch:
   http://wiki.apache.org/solr/LotsOfCores
  
   Trey-13 wrote:
  
   Hi Matt,
  
   In most cases you are going to be better off going with the userid
  method
   unless you have a very small number of users and a very large number
 of
   docs/user. The userid method will likely be much easier to manage, as
  you
   won't have to spin up a new core every time you add a new user.  I
 would
   start here and see if the performance is good enough for your
  requirements
   before you start worrying about it not being efficient.
  
   That being said, I really don't have any idea what your data looks
 like.
   How many users do you have?  How many documents per user?  Are any
   documents
   shared by multiple users?
  
   -Trey
  
  
  
   On Tue, Jan 26, 2010 at 7:27 PM, Matthieu Labour
   matthieu_lab...@yahoo.comwrote:
  
   Hi
  
  
  
   Shall I set up Multiple Core or Single core for the following use
 case:
  
  
  
   I have X number of users.
  
  
  
   When I do a search, I always know for which user I am doing a search
  
  
  
   Shall I set up X cores, 1 for each user ? Or shall I set up 1 core
 and
   add
   a userId field to each document?
  
  
  
   If I choose the 1 core solution then I am concerned with performance.
   Let's say I search for NewYork ... If lucene returns all New York
   matches for all users and then filters based on the userId, then this
   is going to be less efficient than if I have sharded per user and
 send
   the request for New York to the user's core
  
  
  
   Thank you for your help
  
  
  
   matt
  
  
  
  
  
  
  
  
  
  
   --
   View this message in context:
 
 

Re: Plurals in solr indexing

2010-01-27 Thread Tom Hill
I recommend getting familiar with the analysis tool included with solr. From
Solr's main admin screen, click on analysis, Check verbose, and enter your
text, and you can see the changes that happen during analysis.

It's really helpful, especially when getting started.

Tom


On Wed, Jan 27, 2010 at 2:41 AM, murali k ilar...@gmail.com wrote:


 Hi,
 I am having trouble with indexing plurals,

 I have the schema with following fields
 gender (field) - string (field type) (eg. data Boys)
 all (field) - text (field type)  - solr.WhitespaceTokenizerFactory,
 solr.SynonymFilterFactory, solr.WordDelimiterFilterFactory,
 solr.LowerCaseFilterFactory, SnowballPorterFilterFactory

 i am using copyField from gender to all

 and searching on all field

 When i search for Boy, I get the results, If i search for Boys i dont get
 results,
 I have tried things like boys bikes - no results
 boy bikes - works

 kid and kids are synonymns for boy and boys, so i tried adding
 kid,kids,boy,boys in synonyms hoping it will work, it doesnt work that way

 I also have other content fields which are copied to all , and it
 contains
 words like kids, boys etc...
 any idea?





 --
 View this message in context:
 http://old.nabble.com/Plurals-in-solr-indexing-tp27335639p27335639.html
 Sent from the Solr - User mailing list archive at Nabble.com.




Re: filter query error

2010-01-27 Thread jxkmailbox-01
thanks! that worked.





From: jxkmailbox...@yahoo.com jxkmailbox...@yahoo.com
To: solr-user@lucene.apache.org
Sent: Wed, January 27, 2010 1:28:07 PM
Subject: filter query error


NewBie Using Solr1.4

I am trying to use a filter query that filters on more than one value for a 
given filter  ie. filters on field equals value1 or value2

If I enter the following 2 urls in a browser I get back the correct results I 
am looking for:

http://localhost:8080/apache-solr-1.4.0/select/?q=helpfl=*,scorefq=+searchScope:SRM+searchScope:SMNindent=on
or
http://localHost:8080/apache-solr-1.4.0/select/?q=helpfl=*,searchScope,scorefq=searchScope:%28SRM+OR+SMN%29indent=on

But when I try to do it programitically I get an error.  It only works when I 
am filtering on 1 value, but when I try more than one value it fails.
See code snippet  and error message below.  When I use filter2 or filter3 it 
fails, but filter1 gives me no Errors

Not sure what I am doing wrong.  Any help would be greatly appreciated.

-Begin Code snippet ---
String query = help;

//String filter1 = searchScope:SRM
//String filter2 = +searchScope:SRM+searchScope:SMN;
String filter3 = searchScope:(SRM+OR+SMN);

SolrQuery solrQuery = new SolrQuery(query);
solrQuery.addFilterQuery(filter3);
QueryResponse response = solr.query(solrQuery);
-End Code snippet ---

I have tried using 
SolrQuery solrQuery = new SolrQuery(ClientUtils.escapeQueryChars(query));
solrQuery.addFilterQuery(ClientUtils.escapeQueryChars(filter));

But that returns no results


Also note that if I cut and paste the url from the error message below, it 
fails when I paste it in a browser,
but I can get it to work only if I remove the wt=javabin parameter.

Error Message

Exception in thread main org.apache.solr.client.solrj.SolrServerException: 
Error executing query
at 
org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:95)
at org.apache.solr.client.solrj.SolrServer.query(SolrServer.java:118)
at com.xyz.search.SolrSearch.performSearch(SolrSearch.java:126)
at com.xyz.search.SearchMain.main(SearchMain.java:23)
Caused by: org.apache.solr.common.SolrException: 
org.apache.lucene.queryParser.ParseException: Cannot parse ' searchScope:SRM 
searchScope:SMN': Encountered  : :  at line 1, column 28.  Was expecting 
one of:  EOF   AND ...  OR ...  NOT ......
  - ...  ( ...  * ...  ^ ...  QUOTED ...  TERM 
...  FUZZY_SLOP ...  PREFIXTERM ...  WILDTERM ...  [ 
...  { ...  NUMBER ...

org.apache.lucene.queryParser.ParseException: Cannot parse ' searchScope:SRM 
searchScope:SMN': Encountered  : :  at line 1, column 28.  Was expecting 
one of:  EOF   AND ...  OR ...  NOT ......
  - ...  ( ...  * ...  ^ ...  QUOTED ...  TERM 
...  FUZZY_SLOP ...  PREFIXTERM ...  WILDTERM ...  [ 
...  { ...  NUMBER ...

request: 
http://localhost:8080/apache-solr-1.4.0/select?q=helpfq=+searchScope:SRM+searchScope:SMNhl=truerows=15wt=javabinversion=1
at 
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:424)
at 
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:243)
at 
org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:89)
... 3 more

Re: filter query error

2010-01-27 Thread jxkmailbox-01


thanks! that worked.




From: Ahmet Arslan iori...@yahoo.com
To: solr-user@lucene.apache.org
Sent: Wed, January 27, 2010 2:00:32 PM
Subject: Re: filter query error

 I am trying to use a filter query that filters on more than
 one value for a given filter  ie. filters on field
 equals value1 or value2
 
 If I enter the following 2 urls in a browser I get back the
 correct results I am looking for:
 
 http://localhost:8080/apache-solr-1.4.0/select/?q=helpfl=*,scorefq=+searchScope:SRM+searchScope:SMNindent=on
 or
 http://localHost:8080/apache-solr-1.4.0/select/?q=helpfl=*,searchScope,scorefq=searchScope:(SRM+OR+SMN)indent=on
 
 But when I try to do it programitically I get an
 error.  It only works when I am filtering on 1 value,
 but when I try more than one value it fails.
 See code snippet  and error message below.  When
 I use filter2 or filter3 it fails, but filter1 gives me no
 Errors
 
 Not sure what I am doing wrong.  Any help would be
 greatly appreciated.
 
 -Begin Code snippet ---
 String query = help;
 
 //String filter1 = searchScope:SRM
 //String filter2 = +searchScope:SRM+searchScope:SMN;
 String filter3 = searchScope:(SRM+OR+SMN);
 

You need to replace + with space: 
String filter3 = searchScope:(SRM OR SMN); should work.

Can Solr be forced to return all field tags for a document even if the field is empty?l

2010-01-27 Thread Turner, Robbin J
I have a field Title and Summary.  I've currently not set a default value for 
the Summary in my schema, it's just a text field with indexed=true and 
stored=true, but not required.  When the data is indexed sometimes the 
documents don't have a summary so then Solr doesn't index that field.

When a query is sent and we get the results for those documents returned, if 
they did not have a summary then there is no tagged in the xml for that field.

Is there a way to have the xml always return the field tags for each document 
in the result set even if the field has no data?

I apologize ahead of time if this has been answered, but after doing a bit of 
search have not been able to find the answer elsewhere.

Thanks
Robbin




Re: doc with missing highlight info

2010-01-27 Thread Koji Sekiguchi

Harsch, Timothy J. (ARC-TI)[PEROT SYSTEMS] wrote:

Hi,
I have a query where the query matches the document but no highlighting info is returned. 
 Why?  Normally, highlighting returns correctly.  This query is different from others in 
that it uses a phrase like CR1428-Occ1

Field:
field name=destSpan type=text indexed=true
stored=true termVectors=true termPositions=true termOffsets=true 
/

query:
http://localhost:8080/solr/select?q=destSpan%3A%28%22CR1428-Occ2%22%29fl=destSpanhl=truehl.fl=destSpan

results:
?xml version=1.0 encoding=UTF-8?
response
lst name=responseHeader
int name=status0/int
int name=QTime0/int
lst name=params
str name=fldestSpan/str
str name=qdestSpan:(CR1428-Occ2)/str
str name=hl.fldestSpan/str
str name=hltrue/str
/lst
/lst
result name=response numFound=1 start=0
doc
str name=destSpan CR1428-Occ2 abcCR1428 .../str
/doc
/result
lst name=highlighting
lst name=6de31965cda3612c0932a4ea51aba23f8c666c7f/
/lst
/response

Tim Harsch
Sr. Software Engineer
Dell Perot Systems

  
Which Solr version are you using? If trunk, you are using 
FastVectorHighlighter

because destSpan is termVectors/termPositions/termOffsets are on. If so,
you can use (traditional) Highlighter explicitly by specifying 
hl.useHighlighter=true:


http://wiki.apache.org/solr/HighlightingParameters#hl.useHighlighter

If you are using FVH, can you give me info of fieldType name=text/?

Thanks,

Koji

--
http://www.rondhuit.com/en/



Re: solr with tomcat in cluster mode

2010-01-27 Thread Lance Norskog
Linux includes a load-balancer program 'balance'. You set it up at a
third port and configure it to use 'localhost:8180' and
'localhost:8280'.

On Wed, Jan 27, 2010 at 4:06 AM, ZAROGKIKAS,GIORGOS
g.zarogki...@multirama.gr wrote:
 Hi again
        I finally setup my solr Cluster with tomcat6
        The configuration I user is two tomcat servers on the same server in 
 different ports(ex localhost:8180/solr and
         Localhost:8280/solr for testing purposes) with different indexes on 
 each server  and index replication through replication handler of solr , and 
 its working fine for me and very quick

 Now I want to use load balance for these two tomcat servers but without using 
 apache http server
 Is there any solution for that ???






 -Original Message-
 From: Matt Mitchell [mailto:goodie...@gmail.com]
 Sent: Friday, January 22, 2010 9:33 PM
 To: solr-user@lucene.apache.org
 Subject: Re: solr with tomcat in cluster mode

 Hey Otis,

 We're indexing on a separate machine because we want to keep our production
 nodes away from processes like indexing. The indexing server also has a ton
 of resources available, more so than the production nodes. We set it up as
 an indexing server at one point and have decided to stick with it.

 We're not indexing the same index as the search indexes because we want to
 be able to step back a day or two if needed. So we do the SWAP when things
 are done and OK.

 So that last part you mentioned about the searchers needing to re-open will
 happen with a SWAP right? Is your concern that there will be a lag time,
 making it so the slaves will be out of sync for some small period of time?

 Would it be simpler/better to move to using Solrs native slave/master
 feature?

 I'd love to hear any suggestions you might have.

 Thanks,

 Matt

 On Fri, Jan 22, 2010 at 1:58 PM, Otis Gospodnetic 
 otis_gospodne...@yahoo.com wrote:

 This should work fine.
 But why are you indexing to a separate index/core?  Why not index in the
 very same index you are searching?
 Slaves won't see changes until their searchers re-open.

 Otis
 --
 Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch



 - Original Message 
  From: Matt Mitchell goodie...@gmail.com
  To: solr-user@lucene.apache.org
  Sent: Fri, January 22, 2010 9:44:03 AM
  Subject: Re: solr with tomcat in cluster mode
 
  We have a similar setup and I'd be curious to see how folks are doing
 this
  as well.
 
  Our setup: A few servers and an F5 load balancer. Each Solr instance
 points
  to a shared index. We use a separate server for indexing. When the index
 is
  complete, we do some juggling using the Core Admin SWAP function and
 update
  the shared index. I've wondered about having a shared index across
 multiple
  instances of (read-only) Solr -- any problems there?
 
  Matt
 
  On Fri, Jan 22, 2010 at 9:35 AM, ZAROGKIKAS,GIORGOS 
  g.zarogki...@multirama.gr wrote:
 
   Hi
          I'm using solr 1.4 with tomcat in a single pc
          and I want to turn it in cluster mode with 2 nodes and load
   balancing
          But I can't find info how to do
          Is there any manual or a recorded procedure on the internet  to
   do that
          Or is there anyone to help me ?
  
   Thanks in advance
  
  
   Ps : I use windows server 2008 for OS
  
  
  
  
  






-- 
Lance Norskog
goks...@gmail.com


RE: Solr wiki link broken

2010-01-27 Thread Teruhiko Kurosaka
Why don't we change the links to have FrontPage explicitly?
Wouldn't it be the easiest fix unless there are numerous
other pages that references the default page w/o FrontPage?

-kuro  

 -Original Message-
 From: Chris Hostetter [mailto:hossman_luc...@fucit.org] 
 Sent: Tuesday, January 26, 2010 4:41 PM
 To: solr-user@lucene.apache.org
 Subject: RE: Solr wiki link broken
 
 
 : You are right. The wiki can't be read if the preferred 
 language is not English.
 : The wiki system seems to implement or be configured to use 
 a wrong way of choosing its locale.
 : Erik, let me know if I can help solving this.
 
 Interesting.  
 
 When accessing http://wiki.apache.org/solr/; MoinMoin 
 evidently picks a translated version of the page to show 
 each user based on the Accept-Language header sent by the 
 browser.  If it's en or unset, you get the same thing as 
 http://wiki.apache.org/solr/FrontPage -- but if you have some 
 other prefered langauge configured in your browser, then you 
 get a differnet page, for example de causes 
 http://wiki.apache.org/solr/StartSeite to be loaded instead.
 
 (this behavior can be forced inspite of the Accept-Language 
 header sent by the browser if you are logged into the wiki 
 and change the Preferred langauge setting from Browser 
 setting to something else ... but i don't recommend it 
 since i was stuck with German for about 10 minutes and got 
 500 errors every time i tried to change my prefrences
 back)
 
 This is presumably designed to make it easy to support a 
 multilanguage wiki, with users getting langauge specific 
 homepages that can then link out to lanaguge specific 
 versions of pages -- but that doesn't really help us much 
 since we don't have any meaninful content on those langauge 
 specific homepages.
 
 According to this...
 http://wiki.apache.org/solr/HelpOnLanguages
 
 ...we should be deleting all those unused pages, or have 
 INFRA change or wiki config so that something other then 
 FrontPage is out default (which now explains why Lunce-Java 
 has FrontPageEN as the default)
 
 Any volunteers to help purge the wiki of (effectively) blank 
 translation pages? ... it looks like they all (probably) have 
 have comment ##master-page:FrontPage at the top, so they 
 should be easy to identify even if you don't speak the 
 language ... but they aren't very easy to search for since 
 those comments don't appear in the generated page.
 
 
 -Hoss
 
 

Re: Can Solr be forced to return all field tags for a document even if the field is empty?l

2010-01-27 Thread Erick Erickson
This is kind of an unusual request, what higher-level
problem are you trying to solve here? Because the
field just *isn't there* in the underlying Lucene index
for that document.

I suppose you could index a not there token and just
throw those values out from the response...

Erick

On Wed, Jan 27, 2010 at 6:19 PM, Turner, Robbin J 
robbin.j.tur...@boeing.com wrote:

 I have a field Title and Summary.  I've currently not set a default value
 for the Summary in my schema, it's just a text field with indexed=true and
 stored=true, but not required.  When the data is indexed sometimes the
 documents don't have a summary so then Solr doesn't index that field.

 When a query is sent and we get the results for those documents returned,
 if they did not have a summary then there is no tagged in the xml for that
 field.

 Is there a way to have the xml always return the field tags for each
 document in the result set even if the field has no data?

 I apologize ahead of time if this has been answered, but after doing a bit
 of search have not been able to find the answer elsewhere.

 Thanks
 Robbin





Re: Help using CachedSqlEntityProcessor

2010-01-27 Thread Noble Paul നോബിള്‍ नोब्ळ्
cacheKey and cacheLookup are required attributes .

On Thu, Jan 28, 2010 at 12:51 AM, KirstyS kirst...@gmail.com wrote:

 Thanks. I am on 1.4..so maybe that is the problem.
 Will try when I get back to work tomorrow.
 Thanks


 Rolf Johansson-2 wrote:

 I recently had issues with CachedSqlEntityProcessor too, figuring out how
 to
 use the syntax. After a while, I managed to get it working with cacheKey
 and
 cacheLookup. I think this is 1.4 specific though.

 It seems you have double WHERE clauses, one in the query and one in the
 where attribute.

 Try using cacheKey and cacheLookup instead in something like this:

 entity name=LinkedCategory pk=LinkedCatArticleId
         query=SELECT LinkedCategoryBC, CmsArticleId as LinkedCatAricleId
                FROM LinkedCategoryBreadCrumb_SolrSearch (nolock)
         processor=CachedSqlEntityProcessor
         cacheKey=LINKEDCATARTICLEID
         cacheLookup=article.CMSARTICLEID
         deltaQuery=SELECT LinkedCategoryBC
                     FROM LinkedCategoryBreadCrumb_SolrSearch (nolock)
                     WHERE convert(varchar(50), LastUpdateDate) 
                     '${dataimporter.article.last_index_time}'
                     OR convert(varchar(50), PublishDate) 
                     '${dataimporter.article.last_index_time}'
         parentDeltaQuery=SELECT * from vArticleSummaryDetail_SolrSearch
                          (nolock)
     field column=LinkedCategoryBC name=LinkedCategoryBreadCrumb/
 /entity

 /Rolf


 Den 2010-01-27 12.36, skrev KirstyS kirst...@gmail.com:


 Hi, I have looked on the wiki. Using the CachedSqlEntityProcessor looks
 like
 it was simple. But I am getting no speed benefit and am not sure if I
 have
 even got the syntax correct.
 I have a main root entity called 'article'.

 And then I have a number of sub entities. One such entity is as such :

     entity name=LinkedCategory pk=LinkedCatAricleId
               query=SELECT LinkedCategoryBC, CmsArticleId as
 LinkedCatAricleId
                      FROM LinkedCategoryBreadCrumb_SolrSearch (nolock)
                      WHERE convert(varchar(50), CmsArticleId) =
 convert(varchar(50), '${article.CmsArticleId}') 
                 processor=CachedSqlEntityProcessor
                 WHERE=LinkedCatArticleId = article.CmsArticleId
                 deltaQuery=SELECT LinkedCategoryBC
                             FROM LinkedCategoryBreadCrumb_SolrSearch
 (nolock)
                             WHERE convert(varchar(50), CmsArticleId) =
 convert(varchar(50), '${article.CmsArticleId}')
                             AND (convert(varchar(50), LastUpdateDate) 
 '${dataimporter.article.last_index_time}'
                             OR   convert(varchar(50), PublishDate) 
 '${dataimporter.article.last_index_time}')
                 parentDeltaQuery=SELECT * from
 vArticleSummaryDetail_SolrSearch (nolock)
                                  WHERE convert(varchar(50), CmsArticleId)
 =
 convert(varchar(50), '${article.CmsArticleId}')
         field column=LinkedCategoryBC
 name=LinkedCategoryBreadCrumb/
       /entity


 As you can see I have added (for the main query - not worrying about the
 delta queries yet!!) the processor and the 'where' but not sure if it's
 correct.
 Can anyone point me in the right direction???
 Thanks
 Kirsty




 --
 View this message in context: 
 http://old.nabble.com/Help-using-CachedSqlEntityProcessor-tp27337635p27345412.html
 Sent from the Solr - User mailing list archive at Nabble.com.





-- 
-
Noble Paul | Systems Architect| AOL | http://aol.com


transformer or filter...which is better

2010-01-27 Thread Abin Mathew
Hi

When the same thing can be done using a transformer and a filter which
one will be better? and why?? Please help


Re: Can Solr be forced to return all field tags for a document even if the field is empty?l

2010-01-27 Thread Andrzej Bialecki

On 2010-01-28 03:21, Erick Erickson wrote:

This is kind of an unusual request, what higher-level
problem are you trying to solve here? Because the
field just *isn't there* in the underlying Lucene index
for that document.

I suppose you could index a not there token and just
throw those values out from the response...


You can also implement a SearchComponent that post-processes results and 
based on the schema if a field is missing then it adds an empty node to 
the result.



--
Best regards,
Andrzej Bialecki 
 ___. ___ ___ ___ _ _   __
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com



Solr + MySQL newbie question

2010-01-27 Thread Manish Gulati
I am planning to use Solr to power search on the site. Our db is mysql and we 
need to index some tables in the schema into Solr. Based on my initial research 
it appears that I need to write a java program that will create xml documents 
(say mydocs.xml) with add command and then use this command to index it in Solr 
java -jar post.jar mydocs.xml. 

Kindly let me know if this is fine or some other sophiscticated solution exist 
for mysql synching. 

--
Manish


How to disable wildcard search

2010-01-27 Thread Ranveer Kumar
Hi all,

How to remove/disable wildcard search in solr.
I have no requirement of wildcard.
is there any configuration to disable wildcard search in solr.

I am using solrj for searching..

thanks
With regards
Ranveer K Kumar