date:20110810

Well, it depends (tm).

If you're talking about *indexed* terms, then the value is stored only
once in both the cases you mentioned below. There's really very little
difference between a non-multi-valued field and a multi-valued field
in terms of how it's stored in the searchable portion of the index,
except for some position information.

So, having an XML doc with a single-valued field

field name=categorycomputers laptops/field

is almost identical (except for position info as positionIncrementGap) as a

field name=categorycomputers/field
field name=categorylaptops/field

multiValued refers to the *input*, not whether more than one word is
allowed in that field.


Now, about *stored* fields. If you store the data, verbatim copies are
kept in the
storage-specific files in each segment, and the values will be on disk for
each document.

But you probably don't care much because this data is only referenced when you
assemble a document for return to the client, it's irrelevant for searching.

Best
Erick

On Tue, Aug 9, 2011 at 8:02 PM, Kevin Osborn osbo...@yahoo.com wrote:
 Please verify my understanding. I have a field called category and it has a 
 value computers. If I use this same field and value for all of my 
 documents, it is really only stored on disk once because category:computers 
 is a unique term. Is this correct?

 But, what about multi-valued fields. So, I have a field called category. 
 For 100 documents, it has the values computers and laptops. For 100 other 
 documents, it has the values computers and tablets. Is this stored as 
 category:computers, category:laptops, category:tablets, meaning 3 
 unique terms. Or is it stored as category:computers,laptops and 
 category:computers,tablets. I believe it is the first case (hopefully), but 
 I am not sure.

 Thanks.

Re: document indexing

2011-08-10 Thread lee carroll

With the first option you can be page specific in your search results
and searches.
Field collapsing/grouping will help with your normalisation issue.
(what you have listed is different to what I listed you don't have a
unique key)

Option 2 means you loose any ability to reference page, but as you
note your documents are at the level you wish your search results to
be returned.

if you are not interested in page then option 2.

On 10 August 2011 12:22, directorscott dgul...@gmail.com wrote:
 Could you please tell me schema.xml fields tag content for such case?
 Currently index data is something like this:

 PageID BookID Text
 1         1        some text
 2         1        some text
 3         1        some text
 4         1        some text
 5         2        some text
 6         2        some text
 7         2        some text
 8         2        some text

 when i make a simple query for the word some on Text field, i will have
 all 8 rows returned. but i want to list only 2 items (Books with IDs 1 and
 2)

 I am also considering to concatenate Text columns and have the index like
 this:

 BookID     PageTexts
 1             some text some text some text
 2             some text some text some text

 I wonder which index structure is better.




 lee carroll wrote:

 It really does depend upon what you want to do in your app but from
 the info given I'd go for denormalizing by repeating the least number
 of values. So in your case that would be book

 PageID+BookID(uniqueKey), pageID, PageVal1, PageValn, BookID, BookName




 On 10 August 2011 09:46, directorscott lt;dgul...@gmail.comgt; wrote:
 Hello,

 First of all, I am a beginner and i am trying to develop a sample
 application using SolrNet.

 I am struggling about schema definition i need to use to correspond my
 needs. In database, i have Books(bookId, name) and Pages(pageId, bookId,
 text) tables. They have master-detail relationship. I want to be able to
 search in Text area of Pages but list the books. Should i use a schema
 for
 Pages (with pageid as unique key) or for Books (with bookId as unique
 key)
 in this scenario?

 Thanks.



 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/document-indexing-tp3241832p3241832.html
 Sent from the Solr - User mailing list archive at Nabble.com.




 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/document-indexing-tp3241832p3242219.html
 Sent from the Solr - User mailing list archive at Nabble.com.

AW: Problem with DIH: How to map key value pair stored in 1-N relation from a JDBC Source?

2011-08-10 Thread Christian Bordis

Thanks, 
for this quick and enlightening answer! 

I didn't consider that a Transformer can create new columns. In combination 
with dynamic fields it is exactly what I was looking for.

Thanks James ^^

-Ursprüngliche Nachricht-
Von: Dyer, James [mailto:james.d...@ingrambook.com] 
Gesendet: Dienstag, 9. August 2011 16:03
An: solr-user@lucene.apache.org
Betreff: RE: Problem with DIH: How to map key value pair stored in 1-N relation 
from a JDBC Source?

Christian,

It looks like you should probably write a Transformer for your DIH script.  I 
assume you have a child entity set up for PriceTable.  Add a Transformer to 
this entity that will look at the value of currency and price, remove these 
from the row, then add them back in with currency as the field name and 
price as the column value.

By the way, it would likely be better if instead of field names like EUR and 
CHF, you created a dynamic field entry in schema.xml with a dynamic field 
like this:

dynamicField name=CURRENCY_* type=tfloat indexed=true stored=false /

Then have your DIH Transformer prepend CURRENCY_ in front of the field name.  
This way should your company ever add a new currency, you wouldn't need to 
change your schema.

For more information on writing a DIH Transformer, see 
http://wiki.apache.org/solr/DIHCustomTransformer

If you would rather use a scripting language such as javascript instead of 
writing your Transformer in java, see 
http://wiki.apache.org/solr/DataImportHandler#ScriptTransformer .

James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311

Re: Indexing tweet and searching @keyword OR #keyword

Please look more carefully at the documentation for WDDF,
specifically:

split on intra-word delimiters (all non alpha-numeric characters).

WordDelimiterFilterFactory will always throw away non alpha-numeric
characters, you can't tell it do to otherwise. Try some of the other
tokenizers/analyzers to get what you want, and also look at the
admin/analysis page to see what the exact effects are of your
fieldType definitions.

Here's a great place to start:
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters

You probably want something like WhitespaceTokenizerFactory
followed by LowerCaseFilterFactory or some such...

But I really question whether this is what you want either. Do you
really want a search on ipad to *fail* to match input of #ipad? Or
vice-versa?

KeywordTokenizerFactory is probably not the place you want to start,
the tokenization process doesn't break anything up, you happen to be
getting separate tokens because of WDDF, which as you see can't
process things the way you want.


Best
Erick

On Wed, Aug 10, 2011 at 3:09 AM, Mohammad Shariq shariqn...@gmail.com wrote:
 I tried tweaking WordDelimiterFactory but I won't accept # OR @ symbols
 and it ignored totally.
 I need solution plz suggest.

 On 4 August 2011 21:08, Jonathan Rochkind rochk...@jhu.edu wrote:

 It's the WordDelimiterFactory in your filter chain that's removing the
 punctuation entirely from your index, I think.

 Read up on what the WordDelimiter filter does, and what it's settings are;
 decide how you want things to be tokenized in your index to get the behavior
 your want; either get WordDelimiter to do it that way by passing it
 different arguments, or stop using WordDelimiter; come back with any
 questions after trying that!



 On 8/4/2011 11:22 AM, Mohammad Shariq wrote:

 I have indexed around 1 million tweets ( using  text dataType).
 when I search the tweet with #  OR @  I dont get the exact result.
 e.g.  when I search for #ipad OR @ipad   I get the result where ipad
 is
 mentioned skipping the # and @.
 please suggest me, how to tune or what are filterFactories to use to get
 the
 desired result.
 I am indexing the tweet as text, below is text which is there in my
 schema.xml.


 fieldType name=text class=solr.TextField positionIncrementGap=100
 analyzer type=index
     tokenizer class=solr.**KeywordTokenizerFactory/
     filter class=solr.**CommonGramsFilterFactory words=stopwords.txt
 minShingleSize=3 maxShingleSize=3 ignoreCase=true/
     filter class=solr.**WordDelimiterFilterFactory
 generateWordParts=1
 generateNumberParts=1 catenateWords=1 catenateNumbers=1
 catenateAll=0 splitOnCaseChange=1/
     filter class=solr.**LowerCaseFilterFactory/
     filter class=solr.**SnowballPorterFilterFactory
 protected=protwords.txt language=English/
 /analyzer
 analyzer type=query
         tokenizer class=solr.**KeywordTokenizerFactory/
         filter class=solr.**CommonGramsFilterFactory
 words=stopwords.txt
 minShingleSize=3 maxShingleSize=3 ignoreCase=true/
         filter class=solr.**WordDelimiterFilterFactory
 generateWordParts=1 generateNumberParts=1 catenateWords=1
 catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/
         filter class=solr.**LowerCaseFilterFactory/
         filter class=solr.**SnowballPorterFilterFactory
 protected=protwords.txt language=English/
 /analyzer
 /fieldType




 --
 Thanks and Regards
 Mohammad Shariq

Re: frange not working in query

Could you tell us what you're trying to achieve with the range query ?
It's not clear.

-Simon

On Wed, Aug 10, 2011 at 5:57 AM, Amit Sawhney sawhney.a...@gmail.com wrote:
 Hi All,

 I am trying to sort the results on a unix timestamp using this query.

 http://url.com:8983/solr/db/select/?indent=onversion=2.1q={!frange%20l=0.25}query($qq)qq=nokiasort=unix-timestamp%20descstart=0rows=10qt=dismaxwt=dismaxfl=*,scorehl=onhl.snippets=1

 When I run this query, it says 'no field name specified in query and no 
 defaultSearchField defined in schema.xml'

 As soon as I remove the frange query and run this, it starts working fine.

 http://url.com:8983/solr/db/select/?indent=onversion=2.1q=nokiasort=unix-timestamp%20descstart=0rows=10qt=dismaxwt=dismaxfl=*,scorehl=onhl.snippets=1

 Any pointers?


 Thanks,
 Amit

Re: frange not working in query

I meant the frange query, of course

On Wed, Aug 10, 2011 at 10:21 AM, simon mtnes...@gmail.com wrote:
 Could you tell us what you're trying to achieve with the range query ?
 It's not clear.

 -Simon

 On Wed, Aug 10, 2011 at 5:57 AM, Amit Sawhney sawhney.a...@gmail.com wrote:
 Hi All,

 I am trying to sort the results on a unix timestamp using this query.

 http://url.com:8983/solr/db/select/?indent=onversion=2.1q={!frange%20l=0.25}query($qq)qq=nokiasort=unix-timestamp%20descstart=0rows=10qt=dismaxwt=dismaxfl=*,scorehl=onhl.snippets=1

 When I run this query, it says 'no field name specified in query and no 
 defaultSearchField defined in schema.xml'

 As soon as I remove the frange query and run this, it starts working fine.

 http://url.com:8983/solr/db/select/?indent=onversion=2.1q=nokiasort=unix-timestamp%20descstart=0rows=10qt=dismaxwt=dismaxfl=*,scorehl=onhl.snippets=1

 Any pointers?


 Thanks,
 Amit

Re: Is optimize needed on slaves if it replicates from optimized master?

This is expected behavior. You might be optimizing
your index on the master after every set of changes,
in which case the entire index is copied. During this
period, the space on disk will at least double, there's no
way around that.

If you do NOT optimize, then the slave will only copy changed
segments instead of the entire index. Optimizing isn't
usually necessary except periodically (daily, perhaps weekly,
perhaps never actually).

All that said, depending on how merging happens, you will always
have the possibility of the entire index being copied sometimes
because you'll happen to hit a merge that merges all segments
into one.

There are some advanced options that can control some parts
of merging, but you need to get to the bottom of why the whole
index is getting copied every time before you go there. I'd bet
you're issuing an optimize.

Best
Erick

On Wed, Aug 10, 2011 at 5:30 AM, Pranav Prakash pra...@gmail.com wrote:
 That is not true. Replication is roughly a copy of the diff between the
 master and the slave's index.


 In my case, during replication entire index is copied from master to slave,
 during which the size of index goes a little over double. Then it shrinks to
 its original size. Am I doing something wrong? How can I get the master to
 serve only delta index instead of serving whole index and the slaves merging
 the new and old index?

 *Pranav Prakash*

Re: paging size in SOLR

Well, if you really want to you can specify start=0 and rows=1 and
get them all back at once.

You can do page-by-page by incrementing the start parameter as you
indicated.

You can keep from re-executing the search by setting your queryResultCache
appropriately, but this affects all searches so might be an issue.

Best
Erick

On Wed, Aug 10, 2011 at 9:09 AM, jame vaalet jamevaa...@gmail.com wrote:
 hi,
 i want to retrieve all the data from solr (say 10,000 ids ) and my page size
 is 1000 .
 how do i get back the data (pages) one after other ?do i have to increment
 the start value each time by the page size from 0 and do the iteration ?
 In this case am i querying the index 10 time instead of one or after first
 query the result will be cached somewhere for the subsequent pages ?


 JAME VAALET

Re: paging size in SOLR

Worth remembering there are some performance penalties with deep
paging, if you use the page-by-page approach. may not be too much of a
problem if you really are only looking to retrieve 10K docs.

-Simon

On Wed, Aug 10, 2011 at 10:32 AM, Erick Erickson
erickerick...@gmail.com wrote:
 Well, if you really want to you can specify start=0 and rows=1 and
 get them all back at once.

 You can do page-by-page by incrementing the start parameter as you
 indicated.

 You can keep from re-executing the search by setting your queryResultCache
 appropriately, but this affects all searches so might be an issue.

 Best
 Erick

 On Wed, Aug 10, 2011 at 9:09 AM, jame vaalet jamevaa...@gmail.com wrote:
 hi,
 i want to retrieve all the data from solr (say 10,000 ids ) and my page size
 is 1000 .
 how do i get back the data (pages) one after other ?do i have to increment
 the start value each time by the page size from 0 and do the iteration ?
 In this case am i querying the index 10 time instead of one or after first
 query the result will be cached somewhere for the subsequent pages ?


 JAME VAALET

RE: paging size in SOLR

2011-08-10 Thread Jonathan Rochkind

I would imagine the performance penalties with deep paging will ALSO be there 
if you just ask for 1 rows all at once though, instead of in, say, 100 row 
paged batches. Yes? No?

-Original Message-
From: simon [mailto:mtnes...@gmail.com] 
Sent: Wednesday, August 10, 2011 10:44 AM
To: solr-user@lucene.apache.org
Subject: Re: paging size in SOLR

Worth remembering there are some performance penalties with deep
paging, if you use the page-by-page approach. may not be too much of a
problem if you really are only looking to retrieve 10K docs.

-Simon

On Wed, Aug 10, 2011 at 10:32 AM, Erick Erickson
erickerick...@gmail.com wrote:
 Well, if you really want to you can specify start=0 and rows=1 and
 get them all back at once.

 You can do page-by-page by incrementing the start parameter as you
 indicated.

 You can keep from re-executing the search by setting your queryResultCache
 appropriately, but this affects all searches so might be an issue.

 Best
 Erick

 On Wed, Aug 10, 2011 at 9:09 AM, jame vaalet jamevaa...@gmail.com wrote:
 hi,
 i want to retrieve all the data from solr (say 10,000 ids ) and my page size
 is 1000 .
 how do i get back the data (pages) one after other ?do i have to increment
 the start value each time by the page size from 0 and do the iteration ?
 In this case am i querying the index 10 time instead of one or after first
 query the result will be cached somewhere for the subsequent pages ?

 JAME VAALET

Building a facet query in SolrJ

Hi - I'm trying to do a (I think) simple facet query, but I'm not getting the 
results I expect. I have a field, MyField, and I want to get facets for 
specific values of that field. That is, I want a FacetField if MyField is 
ABC, DEF, etc. (a specific list of values), but not if MyField is any other 
value.

If I build my query like this:

SolrQuery query = new SolrQuery( luceneQueryStr );
  query.setStart( request.getStartIndex() );
  query.setRows( request.getMaxResults() );
  query.setFacet(true);
 query.setFacetMinCount(1);

  query.addFacetField(MYFIELD);

  for (String fieldValue : desiredFieldValues) {
   query.addFacetQuery(MYFIELD + : + fieldValue);
 }


queryResponse.getFacetFields returns facets for ALL values of MyField. I 
figured that was because setting the facet field with addFacetField caused Solr 
to examine all values. But, if I take out that line, then getFacetFields 
returns an empty list.

I'm sure I'm doing something simple wrong, but I'm out of ideas right now.

-Rich

Re: paging size in SOLR

2011-08-10 Thread jame vaalet

when you say queryResultCache, does it only cache n number of result for the
last one query or more than one queries?


On 10 August 2011 20:14, simon mtnes...@gmail.com wrote:

 Worth remembering there are some performance penalties with deep
 paging, if you use the page-by-page approach. may not be too much of a
 problem if you really are only looking to retrieve 10K docs.

 -Simon

 On Wed, Aug 10, 2011 at 10:32 AM, Erick Erickson
 erickerick...@gmail.com wrote:
  Well, if you really want to you can specify start=0 and rows=1 and
  get them all back at once.
 
  You can do page-by-page by incrementing the start parameter as you
  indicated.
 
  You can keep from re-executing the search by setting your
 queryResultCache
  appropriately, but this affects all searches so might be an issue.
 
  Best
  Erick
 
  On Wed, Aug 10, 2011 at 9:09 AM, jame vaalet jamevaa...@gmail.com
 wrote:
  hi,
  i want to retrieve all the data from solr (say 10,000 ids ) and my page
 size
  is 1000 .
  how do i get back the data (pages) one after other ?do i have to
 increment
  the start value each time by the page size from 0 and do the iteration
 ?
  In this case am i querying the index 10 time instead of one or after
 first
  query the result will be cached somewhere for the subsequent pages ?
 
 
  JAME VAALET
 
 




-- 

-JAME

Re: Solr 3.3 crashes after ~18 hours?

2011-08-10 Thread alexander sulz


Okay, with this command it hangs.
Also: I managed to get a Thread Dump (attached).

regards

Am 05.08.2011 15:08, schrieb Yonik Seeley:

On Fri, Aug 5, 2011 at 7:33 AM, alexander sulza.s...@digiconcept.net  wrote:

Usually you get a XML-Response when doing commits or optimize, in this case
I get nothing
in return, but the site ( http://[...]/solr/update?optimize=true ) DOESN'T
load forever or anything.
It doesn't hang! I just get a blank page / empty response.

Sounds like you are doing it from a browser?
Can you try it from the command line?  It should give back some sort
of response (or hang waiting for a response).

curl http://localhost:8983/solr/update?commit=true;

-Yonik
http://www.lucidimagination.com



I use the stuff in the example folder, the only changes i made was enable
logging and changing the port to 8985.
I'll try getting a thread dump if it happens again!
So far its looking good with having allocated more memory to it.

Am 04.08.2011 16:08, schrieb Yonik Seeley:

On Thu, Aug 4, 2011 at 8:09 AM, alexander sulza.s...@digiconcept.net
  wrote:

Thank you for the many replies!

Like I said, I couldn't find anything in logs created by solr.
I just had a look at the /var/logs/messages and there wasn't anything
either.

What I mean by crash is that the process is still there and http GET
pings
would return 200
but when i try visiting /solr/admin, I'd get a blank page! The server
ignores any incoming updates or commits,

ignores means what?  The request hangs?  If so, could you get a thread
dump?

Do queries work (like /solr/select?q=*:*) ?


thous throwing no errors, no 503's.. It's like the server has a blackout
and
stares blankly into space.

Are you using a different servlet container than what is shipped with
solr?
If you did start with the solr example server, what jetty
configuration changes have you made?

-Yonik
http://www.lucidimagination.com




Full thread dump Java HotSpot(TM) Server VM (19.1-b02 mixed mode):

DestroyJavaVM prio=10 tid=0x6e32e800 nid=0x5aeb waiting on condition 
[0x]
   java.lang.Thread.State: RUNNABLE

Timer-2 daemon prio=10 tid=0x6e3ff800 nid=0x5b0b in Object.wait() [0x6e6e5000]
   java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on 0xb0260108 (a java.util.TaskQueue)
at java.util.TimerThread.mainLoop(Unknown Source)
- locked 0xb0260108 (a java.util.TaskQueue)
at java.util.TimerThread.run(Unknown Source)

pool-1-thread-1 prio=10 tid=0x6e32dc00 nid=0x5b0a waiting on condition 
[0x6dae]
   java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  0xb02680e8 (a 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(Unknown Source)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(Unknown
 Source)
at java.util.concurrent.LinkedBlockingQueue.take(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.getTask(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)

Timer-1 daemon prio=10 tid=0x0874e000 nid=0x5b07 in Object.wait() [0x6eb6d000]
   java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on 0xb02601c0 (a java.util.TaskQueue)
at java.util.TimerThread.mainLoop(Unknown Source)
- locked 0xb02601c0 (a java.util.TaskQueue)
at java.util.TimerThread.run(Unknown Source)

8106640@qtp-25094328-9 - Acceptor0 SocketConnector@0.0.0.0:8985 prio=10 
tid=0x0832dc00 nid=0x5b06 runnable [0x6ecc7000]
   java.lang.Thread.State: RUNNABLE
at java.net.PlainSocketImpl.socketAccept(Native Method)
at java.net.PlainSocketImpl.accept(Unknown Source)
- locked 0xb0260288 (a java.net.SocksSocketImpl)
at java.net.ServerSocket.implAccept(Unknown Source)
at java.net.ServerSocket.accept(Unknown Source)
at org.mortbay.jetty.bio.SocketConnector.accept(SocketConnector.java:99)
at 
org.mortbay.jetty.AbstractConnector$Acceptor.run(AbstractConnector.java:708)
at 
org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)

9097070@qtp-25094328-8 prio=10 tid=0x0832c400 nid=0x5b05 in Object.wait() 
[0x6ed18000]
   java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on 0xb0264018 (a 
org.mortbay.thread.QueuedThreadPool$PoolThread)
at 
org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:626)
- locked 0xb0264018 (a org.mortbay.thread.QueuedThreadPool$PoolThread)

4098499@qtp-25094328-7 prio=10 tid=0x0832ac00 nid=0x5b04 in Object.wait() 
[0x6ed69000]
   java.lang.Thread.State: TIMED_WAITING (on object monitor)
at

Error loading a custom request handler in Solr 4.0

Hi,

Apologies if this is really basic. I'm trying to learn how to create a
custom request handler, so I wrote the minimal class (attached), compiled
and jar'd it, and placed it in example/lib. I added this to solrconfig.xml:

requestHandler name=/flaxtest class=FlaxTestHandler /

When I started Solr with java -jar start.jar, I got this:

...
SEVERE: java.lang.NoClassDefFoundError:
org/apache/solr/handler/RequestHandlerBase
at java.lang.ClassLoader.defineClass1(Native Method)
...

So I copied all the dist/*.jar files into lib and tried again. This time it
seemed to start ok, but browsing to http://localhost:8983/solr/ displayed
this:

org.apache.solr.common.SolrException: Error Instantiating Request
Handler, FlaxTestHandler is not a org.apache.solr.request.SolrRequestHandler

at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:410) ...


Any ideas?

thanks,
Tom

RE: Building a facet query in SolrJ

Oops. I think I found it. My desiredFieldValues list has the wrong info. Knew 
there was something simple wrong.

From: Simon, Richard T
Sent: Wednesday, August 10, 2011 10:55 AM
To: solr-user@lucene.apache.org
Cc: Simon, Richard T
Subject: Building a facet query in SolrJ

Hi - I'm trying to do a (I think) simple facet query, but I'm not getting the 
results I expect. I have a field, MyField, and I want to get facets for 
specific values of that field. That is, I want a FacetField if MyField is 
ABC, DEF, etc. (a specific list of values), but not if MyField is any other 
value.

If I build my query like this:

SolrQuery query = new SolrQuery( luceneQueryStr );
  query.setStart( request.getStartIndex() );
  query.setRows( request.getMaxResults() );
  query.setFacet(true);
 query.setFacetMinCount(1);

  query.addFacetField(MYFIELD);

  for (String fieldValue : desiredFieldValues) {
   query.addFacetQuery(MYFIELD + : + fieldValue);
 }

queryResponse.getFacetFields returns facets for ALL values of MyField. I 
figured that was because setting the facet field with addFacetField caused Solr 
to examine all values. But, if I take out that line, then getFacetFields 
returns an empty list.

I'm sure I'm doing something simple wrong, but I'm out of ideas right now.

-Rich

Re: Solr 3.3 crashes after ~18 hours?

2011-08-10 Thread Yonik Seeley

On Wed, Aug 10, 2011 at 11:00 AM, alexander sulz a.s...@digiconcept.net wrote:
 Okay, with this command it hangs.

It doesn't look like a hang from this thread dump.  It doesn't look
like any solr requests are executing at the time the dump was taken.

Did you do this from the command line?
curl http://localhost:8983/solr/update?commit=true;

Are you saying that the curl command just hung and never returned?

-Yonik
http://www.lucidimagination.com

 Also: I managed to get a Thread Dump (attached).

 regards

 Am 05.08.2011 15:08, schrieb Yonik Seeley:

 On Fri, Aug 5, 2011 at 7:33 AM, alexander sulza.s...@digiconcept.net
  wrote:

 Usually you get a XML-Response when doing commits or optimize, in this
 case
 I get nothing
 in return, but the site ( http://[...]/solr/update?optimize=true )
 DOESN'T
 load forever or anything.
 It doesn't hang! I just get a blank page / empty response.

 Sounds like you are doing it from a browser?
 Can you try it from the command line?  It should give back some sort
 of response (or hang waiting for a response).

 curl http://localhost:8983/solr/update?commit=true;

 -Yonik
 http://www.lucidimagination.com


 I use the stuff in the example folder, the only changes i made was enable
 logging and changing the port to 8985.
 I'll try getting a thread dump if it happens again!
 So far its looking good with having allocated more memory to it.

 Am 04.08.2011 16:08, schrieb Yonik Seeley:

 On Thu, Aug 4, 2011 at 8:09 AM, alexander sulza.s...@digiconcept.net
  wrote:

 Thank you for the many replies!

 Like I said, I couldn't find anything in logs created by solr.
 I just had a look at the /var/logs/messages and there wasn't anything
 either.

 What I mean by crash is that the process is still there and http GET
 pings
 would return 200
 but when i try visiting /solr/admin, I'd get a blank page! The server
 ignores any incoming updates or commits,

 ignores means what?  The request hangs?  If so, could you get a thread
 dump?

 Do queries work (like /solr/select?q=*:*) ?

 thous throwing no errors, no 503's.. It's like the server has a
 blackout
 and
 stares blankly into space.

 Are you using a different servlet container than what is shipped with
 solr?
 If you did start with the solr example server, what jetty
 configuration changes have you made?

 -Yonik
 http://www.lucidimagination.com

Re: Cache replication

2011-08-10 Thread didier deshommes

Consider putting a cache (memcached, redis, etc) *in front* of your
solr slaves. Just make sure to update it when replication occurs.

didier

On Tue, Aug 9, 2011 at 6:07 PM, arian487 akarb...@tagged.com wrote:
 I'm wondering if the caches on all the slaves are replicated across (such as
 queryResultCache).  That is to say, if I hit one of my slaves and cache a
 result, and I make a search later and that search happens to hit a different
 slave, will that first cached result be available for use?

 This is pretty important because I'm going to have a lot of slaves and if
 this isn't done, then I'd have a high chance of running a lot uncached
 queries.

 Thanks :)

 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Cache-replication-tp3240708p3240708.html
 Sent from the Solr - User mailing list archive at Nabble.com.

Dates off by 1 day?

2011-08-10 Thread Olson, Ron

Hi all-

I apologize in advance if this turns out to be a problem between the keyboard 
and the chair, but I'm confused about why my date field is correct in the 
index, but wrong in SolrJ.

I have a field defined as a date in the index:

field name=FILE_DATE type=date indexed=true stored=true/

And if I use the admin site to query the data, I get the right date:

date name=FILE_DATE2002-05-13T00:00:00Z/date

But in my SolrJ code:

IteratorSolrDocument iter = queryResponse.getResults().iterator();

while (iter.hasNext())
{
SolrDocument resultDoc = iter.next();

System.out.println(--  + resultDoc.getFieldValue(FILE_DATE));

}

I get:

-- Sun May 12 19:00:00 CDT 2002

I've been searching around through the wiki and other places, but can't seem to 
find anything that either mentions this problem or talks about date handling in 
Solr/SolrJ that might refer to something like this.

Thanks for any info,

Ron



DISCLAIMER: This electronic message, including any attachments, files or 
documents, is intended only for the addressee and may contain CONFIDENTIAL, 
PROPRIETARY or LEGALLY PRIVILEGED information.  If you are not the intended 
recipient, you are hereby notified that any use, disclosure, copying or 
distribution of this message or any of the information included in or with it 
is  unauthorized and strictly prohibited.  If you have received this message in 
error, please notify the sender immediately by reply e-mail and permanently 
delete and destroy this message and its attachments, along with any copies 
thereof. This message does not create any contractual obligation on behalf of 
the sender or Law Bulletin Publishing Company.
Thank you.

Re: Dates off by 1 day?

2011-08-10 Thread Sethi, Parampreet


The Date difference is coming because of different time zones.

In Solr the date is stored as Zulu time zone and Solrj is returning date in
CDT timezone (jvm is picking system time zone.)

 date name=FILE_DATE2002-05-13T00:00:00Z/date

 I get:
 
 -- Sun May 12 19:00:00 CDT 2002

You can convert Date in different time-zones using Java Util date functions
if required.

Hope it helps!

-param
On 8/10/11 11:20 AM, Olson, Ron rol...@lbpc.com wrote:

 Hi all-
 
 I apologize in advance if this turns out to be a problem between the keyboard
 and the chair, but I'm confused about why my date field is correct in the
 index, but wrong in SolrJ.
 
 I have a field defined as a date in the index:
 
 field name=FILE_DATE type=date indexed=true stored=true/
 
 And if I use the admin site to query the data, I get the right date:
 
 date name=FILE_DATE2002-05-13T00:00:00Z/date
 
 But in my SolrJ code:
 
 IteratorSolrDocument iter = queryResponse.getResults().iterator();
 
 while (iter.hasNext())
 {
 SolrDocument resultDoc = iter.next();
 
 System.out.println(--  + resultDoc.getFieldValue(FILE_DATE));
 
 }
 
 I get:
 
 -- Sun May 12 19:00:00 CDT 2002
 
 I've been searching around through the wiki and other places, but can't seem
 to find anything that either mentions this problem or talks about date
 handling in Solr/SolrJ that might refer to something like this.
 
 Thanks for any info,
 
 Ron
 
 
 
 DISCLAIMER: This electronic message, including any attachments, files or
 documents, is intended only for the addressee and may contain CONFIDENTIAL,
 PROPRIETARY or LEGALLY PRIVILEGED information.  If you are not the intended
 recipient, you are hereby notified that any use, disclosure, copying or
 distribution of this message or any of the information included in or with it
 is  unauthorized and strictly prohibited.  If you have received this message
 in error, please notify the sender immediately by reply e-mail and permanently
 delete and destroy this message and its attachments, along with any copies
 thereof. This message does not create any contractual obligation on behalf of
 the sender or Law Bulletin Publishing Company.
 Thank you.

RE: Dates off by 1 day?

2011-08-10 Thread Olson, Ron

Ah, great! I knew the problem was between the keyboard and the chair. Thanks!

-Original Message-
From: Sethi, Parampreet [mailto:parampreet.se...@teamaol.com]
Sent: Wednesday, August 10, 2011 10:25 AM
To: solr-user@lucene.apache.org
Subject: Re: Dates off by 1 day?

The Date difference is coming because of different time zones.

In Solr the date is stored as Zulu time zone and Solrj is returning date in
CDT timezone (jvm is picking system time zone.)

 date name=FILE_DATE2002-05-13T00:00:00Z/date

 I get:

 -- Sun May 12 19:00:00 CDT 2002

You can convert Date in different time-zones using Java Util date functions
if required.

Hope it helps!

-param
On 8/10/11 11:20 AM, Olson, Ron rol...@lbpc.com wrote:

 Hi all-

 I apologize in advance if this turns out to be a problem between the keyboard
 and the chair, but I'm confused about why my date field is correct in the
 index, but wrong in SolrJ.

 I have a field defined as a date in the index:

 field name=FILE_DATE type=date indexed=true stored=true/

 And if I use the admin site to query the data, I get the right date:

 date name=FILE_DATE2002-05-13T00:00:00Z/date

 But in my SolrJ code:

 IteratorSolrDocument iter = queryResponse.getResults().iterator();

 while (iter.hasNext())
 {
 SolrDocument resultDoc = iter.next();

 System.out.println(--  + resultDoc.getFieldValue(FILE_DATE));

 }

 I get:

 -- Sun May 12 19:00:00 CDT 2002

 I've been searching around through the wiki and other places, but can't seem
 to find anything that either mentions this problem or talks about date
 handling in Solr/SolrJ that might refer to something like this.

 Thanks for any info,

 Ron

 DISCLAIMER: This electronic message, including any attachments, files or
 documents, is intended only for the addressee and may contain CONFIDENTIAL,
 PROPRIETARY or LEGALLY PRIVILEGED information.  If you are not the intended
 recipient, you are hereby notified that any use, disclosure, copying or
 distribution of this message or any of the information included in or with it
 is  unauthorized and strictly prohibited.  If you have received this message
 in error, please notify the sender immediately by reply e-mail and permanently
 delete and destroy this message and its attachments, along with any copies
 thereof. This message does not create any contractual obligation on behalf of
 the sender or Law Bulletin Publishing Company.
 Thank you.

DISCLAIMER: This electronic message, including any attachments, files or 
documents, is intended only for the addressee and may contain CONFIDENTIAL, 
PROPRIETARY or LEGALLY PRIVILEGED information.  If you are not the intended 
recipient, you are hereby notified that any use, disclosure, copying or 
distribution of this message or any of the information included in or with it 
is  unauthorized and strictly prohibited.  If you have received this message in 
error, please notify the sender immediately by reply e-mail and permanently 
delete and destroy this message and its attachments, along with any copies 
thereof. This message does not create any contractual obligation on behalf of 
the sender or Law Bulletin Publishing Company.
Thank you.

Re: Error loading a custom request handler in Solr 4.0

Th attachment isn't showing up (in gmail, at least). Can you inline
the relevant bits of code ?

On Wed, Aug 10, 2011 at 11:05 AM, Tom Mortimer t...@flax.co.uk wrote:
 Hi,
 Apologies if this is really basic. I'm trying to learn how to create a
 custom request handler, so I wrote the minimal class (attached), compiled
 and jar'd it, and placed it in example/lib. I added this to solrconfig.xml:
     requestHandler name=/flaxtest class=FlaxTestHandler /
 When I started Solr with java -jar start.jar, I got this:
     ...
     SEVERE: java.lang.NoClassDefFoundError:
 org/apache/solr/handler/RequestHandlerBase
 at java.lang.ClassLoader.defineClass1(Native Method)
         ...
 So I copied all the dist/*.jar files into lib and tried again. This time it
 seemed to start ok, but browsing to http://localhost:8983/solr/ displayed
 this:
     org.apache.solr.common.SolrException: Error Instantiating Request
 Handler, FlaxTestHandler is not a org.apache.solr.request.SolrRequestHandler

   at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:410) ...

 Any ideas?
 thanks,
 Tom

Re: Error loading a custom request handler in Solr 4.0

Sure -

import org.apache.solr.request.SolrQueryRequest;
import org.apache.solr.response.SolrQueryResponse;
import org.apache.solr.handler.RequestHandlerBase;

public class FlaxTestHandler extends RequestHandlerBase {

public FlaxTestHandler() { }

public void handleRequestBody(SolrQueryRequest req, SolrQueryResponse
rsp)
throws Exception
{
rsp.add(FlaxTest, Hello!);
}

public String getDescription() { return Flax; }
public String getSourceId() { return Flax; }
public String getSource() { return Flax; }
public String getVersion() { return Flax; }

}



On 10 August 2011 16:43, simon mtnes...@gmail.com wrote:

 Th attachment isn't showing up (in gmail, at least). Can you inline
 the relevant bits of code ?

 On Wed, Aug 10, 2011 at 11:05 AM, Tom Mortimer t...@flax.co.uk wrote:
  Hi,
  Apologies if this is really basic. I'm trying to learn how to create a
  custom request handler, so I wrote the minimal class (attached), compiled
  and jar'd it, and placed it in example/lib. I added this to
 solrconfig.xml:
  requestHandler name=/flaxtest class=FlaxTestHandler /
  When I started Solr with java -jar start.jar, I got this:
  ...
  SEVERE: java.lang.NoClassDefFoundError:
  org/apache/solr/handler/RequestHandlerBase
  at java.lang.ClassLoader.defineClass1(Native Method)
  ...
  So I copied all the dist/*.jar files into lib and tried again. This time
 it
  seemed to start ok, but browsing to http://localhost:8983/solr/displayed
  this:
  org.apache.solr.common.SolrException: Error Instantiating Request
  Handler, FlaxTestHandler is not a
 org.apache.solr.request.SolrRequestHandler
 
at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:410)
 ...
 
  Any ideas?
  thanks,
  Tom

Re: how to ignore case in solr search field?

You can use solr.LowerCaseFilterFactory in an analyser chain for both
indexing and queries. The schema.xml supplied with example has several field
types using this (including text_general).

Tom


On 10 August 2011 16:42, nagarjuna nagarjuna.avul...@gmail.com wrote:

 Hi please help me ..
how to ignore case while searching in solr


 ex:i need same results for the keywords abc, ABC , aBc,AbC and all the
 cases.




 Thank u in advance

 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/how-to-ignore-case-in-solr-search-field-tp3242967p3242967.html
 Sent from the Solr - User mailing list archive at Nabble.com.

Re: Is optimize needed on slaves if it replicates from optimized master?

2011-08-10 Thread Pranav Prakash

Very well explained. Thanks. Yes, we do optimize Index before replication. I
am not particularly worried about disk space usage. I was more curious of
that behavior.

*Pranav Prakash*

temet nosce

Twitter http://twitter.com/pranavprakash | Blog http://blog.myblive.com |
Google http://www.google.com/profiles/pranny


On Wed, Aug 10, 2011 at 19:55, Erick Erickson erickerick...@gmail.comwrote:

 This is expected behavior. You might be optimizing
 your index on the master after every set of changes,
 in which case the entire index is copied. During this
 period, the space on disk will at least double, there's no
 way around that.

 If you do NOT optimize, then the slave will only copy changed
 segments instead of the entire index. Optimizing isn't
 usually necessary except periodically (daily, perhaps weekly,
 perhaps never actually).

 All that said, depending on how merging happens, you will always
 have the possibility of the entire index being copied sometimes
 because you'll happen to hit a merge that merges all segments
 into one.

 There are some advanced options that can control some parts
 of merging, but you need to get to the bottom of why the whole
 index is getting copied every time before you go there. I'd bet
 you're issuing an optimize.

 Best
 Erick

 On Wed, Aug 10, 2011 at 5:30 AM, Pranav Prakash pra...@gmail.com wrote:
  That is not true. Replication is roughly a copy of the diff between the
  master and the slave's index.
 
 
  In my case, during replication entire index is copied from master to
 slave,
  during which the size of index goes a little over double. Then it shrinks
 to
  its original size. Am I doing something wrong? How can I get the master
 to
  serve only delta index instead of serving whole index and the slaves
 merging
  the new and old index?
 
  *Pranav Prakash*

RE: [Help Wanted] Graphics and other help for new Lucene/Solr website

2011-08-10 Thread karl.wright

The site looks great.  And thank you for including the ManifoldCF link. ;-)

Karl

-Original Message-
From: ext Grant Ingersoll [mailto:gsing...@apache.org] 
Sent: Wednesday, August 10, 2011 10:09 AM
To: solr-user@lucene.apache.org; java-u...@lucene.apache.org
Subject: [Help Wanted] Graphics and other help for new Lucene/Solr website

Hi,

We are in the process of putting up a new Lucene/Solr/PyLucene/OpenRelevance 
website.  You can see a preview at http://lucene.staging.apache.org/lucene/.  
It is more or less a look and feel copy of Mahout and Open For Biz websites.  
This new site, IMO, both looks better than the old one and will be a lot easier 
for us committers to maintain/update and for others to contribute to.

So, how can you help?  

0.  All of the code is at https://svn.apache.org/repos/asf/lucene/cms/trunk.  
Check it out the usual way using SVN.  If you want to build locally, see 
https://issues.apache.org/jira/browse/LUCENE-2748 and the links to the ASF CMS 
guide.

1. If you have any graphic design skills:
- I'd love to have some mantle/slide images along the lines of 
http://lucene.staging.apache.org/lucene/images/mantle-lucene-solr.png.  These 
are used in the slideshow at the top of the Lucene, Core and Solr pages and 
should be interesting, inviting, etc. and should give people warm fuzzy 
feelings about all of our software and the great community we have.  (Think 
Marketing!)
- Help us coordinate the color selection on the various pages, 
especially in the slides and especially on the Solr page, as I'm not sure I 
like the green and black background contrasted with the orange of the Solr logo.

2. In a few more days or maybe a week or so, patches to fix content errors, 
etc. will be welcome.  For now, we are still porting things, so I don't want to 
duplicate effort.

3. New, useful documentation is also, of course, always welcome.

4. Test with your favorite browser.  In particular, I don't have IE handy.  
I've checked the site in Chrome, Firefox and Safari.

If you come up w/  images (I won't guarantee they will be accepted, but I am 
appreciative of the help) or other style fixes, etc., please submit all 
content/patches to https://issues.apache.org/jira/browse/LUCENE-2748 and please 
make sure to check the donation box when attaching the file. 

-Grant

Re: [Help Wanted] Graphics and other help for new Lucene/Solr website

2011-08-10 Thread Markus Jelsma

Looks nice! Font seems too light to read with comfort though.

 Hi,
 
 We are in the process of putting up a new
 Lucene/Solr/PyLucene/OpenRelevance website.  You can see a preview at
 http://lucene.staging.apache.org/lucene/.  It is more or less a look and
 feel copy of Mahout and Open For Biz websites.  This new site, IMO, both
 looks better than the old one and will be a lot easier for us committers
 to maintain/update and for others to contribute to.
 
 So, how can you help?
 
 0.  All of the code is at
 https://svn.apache.org/repos/asf/lucene/cms/trunk.  Check it out the usual
 way using SVN.  If you want to build locally, see
 https://issues.apache.org/jira/browse/LUCENE-2748 and the links to the ASF
 CMS guide.
 
 1. If you have any graphic design skills:
   - I'd love to have some mantle/slide images along the lines of
 http://lucene.staging.apache.org/lucene/images/mantle-lucene-solr.png. 
 These are used in the slideshow at the top of the Lucene, Core and Solr
 pages and should be interesting, inviting, etc. and should give people
 warm fuzzy feelings about all of our software and the great community we
 have.  (Think Marketing!) - Help us coordinate the color selection on the
 various pages, especially in the slides and especially on the Solr page,
 as I'm not sure I like the green and black background contrasted with the
 orange of the Solr logo.
 
 2. In a few more days or maybe a week or so, patches to fix content errors,
 etc. will be welcome.  For now, we are still porting things, so I don't
 want to duplicate effort.
 
 3. New, useful documentation is also, of course, always welcome.
 
 4. Test with your favorite browser.  In particular, I don't have IE handy. 
 I've checked the site in Chrome, Firefox and Safari.
 
 If you come up w/  images (I won't guarantee they will be accepted, but I
 am appreciative of the help) or other style fixes, etc., please submit all
 content/patches to https://issues.apache.org/jira/browse/LUCENE-2748 and
 please make sure to check the donation box when attaching the file.
 
 -Grant

Re: Error loading a custom request handler in Solr 4.0

It's working for me. Compiled, inserted in solr/lib, added the config
line to solrconfig.

  when I send a /flaxtest request i get

response
lst name=responseHeader
int name=status0/int
int name=QTime16/int
/lst
str name=FlaxTestHello!/str
/response

I was doing this within a core defined in solr.xml

-Simon

On Wed, Aug 10, 2011 at 11:46 AM, Tom Mortimer t...@flax.co.uk wrote:
 Sure -

 import org.apache.solr.request.SolrQueryRequest;
 import org.apache.solr.response.SolrQueryResponse;
 import org.apache.solr.handler.RequestHandlerBase;

 public class FlaxTestHandler extends RequestHandlerBase {

    public FlaxTestHandler() { }

    public void handleRequestBody(SolrQueryRequest req, SolrQueryResponse
 rsp)
        throws Exception
    {
        rsp.add(FlaxTest, Hello!);
    }

    public String getDescription() { return Flax; }
    public String getSourceId() { return Flax; }
    public String getSource() { return Flax; }
    public String getVersion() { return Flax; }

 }



 On 10 August 2011 16:43, simon mtnes...@gmail.com wrote:

 Th attachment isn't showing up (in gmail, at least). Can you inline
 the relevant bits of code ?

 On Wed, Aug 10, 2011 at 11:05 AM, Tom Mortimer t...@flax.co.uk wrote:
  Hi,
  Apologies if this is really basic. I'm trying to learn how to create a
  custom request handler, so I wrote the minimal class (attached), compiled
  and jar'd it, and placed it in example/lib. I added this to
 solrconfig.xml:
      requestHandler name=/flaxtest class=FlaxTestHandler /
  When I started Solr with java -jar start.jar, I got this:
      ...
      SEVERE: java.lang.NoClassDefFoundError:
  org/apache/solr/handler/RequestHandlerBase
  at java.lang.ClassLoader.defineClass1(Native Method)
          ...
  So I copied all the dist/*.jar files into lib and tried again. This time
 it
  seemed to start ok, but browsing to http://localhost:8983/solr/displayed
  this:
      org.apache.solr.common.SolrException: Error Instantiating Request
  Handler, FlaxTestHandler is not a
 org.apache.solr.request.SolrRequestHandler
 
        at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:410)
 ...
 
  Any ideas?
  thanks,
  Tom

query time problem

2011-08-10 Thread Charles-Andre Martin

Hi,

 

I've noticed poor performance for my solr queries in the past few days.

 

Queries of that type :

 

http://server:5000/solr/select?q=story_search_field_en:(water boston) OR 
story_search_field_fr:(water boston)rows=350start=0sort=r_modify_date 
descshards=shard1:5001/solr,shard2:5002/solrfq=type:(cch_story OR 
cch_published_story)

 

Are slow (more than 10 seconds).

 

I would like to know if someone knows how I could investigate the problem ? I 
tried to specify the parameters debugQuery=onexplainOther=on but this doesn't 
help much.

 

I also monitored the shards log. Sometimes, there is broken pipe in the shards 
logs.

 

Also, is there a way I could monitor the cache statistics ? 

 

For your information, every shards master and slaves computers have enough RAM 
and disk space.

 

 

Charles-André Martin

Re: Error loading a custom request handler in Solr 4.0

Interesting.. is this in trunk (4.0)? Maybe I've broken mine somehow!

What classpath did you use for compiling? And did you copy anything other
than the new jar into lib/ ?

thanks,
Tom


On 10 August 2011 18:07, simon mtnes...@gmail.com wrote:

 It's working for me. Compiled, inserted in solr/lib, added the config
 line to solrconfig.

  when I send a /flaxtest request i get

 response
 lst name=responseHeader
 int name=status0/int
 int name=QTime16/int
 /lst
 str name=FlaxTestHello!/str
 /response

 I was doing this within a core defined in solr.xml

 -Simon

 On Wed, Aug 10, 2011 at 11:46 AM, Tom Mortimer t...@flax.co.uk wrote:
  Sure -
 
  import org.apache.solr.request.SolrQueryRequest;
  import org.apache.solr.response.SolrQueryResponse;
  import org.apache.solr.handler.RequestHandlerBase;
 
  public class FlaxTestHandler extends RequestHandlerBase {
 
 public FlaxTestHandler() { }
 
 public void handleRequestBody(SolrQueryRequest req, SolrQueryResponse
  rsp)
 throws Exception
 {
 rsp.add(FlaxTest, Hello!);
 }
 
 public String getDescription() { return Flax; }
 public String getSourceId() { return Flax; }
 public String getSource() { return Flax; }
 public String getVersion() { return Flax; }
 
  }
 
 
 
  On 10 August 2011 16:43, simon mtnes...@gmail.com wrote:
 
  Th attachment isn't showing up (in gmail, at least). Can you inline
  the relevant bits of code ?
 
  On Wed, Aug 10, 2011 at 11:05 AM, Tom Mortimer t...@flax.co.uk wrote:
   Hi,
   Apologies if this is really basic. I'm trying to learn how to create a
   custom request handler, so I wrote the minimal class (attached),
 compiled
   and jar'd it, and placed it in example/lib. I added this to
  solrconfig.xml:
   requestHandler name=/flaxtest class=FlaxTestHandler /
   When I started Solr with java -jar start.jar, I got this:
   ...
   SEVERE: java.lang.NoClassDefFoundError:
   org/apache/solr/handler/RequestHandlerBase
   at java.lang.ClassLoader.defineClass1(Native Method)
   ...
   So I copied all the dist/*.jar files into lib and tried again. This
 time
  it
   seemed to start ok, but browsing to
 http://localhost:8983/solr/displayed
   this:
   org.apache.solr.common.SolrException: Error Instantiating Request
   Handler, FlaxTestHandler is not a
  org.apache.solr.request.SolrRequestHandler
  
 at
 org.apache.solr.core.SolrCore.createInstance(SolrCore.java:410)
  ...
  
   Any ideas?
   thanks,
   Tom

RE: Building a facet query in SolrJ

I take it back. I didn't find it. I corrected my values and the facet queries 
still don't find what I want.

The values I'm looking for are URIs, so they look like: http://place.org/abc/def

I add the facet query like so:

query.addFacetQuery(MyField + : + \ + uri + \);


I print the query, just to see what it is:

Facet Query:  MyField: : http://place.org/abc/def;

But when I examine queryResponse.getFacetFields, it's an empty list, if I do 
not set the facet field. If I set the facet field to MyField, then I get facets 
for ALL the values of MyField, not just the ones in the facet queries.

Can anyone help here?

Thanks.


From: Simon, Richard T
Sent: Wednesday, August 10, 2011 11:07 AM
To: Simon, Richard T; solr-user@lucene.apache.org
Subject: RE: Building a facet query in SolrJ

Oops. I think I found it. My desiredFieldValues list has the wrong info. Knew 
there was something simple wrong.

From: Simon, Richard T
Sent: Wednesday, August 10, 2011 10:55 AM
To: solr-user@lucene.apache.org
Cc: Simon, Richard T
Subject: Building a facet query in SolrJ

Hi - I'm trying to do a (I think) simple facet query, but I'm not getting the 
results I expect. I have a field, MyField, and I want to get facets for 
specific values of that field. That is, I want a FacetField if MyField is 
ABC, DEF, etc. (a specific list of values), but not if MyField is any other 
value.

If I build my query like this:

SolrQuery query = new SolrQuery( luceneQueryStr );
  query.setStart( request.getStartIndex() );
  query.setRows( request.getMaxResults() );
  query.setFacet(true);
 query.setFacetMinCount(1);

  query.addFacetField(MYFIELD);

  for (String fieldValue : desiredFieldValues) {
   query.addFacetQuery(MYFIELD + : + fieldValue);
 }


queryResponse.getFacetFields returns facets for ALL values of MyField. I 
figured that was because setting the facet field with addFacetField caused Solr 
to examine all values. But, if I take out that line, then getFacetFields 
returns an empty list.

I'm sure I'm doing something simple wrong, but I'm out of ideas right now.

-Rich

Re: Error loading a custom request handler in Solr 4.0

This is in trunk (up to date). Compiler is 1.6.0_26

classpath was  
dist/apache-solr-solrj-4.0-SNAPSHOT.jar:dist/apache-solr-core-4.0-SNAPSHOT.jar
built from trunk just prior by 'ant dist'

I'd try again with a clean trunk .

-Simon

On Wed, Aug 10, 2011 at 1:20 PM, Tom Mortimer t...@flax.co.uk wrote:
 Interesting.. is this in trunk (4.0)? Maybe I've broken mine somehow!

 What classpath did you use for compiling? And did you copy anything other
 than the new jar into lib/ ?

 thanks,
 Tom


 On 10 August 2011 18:07, simon mtnes...@gmail.com wrote:

 It's working for me. Compiled, inserted in solr/lib, added the config
 line to solrconfig.

  when I send a /flaxtest request i get

 response
 lst name=responseHeader
 int name=status0/int
 int name=QTime16/int
 /lst
 str name=FlaxTestHello!/str
 /response

 I was doing this within a core defined in solr.xml

 -Simon

 On Wed, Aug 10, 2011 at 11:46 AM, Tom Mortimer t...@flax.co.uk wrote:
  Sure -
 
  import org.apache.solr.request.SolrQueryRequest;
  import org.apache.solr.response.SolrQueryResponse;
  import org.apache.solr.handler.RequestHandlerBase;
 
  public class FlaxTestHandler extends RequestHandlerBase {
 
     public FlaxTestHandler() { }
 
     public void handleRequestBody(SolrQueryRequest req, SolrQueryResponse
  rsp)
         throws Exception
     {
         rsp.add(FlaxTest, Hello!);
     }
 
     public String getDescription() { return Flax; }
     public String getSourceId() { return Flax; }
     public String getSource() { return Flax; }
     public String getVersion() { return Flax; }
 
  }
 
 
 
  On 10 August 2011 16:43, simon mtnes...@gmail.com wrote:
 
  Th attachment isn't showing up (in gmail, at least). Can you inline
  the relevant bits of code ?
 
  On Wed, Aug 10, 2011 at 11:05 AM, Tom Mortimer t...@flax.co.uk wrote:
   Hi,
   Apologies if this is really basic. I'm trying to learn how to create a
   custom request handler, so I wrote the minimal class (attached),
 compiled
   and jar'd it, and placed it in example/lib. I added this to
  solrconfig.xml:
       requestHandler name=/flaxtest class=FlaxTestHandler /
   When I started Solr with java -jar start.jar, I got this:
       ...
       SEVERE: java.lang.NoClassDefFoundError:
   org/apache/solr/handler/RequestHandlerBase
   at java.lang.ClassLoader.defineClass1(Native Method)
           ...
   So I copied all the dist/*.jar files into lib and tried again. This
 time
  it
   seemed to start ok, but browsing to
 http://localhost:8983/solr/displayed
   this:
       org.apache.solr.common.SolrException: Error Instantiating Request
   Handler, FlaxTestHandler is not a
  org.apache.solr.request.SolrRequestHandler
  
         at
 org.apache.solr.core.SolrCore.createInstance(SolrCore.java:410)
  ...
  
   Any ideas?
   thanks,
   Tom

Re: query time problem

Off the top of my head ...

Can you tell if GC is happening more frequently than usual/expected  ?

Is the index optimized - if not, how many segments ?

It's possible that one of the shards is behind a flaky network connection.

Is the 10s performance just for the Solr query or wallclock time at
the browser ?

You can monitor cache statistics from the admin console 'statistics' page

Are you seeing anything untoward in the solr logs ?

-Simon

On Wed, Aug 10, 2011 at 1:11 PM, Charles-Andre Martin
charles-andre.mar...@sunmedia.ca wrote:
 Hi,



 I've noticed poor performance for my solr queries in the past few days.



 Queries of that type :



 http://server:5000/solr/select?q=story_search_field_en:(water boston) OR 
 story_search_field_fr:(water boston)rows=350start=0sort=r_modify_date 
 descshards=shard1:5001/solr,shard2:5002/solrfq=type:(cch_story OR 
 cch_published_story)



 Are slow (more than 10 seconds).



 I would like to know if someone knows how I could investigate the problem ? I 
 tried to specify the parameters debugQuery=onexplainOther=on but this 
 doesn't help much.



 I also monitored the shards log. Sometimes, there is broken pipe in the 
 shards logs.



 Also, is there a way I could monitor the cache statistics ?



 For your information, every shards master and slaves computers have enough 
 RAM and disk space.





 Charles-André Martin

How to start troubleshooting a content extraction issue

2011-08-10 Thread Tim AtLee

Hello

So, I'm a newbie to Solr and Tika and whatnot, so please use simple words
for me :P

I am running Solr on Tomcat 7 on Windows Server 2008 r2, running as the
search engine for a Drupal web site.

Up until recently, everything has been fine - searching works, faceting
works, etc.

Recently a user uploaded a 5mb xltm file, which seems to be causing Tomcat
to spike in CPU usage, and eventually error out.  When the documents are
submitted to be index, the tomcat process spikes up to use 100% of 1
available CPU, with the eventual error in Drupal of Exception occured
sending *sites/default/files/nodefiles/533/June 30, 2011.xltm* to Solr 0
Status: Communication Error.

I am looking for some help in figuring out where to troubleshoot this.  I
assume it's this file, but I guess I'd like to be sure - so how can I submit
this file for content extraction manually to see what happens?

Thanks,

Tim

Re: Error loading a custom request handler in Solr 4.0

Thanks Simon. I'll try again tomorrow.

Tom

On 10 August 2011 18:46, simon mtnes...@gmail.com wrote:

 This is in trunk (up to date). Compiler is 1.6.0_26

 classpath was
  
 dist/apache-solr-solrj-4.0-SNAPSHOT.jar:dist/apache-solr-core-4.0-SNAPSHOT.jar
 built from trunk just prior by 'ant dist'

 I'd try again with a clean trunk .

 -Simon

 On Wed, Aug 10, 2011 at 1:20 PM, Tom Mortimer t...@flax.co.uk wrote:
  Interesting.. is this in trunk (4.0)? Maybe I've broken mine somehow!
 
  What classpath did you use for compiling? And did you copy anything other
  than the new jar into lib/ ?
 
  thanks,
  Tom
 
 
  On 10 August 2011 18:07, simon mtnes...@gmail.com wrote:
 
  It's working for me. Compiled, inserted in solr/lib, added the config
  line to solrconfig.
 
   when I send a /flaxtest request i get
 
  response
  lst name=responseHeader
  int name=status0/int
  int name=QTime16/int
  /lst
  str name=FlaxTestHello!/str
  /response
 
  I was doing this within a core defined in solr.xml
 
  -Simon
 
  On Wed, Aug 10, 2011 at 11:46 AM, Tom Mortimer t...@flax.co.uk wrote:
   Sure -
  
   import org.apache.solr.request.SolrQueryRequest;
   import org.apache.solr.response.SolrQueryResponse;
   import org.apache.solr.handler.RequestHandlerBase;
  
   public class FlaxTestHandler extends RequestHandlerBase {
  
  public FlaxTestHandler() { }
  
  public void handleRequestBody(SolrQueryRequest req,
 SolrQueryResponse
   rsp)
  throws Exception
  {
  rsp.add(FlaxTest, Hello!);
  }
  
  public String getDescription() { return Flax; }
  public String getSourceId() { return Flax; }
  public String getSource() { return Flax; }
  public String getVersion() { return Flax; }
  
   }
  
  
  
   On 10 August 2011 16:43, simon mtnes...@gmail.com wrote:
  
   Th attachment isn't showing up (in gmail, at least). Can you inline
   the relevant bits of code ?
  
   On Wed, Aug 10, 2011 at 11:05 AM, Tom Mortimer t...@flax.co.uk
 wrote:
Hi,
Apologies if this is really basic. I'm trying to learn how to
 create a
custom request handler, so I wrote the minimal class (attached),
  compiled
and jar'd it, and placed it in example/lib. I added this to
   solrconfig.xml:
requestHandler name=/flaxtest class=FlaxTestHandler /
When I started Solr with java -jar start.jar, I got this:
...
SEVERE: java.lang.NoClassDefFoundError:
org/apache/solr/handler/RequestHandlerBase
at java.lang.ClassLoader.defineClass1(Native Method)
...
So I copied all the dist/*.jar files into lib and tried again. This
  time
   it
seemed to start ok, but browsing to
  http://localhost:8983/solr/displayed
this:
org.apache.solr.common.SolrException: Error Instantiating
 Request
Handler, FlaxTestHandler is not a
   org.apache.solr.request.SolrRequestHandler
   
  at
  org.apache.solr.core.SolrCore.createInstance(SolrCore.java:410)
   ...
   
Any ideas?
thanks,
Tom

Re: Building a facet query in SolrJ

2011-08-10 Thread Erik Hatcher

Try making your queries, manually, to see this closer in action... 
q=MyField:uri and see what you get.  In this case, because your URI contains 
characters that make the default query parser unhappy, do this sort of query 
instead:

{!term f=MyField}uri

That way the query is parsed properly into a single term query.

I am a little confused below since you're faceting on MyField entirely 
(addFacetField) where you'd get the values of each URI facet query in that list 
anyway.

Erik

On Aug 10, 2011, at 13:42 , Simon, Richard T wrote:

 I take it back. I didn't find it. I corrected my values and the facet queries 
 still don't find what I want.
 
 The values I'm looking for are URIs, so they look like: 
 http://place.org/abc/def
 
 I add the facet query like so:
 
 query.addFacetQuery(MyField + : + \ + uri + \);
 
 
 I print the query, just to see what it is:
 
 Facet Query:  MyField: : http://place.org/abc/def;
 
 But when I examine queryResponse.getFacetFields, it's an empty list, if I do 
 not set the facet field. If I set the facet field to MyField, then I get 
 facets for ALL the values of MyField, not just the ones in the facet queries.
 
 Can anyone help here?
 
 Thanks.
 
 
 From: Simon, Richard T
 Sent: Wednesday, August 10, 2011 11:07 AM
 To: Simon, Richard T; solr-user@lucene.apache.org
 Subject: RE: Building a facet query in SolrJ
 
 Oops. I think I found it. My desiredFieldValues list has the wrong info. Knew 
 there was something simple wrong.
 
 From: Simon, Richard T
 Sent: Wednesday, August 10, 2011 10:55 AM
 To: solr-user@lucene.apache.org
 Cc: Simon, Richard T
 Subject: Building a facet query in SolrJ
 
 Hi - I'm trying to do a (I think) simple facet query, but I'm not getting the 
 results I expect. I have a field, MyField, and I want to get facets for 
 specific values of that field. That is, I want a FacetField if MyField is 
 ABC, DEF, etc. (a specific list of values), but not if MyField is any 
 other value.
 
 If I build my query like this:
 
 SolrQuery query = new SolrQuery( luceneQueryStr );
  query.setStart( request.getStartIndex() );
  query.setRows( request.getMaxResults() );
  query.setFacet(true);
 query.setFacetMinCount(1);
 
  query.addFacetField(MYFIELD);
 
  for (String fieldValue : desiredFieldValues) {
   query.addFacetQuery(MYFIELD + : + fieldValue);
 }
 
 
 queryResponse.getFacetFields returns facets for ALL values of MyField. I 
 figured that was because setting the facet field with addFacetField caused 
 Solr to examine all values. But, if I take out that line, then getFacetFields 
 returns an empty list.
 
 I'm sure I'm doing something simple wrong, but I'm out of ideas right now.
 
 -Rich

RE: Building a facet query in SolrJ

Hi -- I do get facets for all the values of MyField when I specify the facet 
field, but that's not what I want. I just want facets for a subset of the 
values of MyField. That's why I'm trying to use the facet queries, to just get 
facets for those values.


-Rich

-Original Message-
From: Erik Hatcher [mailto:erik.hatc...@gmail.com] 
Sent: Wednesday, August 10, 2011 2:04 PM
To: solr-user@lucene.apache.org
Subject: Re: Building a facet query in SolrJ

Try making your queries, manually, to see this closer in action... 
q=MyField:uri and see what you get.  In this case, because your URI contains 
characters that make the default query parser unhappy, do this sort of query 
instead:

{!term f=MyField}uri

That way the query is parsed properly into a single term query.

I am a little confused below since you're faceting on MyField entirely 
(addFacetField) where you'd get the values of each URI facet query in that list 
anyway.

Erik

On Aug 10, 2011, at 13:42 , Simon, Richard T wrote:

 I take it back. I didn't find it. I corrected my values and the facet queries 
 still don't find what I want.
 
 The values I'm looking for are URIs, so they look like: 
 http://place.org/abc/def
 
 I add the facet query like so:
 
 query.addFacetQuery(MyField + : + \ + uri + \);
 
 
 I print the query, just to see what it is:
 
 Facet Query:  MyField: : http://place.org/abc/def;
 
 But when I examine queryResponse.getFacetFields, it's an empty list, if I do 
 not set the facet field. If I set the facet field to MyField, then I get 
 facets for ALL the values of MyField, not just the ones in the facet queries.
 
 Can anyone help here?
 
 Thanks.
 
 
 From: Simon, Richard T
 Sent: Wednesday, August 10, 2011 11:07 AM
 To: Simon, Richard T; solr-user@lucene.apache.org
 Subject: RE: Building a facet query in SolrJ
 
 Oops. I think I found it. My desiredFieldValues list has the wrong info. Knew 
 there was something simple wrong.
 
 From: Simon, Richard T
 Sent: Wednesday, August 10, 2011 10:55 AM
 To: solr-user@lucene.apache.org
 Cc: Simon, Richard T
 Subject: Building a facet query in SolrJ
 
 Hi - I'm trying to do a (I think) simple facet query, but I'm not getting the 
 results I expect. I have a field, MyField, and I want to get facets for 
 specific values of that field. That is, I want a FacetField if MyField is 
 ABC, DEF, etc. (a specific list of values), but not if MyField is any 
 other value.
 
 If I build my query like this:
 
 SolrQuery query = new SolrQuery( luceneQueryStr );
  query.setStart( request.getStartIndex() );
  query.setRows( request.getMaxResults() );
  query.setFacet(true);
 query.setFacetMinCount(1);
 
  query.addFacetField(MYFIELD);
 
  for (String fieldValue : desiredFieldValues) {
   query.addFacetQuery(MYFIELD + : + fieldValue);
 }
 
 
 queryResponse.getFacetFields returns facets for ALL values of MyField. I 
 figured that was because setting the facet field with addFacetField caused 
 Solr to examine all values. But, if I take out that line, then getFacetFields 
 returns an empty list.
 
 I'm sure I'm doing something simple wrong, but I'm out of ideas right now.
 
 -Rich

RE: query time problem

2011-08-10 Thread Charles-Andre Martin

Thanks Simon for these tracks.

Here's my answers :

Can you tell if GC is happening more frequently than usual/expected  ?

GC is OK.

Is the index optimized - if not, how many segments ?

According to the statistics page from the admin :
One shard (master/slave) has 10 segments
The other shard (master/slave) has 13 segments

Is this ok ? The optimize job is running each day during the night.


It's possible that one of the shards is behind a flaky network connection.

Will check ...


Is the 10s performance just for the Solr query or wallclock time at
the browser ?

Both

You can monitor cache statistics from the admin console 'statistics' page

Thanks


Are you seeing anything untoward in the solr logs ?

I see stacktrace :

Aug 10, 2011 1:49:13 PM org.apache.solr.common.SolrException log
SEVERE: ClientAbortException:  java.net.SocketException: Broken pipe
at 
org.apache.catalina.connector.OutputBuffer.realWriteBytes(OutputBuffer.java:358)
at org.apache.tomcat.util.buf.ByteChunk.append(ByteChunk.java:325)
at 
org.apache.catalina.connector.OutputBuffer.writeBytes(OutputBuffer.java:381)
at 
org.apache.catalina.connector.OutputBuffer.write(OutputBuffer.java:370)
at 
org.apache.catalina.connector.CoyoteOutputStream.write(CoyoteOutputStream.java:89)
at 
org.apache.solr.common.util.FastOutputStream.flushBuffer(FastOutputStream.java:183)
at 
org.apache.solr.common.util.JavaBinCodec.marshal(JavaBinCodec.java:89)
at 
org.apache.solr.request.BinaryResponseWriter.write(BinaryResponseWriter.java:48)
at 
org.apache.solr.servlet.SolrDispatchFilter.writeResponse(SolrDispatchFilter.java:322)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:254)
at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at 
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
at 
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
at 
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
at 
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
at 
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
at 
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:286)
at 
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:845)
at 
org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583)
at 
org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447)
at java.lang.Thread.run(Thread.java:619)
Caused by: java.net.SocketException: Broken pipe
at java.net.SocketOutputStream.socketWrite0(Native Method)
at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92)
at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
at 
org.apache.coyote.http11.InternalOutputBuffer.realWriteBytes(InternalOutputBuffer.java:740)
at org.apache.tomcat.util.buf.ByteChunk.flushBuffer(ByteChunk.java:434)
at org.apache.tomcat.util.buf.ByteChunk.append(ByteChunk.java:349)
at 
org.apache.coyote.http11.InternalOutputBuffer$OutputStreamOutputBuffer.doWrite(InternalOutputBuffer.java:764)
at 
org.apache.coyote.http11.filters.IdentityOutputFilter.doWrite(IdentityOutputFilter.java:127)
at 
org.apache.coyote.http11.InternalOutputBuffer.doWrite(InternalOutputBuffer.java:573)
at org.apache.coyote.Response.doWrite(Response.java:560)
at 
org.apache.catalina.connector.OutputBuffer.realWriteBytes(OutputBuffer.java:353)
... 21 more

Charles-André Martin


800 Square Victoria
Montréal (Québec) H4Z 0A3
Tél : (514) 504-2703


-Message d'origine-
De : simon [mailto:mtnes...@gmail.com] 
Envoyé : August-10-11 1:52 PM
À : solr-user@lucene.apache.org
Objet : Re: query time problem

Off the top of my head ...

Can you tell if GC is happening more frequently than usual/expected  ?

Is the index optimized - if not, how many segments ?

It's possible that one of the shards is behind a flaky network connection.

Is the 10s performance just for the Solr query or wallclock time at
the browser ?

You can monitor cache statistics from the admin console 'statistics' page

Are you seeing anything untoward in the solr logs ?

-Simon

On Wed, Aug 10, 2011 at 1:11 PM, Charles-Andre Martin
charles-andre.mar...@sunmedia.ca wrote:
 Hi,



 I've noticed poor performance for my solr queries in the past few days.



 Queries of that type :



 http://server:5000/solr/select?q=story_search_field_en:(water boston) OR 
 story_search_field_fr:(water boston)rows=350start=0sort=r_modify_date

Can't mix Synonyms with Shingles?

2011-08-10 Thread Jeff Wartes


I would like to combine the ShingleFilterFactory with a SynonymFilterFactory in 
a field type. 

I've looked at something like this using the analysis.jsp tool: 

fieldType name=TestTerm class=solr.TextField 
positionIncrementGap=100
  analyzer type=index
tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.WordDelimiterFilterFactory generateWordParts=1 
generateNumberParts=1 stemEnglishPosessive=1/
filter class=solr.ShingleFilterFactory tokenSeparator= /
filter class=solr.SynonymFilterFactory 
synonyms=synonyms.BusinessNames.txt ignoreCase=true expand=true/
...
  /analyzer
  analyzer type=query
  ...
  /analyzer
/fieldType

However, when a ShingleFilterFactory is applied first, the SynonymFilterFactory 
appears to do nothing. 
I haven't found any documentation or other warnings against this combination, 
and I don't want to apply shingles after synonyms (this works) because 
multi-word synonyms then cause severe term expansion. I don't really mind if 
the synonyms fail to match shingles, (although I'd prefer they succeed) but I'd 
at least expect that synonyms would continue to match on the original tokens, 
as they do if I remove the ShingleFilterFactory.

I'm using Solr 3.3, any clarification would be appreciated.

Thanks,
  -Jeff Wartes

Re: Error loading a custom request handler in Solr 4.0

2011-08-10 Thread Chris Hostetter


: custom request handler, so I wrote the minimal class (attached), compiled
: and jar'd it, and placed it in example/lib. I added this to solrconfig.xml:

that's the crux of hte issue.

example/lib is where the jetty libraries live -- not solr plugins.

you should either put your custom jar's in the lib dir of your solr home 
(ie: example/solr/lib) or put it in a directory of your choice that you 
refer to from your solrconfig.xml file using a lib/ directive.

: So I copied all the dist/*.jar files into lib and tried again. This time it

ouch ... make sure you remove *all* of those, or you will have no end of 
random obscure classpath issues at random times as jars are sometimes 
loaded from the war and sometimes loaded from that directory.


-Hoss

RE: Can't mix Synonyms with Shingles?

2011-08-10 Thread Steven A Rowe

Hi Jeff,

Hi Jeff,

You have configured ShingleFilterFactory with a token separator of , so e.g. 
International Corporation will output the shingle InternationalCorporation. 
 If this is the form you want to use for synonym matching, it must exist in 
your synonym file.  Does it?

Steve

 -Original Message-
 From: Jeff Wartes [mailto:jwar...@whitepages.com]
 Sent: Wednesday, August 10, 2011 3:43 PM
 To: solr-user@lucene.apache.org
 Subject: Can't mix Synonyms with Shingles?
 
 
 I would like to combine the ShingleFilterFactory with a
 SynonymFilterFactory in a field type.
 
 I've looked at something like this using the analysis.jsp tool:
 
 fieldType name=TestTerm class=solr.TextField
 positionIncrementGap=100
   analyzer type=index
 tokenizer class=solr.WhitespaceTokenizerFactory/
 filter class=solr.WordDelimiterFilterFactory
 generateWordParts=1 generateNumberParts=1 stemEnglishPosessive=1/
 filter class=solr.ShingleFilterFactory tokenSeparator= /
 filter class=solr.SynonymFilterFactory
 synonyms=synonyms.BusinessNames.txt ignoreCase=true expand=true/
 ...
   /analyzer
   analyzer type=query
   ...
   /analyzer
 /fieldType
 
 However, when a ShingleFilterFactory is applied first, the
 SynonymFilterFactory appears to do nothing.
 I haven't found any documentation or other warnings against this
 combination, and I don't want to apply shingles after synonyms (this
 works) because multi-word synonyms then cause severe term expansion. I
 don't really mind if the synonyms fail to match shingles, (although I'd
 prefer they succeed) but I'd at least expect that synonyms would continue
 to match on the original tokens, as they do if I remove the
 ShingleFilterFactory.
 
 I'm using Solr 3.3, any clarification would be appreciated.
 
 Thanks,
   -Jeff Wartes

RE: Building a facet query in SolrJ

2011-08-10 Thread Chris Hostetter


: query.addFacetQuery(MyField + : + \ + uri + \);
...
: But when I examine queryResponse.getFacetFields, it's an empty list, if 

facet.query constraints+counts do not come back in the facet.field 
section of hte response.  they come back in the facet.query section of 
the response (look at the XML in your browser and you'll see what i 
mean)...

https://lucene.apache.org/solr/api/org/apache/solr/client/solrj/response/QueryResponse.html#getFacetQuery%28%29


-Hoss

Re: Example Solr Config on EC2

2011-08-10 Thread Matt Shields

If I were to build a master with multiple slaves, is it possible to promote
a slave to be the new master if the original master fails?  Will all the
slaves pickup right where they left off, or any time the master fails will
we need to completely regenerate all the data?

If this is possible, are there any examples of this being automated?
 Especially on Win2k3.

Matthew Shields
Owner
BeanTown Host - Web Hosting, Domain Names, Dedicated Servers, Colocation,
Managed Services
www.beantownhost.com
www.sysadminvalley.com
www.jeeprally.com



On Mon, Aug 8, 2011 at 5:34 PM, mboh...@yahoo.com wrote:

 Matthew,

 Here's another resource:

 http://www.lucidimagination.com/blog/2010/02/01/solr-shines-through-the-cloud-lucidworks-solr-on-ec2/


 Michael Bohlig
 Lucid Imagination



 - Original Message 
 From: Matt Shields m...@mattshields.org
 To: solr-user@lucene.apache.org
 Sent: Mon, August 8, 2011 2:03:20 PM
 Subject: Example Solr Config on EC2

 I'm looking for some examples of how to setup Solr on EC2.  The
 configuration I'm looking for would have multiple nodes for redundancy.
 I've tested in-house with a single master and slave with replication
 running in Tomcat on Windows Server 2003, but even if I have multiple
 slaves
 the single master is a single point of failure.  Any suggestions or example
 configurations?  The project I'm working on is a .NET setup, so ideally I'd
 like to keep this search cluster on Windows Server, even though I prefer
 Linux.

 Matthew Shields
 Owner
 BeanTown Host - Web Hosting, Domain Names, Dedicated Servers, Colocation,
 Managed Services
 www.beantownhost.com
 www.sysadminvalley.com
 www.jeeprally.com

Problem with xinclude in solrconfig.xml

2011-08-10 Thread Way Cool

Hi, Guys,

Based on the document below, I should be able to include a file under the
same directory by specifying relative path via xinclude in solrconfig.xml:
http://wiki.apache.org/solr/SolrConfigXml

However I am getting the following error when I use relative path (absolute
path works fine though):
SEVERE: org.xml.sax.SAXParseException: Error attempting to parse XML file

Any ideas?

Thanks,

YH

Re: Problem with xinclude in solrconfig.xml

2011-08-10 Thread Way Cool

Sorry for the spam. I just figured it out. Thanks.

On Wed, Aug 10, 2011 at 2:17 PM, Way Cool way1.wayc...@gmail.com wrote:

 Hi, Guys,

 Based on the document below, I should be able to include a file under the
 same directory by specifying relative path via xinclude in solrconfig.xml:
 http://wiki.apache.org/solr/SolrConfigXml

 However I am getting the following error when I use relative path (absolute
 path works fine though):
 SEVERE: org.xml.sax.SAXParseException: Error attempting to parse XML file

 Any ideas?

 Thanks,

 YH

RE: Can't mix Synonyms with Shingles?

2011-08-10 Thread Jeff Wartes


Hi Steven,

The token separator was certainly a deliberate choice, are you saying that 
after applying shingles, synonyms can only match shingled terms? The term 
analysis suggests the original tokens still exist. 
You've made me realize that only certain synonyms seem to have problems though, 
so it's not a blanket failure.

Take this synonym definition:
wamu, washington mutual bank, washington mutual

Indexing wamu looks like it'll work fine - there are no shingles, and all 
three synonym expansions appear to get indexed. (expand=true) However, 
indexing washington mutual applies the shingles correctly, (adds 
washingtonmutual to position 1) but the synonym expansion does not happen. I 
would still expect the synonym definition to match the original terms and index 
'wamu' along with the other stuff.

Thanks.



-Original Message-
From: Steven A Rowe [mailto:sar...@syr.edu] 
Sent: Wednesday, August 10, 2011 12:54 PM
To: solr-user@lucene.apache.org
Subject: RE: Can't mix Synonyms with Shingles?

Hi Jeff,

Hi Jeff,

You have configured ShingleFilterFactory with a token separator of , so e.g. 
International Corporation will output the shingle InternationalCorporation. 
 If this is the form you want to use for synonym matching, it must exist in 
your synonym file.  Does it?

Steve

 -Original Message-
 From: Jeff Wartes [mailto:jwar...@whitepages.com]
 Sent: Wednesday, August 10, 2011 3:43 PM
 To: solr-user@lucene.apache.org
 Subject: Can't mix Synonyms with Shingles?
 
 
 I would like to combine the ShingleFilterFactory with a 
 SynonymFilterFactory in a field type.
 
 I've looked at something like this using the analysis.jsp tool:
 
 fieldType name=TestTerm class=solr.TextField
 positionIncrementGap=100
   analyzer type=index
 tokenizer class=solr.WhitespaceTokenizerFactory/
 filter class=solr.WordDelimiterFilterFactory
 generateWordParts=1 generateNumberParts=1 stemEnglishPosessive=1/
 filter class=solr.ShingleFilterFactory tokenSeparator= /
 filter class=solr.SynonymFilterFactory
 synonyms=synonyms.BusinessNames.txt ignoreCase=true expand=true/
 ...
   /analyzer
   analyzer type=query
   ...
   /analyzer
 /fieldType
 
 However, when a ShingleFilterFactory is applied first, the 
 SynonymFilterFactory appears to do nothing.
 I haven't found any documentation or other warnings against this 
 combination, and I don't want to apply shingles after synonyms (this
 works) because multi-word synonyms then cause severe term expansion. I 
 don't really mind if the synonyms fail to match shingles, (although 
 I'd prefer they succeed) but I'd at least expect that synonyms would 
 continue to match on the original tokens, as they do if I remove the 
 ShingleFilterFactory.
 
 I'm using Solr 3.3, any clarification would be appreciated.
 
 Thanks,
   -Jeff Wartes

Solr 3.3: DIH configuration for Oracle

2011-08-10 Thread Eugeny Balakhonov

Hello, all!

 

I want to create a good DIH configuration for my Oracle database with deltas
support. Unfortunately I am not able to do it well as DIH has the strange
restrictions.

I want to explain a problem on a simple example. In a reality my database
has very difficult structure.

 

Initial conditions: Two tables with following easy structure:

 

Table1

-  ID_RECORD(Primary key)

-  DATA_FIELD1

-  ..

-  DATA_FIELD2

-  LAST_CHANGE_TIME

Table2

-  ID_RECORD(Primary key)

-  PARENT_ID_RECORD (Foreign key to Table1.ID_RECORD) 

-  DATA_FIELD1

-  ..

-  DATA_FIELD2

-  LAST_CHANGE_TIME

 

In performance reasons it is necessary to do selection of the given tables
by means of one request (via inner join).

 

My db-data-config.xml file:

 

?xml version=1.0 encoding=UTF-8?

dataConfig

dataSource jndiName=jdbc/DB1 type=JdbcDataSource user=
password=/

document

entity name=ent pk=T1_ID_RECORD, T2_ID_RECORD

query=select * from TABLE1 t1 inner join TABLE2 t2 on
t1.ID_RECORD = t2.PARENT_ID_RECORD

deltaQuery=select t1.ID_RECORD T1_ID_RECORD, t1.ID_RECORD
T2_ID_RECORD 

   from TABLE1 t1 inner join TABLE2 t2 on
t1.ID_RECORD = t2.PARENT_ID_RECORD

   where TABLE1.LAST_CHANGE_TIME 
to_date('${dataimporter.last_index_time}', '-MM-DD HH24:MI:SS')

   or TABLE2.LAST_CHANGE_TIME 
to_date('${dataimporter.last_index_time}', '-MM-DD HH24:MI:SS')

deltaImportQuery=select * from TABLE1 t1 inner join TABLE2 t2
on t1.ID_RECORD = t2.PARENT_ID_RECORD

where t1.ID_RECORD = ${dataimporter.delta.T1_ID_RECORD} and
t2.ID_RECORD = ${dataimporter.delta.T2_ID_RECORD}

/

/document

/dataConfig

 

In result I have following error:

 

java.lang.IllegalArgumentException: deltaQuery has no column to resolve to
declared primary key pk='T1_ID_RECORD, T2_ID_RECORD'

 

I have analyzed the source code of DIH. I found that in the DocBuilder class
collectDelta() method works with value of entity attribute pk as with
simple string. But in my case this is array with two values: T1_ID_RECORD,
T2_ID_RECORD

 

What do I do wrong?

 

Thanks,

Eugeny

Increasing the highlight snippet size

2011-08-10 Thread Sang Yum

Hi,

I have been trying to increase the size of the highlight snippets using
hl.fragSize parameter, without much success. It seems that hl.fragSize is
not making any difference at all in terms of snippet size.

For example, compare the following two set of query/results:

http://10.1.1.51:8983/solr/select?q=%28bookCode%3abarglewargle+AND+content%3awriting+AND+id:6970%29rows=1sort=id+ascfl=id%2cbookCode%2cnavPointId%2csectionTitlehl=truehl.fl=contenthl.snippets=100hl.fragSize=10hl.maxAnalyzedChars=-1version=2.2

/spanspan id=w20422 class=werd to/spanspan id=w20423
class=werd emwrite/em/spanspan id=w20424 class=werd a

http://10.1.1.51:8983/solr/select?q=%28bookCode%3abarglewargle+AND+content%3awriting+AND+id:6970%29rows=1sort=id+ascfl=id%2cbookCode%2cnavPointId%2csectionTitlehl=truehl.fl=contenthl.snippets=100hl.fragSize=1000hl.maxAnalyzedChars=-1version=2.2

/spanspan id=w20422 class=werd to/spanspan id=w20423
class=werd emwrite/em/spanspan id=w20424 class=werd a

Because of our particular needs, the content has been spanified, each word
with its own span id. I do apply HTMLStrip during the index time.

What I would like to do is to increase the size of snippet so that the
highlighted snippets contain more surrounding words.

Although hl.fragSize went from 10 to 1000, the result is the same.
This leads me to believe that hl.fragSize might not be the correct parameter
to achieve the effect i am looking for. If so, what parameter should I use?

Thanks!

Re: Example Solr Config on EC2

2011-08-10 Thread Akshay

Yes you can promote a slave to be master refer
http://wiki.apache.org/solr/SolrReplication#enable.2BAC8-disable_master.2BAC8-slave_in_a_node

In AWS one can use an elastic IP(http://aws.amazon.com/articles/1346) to
refer to the master and this can be assigned to slaves as they assume the
role of master(in case of failure). All slaves will then refer to this new
master and there will be no need to regenerate data.

Automation of this maybe possible through CloudWatch alarm-actions. I don't
know of any available example automation scripts.

Cheers
Akshay.

On Wed, Aug 10, 2011 at 9:08 PM, Matt Shields m...@mattshields.org wrote:

If I were to build a master with multiple slaves, is it possible to promote
a slave to be the new master if the original master fails? Will all the
slaves pickup right where they left off, or any time the master fails will
we need to completely regenerate all the data?

If this is possible, are there any examples of this being automated?
Especially on Win2k3.

Matthew Shields
Owner
BeanTown Host - Web Hosting, Domain Names, Dedicated Servers, Colocation,
Managed Services
www.beantownhost.com
www.sysadminvalley.com
www.jeeprally.com

On Mon, Aug 8, 2011 at 5:34 PM, mboh...@yahoo.com wrote:

Matthew,

Here's another resource:

http://www.lucidimagination.com/blog/2010/02/01/solr-shines-through-the-cloud-lucidworks-solr-on-ec2/

Michael Bohlig
Lucid Imagination

- Original Message
From: Matt Shields m...@mattshields.org
To: solr-user@lucene.apache.org
Sent: Mon, August 8, 2011 2:03:20 PM
Subject: Example Solr Config on EC2

I'm looking for some examples of how to setup Solr on EC2. The
configuration I'm looking for would have multiple nodes for redundancy.
I've tested in-house with a single master and slave with replication
running in Tomcat on Windows Server 2003, but even if I have multiple
slaves
the single master is a single point of failure. Any suggestions or
example
configurations? The project I'm working on is a .NET setup, so ideally
I'd
like to keep this search cluster on Windows Server, even though I prefer
Linux.

Matthew Shields
Owner
BeanTown Host - Web Hosting, Domain Names, Dedicated Servers, Colocation,
Managed Services
www.beantownhost.com
www.sysadminvalley.com
www.jeeprally.com

Re: Increasing the highlight snippet size