Re: edismax, inconsistencies with implicit/explicit AND when used with explicit OR

2011-08-10 Thread Ahmet Arslan
Hi Mark,

I suspect that issue you are facing is 

https://issues.apache.org/jira/browse/SOLR-2649

You can verify this by toggling default operator between 'AND' and 'OR'.

--- On Wed, 8/10/11, Mark juszczec mark.juszc...@gmail.com wrote:

 From: Mark juszczec mark.juszc...@gmail.com
 Subject: edismax, inconsistencies with implicit/explicit AND when used with 
 explicit OR
 To: solr-user@lucene.apache.org
 Date: Wednesday, August 10, 2011, 12:27 AM
 Hello all
 
 We've just switched from the default parser to the edismax
 parser and a user
 has noticed some inconsistencies when using
 implicit/explicit ANDs, ORs and
 grouping search terms
 in parenthesis.
 
 First, the default query operator is AND.  I switched
 it from OR today.
 
 The query:
 
 customersJoin/select?indent=onversion=3.3q=CUSTOMER_NM:*IBM*%20CUSTOMER_NM:*Software*%20OR%20CUSTOMER_NM:*something*fq=start=0rows=10fl=*%2CscoredefType=edismaxwt=explainOther=hl.flhttp://cn-nyc1-ad-dev1.cnet.com:8983/solr/customersJoin/select?indent=onversion=3.3q=CUSTOMER_NM:*IBM*%20CUSTOMER_NM:*Software*%20OR%20CUSTOMER_NM:*something*fq=start=0rows=10fl=*%2CscoredefType=edismaxwt=explainOther=hl.fl
 =
 
 
 returns 1053 results.  Some have only IBM in
 CUSTOMER_NM, some have only
 Software in the name, some have both.
 
 
 However, when I explicitly specify an AND between
 CUSTOMER_NM:*IBM* and
 CUSTOMER_NM:*Software* :
 
 
 customersJoin/select?indent=onversion=3.3q=CUSTOMER_NM:*IBM*%20AND%20CUSTOMER_NM:*Software*%20OR%20CUSTOMER_NM:*something*fq=start=0rows=10fl=*%2CscoredefType=edismaxwt=explainOther=hl.fl=
 
 I only get 3 results and all of them contain both IBM and
 Software.
 
 I found this reference to inconsistencies with edismax, but
 I'm not sure it
 explains this situation 100%.
 
 http://lucene.472066.n3.nabble.com/edismax-inconsistency-AND-OR-td2131795.html
 
 Have I found a bug or am I doing something terribly wrong?
 
 Mark



Re: Indexing tweet and searching @keyword OR #keyword

2011-08-10 Thread Mohammad Shariq
I tried tweaking WordDelimiterFactory but I won't accept # OR @ symbols
and it ignored totally.
I need solution plz suggest.

On 4 August 2011 21:08, Jonathan Rochkind rochk...@jhu.edu wrote:

 It's the WordDelimiterFactory in your filter chain that's removing the
 punctuation entirely from your index, I think.

 Read up on what the WordDelimiter filter does, and what it's settings are;
 decide how you want things to be tokenized in your index to get the behavior
 your want; either get WordDelimiter to do it that way by passing it
 different arguments, or stop using WordDelimiter; come back with any
 questions after trying that!



 On 8/4/2011 11:22 AM, Mohammad Shariq wrote:

 I have indexed around 1 million tweets ( using  text dataType).
 when I search the tweet with #  OR @  I dont get the exact result.
 e.g.  when I search for #ipad OR @ipad   I get the result where ipad
 is
 mentioned skipping the # and @.
 please suggest me, how to tune or what are filterFactories to use to get
 the
 desired result.
 I am indexing the tweet as text, below is text which is there in my
 schema.xml.


 fieldType name=text class=solr.TextField positionIncrementGap=100
 analyzer type=index
 tokenizer class=solr.**KeywordTokenizerFactory/
 filter class=solr.**CommonGramsFilterFactory words=stopwords.txt
 minShingleSize=3 maxShingleSize=3 ignoreCase=true/
 filter class=solr.**WordDelimiterFilterFactory
 generateWordParts=1
 generateNumberParts=1 catenateWords=1 catenateNumbers=1
 catenateAll=0 splitOnCaseChange=1/
 filter class=solr.**LowerCaseFilterFactory/
 filter class=solr.**SnowballPorterFilterFactory
 protected=protwords.txt language=English/
 /analyzer
 analyzer type=query
 tokenizer class=solr.**KeywordTokenizerFactory/
 filter class=solr.**CommonGramsFilterFactory
 words=stopwords.txt
 minShingleSize=3 maxShingleSize=3 ignoreCase=true/
 filter class=solr.**WordDelimiterFilterFactory
 generateWordParts=1 generateNumberParts=1 catenateWords=1
 catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/
 filter class=solr.**LowerCaseFilterFactory/
 filter class=solr.**SnowballPorterFilterFactory
 protected=protwords.txt language=English/
 /analyzer
 /fieldType




-- 
Thanks and Regards
Mohammad Shariq


Re: Is optimize needed on slaves if it replicates from optimized master?

2011-08-10 Thread Bernd Fehling


From what I see on my slaves, yes.
After replication has finished and new index is in place and new reader has 
started
I have always a write.lock file in my index directory on slaves, even though 
the index
on master is optimized.

Regards
Bernd


Am 10.08.2011 09:12, schrieb Pranav Prakash:

Do slaves need a separate optimize command if they replicate from optimized
master?

*Pranav Prakash*

temet nosce

Twitterhttp://twitter.com/pranavprakash  | Bloghttp://blog.myblive.com  |
Googlehttp://www.google.com/profiles/pranny



Re: Is optimize needed on slaves if it replicates from optimized master?

2011-08-10 Thread Shalin Shekhar Mangar
On Wed, Aug 10, 2011 at 1:11 PM, Bernd Fehling 
bernd.fehl...@uni-bielefeld.de wrote:


 From what I see on my slaves, yes.
 After replication has finished and new index is in place and new reader has
 started
 I have always a write.lock file in my index directory on slaves, even
 though the index
 on master is optimized.


That is not true. Replication is roughly a copy of the diff between the
master and the slave's index. An optimized index is a merged and re-written
index so replication from an optimized master will give an optimized copy on
the slave.

The write lock is due to the fact that an IndexWriter is always open in Solr
even on the slaves.

-- 
Regards,
Shalin Shekhar Mangar.


Re: Is optimize needed on slaves if it replicates from optimized master?

2011-08-10 Thread Bernd Fehling


Sure there is actually no optimizing on the slave needed,
but after calling optimize on the slave the write.lock will be removed.
So why is the replication process not doing this?

Regards
Bernd


Am 10.08.2011 10:57, schrieb Shalin Shekhar Mangar:

On Wed, Aug 10, 2011 at 1:11 PM, Bernd Fehling
bernd.fehl...@uni-bielefeld.de  wrote:



 From what I see on my slaves, yes.
After replication has finished and new index is in place and new reader has
started
I have always a write.lock file in my index directory on slaves, even
though the index
on master is optimized.



That is not true. Replication is roughly a copy of the diff between the
master and the slave's index. An optimized index is a merged and re-written
index so replication from an optimized master will give an optimized copy on
the slave.

The write lock is due to the fact that an IndexWriter is always open in Solr
even on the slaves.



document indexing

2011-08-10 Thread directorscott
Hello,

First of all, I am a beginner and i am trying to develop a sample
application using SolrNet.

I am struggling about schema definition i need to use to correspond my
needs. In database, i have Books(bookId, name) and Pages(pageId, bookId,
text) tables. They have master-detail relationship. I want to be able to
search in Text area of Pages but list the books. Should i use a schema for
Pages (with pageid as unique key) or for Books (with bookId as unique key)
in this scenario? 

Thanks.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/document-indexing-tp3241832p3241832.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Is optimize needed on slaves if it replicates from optimized master?

2011-08-10 Thread Pranav Prakash
That is not true. Replication is roughly a copy of the diff between the
 master and the slave's index.


In my case, during replication entire index is copied from master to slave,
during which the size of index goes a little over double. Then it shrinks to
its original size. Am I doing something wrong? How can I get the master to
serve only delta index instead of serving whole index and the slaves merging
the new and old index?

*Pranav Prakash*


Re: document indexing

2011-08-10 Thread lee carroll
It really does depend upon what you want to do in your app but from
the info given I'd go for denormalizing by repeating the least number
of values. So in your case that would be book

PageID+BookID(uniqueKey), pageID, PageVal1, PageValn, BookID, BookName




On 10 August 2011 09:46, directorscott dgul...@gmail.com wrote:
 Hello,

 First of all, I am a beginner and i am trying to develop a sample
 application using SolrNet.

 I am struggling about schema definition i need to use to correspond my
 needs. In database, i have Books(bookId, name) and Pages(pageId, bookId,
 text) tables. They have master-detail relationship. I want to be able to
 search in Text area of Pages but list the books. Should i use a schema for
 Pages (with pageid as unique key) or for Books (with bookId as unique key)
 in this scenario?

 Thanks.



 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/document-indexing-tp3241832p3241832.html
 Sent from the Solr - User mailing list archive at Nabble.com.



frange not working in query

2011-08-10 Thread Amit Sawhney
Hi All,

I am trying to sort the results on a unix timestamp using this query. 

http://url.com:8983/solr/db/select/?indent=onversion=2.1q={!frange%20l=0.25}query($qq)qq=nokiasort=unix-timestamp%20descstart=0rows=10qt=dismaxwt=dismaxfl=*,scorehl=onhl.snippets=1

When I run this query, it says 'no field name specified in query and no 
defaultSearchField defined in schema.xml'

As soon as I remove the frange query and run this, it starts working fine. 

http://url.com:8983/solr/db/select/?indent=onversion=2.1q=nokiasort=unix-timestamp%20descstart=0rows=10qt=dismaxwt=dismaxfl=*,scorehl=onhl.snippets=1

Any pointers?


Thanks,
Amit

RE: Trying to index pdf docs - lazy loading error - ClassNotFoundException: solr.extraction.ExtractingRequestHandler

2011-08-10 Thread Rode González
I have had a mistake with the configs files. From the example directory all
works correctly. Thanks to all.

---
Rode González
Libnova, SL
Paseo de la Castellana, 153-Madrid
[t]91 449 08 94  [f]91 141 21 21
www.libnova.es

 -Mensaje original-
 De: Rode González [mailto:r...@libnova.es]
 Enviado el: martes, 09 de agosto de 2011 13:04
 Para: solr-user@lucene.apache.org
 CC: Leo
 Asunto: Trying to index pdf docs - lazy loading error -
 ClassNotFoundException: solr.extraction.ExtractingRequestHandler
 
 Hi all.
 
 
 
 I've tried to index pdf documents using the libraries includes in the
 example distribution of solr 3.3.0.
 
 
 
 I've copied all the jars includes in /dist and /contrib directories in
 a
 common /lib directory and I've included this path to the solrconfig.xml
 file.
 
 
 
 The request handler for binary docs has no changes from the example:
 
 
 
   requestHandler name=/update/extract
 
   startup=lazy
 
   class=solr.extraction.ExtractingRequestHandler 
 
 lst name=defaults
 
   !-- All the main content goes into text... if you need to
 return
 
the extracted text or do highlighting, use a stored field. -
 -
 
   str name=fmap.contenttext/str
 
  !-- str name=lowernamestrue/str --
 
   !-- str name=uprefixignored_/str --
 
 
 
   !-- capture link hrefs but ignore div attributes --
 
   !-- str name=captureAttrtrue/str --
 
   !-- str name=fmap.alinks/str --
 
   !-- str name=fmap.divignored_/str --
 
 /lst
 
   /requestHandler
 
 
 
 I've commented all subnodes except fmap.content because I don't use the
 rest
 of them.
 
 
 
 
 
 ...BUT... :)
 
 
 
 When I try :
 
 
 
 curl
 http://myserver:8080/solr/update/extract/?literal.id=1000commit=true;
 -F myfile=@myfile_.pdf
 
 
 
 I get:
 
 
 
 Status HTTP 500 - lazy loading error
 org.apache.solr.common.SolrException:
 lazy loading error
 
 ...
 
 Caused by: org.apache.solr.common.SolrException: Error loading class
 'solr.extraction.ExtractingRequestHandler'
 
 ...
 
 
 
 
 
 I've moved contrib/extraction/lib/* to my lib/*  .
 
 Restart the server and I can see in the log that apache-solr-cell-
 3.3.0.jar
 was added to the classloader. But I get the same result :(  ... lazy
 loading
 error, error loading class.
 
 
 
 
 
 
 
 # What am I forgetting? what am I missing?
 
 
 
 Thanks
 
 
 
 
 
 ---
 
 Rode González
 
 
 
   _
 
 No se encontraron virus en este mensaje.
 Comprobado por AVG - www.avg.com
 Versión: 10.0.1392 / Base de datos de virus: 1520/3822 - Fecha de
 publicación: 08/08/11
 
 
 -
 No se encontraron virus en este mensaje.
 Comprobado por AVG - www.avg.com
 Versión: 10.0.1392 / Base de datos de virus: 1520/3822 - Fecha de
 publicación: 08/08/11

-
No se encontraron virus en este mensaje.
Comprobado por AVG - www.avg.com
Versión: 10.0.1392 / Base de datos de virus: 1520/3824 - Fecha de
publicación: 08/09/11




Re: Possible bug in FastVectorHighlighter

2011-08-10 Thread Massimo Schiavon

Worked fine. Thanks a lot!

Massimo

On 09/08/2011 11:58, Jayendra Patil wrote:

Try using -

  str name=hl.tag.pre![CDATA[b]]/str
  str name=hl.tag.post![CDATA[/b]]/str

Regards,
Jayendra


On Tue, Aug 9, 2011 at 4:46 AM, Massimo Schiavonmschia...@volunia.com  wrote:

In my Solr (3.3) configuration I specified these two params:

str name=hl.simple.pre![CDATA[b]]/str
str name=hl.simple.post![CDATA[/b]]/str

when I do a simple search I obtain correctly highlighted results where
matches areenclosed with correct tag.
If I do the same request with hl.useFastVectorHighlighter=true in the http
query string (or specifying the same parameter in the config file) the
metches are enclosed withem  tag (the default value).

Anyone has encountered the same




Re: document indexing

2011-08-10 Thread directorscott
Could you please tell me schema.xml fields tag content for such case?
Currently index data is something like this:

PageID BookID Text
1 1some text
2 1some text
3 1some text
4 1some text
5 2some text
6 2some text
7 2some text
8 2some text

when i make a simple query for the word some on Text field, i will have
all 8 rows returned. but i want to list only 2 items (Books with IDs 1 and
2)

I am also considering to concatenate Text columns and have the index like
this:

BookID PageTexts
1 some text some text some text
2 some text some text some text

I wonder which index structure is better.


 

lee carroll wrote:
 
 It really does depend upon what you want to do in your app but from
 the info given I'd go for denormalizing by repeating the least number
 of values. So in your case that would be book
 
 PageID+BookID(uniqueKey), pageID, PageVal1, PageValn, BookID, BookName
 
 
 
 
 On 10 August 2011 09:46, directorscott lt;dgul...@gmail.comgt; wrote:
 Hello,

 First of all, I am a beginner and i am trying to develop a sample
 application using SolrNet.

 I am struggling about schema definition i need to use to correspond my
 needs. In database, i have Books(bookId, name) and Pages(pageId, bookId,
 text) tables. They have master-detail relationship. I want to be able to
 search in Text area of Pages but list the books. Should i use a schema
 for
 Pages (with pageid as unique key) or for Books (with bookId as unique
 key)
 in this scenario?

 Thanks.



 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/document-indexing-tp3241832p3241832.html
 Sent from the Solr - User mailing list archive at Nabble.com.

 


--
View this message in context: 
http://lucene.472066.n3.nabble.com/document-indexing-tp3241832p3242219.html
Sent from the Solr - User mailing list archive at Nabble.com.


Date faceting per last hour, three days and last week

2011-08-10 Thread Joan
Hi,

I'm trying date faceting per last 24 hours, three days and last week, but I
don't know how to do it.

I have a DateField and I want to set different ranges, it is posible?

I understand the example from solr
wikihttp://wiki.apache.org/solr/SimpleFacetParameters#Date_Faceting:_per_day_for_the_past_5_daysbut
I want to do more gaps with the same field_date.

How I do this?

Thanks,

Joan


paging size in SOLR

2011-08-10 Thread jame vaalet
hi,
i want to retrieve all the data from solr (say 10,000 ids ) and my page size
is 1000 .
how do i get back the data (pages) one after other ?do i have to increment
the start value each time by the page size from 0 and do the iteration ?
In this case am i querying the index 10 time instead of one or after first
query the result will be cached somewhere for the subsequent pages ?


JAME VAALET


How come this query string starts with wildcard?

2011-08-10 Thread Pranav Prakash
While going through my error logs of Solr, i found that a user had fired a
query - jawapan ujian bulanan thn 4 (bahasa melayu). This was converted to
following for autosuggest purposes -
jawapan?ujian?bulanan?thn?4?(bahasa?melayu)* by the javascript code. Solr
threw the exception

Cannot parse 'jawapan?ujian?bulanan?thn?4?(bahasa?melayu)*': '*' or
'?' not allowed as first character in WildcardQuery

How come this query string begins with wildcard character?

When I changed the query to remove brackets, everything went smooth.
There were no results, because probably my search index didn't had
any.


*Pranav Prakash*

temet nosce

Twitter http://twitter.com/pranavprakash | Blog http://blog.myblive.com |
Google http://www.google.com/profiles/pranny


Re: Date faceting per last hour, three days and last week

2011-08-10 Thread O. Klein
I would use facet queries:

facet.query=date:[NOW-1DAY TO NOW]
facet.query=date:[NOW-3DAY TO NOW]
facet.query=date:[NOW-7DAY TO NOW]

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Date-faceting-per-last-hour-three-days-and-last-week-tp3242364p3242574.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: How come this query string starts with wildcard?

2011-08-10 Thread Michael Ryan
I think this is because ) is treated as a token delimiter. So (foo)bar is 
treated the same as (foo) bar (that is, bar is treated as a separate word). 
So (foo)* is really parsed as (foo) * and thus the * is treated as the 
start of a new word.

-Michael


[Help Wanted] Graphics and other help for new Lucene/Solr website

2011-08-10 Thread Grant Ingersoll
Hi,

We are in the process of putting up a new Lucene/Solr/PyLucene/OpenRelevance 
website.  You can see a preview at http://lucene.staging.apache.org/lucene/.  
It is more or less a look and feel copy of Mahout and Open For Biz websites.  
This new site, IMO, both looks better than the old one and will be a lot easier 
for us committers to maintain/update and for others to contribute to.

So, how can you help?  

0.  All of the code is at https://svn.apache.org/repos/asf/lucene/cms/trunk.  
Check it out the usual way using SVN.  If you want to build locally, see 
https://issues.apache.org/jira/browse/LUCENE-2748 and the links to the ASF CMS 
guide.

1. If you have any graphic design skills:
- I'd love to have some mantle/slide images along the lines of 
http://lucene.staging.apache.org/lucene/images/mantle-lucene-solr.png.  These 
are used in the slideshow at the top of the Lucene, Core and Solr pages and 
should be interesting, inviting, etc. and should give people warm fuzzy 
feelings about all of our software and the great community we have.  (Think 
Marketing!)
- Help us coordinate the color selection on the various pages, 
especially in the slides and especially on the Solr page, as I'm not sure I 
like the green and black background contrasted with the orange of the Solr logo.

2. In a few more days or maybe a week or so, patches to fix content errors, 
etc. will be welcome.  For now, we are still porting things, so I don't want to 
duplicate effort.

3. New, useful documentation is also, of course, always welcome.

4. Test with your favorite browser.  In particular, I don't have IE handy.  
I've checked the site in Chrome, Firefox and Safari.

If you come up w/  images (I won't guarantee they will be accepted, but I am 
appreciative of the help) or other style fixes, etc., please submit all 
content/patches to https://issues.apache.org/jira/browse/LUCENE-2748 and please 
make sure to check the donation box when attaching the file. 

-Grant

 

Re: unique terms and multi-valued fields

2011-08-10 Thread Erick Erickson
Well, it depends (tm).

If you're talking about *indexed* terms, then the value is stored only
once in both the cases you mentioned below. There's really very little
difference between a non-multi-valued field and a multi-valued field
in terms of how it's stored in the searchable portion of the index,
except for some position information.

So, having an XML doc with a single-valued field

field name=categorycomputers laptops/field

is almost identical (except for position info as positionIncrementGap) as a

field name=categorycomputers/field
field name=categorylaptops/field

multiValued refers to the *input*, not whether more than one word is
allowed in that field.


Now, about *stored* fields. If you store the data, verbatim copies are
kept in the
storage-specific files in each segment, and the values will be on disk for
each document.

But you probably don't care much because this data is only referenced when you
assemble a document for return to the client, it's irrelevant for searching.

Best
Erick

On Tue, Aug 9, 2011 at 8:02 PM, Kevin Osborn osbo...@yahoo.com wrote:
 Please verify my understanding. I have a field called category and it has a 
 value computers. If I use this same field and value for all of my 
 documents, it is really only stored on disk once because category:computers 
 is a unique term. Is this correct?

 But, what about multi-valued fields. So, I have a field called category. 
 For 100 documents, it has the values computers and laptops. For 100 other 
 documents, it has the values computers and tablets. Is this stored as 
 category:computers, category:laptops, category:tablets, meaning 3 
 unique terms. Or is it stored as category:computers,laptops and 
 category:computers,tablets. I believe it is the first case (hopefully), but 
 I am not sure.

 Thanks.


Re: document indexing

2011-08-10 Thread lee carroll
With the first option you can be page specific in your search results
and searches.
Field collapsing/grouping will help with your normalisation issue.
(what you have listed is different to what I listed you don't have a
unique key)

Option 2 means you loose any ability to reference page, but as you
note your documents are at the level you wish your search results to
be returned.

if you are not interested in page then option 2.

On 10 August 2011 12:22, directorscott dgul...@gmail.com wrote:
 Could you please tell me schema.xml fields tag content for such case?
 Currently index data is something like this:

 PageID BookID Text
 1         1        some text
 2         1        some text
 3         1        some text
 4         1        some text
 5         2        some text
 6         2        some text
 7         2        some text
 8         2        some text

 when i make a simple query for the word some on Text field, i will have
 all 8 rows returned. but i want to list only 2 items (Books with IDs 1 and
 2)

 I am also considering to concatenate Text columns and have the index like
 this:

 BookID     PageTexts
 1             some text some text some text
 2             some text some text some text

 I wonder which index structure is better.




 lee carroll wrote:

 It really does depend upon what you want to do in your app but from
 the info given I'd go for denormalizing by repeating the least number
 of values. So in your case that would be book

 PageID+BookID(uniqueKey), pageID, PageVal1, PageValn, BookID, BookName




 On 10 August 2011 09:46, directorscott lt;dgul...@gmail.comgt; wrote:
 Hello,

 First of all, I am a beginner and i am trying to develop a sample
 application using SolrNet.

 I am struggling about schema definition i need to use to correspond my
 needs. In database, i have Books(bookId, name) and Pages(pageId, bookId,
 text) tables. They have master-detail relationship. I want to be able to
 search in Text area of Pages but list the books. Should i use a schema
 for
 Pages (with pageid as unique key) or for Books (with bookId as unique
 key)
 in this scenario?

 Thanks.



 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/document-indexing-tp3241832p3241832.html
 Sent from the Solr - User mailing list archive at Nabble.com.




 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/document-indexing-tp3241832p3242219.html
 Sent from the Solr - User mailing list archive at Nabble.com.



AW: Problem with DIH: How to map key value pair stored in 1-N relation from a JDBC Source?

2011-08-10 Thread Christian Bordis
Thanks, 
for this quick and enlightening answer! 

I didn't consider that a Transformer can create new columns. In combination 
with dynamic fields it is exactly what I was looking for.

Thanks James ^^

-Ursprüngliche Nachricht-
Von: Dyer, James [mailto:james.d...@ingrambook.com] 
Gesendet: Dienstag, 9. August 2011 16:03
An: solr-user@lucene.apache.org
Betreff: RE: Problem with DIH: How to map key value pair stored in 1-N relation 
from a JDBC Source?

Christian,

It looks like you should probably write a Transformer for your DIH script.  I 
assume you have a child entity set up for PriceTable.  Add a Transformer to 
this entity that will look at the value of currency and price, remove these 
from the row, then add them back in with currency as the field name and 
price as the column value.

By the way, it would likely be better if instead of field names like EUR and 
CHF, you created a dynamic field entry in schema.xml with a dynamic field 
like this:

dynamicField name=CURRENCY_* type=tfloat indexed=true stored=false /

Then have your DIH Transformer prepend CURRENCY_ in front of the field name.  
This way should your company ever add a new currency, you wouldn't need to 
change your schema.

For more information on writing a DIH Transformer, see 
http://wiki.apache.org/solr/DIHCustomTransformer

If you would rather use a scripting language such as javascript instead of 
writing your Transformer in java, see 
http://wiki.apache.org/solr/DataImportHandler#ScriptTransformer .

James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311


Re: Indexing tweet and searching @keyword OR #keyword

2011-08-10 Thread Erick Erickson
Please look more carefully at the documentation for WDDF,
specifically:

split on intra-word delimiters (all non alpha-numeric characters).

WordDelimiterFilterFactory will always throw away non alpha-numeric
characters, you can't tell it do to otherwise. Try some of the other
tokenizers/analyzers to get what you want, and also look at the
admin/analysis page to see what the exact effects are of your
fieldType definitions.

Here's a great place to start:
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters

You probably want something like WhitespaceTokenizerFactory
followed by LowerCaseFilterFactory or some such...

But I really question whether this is what you want either. Do you
really want a search on ipad to *fail* to match input of #ipad? Or
vice-versa?

KeywordTokenizerFactory is probably not the place you want to start,
the tokenization process doesn't break anything up, you happen to be
getting separate tokens because of WDDF, which as you see can't
process things the way you want.


Best
Erick

On Wed, Aug 10, 2011 at 3:09 AM, Mohammad Shariq shariqn...@gmail.com wrote:
 I tried tweaking WordDelimiterFactory but I won't accept # OR @ symbols
 and it ignored totally.
 I need solution plz suggest.

 On 4 August 2011 21:08, Jonathan Rochkind rochk...@jhu.edu wrote:

 It's the WordDelimiterFactory in your filter chain that's removing the
 punctuation entirely from your index, I think.

 Read up on what the WordDelimiter filter does, and what it's settings are;
 decide how you want things to be tokenized in your index to get the behavior
 your want; either get WordDelimiter to do it that way by passing it
 different arguments, or stop using WordDelimiter; come back with any
 questions after trying that!



 On 8/4/2011 11:22 AM, Mohammad Shariq wrote:

 I have indexed around 1 million tweets ( using  text dataType).
 when I search the tweet with #  OR @  I dont get the exact result.
 e.g.  when I search for #ipad OR @ipad   I get the result where ipad
 is
 mentioned skipping the # and @.
 please suggest me, how to tune or what are filterFactories to use to get
 the
 desired result.
 I am indexing the tweet as text, below is text which is there in my
 schema.xml.


 fieldType name=text class=solr.TextField positionIncrementGap=100
 analyzer type=index
     tokenizer class=solr.**KeywordTokenizerFactory/
     filter class=solr.**CommonGramsFilterFactory words=stopwords.txt
 minShingleSize=3 maxShingleSize=3 ignoreCase=true/
     filter class=solr.**WordDelimiterFilterFactory
 generateWordParts=1
 generateNumberParts=1 catenateWords=1 catenateNumbers=1
 catenateAll=0 splitOnCaseChange=1/
     filter class=solr.**LowerCaseFilterFactory/
     filter class=solr.**SnowballPorterFilterFactory
 protected=protwords.txt language=English/
 /analyzer
 analyzer type=query
         tokenizer class=solr.**KeywordTokenizerFactory/
         filter class=solr.**CommonGramsFilterFactory
 words=stopwords.txt
 minShingleSize=3 maxShingleSize=3 ignoreCase=true/
         filter class=solr.**WordDelimiterFilterFactory
 generateWordParts=1 generateNumberParts=1 catenateWords=1
 catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/
         filter class=solr.**LowerCaseFilterFactory/
         filter class=solr.**SnowballPorterFilterFactory
 protected=protwords.txt language=English/
 /analyzer
 /fieldType




 --
 Thanks and Regards
 Mohammad Shariq



Re: frange not working in query

2011-08-10 Thread simon
Could you tell us what you're trying to achieve with the range query ?
It's not clear.

-Simon

On Wed, Aug 10, 2011 at 5:57 AM, Amit Sawhney sawhney.a...@gmail.com wrote:
 Hi All,

 I am trying to sort the results on a unix timestamp using this query.

 http://url.com:8983/solr/db/select/?indent=onversion=2.1q={!frange%20l=0.25}query($qq)qq=nokiasort=unix-timestamp%20descstart=0rows=10qt=dismaxwt=dismaxfl=*,scorehl=onhl.snippets=1

 When I run this query, it says 'no field name specified in query and no 
 defaultSearchField defined in schema.xml'

 As soon as I remove the frange query and run this, it starts working fine.

 http://url.com:8983/solr/db/select/?indent=onversion=2.1q=nokiasort=unix-timestamp%20descstart=0rows=10qt=dismaxwt=dismaxfl=*,scorehl=onhl.snippets=1

 Any pointers?


 Thanks,
 Amit


Re: frange not working in query

2011-08-10 Thread simon
I meant the frange query, of course

On Wed, Aug 10, 2011 at 10:21 AM, simon mtnes...@gmail.com wrote:
 Could you tell us what you're trying to achieve with the range query ?
 It's not clear.

 -Simon

 On Wed, Aug 10, 2011 at 5:57 AM, Amit Sawhney sawhney.a...@gmail.com wrote:
 Hi All,

 I am trying to sort the results on a unix timestamp using this query.

 http://url.com:8983/solr/db/select/?indent=onversion=2.1q={!frange%20l=0.25}query($qq)qq=nokiasort=unix-timestamp%20descstart=0rows=10qt=dismaxwt=dismaxfl=*,scorehl=onhl.snippets=1

 When I run this query, it says 'no field name specified in query and no 
 defaultSearchField defined in schema.xml'

 As soon as I remove the frange query and run this, it starts working fine.

 http://url.com:8983/solr/db/select/?indent=onversion=2.1q=nokiasort=unix-timestamp%20descstart=0rows=10qt=dismaxwt=dismaxfl=*,scorehl=onhl.snippets=1

 Any pointers?


 Thanks,
 Amit



Re: Is optimize needed on slaves if it replicates from optimized master?

2011-08-10 Thread Erick Erickson
This is expected behavior. You might be optimizing
your index on the master after every set of changes,
in which case the entire index is copied. During this
period, the space on disk will at least double, there's no
way around that.

If you do NOT optimize, then the slave will only copy changed
segments instead of the entire index. Optimizing isn't
usually necessary except periodically (daily, perhaps weekly,
perhaps never actually).

All that said, depending on how merging happens, you will always
have the possibility of the entire index being copied sometimes
because you'll happen to hit a merge that merges all segments
into one.

There are some advanced options that can control some parts
of merging, but you need to get to the bottom of why the whole
index is getting copied every time before you go there. I'd bet
you're issuing an optimize.

Best
Erick

On Wed, Aug 10, 2011 at 5:30 AM, Pranav Prakash pra...@gmail.com wrote:
 That is not true. Replication is roughly a copy of the diff between the
 master and the slave's index.


 In my case, during replication entire index is copied from master to slave,
 during which the size of index goes a little over double. Then it shrinks to
 its original size. Am I doing something wrong? How can I get the master to
 serve only delta index instead of serving whole index and the slaves merging
 the new and old index?

 *Pranav Prakash*



Re: paging size in SOLR

2011-08-10 Thread Erick Erickson
Well, if you really want to you can specify start=0 and rows=1 and
get them all back at once.

You can do page-by-page by incrementing the start parameter as you
indicated.

You can keep from re-executing the search by setting your queryResultCache
appropriately, but this affects all searches so might be an issue.

Best
Erick

On Wed, Aug 10, 2011 at 9:09 AM, jame vaalet jamevaa...@gmail.com wrote:
 hi,
 i want to retrieve all the data from solr (say 10,000 ids ) and my page size
 is 1000 .
 how do i get back the data (pages) one after other ?do i have to increment
 the start value each time by the page size from 0 and do the iteration ?
 In this case am i querying the index 10 time instead of one or after first
 query the result will be cached somewhere for the subsequent pages ?


 JAME VAALET



Re: paging size in SOLR

2011-08-10 Thread simon
Worth remembering there are some performance penalties with deep
paging, if you use the page-by-page approach. may not be too much of a
problem if you really are only looking to retrieve 10K docs.

-Simon

On Wed, Aug 10, 2011 at 10:32 AM, Erick Erickson
erickerick...@gmail.com wrote:
 Well, if you really want to you can specify start=0 and rows=1 and
 get them all back at once.

 You can do page-by-page by incrementing the start parameter as you
 indicated.

 You can keep from re-executing the search by setting your queryResultCache
 appropriately, but this affects all searches so might be an issue.

 Best
 Erick

 On Wed, Aug 10, 2011 at 9:09 AM, jame vaalet jamevaa...@gmail.com wrote:
 hi,
 i want to retrieve all the data from solr (say 10,000 ids ) and my page size
 is 1000 .
 how do i get back the data (pages) one after other ?do i have to increment
 the start value each time by the page size from 0 and do the iteration ?
 In this case am i querying the index 10 time instead of one or after first
 query the result will be cached somewhere for the subsequent pages ?


 JAME VAALET




RE: paging size in SOLR

2011-08-10 Thread Jonathan Rochkind
I would imagine the performance penalties with deep paging will ALSO be there 
if you just ask for 1 rows all at once though, instead of in, say, 100 row 
paged batches. Yes? No?

-Original Message-
From: simon [mailto:mtnes...@gmail.com] 
Sent: Wednesday, August 10, 2011 10:44 AM
To: solr-user@lucene.apache.org
Subject: Re: paging size in SOLR

Worth remembering there are some performance penalties with deep
paging, if you use the page-by-page approach. may not be too much of a
problem if you really are only looking to retrieve 10K docs.

-Simon

On Wed, Aug 10, 2011 at 10:32 AM, Erick Erickson
erickerick...@gmail.com wrote:
 Well, if you really want to you can specify start=0 and rows=1 and
 get them all back at once.

 You can do page-by-page by incrementing the start parameter as you
 indicated.

 You can keep from re-executing the search by setting your queryResultCache
 appropriately, but this affects all searches so might be an issue.

 Best
 Erick

 On Wed, Aug 10, 2011 at 9:09 AM, jame vaalet jamevaa...@gmail.com wrote:
 hi,
 i want to retrieve all the data from solr (say 10,000 ids ) and my page size
 is 1000 .
 how do i get back the data (pages) one after other ?do i have to increment
 the start value each time by the page size from 0 and do the iteration ?
 In this case am i querying the index 10 time instead of one or after first
 query the result will be cached somewhere for the subsequent pages ?


 JAME VAALET




Building a facet query in SolrJ

2011-08-10 Thread Simon, Richard T
Hi - I'm trying to do a (I think) simple facet query, but I'm not getting the 
results I expect. I have a field, MyField, and I want to get facets for 
specific values of that field. That is, I want a FacetField if MyField is 
ABC, DEF, etc. (a specific list of values), but not if MyField is any other 
value.

If I build my query like this:

SolrQuery query = new SolrQuery( luceneQueryStr );
  query.setStart( request.getStartIndex() );
  query.setRows( request.getMaxResults() );
  query.setFacet(true);
 query.setFacetMinCount(1);

  query.addFacetField(MYFIELD);

  for (String fieldValue : desiredFieldValues) {
   query.addFacetQuery(MYFIELD + : + fieldValue);
 }


queryResponse.getFacetFields returns facets for ALL values of MyField. I 
figured that was because setting the facet field with addFacetField caused Solr 
to examine all values. But, if I take out that line, then getFacetFields 
returns an empty list.

I'm sure I'm doing something simple wrong, but I'm out of ideas right now.

-Rich






Re: paging size in SOLR

2011-08-10 Thread jame vaalet
when you say queryResultCache, does it only cache n number of result for the
last one query or more than one queries?


On 10 August 2011 20:14, simon mtnes...@gmail.com wrote:

 Worth remembering there are some performance penalties with deep
 paging, if you use the page-by-page approach. may not be too much of a
 problem if you really are only looking to retrieve 10K docs.

 -Simon

 On Wed, Aug 10, 2011 at 10:32 AM, Erick Erickson
 erickerick...@gmail.com wrote:
  Well, if you really want to you can specify start=0 and rows=1 and
  get them all back at once.
 
  You can do page-by-page by incrementing the start parameter as you
  indicated.
 
  You can keep from re-executing the search by setting your
 queryResultCache
  appropriately, but this affects all searches so might be an issue.
 
  Best
  Erick
 
  On Wed, Aug 10, 2011 at 9:09 AM, jame vaalet jamevaa...@gmail.com
 wrote:
  hi,
  i want to retrieve all the data from solr (say 10,000 ids ) and my page
 size
  is 1000 .
  how do i get back the data (pages) one after other ?do i have to
 increment
  the start value each time by the page size from 0 and do the iteration
 ?
  In this case am i querying the index 10 time instead of one or after
 first
  query the result will be cached somewhere for the subsequent pages ?
 
 
  JAME VAALET
 
 




-- 

-JAME


Re: Solr 3.3 crashes after ~18 hours?

2011-08-10 Thread alexander sulz

Okay, with this command it hangs.
Also: I managed to get a Thread Dump (attached).

regards

Am 05.08.2011 15:08, schrieb Yonik Seeley:

On Fri, Aug 5, 2011 at 7:33 AM, alexander sulza.s...@digiconcept.net  wrote:

Usually you get a XML-Response when doing commits or optimize, in this case
I get nothing
in return, but the site ( http://[...]/solr/update?optimize=true ) DOESN'T
load forever or anything.
It doesn't hang! I just get a blank page / empty response.

Sounds like you are doing it from a browser?
Can you try it from the command line?  It should give back some sort
of response (or hang waiting for a response).

curl http://localhost:8983/solr/update?commit=true;

-Yonik
http://www.lucidimagination.com



I use the stuff in the example folder, the only changes i made was enable
logging and changing the port to 8985.
I'll try getting a thread dump if it happens again!
So far its looking good with having allocated more memory to it.

Am 04.08.2011 16:08, schrieb Yonik Seeley:

On Thu, Aug 4, 2011 at 8:09 AM, alexander sulza.s...@digiconcept.net
  wrote:

Thank you for the many replies!

Like I said, I couldn't find anything in logs created by solr.
I just had a look at the /var/logs/messages and there wasn't anything
either.

What I mean by crash is that the process is still there and http GET
pings
would return 200
but when i try visiting /solr/admin, I'd get a blank page! The server
ignores any incoming updates or commits,

ignores means what?  The request hangs?  If so, could you get a thread
dump?

Do queries work (like /solr/select?q=*:*) ?


thous throwing no errors, no 503's.. It's like the server has a blackout
and
stares blankly into space.

Are you using a different servlet container than what is shipped with
solr?
If you did start with the solr example server, what jetty
configuration changes have you made?

-Yonik
http://www.lucidimagination.com




Full thread dump Java HotSpot(TM) Server VM (19.1-b02 mixed mode):

DestroyJavaVM prio=10 tid=0x6e32e800 nid=0x5aeb waiting on condition 
[0x]
   java.lang.Thread.State: RUNNABLE

Timer-2 daemon prio=10 tid=0x6e3ff800 nid=0x5b0b in Object.wait() [0x6e6e5000]
   java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on 0xb0260108 (a java.util.TaskQueue)
at java.util.TimerThread.mainLoop(Unknown Source)
- locked 0xb0260108 (a java.util.TaskQueue)
at java.util.TimerThread.run(Unknown Source)

pool-1-thread-1 prio=10 tid=0x6e32dc00 nid=0x5b0a waiting on condition 
[0x6dae]
   java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  0xb02680e8 (a 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(Unknown Source)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(Unknown
 Source)
at java.util.concurrent.LinkedBlockingQueue.take(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.getTask(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)

Timer-1 daemon prio=10 tid=0x0874e000 nid=0x5b07 in Object.wait() [0x6eb6d000]
   java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on 0xb02601c0 (a java.util.TaskQueue)
at java.util.TimerThread.mainLoop(Unknown Source)
- locked 0xb02601c0 (a java.util.TaskQueue)
at java.util.TimerThread.run(Unknown Source)

8106640@qtp-25094328-9 - Acceptor0 SocketConnector@0.0.0.0:8985 prio=10 
tid=0x0832dc00 nid=0x5b06 runnable [0x6ecc7000]
   java.lang.Thread.State: RUNNABLE
at java.net.PlainSocketImpl.socketAccept(Native Method)
at java.net.PlainSocketImpl.accept(Unknown Source)
- locked 0xb0260288 (a java.net.SocksSocketImpl)
at java.net.ServerSocket.implAccept(Unknown Source)
at java.net.ServerSocket.accept(Unknown Source)
at org.mortbay.jetty.bio.SocketConnector.accept(SocketConnector.java:99)
at 
org.mortbay.jetty.AbstractConnector$Acceptor.run(AbstractConnector.java:708)
at 
org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)

9097070@qtp-25094328-8 prio=10 tid=0x0832c400 nid=0x5b05 in Object.wait() 
[0x6ed18000]
   java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on 0xb0264018 (a 
org.mortbay.thread.QueuedThreadPool$PoolThread)
at 
org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:626)
- locked 0xb0264018 (a org.mortbay.thread.QueuedThreadPool$PoolThread)

4098499@qtp-25094328-7 prio=10 tid=0x0832ac00 nid=0x5b04 in Object.wait() 
[0x6ed69000]
   java.lang.Thread.State: TIMED_WAITING (on object monitor)
at 

Error loading a custom request handler in Solr 4.0

2011-08-10 Thread Tom Mortimer
Hi,

Apologies if this is really basic. I'm trying to learn how to create a
custom request handler, so I wrote the minimal class (attached), compiled
and jar'd it, and placed it in example/lib. I added this to solrconfig.xml:

requestHandler name=/flaxtest class=FlaxTestHandler /

When I started Solr with java -jar start.jar, I got this:

...
SEVERE: java.lang.NoClassDefFoundError:
org/apache/solr/handler/RequestHandlerBase
at java.lang.ClassLoader.defineClass1(Native Method)
...

So I copied all the dist/*.jar files into lib and tried again. This time it
seemed to start ok, but browsing to http://localhost:8983/solr/ displayed
this:

org.apache.solr.common.SolrException: Error Instantiating Request
Handler, FlaxTestHandler is not a org.apache.solr.request.SolrRequestHandler

at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:410) ...


Any ideas?

thanks,
Tom


RE: Building a facet query in SolrJ

2011-08-10 Thread Simon, Richard T
Oops. I think I found it. My desiredFieldValues list has the wrong info. Knew 
there was something simple wrong.

From: Simon, Richard T
Sent: Wednesday, August 10, 2011 10:55 AM
To: solr-user@lucene.apache.org
Cc: Simon, Richard T
Subject: Building a facet query in SolrJ

Hi - I'm trying to do a (I think) simple facet query, but I'm not getting the 
results I expect. I have a field, MyField, and I want to get facets for 
specific values of that field. That is, I want a FacetField if MyField is 
ABC, DEF, etc. (a specific list of values), but not if MyField is any other 
value.

If I build my query like this:

SolrQuery query = new SolrQuery( luceneQueryStr );
  query.setStart( request.getStartIndex() );
  query.setRows( request.getMaxResults() );
  query.setFacet(true);
 query.setFacetMinCount(1);

  query.addFacetField(MYFIELD);

  for (String fieldValue : desiredFieldValues) {
   query.addFacetQuery(MYFIELD + : + fieldValue);
 }


queryResponse.getFacetFields returns facets for ALL values of MyField. I 
figured that was because setting the facet field with addFacetField caused Solr 
to examine all values. But, if I take out that line, then getFacetFields 
returns an empty list.

I'm sure I'm doing something simple wrong, but I'm out of ideas right now.

-Rich






Re: Solr 3.3 crashes after ~18 hours?

2011-08-10 Thread Yonik Seeley
On Wed, Aug 10, 2011 at 11:00 AM, alexander sulz a.s...@digiconcept.net wrote:
 Okay, with this command it hangs.

It doesn't look like a hang from this thread dump.  It doesn't look
like any solr requests are executing at the time the dump was taken.

Did you do this from the command line?
curl http://localhost:8983/solr/update?commit=true;

Are you saying that the curl command just hung and never returned?

-Yonik
http://www.lucidimagination.com

 Also: I managed to get a Thread Dump (attached).

 regards

 Am 05.08.2011 15:08, schrieb Yonik Seeley:

 On Fri, Aug 5, 2011 at 7:33 AM, alexander sulza.s...@digiconcept.net
  wrote:

 Usually you get a XML-Response when doing commits or optimize, in this
 case
 I get nothing
 in return, but the site ( http://[...]/solr/update?optimize=true )
 DOESN'T
 load forever or anything.
 It doesn't hang! I just get a blank page / empty response.

 Sounds like you are doing it from a browser?
 Can you try it from the command line?  It should give back some sort
 of response (or hang waiting for a response).

 curl http://localhost:8983/solr/update?commit=true;

 -Yonik
 http://www.lucidimagination.com


 I use the stuff in the example folder, the only changes i made was enable
 logging and changing the port to 8985.
 I'll try getting a thread dump if it happens again!
 So far its looking good with having allocated more memory to it.

 Am 04.08.2011 16:08, schrieb Yonik Seeley:

 On Thu, Aug 4, 2011 at 8:09 AM, alexander sulza.s...@digiconcept.net
  wrote:

 Thank you for the many replies!

 Like I said, I couldn't find anything in logs created by solr.
 I just had a look at the /var/logs/messages and there wasn't anything
 either.

 What I mean by crash is that the process is still there and http GET
 pings
 would return 200
 but when i try visiting /solr/admin, I'd get a blank page! The server
 ignores any incoming updates or commits,

 ignores means what?  The request hangs?  If so, could you get a thread
 dump?

 Do queries work (like /solr/select?q=*:*) ?

 thous throwing no errors, no 503's.. It's like the server has a
 blackout
 and
 stares blankly into space.

 Are you using a different servlet container than what is shipped with
 solr?
 If you did start with the solr example server, what jetty
 configuration changes have you made?

 -Yonik
 http://www.lucidimagination.com





Re: Cache replication

2011-08-10 Thread didier deshommes
Consider putting a cache (memcached, redis, etc) *in front* of your
solr slaves. Just make sure to update it when replication occurs.

didier

On Tue, Aug 9, 2011 at 6:07 PM, arian487 akarb...@tagged.com wrote:
 I'm wondering if the caches on all the slaves are replicated across (such as
 queryResultCache).  That is to say, if I hit one of my slaves and cache a
 result, and I make a search later and that search happens to hit a different
 slave, will that first cached result be available for use?

 This is pretty important because I'm going to have a lot of slaves and if
 this isn't done, then I'd have a high chance of running a lot uncached
 queries.

 Thanks :)

 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Cache-replication-tp3240708p3240708.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Dates off by 1 day?

2011-08-10 Thread Olson, Ron
Hi all-

I apologize in advance if this turns out to be a problem between the keyboard 
and the chair, but I'm confused about why my date field is correct in the 
index, but wrong in SolrJ.

I have a field defined as a date in the index:

field name=FILE_DATE type=date indexed=true stored=true/

And if I use the admin site to query the data, I get the right date:

date name=FILE_DATE2002-05-13T00:00:00Z/date

But in my SolrJ code:

IteratorSolrDocument iter = queryResponse.getResults().iterator();

while (iter.hasNext())
{
SolrDocument resultDoc = iter.next();

System.out.println(--  + resultDoc.getFieldValue(FILE_DATE));

}

I get:

-- Sun May 12 19:00:00 CDT 2002

I've been searching around through the wiki and other places, but can't seem to 
find anything that either mentions this problem or talks about date handling in 
Solr/SolrJ that might refer to something like this.

Thanks for any info,

Ron



DISCLAIMER: This electronic message, including any attachments, files or 
documents, is intended only for the addressee and may contain CONFIDENTIAL, 
PROPRIETARY or LEGALLY PRIVILEGED information.  If you are not the intended 
recipient, you are hereby notified that any use, disclosure, copying or 
distribution of this message or any of the information included in or with it 
is  unauthorized and strictly prohibited.  If you have received this message in 
error, please notify the sender immediately by reply e-mail and permanently 
delete and destroy this message and its attachments, along with any copies 
thereof. This message does not create any contractual obligation on behalf of 
the sender or Law Bulletin Publishing Company.
Thank you.


Re: Dates off by 1 day?

2011-08-10 Thread Sethi, Parampreet

The Date difference is coming because of different time zones.

In Solr the date is stored as Zulu time zone and Solrj is returning date in
CDT timezone (jvm is picking system time zone.)

 date name=FILE_DATE2002-05-13T00:00:00Z/date

 I get:
 
 -- Sun May 12 19:00:00 CDT 2002

You can convert Date in different time-zones using Java Util date functions
if required.

Hope it helps!

-param
On 8/10/11 11:20 AM, Olson, Ron rol...@lbpc.com wrote:

 Hi all-
 
 I apologize in advance if this turns out to be a problem between the keyboard
 and the chair, but I'm confused about why my date field is correct in the
 index, but wrong in SolrJ.
 
 I have a field defined as a date in the index:
 
 field name=FILE_DATE type=date indexed=true stored=true/
 
 And if I use the admin site to query the data, I get the right date:
 
 date name=FILE_DATE2002-05-13T00:00:00Z/date
 
 But in my SolrJ code:
 
 IteratorSolrDocument iter = queryResponse.getResults().iterator();
 
 while (iter.hasNext())
 {
 SolrDocument resultDoc = iter.next();
 
 System.out.println(--  + resultDoc.getFieldValue(FILE_DATE));
 
 }
 
 I get:
 
 -- Sun May 12 19:00:00 CDT 2002
 
 I've been searching around through the wiki and other places, but can't seem
 to find anything that either mentions this problem or talks about date
 handling in Solr/SolrJ that might refer to something like this.
 
 Thanks for any info,
 
 Ron
 
 
 
 DISCLAIMER: This electronic message, including any attachments, files or
 documents, is intended only for the addressee and may contain CONFIDENTIAL,
 PROPRIETARY or LEGALLY PRIVILEGED information.  If you are not the intended
 recipient, you are hereby notified that any use, disclosure, copying or
 distribution of this message or any of the information included in or with it
 is  unauthorized and strictly prohibited.  If you have received this message
 in error, please notify the sender immediately by reply e-mail and permanently
 delete and destroy this message and its attachments, along with any copies
 thereof. This message does not create any contractual obligation on behalf of
 the sender or Law Bulletin Publishing Company.
 Thank you.



RE: Dates off by 1 day?

2011-08-10 Thread Olson, Ron
Ah, great! I knew the problem was between the keyboard and the chair. Thanks!

-Original Message-
From: Sethi, Parampreet [mailto:parampreet.se...@teamaol.com]
Sent: Wednesday, August 10, 2011 10:25 AM
To: solr-user@lucene.apache.org
Subject: Re: Dates off by 1 day?


The Date difference is coming because of different time zones.

In Solr the date is stored as Zulu time zone and Solrj is returning date in
CDT timezone (jvm is picking system time zone.)

 date name=FILE_DATE2002-05-13T00:00:00Z/date

 I get:

 -- Sun May 12 19:00:00 CDT 2002

You can convert Date in different time-zones using Java Util date functions
if required.

Hope it helps!

-param
On 8/10/11 11:20 AM, Olson, Ron rol...@lbpc.com wrote:

 Hi all-

 I apologize in advance if this turns out to be a problem between the keyboard
 and the chair, but I'm confused about why my date field is correct in the
 index, but wrong in SolrJ.

 I have a field defined as a date in the index:

 field name=FILE_DATE type=date indexed=true stored=true/

 And if I use the admin site to query the data, I get the right date:

 date name=FILE_DATE2002-05-13T00:00:00Z/date

 But in my SolrJ code:

 IteratorSolrDocument iter = queryResponse.getResults().iterator();

 while (iter.hasNext())
 {
 SolrDocument resultDoc = iter.next();

 System.out.println(--  + resultDoc.getFieldValue(FILE_DATE));

 }

 I get:

 -- Sun May 12 19:00:00 CDT 2002

 I've been searching around through the wiki and other places, but can't seem
 to find anything that either mentions this problem or talks about date
 handling in Solr/SolrJ that might refer to something like this.

 Thanks for any info,

 Ron



 DISCLAIMER: This electronic message, including any attachments, files or
 documents, is intended only for the addressee and may contain CONFIDENTIAL,
 PROPRIETARY or LEGALLY PRIVILEGED information.  If you are not the intended
 recipient, you are hereby notified that any use, disclosure, copying or
 distribution of this message or any of the information included in or with it
 is  unauthorized and strictly prohibited.  If you have received this message
 in error, please notify the sender immediately by reply e-mail and permanently
 delete and destroy this message and its attachments, along with any copies
 thereof. This message does not create any contractual obligation on behalf of
 the sender or Law Bulletin Publishing Company.
 Thank you.



DISCLAIMER: This electronic message, including any attachments, files or 
documents, is intended only for the addressee and may contain CONFIDENTIAL, 
PROPRIETARY or LEGALLY PRIVILEGED information.  If you are not the intended 
recipient, you are hereby notified that any use, disclosure, copying or 
distribution of this message or any of the information included in or with it 
is  unauthorized and strictly prohibited.  If you have received this message in 
error, please notify the sender immediately by reply e-mail and permanently 
delete and destroy this message and its attachments, along with any copies 
thereof. This message does not create any contractual obligation on behalf of 
the sender or Law Bulletin Publishing Company.
Thank you.


Re: Error loading a custom request handler in Solr 4.0

2011-08-10 Thread simon
Th attachment isn't showing up (in gmail, at least). Can you inline
the relevant bits of code ?

On Wed, Aug 10, 2011 at 11:05 AM, Tom Mortimer t...@flax.co.uk wrote:
 Hi,
 Apologies if this is really basic. I'm trying to learn how to create a
 custom request handler, so I wrote the minimal class (attached), compiled
 and jar'd it, and placed it in example/lib. I added this to solrconfig.xml:
     requestHandler name=/flaxtest class=FlaxTestHandler /
 When I started Solr with java -jar start.jar, I got this:
     ...
     SEVERE: java.lang.NoClassDefFoundError:
 org/apache/solr/handler/RequestHandlerBase
 at java.lang.ClassLoader.defineClass1(Native Method)
         ...
 So I copied all the dist/*.jar files into lib and tried again. This time it
 seemed to start ok, but browsing to http://localhost:8983/solr/ displayed
 this:
     org.apache.solr.common.SolrException: Error Instantiating Request
 Handler, FlaxTestHandler is not a org.apache.solr.request.SolrRequestHandler

   at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:410) ...

 Any ideas?
 thanks,
 Tom



Re: Error loading a custom request handler in Solr 4.0

2011-08-10 Thread Tom Mortimer
Sure -

import org.apache.solr.request.SolrQueryRequest;
import org.apache.solr.response.SolrQueryResponse;
import org.apache.solr.handler.RequestHandlerBase;

public class FlaxTestHandler extends RequestHandlerBase {

public FlaxTestHandler() { }

public void handleRequestBody(SolrQueryRequest req, SolrQueryResponse
rsp)
throws Exception
{
rsp.add(FlaxTest, Hello!);
}

public String getDescription() { return Flax; }
public String getSourceId() { return Flax; }
public String getSource() { return Flax; }
public String getVersion() { return Flax; }

}



On 10 August 2011 16:43, simon mtnes...@gmail.com wrote:

 Th attachment isn't showing up (in gmail, at least). Can you inline
 the relevant bits of code ?

 On Wed, Aug 10, 2011 at 11:05 AM, Tom Mortimer t...@flax.co.uk wrote:
  Hi,
  Apologies if this is really basic. I'm trying to learn how to create a
  custom request handler, so I wrote the minimal class (attached), compiled
  and jar'd it, and placed it in example/lib. I added this to
 solrconfig.xml:
  requestHandler name=/flaxtest class=FlaxTestHandler /
  When I started Solr with java -jar start.jar, I got this:
  ...
  SEVERE: java.lang.NoClassDefFoundError:
  org/apache/solr/handler/RequestHandlerBase
  at java.lang.ClassLoader.defineClass1(Native Method)
  ...
  So I copied all the dist/*.jar files into lib and tried again. This time
 it
  seemed to start ok, but browsing to http://localhost:8983/solr/displayed
  this:
  org.apache.solr.common.SolrException: Error Instantiating Request
  Handler, FlaxTestHandler is not a
 org.apache.solr.request.SolrRequestHandler
 
at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:410)
 ...
 
  Any ideas?
  thanks,
  Tom
 



Re: how to ignore case in solr search field?

2011-08-10 Thread Tom Mortimer
You can use solr.LowerCaseFilterFactory in an analyser chain for both
indexing and queries. The schema.xml supplied with example has several field
types using this (including text_general).

Tom


On 10 August 2011 16:42, nagarjuna nagarjuna.avul...@gmail.com wrote:

 Hi please help me ..
how to ignore case while searching in solr


 ex:i need same results for the keywords abc, ABC , aBc,AbC and all the
 cases.




 Thank u in advance

 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/how-to-ignore-case-in-solr-search-field-tp3242967p3242967.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: Is optimize needed on slaves if it replicates from optimized master?

2011-08-10 Thread Pranav Prakash
Very well explained. Thanks. Yes, we do optimize Index before replication. I
am not particularly worried about disk space usage. I was more curious of
that behavior.

*Pranav Prakash*

temet nosce

Twitter http://twitter.com/pranavprakash | Blog http://blog.myblive.com |
Google http://www.google.com/profiles/pranny


On Wed, Aug 10, 2011 at 19:55, Erick Erickson erickerick...@gmail.comwrote:

 This is expected behavior. You might be optimizing
 your index on the master after every set of changes,
 in which case the entire index is copied. During this
 period, the space on disk will at least double, there's no
 way around that.

 If you do NOT optimize, then the slave will only copy changed
 segments instead of the entire index. Optimizing isn't
 usually necessary except periodically (daily, perhaps weekly,
 perhaps never actually).

 All that said, depending on how merging happens, you will always
 have the possibility of the entire index being copied sometimes
 because you'll happen to hit a merge that merges all segments
 into one.

 There are some advanced options that can control some parts
 of merging, but you need to get to the bottom of why the whole
 index is getting copied every time before you go there. I'd bet
 you're issuing an optimize.

 Best
 Erick

 On Wed, Aug 10, 2011 at 5:30 AM, Pranav Prakash pra...@gmail.com wrote:
  That is not true. Replication is roughly a copy of the diff between the
  master and the slave's index.
 
 
  In my case, during replication entire index is copied from master to
 slave,
  during which the size of index goes a little over double. Then it shrinks
 to
  its original size. Am I doing something wrong? How can I get the master
 to
  serve only delta index instead of serving whole index and the slaves
 merging
  the new and old index?
 
  *Pranav Prakash*
 



RE: [Help Wanted] Graphics and other help for new Lucene/Solr website

2011-08-10 Thread karl.wright
The site looks great.  And thank you for including the ManifoldCF link. ;-)

Karl

-Original Message-
From: ext Grant Ingersoll [mailto:gsing...@apache.org] 
Sent: Wednesday, August 10, 2011 10:09 AM
To: solr-user@lucene.apache.org; java-u...@lucene.apache.org
Subject: [Help Wanted] Graphics and other help for new Lucene/Solr website

Hi,

We are in the process of putting up a new Lucene/Solr/PyLucene/OpenRelevance 
website.  You can see a preview at http://lucene.staging.apache.org/lucene/.  
It is more or less a look and feel copy of Mahout and Open For Biz websites.  
This new site, IMO, both looks better than the old one and will be a lot easier 
for us committers to maintain/update and for others to contribute to.

So, how can you help?  

0.  All of the code is at https://svn.apache.org/repos/asf/lucene/cms/trunk.  
Check it out the usual way using SVN.  If you want to build locally, see 
https://issues.apache.org/jira/browse/LUCENE-2748 and the links to the ASF CMS 
guide.

1. If you have any graphic design skills:
- I'd love to have some mantle/slide images along the lines of 
http://lucene.staging.apache.org/lucene/images/mantle-lucene-solr.png.  These 
are used in the slideshow at the top of the Lucene, Core and Solr pages and 
should be interesting, inviting, etc. and should give people warm fuzzy 
feelings about all of our software and the great community we have.  (Think 
Marketing!)
- Help us coordinate the color selection on the various pages, 
especially in the slides and especially on the Solr page, as I'm not sure I 
like the green and black background contrasted with the orange of the Solr logo.

2. In a few more days or maybe a week or so, patches to fix content errors, 
etc. will be welcome.  For now, we are still porting things, so I don't want to 
duplicate effort.

3. New, useful documentation is also, of course, always welcome.

4. Test with your favorite browser.  In particular, I don't have IE handy.  
I've checked the site in Chrome, Firefox and Safari.

If you come up w/  images (I won't guarantee they will be accepted, but I am 
appreciative of the help) or other style fixes, etc., please submit all 
content/patches to https://issues.apache.org/jira/browse/LUCENE-2748 and please 
make sure to check the donation box when attaching the file. 

-Grant

 


Re: [Help Wanted] Graphics and other help for new Lucene/Solr website

2011-08-10 Thread Markus Jelsma
Looks nice! Font seems too light to read with comfort though.

 Hi,
 
 We are in the process of putting up a new
 Lucene/Solr/PyLucene/OpenRelevance website.  You can see a preview at
 http://lucene.staging.apache.org/lucene/.  It is more or less a look and
 feel copy of Mahout and Open For Biz websites.  This new site, IMO, both
 looks better than the old one and will be a lot easier for us committers
 to maintain/update and for others to contribute to.
 
 So, how can you help?
 
 0.  All of the code is at
 https://svn.apache.org/repos/asf/lucene/cms/trunk.  Check it out the usual
 way using SVN.  If you want to build locally, see
 https://issues.apache.org/jira/browse/LUCENE-2748 and the links to the ASF
 CMS guide.
 
 1. If you have any graphic design skills:
   - I'd love to have some mantle/slide images along the lines of
 http://lucene.staging.apache.org/lucene/images/mantle-lucene-solr.png. 
 These are used in the slideshow at the top of the Lucene, Core and Solr
 pages and should be interesting, inviting, etc. and should give people
 warm fuzzy feelings about all of our software and the great community we
 have.  (Think Marketing!) - Help us coordinate the color selection on the
 various pages, especially in the slides and especially on the Solr page,
 as I'm not sure I like the green and black background contrasted with the
 orange of the Solr logo.
 
 2. In a few more days or maybe a week or so, patches to fix content errors,
 etc. will be welcome.  For now, we are still porting things, so I don't
 want to duplicate effort.
 
 3. New, useful documentation is also, of course, always welcome.
 
 4. Test with your favorite browser.  In particular, I don't have IE handy. 
 I've checked the site in Chrome, Firefox and Safari.
 
 If you come up w/  images (I won't guarantee they will be accepted, but I
 am appreciative of the help) or other style fixes, etc., please submit all
 content/patches to https://issues.apache.org/jira/browse/LUCENE-2748 and
 please make sure to check the donation box when attaching the file.
 
 -Grant


Re: Error loading a custom request handler in Solr 4.0

2011-08-10 Thread simon
It's working for me. Compiled, inserted in solr/lib, added the config
line to solrconfig.

  when I send a /flaxtest request i get

response
lst name=responseHeader
int name=status0/int
int name=QTime16/int
/lst
str name=FlaxTestHello!/str
/response

I was doing this within a core defined in solr.xml

-Simon

On Wed, Aug 10, 2011 at 11:46 AM, Tom Mortimer t...@flax.co.uk wrote:
 Sure -

 import org.apache.solr.request.SolrQueryRequest;
 import org.apache.solr.response.SolrQueryResponse;
 import org.apache.solr.handler.RequestHandlerBase;

 public class FlaxTestHandler extends RequestHandlerBase {

    public FlaxTestHandler() { }

    public void handleRequestBody(SolrQueryRequest req, SolrQueryResponse
 rsp)
        throws Exception
    {
        rsp.add(FlaxTest, Hello!);
    }

    public String getDescription() { return Flax; }
    public String getSourceId() { return Flax; }
    public String getSource() { return Flax; }
    public String getVersion() { return Flax; }

 }



 On 10 August 2011 16:43, simon mtnes...@gmail.com wrote:

 Th attachment isn't showing up (in gmail, at least). Can you inline
 the relevant bits of code ?

 On Wed, Aug 10, 2011 at 11:05 AM, Tom Mortimer t...@flax.co.uk wrote:
  Hi,
  Apologies if this is really basic. I'm trying to learn how to create a
  custom request handler, so I wrote the minimal class (attached), compiled
  and jar'd it, and placed it in example/lib. I added this to
 solrconfig.xml:
      requestHandler name=/flaxtest class=FlaxTestHandler /
  When I started Solr with java -jar start.jar, I got this:
      ...
      SEVERE: java.lang.NoClassDefFoundError:
  org/apache/solr/handler/RequestHandlerBase
  at java.lang.ClassLoader.defineClass1(Native Method)
          ...
  So I copied all the dist/*.jar files into lib and tried again. This time
 it
  seemed to start ok, but browsing to http://localhost:8983/solr/displayed
  this:
      org.apache.solr.common.SolrException: Error Instantiating Request
  Handler, FlaxTestHandler is not a
 org.apache.solr.request.SolrRequestHandler
 
        at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:410)
 ...
 
  Any ideas?
  thanks,
  Tom
 




query time problem

2011-08-10 Thread Charles-Andre Martin
Hi,

 

I've noticed poor performance for my solr queries in the past few days.

 

Queries of that type :

 

http://server:5000/solr/select?q=story_search_field_en:(water boston) OR 
story_search_field_fr:(water boston)rows=350start=0sort=r_modify_date 
descshards=shard1:5001/solr,shard2:5002/solrfq=type:(cch_story OR 
cch_published_story)

 

Are slow (more than 10 seconds).

 

I would like to know if someone knows how I could investigate the problem ? I 
tried to specify the parameters debugQuery=onexplainOther=on but this doesn't 
help much.

 

I also monitored the shards log. Sometimes, there is broken pipe in the shards 
logs.

 

Also, is there a way I could monitor the cache statistics ? 

 

For your information, every shards master and slaves computers have enough RAM 
and disk space.

 

 

Charles-André Martin

 

 



Re: Error loading a custom request handler in Solr 4.0

2011-08-10 Thread Tom Mortimer
Interesting.. is this in trunk (4.0)? Maybe I've broken mine somehow!

What classpath did you use for compiling? And did you copy anything other
than the new jar into lib/ ?

thanks,
Tom


On 10 August 2011 18:07, simon mtnes...@gmail.com wrote:

 It's working for me. Compiled, inserted in solr/lib, added the config
 line to solrconfig.

  when I send a /flaxtest request i get

 response
 lst name=responseHeader
 int name=status0/int
 int name=QTime16/int
 /lst
 str name=FlaxTestHello!/str
 /response

 I was doing this within a core defined in solr.xml

 -Simon

 On Wed, Aug 10, 2011 at 11:46 AM, Tom Mortimer t...@flax.co.uk wrote:
  Sure -
 
  import org.apache.solr.request.SolrQueryRequest;
  import org.apache.solr.response.SolrQueryResponse;
  import org.apache.solr.handler.RequestHandlerBase;
 
  public class FlaxTestHandler extends RequestHandlerBase {
 
 public FlaxTestHandler() { }
 
 public void handleRequestBody(SolrQueryRequest req, SolrQueryResponse
  rsp)
 throws Exception
 {
 rsp.add(FlaxTest, Hello!);
 }
 
 public String getDescription() { return Flax; }
 public String getSourceId() { return Flax; }
 public String getSource() { return Flax; }
 public String getVersion() { return Flax; }
 
  }
 
 
 
  On 10 August 2011 16:43, simon mtnes...@gmail.com wrote:
 
  Th attachment isn't showing up (in gmail, at least). Can you inline
  the relevant bits of code ?
 
  On Wed, Aug 10, 2011 at 11:05 AM, Tom Mortimer t...@flax.co.uk wrote:
   Hi,
   Apologies if this is really basic. I'm trying to learn how to create a
   custom request handler, so I wrote the minimal class (attached),
 compiled
   and jar'd it, and placed it in example/lib. I added this to
  solrconfig.xml:
   requestHandler name=/flaxtest class=FlaxTestHandler /
   When I started Solr with java -jar start.jar, I got this:
   ...
   SEVERE: java.lang.NoClassDefFoundError:
   org/apache/solr/handler/RequestHandlerBase
   at java.lang.ClassLoader.defineClass1(Native Method)
   ...
   So I copied all the dist/*.jar files into lib and tried again. This
 time
  it
   seemed to start ok, but browsing to
 http://localhost:8983/solr/displayed
   this:
   org.apache.solr.common.SolrException: Error Instantiating Request
   Handler, FlaxTestHandler is not a
  org.apache.solr.request.SolrRequestHandler
  
 at
 org.apache.solr.core.SolrCore.createInstance(SolrCore.java:410)
  ...
  
   Any ideas?
   thanks,
   Tom
  
 
 



RE: Building a facet query in SolrJ

2011-08-10 Thread Simon, Richard T
I take it back. I didn't find it. I corrected my values and the facet queries 
still don't find what I want.

The values I'm looking for are URIs, so they look like: http://place.org/abc/def

I add the facet query like so:

query.addFacetQuery(MyField + : + \ + uri + \);


I print the query, just to see what it is:

Facet Query:  MyField: : http://place.org/abc/def;

But when I examine queryResponse.getFacetFields, it's an empty list, if I do 
not set the facet field. If I set the facet field to MyField, then I get facets 
for ALL the values of MyField, not just the ones in the facet queries.

Can anyone help here?

Thanks.


From: Simon, Richard T
Sent: Wednesday, August 10, 2011 11:07 AM
To: Simon, Richard T; solr-user@lucene.apache.org
Subject: RE: Building a facet query in SolrJ

Oops. I think I found it. My desiredFieldValues list has the wrong info. Knew 
there was something simple wrong.

From: Simon, Richard T
Sent: Wednesday, August 10, 2011 10:55 AM
To: solr-user@lucene.apache.org
Cc: Simon, Richard T
Subject: Building a facet query in SolrJ

Hi - I'm trying to do a (I think) simple facet query, but I'm not getting the 
results I expect. I have a field, MyField, and I want to get facets for 
specific values of that field. That is, I want a FacetField if MyField is 
ABC, DEF, etc. (a specific list of values), but not if MyField is any other 
value.

If I build my query like this:

SolrQuery query = new SolrQuery( luceneQueryStr );
  query.setStart( request.getStartIndex() );
  query.setRows( request.getMaxResults() );
  query.setFacet(true);
 query.setFacetMinCount(1);

  query.addFacetField(MYFIELD);

  for (String fieldValue : desiredFieldValues) {
   query.addFacetQuery(MYFIELD + : + fieldValue);
 }


queryResponse.getFacetFields returns facets for ALL values of MyField. I 
figured that was because setting the facet field with addFacetField caused Solr 
to examine all values. But, if I take out that line, then getFacetFields 
returns an empty list.

I'm sure I'm doing something simple wrong, but I'm out of ideas right now.

-Rich






Re: Error loading a custom request handler in Solr 4.0

2011-08-10 Thread simon
This is in trunk (up to date). Compiler is 1.6.0_26

classpath was  
dist/apache-solr-solrj-4.0-SNAPSHOT.jar:dist/apache-solr-core-4.0-SNAPSHOT.jar
built from trunk just prior by 'ant dist'

I'd try again with a clean trunk .

-Simon

On Wed, Aug 10, 2011 at 1:20 PM, Tom Mortimer t...@flax.co.uk wrote:
 Interesting.. is this in trunk (4.0)? Maybe I've broken mine somehow!

 What classpath did you use for compiling? And did you copy anything other
 than the new jar into lib/ ?

 thanks,
 Tom


 On 10 August 2011 18:07, simon mtnes...@gmail.com wrote:

 It's working for me. Compiled, inserted in solr/lib, added the config
 line to solrconfig.

  when I send a /flaxtest request i get

 response
 lst name=responseHeader
 int name=status0/int
 int name=QTime16/int
 /lst
 str name=FlaxTestHello!/str
 /response

 I was doing this within a core defined in solr.xml

 -Simon

 On Wed, Aug 10, 2011 at 11:46 AM, Tom Mortimer t...@flax.co.uk wrote:
  Sure -
 
  import org.apache.solr.request.SolrQueryRequest;
  import org.apache.solr.response.SolrQueryResponse;
  import org.apache.solr.handler.RequestHandlerBase;
 
  public class FlaxTestHandler extends RequestHandlerBase {
 
     public FlaxTestHandler() { }
 
     public void handleRequestBody(SolrQueryRequest req, SolrQueryResponse
  rsp)
         throws Exception
     {
         rsp.add(FlaxTest, Hello!);
     }
 
     public String getDescription() { return Flax; }
     public String getSourceId() { return Flax; }
     public String getSource() { return Flax; }
     public String getVersion() { return Flax; }
 
  }
 
 
 
  On 10 August 2011 16:43, simon mtnes...@gmail.com wrote:
 
  Th attachment isn't showing up (in gmail, at least). Can you inline
  the relevant bits of code ?
 
  On Wed, Aug 10, 2011 at 11:05 AM, Tom Mortimer t...@flax.co.uk wrote:
   Hi,
   Apologies if this is really basic. I'm trying to learn how to create a
   custom request handler, so I wrote the minimal class (attached),
 compiled
   and jar'd it, and placed it in example/lib. I added this to
  solrconfig.xml:
       requestHandler name=/flaxtest class=FlaxTestHandler /
   When I started Solr with java -jar start.jar, I got this:
       ...
       SEVERE: java.lang.NoClassDefFoundError:
   org/apache/solr/handler/RequestHandlerBase
   at java.lang.ClassLoader.defineClass1(Native Method)
           ...
   So I copied all the dist/*.jar files into lib and tried again. This
 time
  it
   seemed to start ok, but browsing to
 http://localhost:8983/solr/displayed
   this:
       org.apache.solr.common.SolrException: Error Instantiating Request
   Handler, FlaxTestHandler is not a
  org.apache.solr.request.SolrRequestHandler
  
         at
 org.apache.solr.core.SolrCore.createInstance(SolrCore.java:410)
  ...
  
   Any ideas?
   thanks,
   Tom
  
 
 




Re: query time problem

2011-08-10 Thread simon
Off the top of my head ...

Can you tell if GC is happening more frequently than usual/expected  ?

Is the index optimized - if not, how many segments ?

It's possible that one of the shards is behind a flaky network connection.

Is the 10s performance just for the Solr query or wallclock time at
the browser ?

You can monitor cache statistics from the admin console 'statistics' page

Are you seeing anything untoward in the solr logs ?

-Simon

On Wed, Aug 10, 2011 at 1:11 PM, Charles-Andre Martin
charles-andre.mar...@sunmedia.ca wrote:
 Hi,



 I've noticed poor performance for my solr queries in the past few days.



 Queries of that type :



 http://server:5000/solr/select?q=story_search_field_en:(water boston) OR 
 story_search_field_fr:(water boston)rows=350start=0sort=r_modify_date 
 descshards=shard1:5001/solr,shard2:5002/solrfq=type:(cch_story OR 
 cch_published_story)



 Are slow (more than 10 seconds).



 I would like to know if someone knows how I could investigate the problem ? I 
 tried to specify the parameters debugQuery=onexplainOther=on but this 
 doesn't help much.



 I also monitored the shards log. Sometimes, there is broken pipe in the 
 shards logs.



 Also, is there a way I could monitor the cache statistics ?



 For your information, every shards master and slaves computers have enough 
 RAM and disk space.





 Charles-André Martin








How to start troubleshooting a content extraction issue

2011-08-10 Thread Tim AtLee
Hello

So, I'm a newbie to Solr and Tika and whatnot, so please use simple words
for me :P

I am running Solr on Tomcat 7 on Windows Server 2008 r2, running as the
search engine for a Drupal web site.

Up until recently, everything has been fine - searching works, faceting
works, etc.

Recently a user uploaded a 5mb xltm file, which seems to be causing Tomcat
to spike in CPU usage, and eventually error out.  When the documents are
submitted to be index, the tomcat process spikes up to use 100% of 1
available CPU, with the eventual error in Drupal of Exception occured
sending *sites/default/files/nodefiles/533/June 30, 2011.xltm* to Solr 0
Status: Communication Error.

I am looking for some help in figuring out where to troubleshoot this.  I
assume it's this file, but I guess I'd like to be sure - so how can I submit
this file for content extraction manually to see what happens?

Thanks,

Tim


Re: Error loading a custom request handler in Solr 4.0

2011-08-10 Thread Tom Mortimer
Thanks Simon. I'll try again tomorrow.

Tom

On 10 August 2011 18:46, simon mtnes...@gmail.com wrote:

 This is in trunk (up to date). Compiler is 1.6.0_26

 classpath was
  
 dist/apache-solr-solrj-4.0-SNAPSHOT.jar:dist/apache-solr-core-4.0-SNAPSHOT.jar
 built from trunk just prior by 'ant dist'

 I'd try again with a clean trunk .

 -Simon

 On Wed, Aug 10, 2011 at 1:20 PM, Tom Mortimer t...@flax.co.uk wrote:
  Interesting.. is this in trunk (4.0)? Maybe I've broken mine somehow!
 
  What classpath did you use for compiling? And did you copy anything other
  than the new jar into lib/ ?
 
  thanks,
  Tom
 
 
  On 10 August 2011 18:07, simon mtnes...@gmail.com wrote:
 
  It's working for me. Compiled, inserted in solr/lib, added the config
  line to solrconfig.
 
   when I send a /flaxtest request i get
 
  response
  lst name=responseHeader
  int name=status0/int
  int name=QTime16/int
  /lst
  str name=FlaxTestHello!/str
  /response
 
  I was doing this within a core defined in solr.xml
 
  -Simon
 
  On Wed, Aug 10, 2011 at 11:46 AM, Tom Mortimer t...@flax.co.uk wrote:
   Sure -
  
   import org.apache.solr.request.SolrQueryRequest;
   import org.apache.solr.response.SolrQueryResponse;
   import org.apache.solr.handler.RequestHandlerBase;
  
   public class FlaxTestHandler extends RequestHandlerBase {
  
  public FlaxTestHandler() { }
  
  public void handleRequestBody(SolrQueryRequest req,
 SolrQueryResponse
   rsp)
  throws Exception
  {
  rsp.add(FlaxTest, Hello!);
  }
  
  public String getDescription() { return Flax; }
  public String getSourceId() { return Flax; }
  public String getSource() { return Flax; }
  public String getVersion() { return Flax; }
  
   }
  
  
  
   On 10 August 2011 16:43, simon mtnes...@gmail.com wrote:
  
   Th attachment isn't showing up (in gmail, at least). Can you inline
   the relevant bits of code ?
  
   On Wed, Aug 10, 2011 at 11:05 AM, Tom Mortimer t...@flax.co.uk
 wrote:
Hi,
Apologies if this is really basic. I'm trying to learn how to
 create a
custom request handler, so I wrote the minimal class (attached),
  compiled
and jar'd it, and placed it in example/lib. I added this to
   solrconfig.xml:
requestHandler name=/flaxtest class=FlaxTestHandler /
When I started Solr with java -jar start.jar, I got this:
...
SEVERE: java.lang.NoClassDefFoundError:
org/apache/solr/handler/RequestHandlerBase
at java.lang.ClassLoader.defineClass1(Native Method)
...
So I copied all the dist/*.jar files into lib and tried again. This
  time
   it
seemed to start ok, but browsing to
  http://localhost:8983/solr/displayed
this:
org.apache.solr.common.SolrException: Error Instantiating
 Request
Handler, FlaxTestHandler is not a
   org.apache.solr.request.SolrRequestHandler
   
  at
  org.apache.solr.core.SolrCore.createInstance(SolrCore.java:410)
   ...
   
Any ideas?
thanks,
Tom
   
  
  
 
 



Re: Building a facet query in SolrJ

2011-08-10 Thread Erik Hatcher
Try making your queries, manually, to see this closer in action... 
q=MyField:uri and see what you get.  In this case, because your URI contains 
characters that make the default query parser unhappy, do this sort of query 
instead:

{!term f=MyField}uri

That way the query is parsed properly into a single term query.

I am a little confused below since you're faceting on MyField entirely 
(addFacetField) where you'd get the values of each URI facet query in that list 
anyway.

Erik

On Aug 10, 2011, at 13:42 , Simon, Richard T wrote:

 I take it back. I didn't find it. I corrected my values and the facet queries 
 still don't find what I want.
 
 The values I'm looking for are URIs, so they look like: 
 http://place.org/abc/def
 
 I add the facet query like so:
 
 query.addFacetQuery(MyField + : + \ + uri + \);
 
 
 I print the query, just to see what it is:
 
 Facet Query:  MyField: : http://place.org/abc/def;
 
 But when I examine queryResponse.getFacetFields, it's an empty list, if I do 
 not set the facet field. If I set the facet field to MyField, then I get 
 facets for ALL the values of MyField, not just the ones in the facet queries.
 
 Can anyone help here?
 
 Thanks.
 
 
 From: Simon, Richard T
 Sent: Wednesday, August 10, 2011 11:07 AM
 To: Simon, Richard T; solr-user@lucene.apache.org
 Subject: RE: Building a facet query in SolrJ
 
 Oops. I think I found it. My desiredFieldValues list has the wrong info. Knew 
 there was something simple wrong.
 
 From: Simon, Richard T
 Sent: Wednesday, August 10, 2011 10:55 AM
 To: solr-user@lucene.apache.org
 Cc: Simon, Richard T
 Subject: Building a facet query in SolrJ
 
 Hi - I'm trying to do a (I think) simple facet query, but I'm not getting the 
 results I expect. I have a field, MyField, and I want to get facets for 
 specific values of that field. That is, I want a FacetField if MyField is 
 ABC, DEF, etc. (a specific list of values), but not if MyField is any 
 other value.
 
 If I build my query like this:
 
 SolrQuery query = new SolrQuery( luceneQueryStr );
  query.setStart( request.getStartIndex() );
  query.setRows( request.getMaxResults() );
  query.setFacet(true);
 query.setFacetMinCount(1);
 
  query.addFacetField(MYFIELD);
 
  for (String fieldValue : desiredFieldValues) {
   query.addFacetQuery(MYFIELD + : + fieldValue);
 }
 
 
 queryResponse.getFacetFields returns facets for ALL values of MyField. I 
 figured that was because setting the facet field with addFacetField caused 
 Solr to examine all values. But, if I take out that line, then getFacetFields 
 returns an empty list.
 
 I'm sure I'm doing something simple wrong, but I'm out of ideas right now.
 
 -Rich
 
 
 
 



RE: Building a facet query in SolrJ

2011-08-10 Thread Simon, Richard T
Hi -- I do get facets for all the values of MyField when I specify the facet 
field, but that's not what I want. I just want facets for a subset of the 
values of MyField. That's why I'm trying to use the facet queries, to just get 
facets for those values.


-Rich

-Original Message-
From: Erik Hatcher [mailto:erik.hatc...@gmail.com] 
Sent: Wednesday, August 10, 2011 2:04 PM
To: solr-user@lucene.apache.org
Subject: Re: Building a facet query in SolrJ

Try making your queries, manually, to see this closer in action... 
q=MyField:uri and see what you get.  In this case, because your URI contains 
characters that make the default query parser unhappy, do this sort of query 
instead:

{!term f=MyField}uri

That way the query is parsed properly into a single term query.

I am a little confused below since you're faceting on MyField entirely 
(addFacetField) where you'd get the values of each URI facet query in that list 
anyway.

Erik

On Aug 10, 2011, at 13:42 , Simon, Richard T wrote:

 I take it back. I didn't find it. I corrected my values and the facet queries 
 still don't find what I want.
 
 The values I'm looking for are URIs, so they look like: 
 http://place.org/abc/def
 
 I add the facet query like so:
 
 query.addFacetQuery(MyField + : + \ + uri + \);
 
 
 I print the query, just to see what it is:
 
 Facet Query:  MyField: : http://place.org/abc/def;
 
 But when I examine queryResponse.getFacetFields, it's an empty list, if I do 
 not set the facet field. If I set the facet field to MyField, then I get 
 facets for ALL the values of MyField, not just the ones in the facet queries.
 
 Can anyone help here?
 
 Thanks.
 
 
 From: Simon, Richard T
 Sent: Wednesday, August 10, 2011 11:07 AM
 To: Simon, Richard T; solr-user@lucene.apache.org
 Subject: RE: Building a facet query in SolrJ
 
 Oops. I think I found it. My desiredFieldValues list has the wrong info. Knew 
 there was something simple wrong.
 
 From: Simon, Richard T
 Sent: Wednesday, August 10, 2011 10:55 AM
 To: solr-user@lucene.apache.org
 Cc: Simon, Richard T
 Subject: Building a facet query in SolrJ
 
 Hi - I'm trying to do a (I think) simple facet query, but I'm not getting the 
 results I expect. I have a field, MyField, and I want to get facets for 
 specific values of that field. That is, I want a FacetField if MyField is 
 ABC, DEF, etc. (a specific list of values), but not if MyField is any 
 other value.
 
 If I build my query like this:
 
 SolrQuery query = new SolrQuery( luceneQueryStr );
  query.setStart( request.getStartIndex() );
  query.setRows( request.getMaxResults() );
  query.setFacet(true);
 query.setFacetMinCount(1);
 
  query.addFacetField(MYFIELD);
 
  for (String fieldValue : desiredFieldValues) {
   query.addFacetQuery(MYFIELD + : + fieldValue);
 }
 
 
 queryResponse.getFacetFields returns facets for ALL values of MyField. I 
 figured that was because setting the facet field with addFacetField caused 
 Solr to examine all values. But, if I take out that line, then getFacetFields 
 returns an empty list.
 
 I'm sure I'm doing something simple wrong, but I'm out of ideas right now.
 
 -Rich
 
 
 
 



RE: query time problem

2011-08-10 Thread Charles-Andre Martin
Thanks Simon for these tracks.

Here's my answers :

Can you tell if GC is happening more frequently than usual/expected  ?

GC is OK.

Is the index optimized - if not, how many segments ?

According to the statistics page from the admin :
One shard (master/slave) has 10 segments
The other shard (master/slave) has 13 segments

Is this ok ? The optimize job is running each day during the night.


It's possible that one of the shards is behind a flaky network connection.

Will check ...


Is the 10s performance just for the Solr query or wallclock time at
the browser ?

Both

You can monitor cache statistics from the admin console 'statistics' page

Thanks


Are you seeing anything untoward in the solr logs ?

I see stacktrace :

Aug 10, 2011 1:49:13 PM org.apache.solr.common.SolrException log
SEVERE: ClientAbortException:  java.net.SocketException: Broken pipe
at 
org.apache.catalina.connector.OutputBuffer.realWriteBytes(OutputBuffer.java:358)
at org.apache.tomcat.util.buf.ByteChunk.append(ByteChunk.java:325)
at 
org.apache.catalina.connector.OutputBuffer.writeBytes(OutputBuffer.java:381)
at 
org.apache.catalina.connector.OutputBuffer.write(OutputBuffer.java:370)
at 
org.apache.catalina.connector.CoyoteOutputStream.write(CoyoteOutputStream.java:89)
at 
org.apache.solr.common.util.FastOutputStream.flushBuffer(FastOutputStream.java:183)
at 
org.apache.solr.common.util.JavaBinCodec.marshal(JavaBinCodec.java:89)
at 
org.apache.solr.request.BinaryResponseWriter.write(BinaryResponseWriter.java:48)
at 
org.apache.solr.servlet.SolrDispatchFilter.writeResponse(SolrDispatchFilter.java:322)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:254)
at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at 
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
at 
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
at 
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
at 
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
at 
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
at 
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:286)
at 
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:845)
at 
org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583)
at 
org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447)
at java.lang.Thread.run(Thread.java:619)
Caused by: java.net.SocketException: Broken pipe
at java.net.SocketOutputStream.socketWrite0(Native Method)
at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92)
at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
at 
org.apache.coyote.http11.InternalOutputBuffer.realWriteBytes(InternalOutputBuffer.java:740)
at org.apache.tomcat.util.buf.ByteChunk.flushBuffer(ByteChunk.java:434)
at org.apache.tomcat.util.buf.ByteChunk.append(ByteChunk.java:349)
at 
org.apache.coyote.http11.InternalOutputBuffer$OutputStreamOutputBuffer.doWrite(InternalOutputBuffer.java:764)
at 
org.apache.coyote.http11.filters.IdentityOutputFilter.doWrite(IdentityOutputFilter.java:127)
at 
org.apache.coyote.http11.InternalOutputBuffer.doWrite(InternalOutputBuffer.java:573)
at org.apache.coyote.Response.doWrite(Response.java:560)
at 
org.apache.catalina.connector.OutputBuffer.realWriteBytes(OutputBuffer.java:353)
... 21 more

Charles-André Martin


800 Square Victoria
Montréal (Québec) H4Z 0A3
Tél : (514) 504-2703


-Message d'origine-
De : simon [mailto:mtnes...@gmail.com] 
Envoyé : August-10-11 1:52 PM
À : solr-user@lucene.apache.org
Objet : Re: query time problem

Off the top of my head ...

Can you tell if GC is happening more frequently than usual/expected  ?

Is the index optimized - if not, how many segments ?

It's possible that one of the shards is behind a flaky network connection.

Is the 10s performance just for the Solr query or wallclock time at
the browser ?

You can monitor cache statistics from the admin console 'statistics' page

Are you seeing anything untoward in the solr logs ?

-Simon

On Wed, Aug 10, 2011 at 1:11 PM, Charles-Andre Martin
charles-andre.mar...@sunmedia.ca wrote:
 Hi,



 I've noticed poor performance for my solr queries in the past few days.



 Queries of that type :



 http://server:5000/solr/select?q=story_search_field_en:(water boston) OR 
 story_search_field_fr:(water boston)rows=350start=0sort=r_modify_date 
 

Can't mix Synonyms with Shingles?

2011-08-10 Thread Jeff Wartes

I would like to combine the ShingleFilterFactory with a SynonymFilterFactory in 
a field type. 

I've looked at something like this using the analysis.jsp tool: 

fieldType name=TestTerm class=solr.TextField 
positionIncrementGap=100
  analyzer type=index
tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.WordDelimiterFilterFactory generateWordParts=1 
generateNumberParts=1 stemEnglishPosessive=1/
filter class=solr.ShingleFilterFactory tokenSeparator= /
filter class=solr.SynonymFilterFactory 
synonyms=synonyms.BusinessNames.txt ignoreCase=true expand=true/
...
  /analyzer
  analyzer type=query
  ...
  /analyzer
/fieldType

However, when a ShingleFilterFactory is applied first, the SynonymFilterFactory 
appears to do nothing. 
I haven't found any documentation or other warnings against this combination, 
and I don't want to apply shingles after synonyms (this works) because 
multi-word synonyms then cause severe term expansion. I don't really mind if 
the synonyms fail to match shingles, (although I'd prefer they succeed) but I'd 
at least expect that synonyms would continue to match on the original tokens, 
as they do if I remove the ShingleFilterFactory.

I'm using Solr 3.3, any clarification would be appreciated.

Thanks,
  -Jeff Wartes



Re: Error loading a custom request handler in Solr 4.0

2011-08-10 Thread Chris Hostetter

: custom request handler, so I wrote the minimal class (attached), compiled
: and jar'd it, and placed it in example/lib. I added this to solrconfig.xml:

that's the crux of hte issue.

example/lib is where the jetty libraries live -- not solr plugins.

you should either put your custom jar's in the lib dir of your solr home 
(ie: example/solr/lib) or put it in a directory of your choice that you 
refer to from your solrconfig.xml file using a lib/ directive.

: So I copied all the dist/*.jar files into lib and tried again. This time it

ouch ... make sure you remove *all* of those, or you will have no end of 
random obscure classpath issues at random times as jars are sometimes 
loaded from the war and sometimes loaded from that directory.


-Hoss


RE: Can't mix Synonyms with Shingles?

2011-08-10 Thread Steven A Rowe
Hi Jeff,

Hi Jeff,

You have configured ShingleFilterFactory with a token separator of , so e.g. 
International Corporation will output the shingle InternationalCorporation. 
 If this is the form you want to use for synonym matching, it must exist in 
your synonym file.  Does it?

Steve

 -Original Message-
 From: Jeff Wartes [mailto:jwar...@whitepages.com]
 Sent: Wednesday, August 10, 2011 3:43 PM
 To: solr-user@lucene.apache.org
 Subject: Can't mix Synonyms with Shingles?
 
 
 I would like to combine the ShingleFilterFactory with a
 SynonymFilterFactory in a field type.
 
 I've looked at something like this using the analysis.jsp tool:
 
 fieldType name=TestTerm class=solr.TextField
 positionIncrementGap=100
   analyzer type=index
 tokenizer class=solr.WhitespaceTokenizerFactory/
 filter class=solr.WordDelimiterFilterFactory
 generateWordParts=1 generateNumberParts=1 stemEnglishPosessive=1/
 filter class=solr.ShingleFilterFactory tokenSeparator= /
 filter class=solr.SynonymFilterFactory
 synonyms=synonyms.BusinessNames.txt ignoreCase=true expand=true/
 ...
   /analyzer
   analyzer type=query
   ...
   /analyzer
 /fieldType
 
 However, when a ShingleFilterFactory is applied first, the
 SynonymFilterFactory appears to do nothing.
 I haven't found any documentation or other warnings against this
 combination, and I don't want to apply shingles after synonyms (this
 works) because multi-word synonyms then cause severe term expansion. I
 don't really mind if the synonyms fail to match shingles, (although I'd
 prefer they succeed) but I'd at least expect that synonyms would continue
 to match on the original tokens, as they do if I remove the
 ShingleFilterFactory.
 
 I'm using Solr 3.3, any clarification would be appreciated.
 
 Thanks,
   -Jeff Wartes



RE: Building a facet query in SolrJ

2011-08-10 Thread Chris Hostetter

: query.addFacetQuery(MyField + : + \ + uri + \);
...
: But when I examine queryResponse.getFacetFields, it's an empty list, if 

facet.query constraints+counts do not come back in the facet.field 
section of hte response.  they come back in the facet.query section of 
the response (look at the XML in your browser and you'll see what i 
mean)...

https://lucene.apache.org/solr/api/org/apache/solr/client/solrj/response/QueryResponse.html#getFacetQuery%28%29


-Hoss


Re: Example Solr Config on EC2

2011-08-10 Thread Matt Shields
If I were to build a master with multiple slaves, is it possible to promote
a slave to be the new master if the original master fails?  Will all the
slaves pickup right where they left off, or any time the master fails will
we need to completely regenerate all the data?

If this is possible, are there any examples of this being automated?
 Especially on Win2k3.

Matthew Shields
Owner
BeanTown Host - Web Hosting, Domain Names, Dedicated Servers, Colocation,
Managed Services
www.beantownhost.com
www.sysadminvalley.com
www.jeeprally.com



On Mon, Aug 8, 2011 at 5:34 PM, mboh...@yahoo.com wrote:

 Matthew,

 Here's another resource:

 http://www.lucidimagination.com/blog/2010/02/01/solr-shines-through-the-cloud-lucidworks-solr-on-ec2/


 Michael Bohlig
 Lucid Imagination



 - Original Message 
 From: Matt Shields m...@mattshields.org
 To: solr-user@lucene.apache.org
 Sent: Mon, August 8, 2011 2:03:20 PM
 Subject: Example Solr Config on EC2

 I'm looking for some examples of how to setup Solr on EC2.  The
 configuration I'm looking for would have multiple nodes for redundancy.
 I've tested in-house with a single master and slave with replication
 running in Tomcat on Windows Server 2003, but even if I have multiple
 slaves
 the single master is a single point of failure.  Any suggestions or example
 configurations?  The project I'm working on is a .NET setup, so ideally I'd
 like to keep this search cluster on Windows Server, even though I prefer
 Linux.

 Matthew Shields
 Owner
 BeanTown Host - Web Hosting, Domain Names, Dedicated Servers, Colocation,
 Managed Services
 www.beantownhost.com
 www.sysadminvalley.com
 www.jeeprally.com




Problem with xinclude in solrconfig.xml

2011-08-10 Thread Way Cool
Hi, Guys,

Based on the document below, I should be able to include a file under the
same directory by specifying relative path via xinclude in solrconfig.xml:
http://wiki.apache.org/solr/SolrConfigXml

However I am getting the following error when I use relative path (absolute
path works fine though):
SEVERE: org.xml.sax.SAXParseException: Error attempting to parse XML file

Any ideas?

Thanks,

YH


Re: Problem with xinclude in solrconfig.xml

2011-08-10 Thread Way Cool
Sorry for the spam. I just figured it out. Thanks.

On Wed, Aug 10, 2011 at 2:17 PM, Way Cool way1.wayc...@gmail.com wrote:

 Hi, Guys,

 Based on the document below, I should be able to include a file under the
 same directory by specifying relative path via xinclude in solrconfig.xml:
 http://wiki.apache.org/solr/SolrConfigXml

 However I am getting the following error when I use relative path (absolute
 path works fine though):
 SEVERE: org.xml.sax.SAXParseException: Error attempting to parse XML file

 Any ideas?

 Thanks,

 YH



RE: Can't mix Synonyms with Shingles?

2011-08-10 Thread Jeff Wartes

Hi Steven,

The token separator was certainly a deliberate choice, are you saying that 
after applying shingles, synonyms can only match shingled terms? The term 
analysis suggests the original tokens still exist. 
You've made me realize that only certain synonyms seem to have problems though, 
so it's not a blanket failure.

Take this synonym definition:
wamu, washington mutual bank, washington mutual

Indexing wamu looks like it'll work fine - there are no shingles, and all 
three synonym expansions appear to get indexed. (expand=true) However, 
indexing washington mutual applies the shingles correctly, (adds 
washingtonmutual to position 1) but the synonym expansion does not happen. I 
would still expect the synonym definition to match the original terms and index 
'wamu' along with the other stuff.

Thanks.



-Original Message-
From: Steven A Rowe [mailto:sar...@syr.edu] 
Sent: Wednesday, August 10, 2011 12:54 PM
To: solr-user@lucene.apache.org
Subject: RE: Can't mix Synonyms with Shingles?

Hi Jeff,

Hi Jeff,

You have configured ShingleFilterFactory with a token separator of , so e.g. 
International Corporation will output the shingle InternationalCorporation. 
 If this is the form you want to use for synonym matching, it must exist in 
your synonym file.  Does it?

Steve

 -Original Message-
 From: Jeff Wartes [mailto:jwar...@whitepages.com]
 Sent: Wednesday, August 10, 2011 3:43 PM
 To: solr-user@lucene.apache.org
 Subject: Can't mix Synonyms with Shingles?
 
 
 I would like to combine the ShingleFilterFactory with a 
 SynonymFilterFactory in a field type.
 
 I've looked at something like this using the analysis.jsp tool:
 
 fieldType name=TestTerm class=solr.TextField
 positionIncrementGap=100
   analyzer type=index
 tokenizer class=solr.WhitespaceTokenizerFactory/
 filter class=solr.WordDelimiterFilterFactory
 generateWordParts=1 generateNumberParts=1 stemEnglishPosessive=1/
 filter class=solr.ShingleFilterFactory tokenSeparator= /
 filter class=solr.SynonymFilterFactory
 synonyms=synonyms.BusinessNames.txt ignoreCase=true expand=true/
 ...
   /analyzer
   analyzer type=query
   ...
   /analyzer
 /fieldType
 
 However, when a ShingleFilterFactory is applied first, the 
 SynonymFilterFactory appears to do nothing.
 I haven't found any documentation or other warnings against this 
 combination, and I don't want to apply shingles after synonyms (this
 works) because multi-word synonyms then cause severe term expansion. I 
 don't really mind if the synonyms fail to match shingles, (although 
 I'd prefer they succeed) but I'd at least expect that synonyms would 
 continue to match on the original tokens, as they do if I remove the 
 ShingleFilterFactory.
 
 I'm using Solr 3.3, any clarification would be appreciated.
 
 Thanks,
   -Jeff Wartes



Solr 3.3: DIH configuration for Oracle

2011-08-10 Thread Eugeny Balakhonov
Hello, all!

 

I want to create a good DIH configuration for my Oracle database with deltas
support. Unfortunately I am not able to do it well as DIH has the strange
restrictions.

I want to explain a problem on a simple example. In a reality my database
has very difficult structure.

 

Initial conditions: Two tables with following easy structure:

 

Table1

-  ID_RECORD(Primary key)

-  DATA_FIELD1

-  ..

-  DATA_FIELD2

-  LAST_CHANGE_TIME

Table2

-  ID_RECORD(Primary key)

-  PARENT_ID_RECORD (Foreign key to Table1.ID_RECORD) 

-  DATA_FIELD1

-  ..

-  DATA_FIELD2

-  LAST_CHANGE_TIME

 

In performance reasons it is necessary to do selection of the given tables
by means of one request (via inner join).

 

My db-data-config.xml file:

 

?xml version=1.0 encoding=UTF-8?

dataConfig

dataSource jndiName=jdbc/DB1 type=JdbcDataSource user=
password=/

document

entity name=ent pk=T1_ID_RECORD, T2_ID_RECORD

query=select * from TABLE1 t1 inner join TABLE2 t2 on
t1.ID_RECORD = t2.PARENT_ID_RECORD

deltaQuery=select t1.ID_RECORD T1_ID_RECORD, t1.ID_RECORD
T2_ID_RECORD 

   from TABLE1 t1 inner join TABLE2 t2 on
t1.ID_RECORD = t2.PARENT_ID_RECORD

   where TABLE1.LAST_CHANGE_TIME 
to_date('${dataimporter.last_index_time}', '-MM-DD HH24:MI:SS')

   or TABLE2.LAST_CHANGE_TIME 
to_date('${dataimporter.last_index_time}', '-MM-DD HH24:MI:SS')

deltaImportQuery=select * from TABLE1 t1 inner join TABLE2 t2
on t1.ID_RECORD = t2.PARENT_ID_RECORD

where t1.ID_RECORD = ${dataimporter.delta.T1_ID_RECORD} and
t2.ID_RECORD = ${dataimporter.delta.T2_ID_RECORD}

/

/document

/dataConfig

 

In result I have following error:

 

java.lang.IllegalArgumentException: deltaQuery has no column to resolve to
declared primary key pk='T1_ID_RECORD, T2_ID_RECORD'

 

I have analyzed the source code of DIH. I found that in the DocBuilder class
collectDelta() method works with value of entity attribute pk as with
simple string. But in my case this is array with two values: T1_ID_RECORD,
T2_ID_RECORD

 

What do I do wrong?

 

Thanks,

Eugeny

 



Increasing the highlight snippet size

2011-08-10 Thread Sang Yum
Hi,

I have been trying to increase the size of the highlight snippets using
hl.fragSize parameter, without much success. It seems that hl.fragSize is
not making any difference at all in terms of snippet size.

For example, compare the following two set of query/results:

http://10.1.1.51:8983/solr/select?q=%28bookCode%3abarglewargle+AND+content%3awriting+AND+id:6970%29rows=1sort=id+ascfl=id%2cbookCode%2cnavPointId%2csectionTitlehl=truehl.fl=contenthl.snippets=100hl.fragSize=10hl.maxAnalyzedChars=-1version=2.2

/spanspan id=w20422 class=werd to/spanspan id=w20423
class=werd emwrite/em/spanspan id=w20424 class=werd a

http://10.1.1.51:8983/solr/select?q=%28bookCode%3abarglewargle+AND+content%3awriting+AND+id:6970%29rows=1sort=id+ascfl=id%2cbookCode%2cnavPointId%2csectionTitlehl=truehl.fl=contenthl.snippets=100hl.fragSize=1000hl.maxAnalyzedChars=-1version=2.2

/spanspan id=w20422 class=werd to/spanspan id=w20423
class=werd emwrite/em/spanspan id=w20424 class=werd a

Because of our particular needs, the content has been spanified, each word
with its own span id. I do apply HTMLStrip during the index time.

What I would like to do is to increase the size of snippet so that the
highlighted snippets contain more surrounding words.

Although hl.fragSize went from 10 to 1000, the result is the same.
This leads me to believe that hl.fragSize might not be the correct parameter
to achieve the effect i am looking for. If so, what parameter should I use?

Thanks!


Re: Example Solr Config on EC2

2011-08-10 Thread Akshay
Yes you can promote a slave to be master refer
http://wiki.apache.org/solr/SolrReplication#enable.2BAC8-disable_master.2BAC8-slave_in_a_node

In AWS one can use an elastic IP(http://aws.amazon.com/articles/1346) to
refer to the master and this can be assigned to slaves as they assume the
role of master(in case of failure). All slaves will then refer to this new
master and there will be no need to regenerate data.

Automation of this maybe possible through CloudWatch alarm-actions. I don't
know of any available example automation scripts.

Cheers
Akshay.

On Wed, Aug 10, 2011 at 9:08 PM, Matt Shields m...@mattshields.org wrote:

 If I were to build a master with multiple slaves, is it possible to promote
 a slave to be the new master if the original master fails?  Will all the
 slaves pickup right where they left off, or any time the master fails will
 we need to completely regenerate all the data?

 If this is possible, are there any examples of this being automated?
  Especially on Win2k3.

 Matthew Shields
 Owner
 BeanTown Host - Web Hosting, Domain Names, Dedicated Servers, Colocation,
 Managed Services
 www.beantownhost.com
 www.sysadminvalley.com
 www.jeeprally.com



 On Mon, Aug 8, 2011 at 5:34 PM, mboh...@yahoo.com wrote:

  Matthew,
 
  Here's another resource:
 
 
 http://www.lucidimagination.com/blog/2010/02/01/solr-shines-through-the-cloud-lucidworks-solr-on-ec2/
 
 
  Michael Bohlig
  Lucid Imagination
 
 
 
  - Original Message 
  From: Matt Shields m...@mattshields.org
  To: solr-user@lucene.apache.org
  Sent: Mon, August 8, 2011 2:03:20 PM
  Subject: Example Solr Config on EC2
 
  I'm looking for some examples of how to setup Solr on EC2.  The
  configuration I'm looking for would have multiple nodes for redundancy.
  I've tested in-house with a single master and slave with replication
  running in Tomcat on Windows Server 2003, but even if I have multiple
  slaves
  the single master is a single point of failure.  Any suggestions or
 example
  configurations?  The project I'm working on is a .NET setup, so ideally
 I'd
  like to keep this search cluster on Windows Server, even though I prefer
  Linux.
 
  Matthew Shields
  Owner
  BeanTown Host - Web Hosting, Domain Names, Dedicated Servers, Colocation,
  Managed Services
  www.beantownhost.com
  www.sysadminvalley.com
  www.jeeprally.com
 
 



Re: Increasing the highlight snippet size

2011-08-10 Thread simon
an hl.fragsize of 1000 is problematical, as Solr parses that
parameter as a 32 bit int... that's several bits more.

-Simon

On Wed, Aug 10, 2011 at 4:59 PM, Sang Yum sang...@gmail.com wrote:
 Hi,

 I have been trying to increase the size of the highlight snippets using
 hl.fragSize parameter, without much success. It seems that hl.fragSize is
 not making any difference at all in terms of snippet size.

 For example, compare the following two set of query/results:

 http://10.1.1.51:8983/solr/select?q=%28bookCode%3abarglewargle+AND+content%3awriting+AND+id:6970%29rows=1sort=id+ascfl=id%2cbookCode%2cnavPointId%2csectionTitlehl=truehl.fl=contenthl.snippets=100hl.fragSize=10hl.maxAnalyzedChars=-1version=2.2

 /spanspan id=w20422 class=werd to/spanspan id=w20423
 class=werd emwrite/em/spanspan id=w20424 class=werd a

 http://10.1.1.51:8983/solr/select?q=%28bookCode%3abarglewargle+AND+content%3awriting+AND+id:6970%29rows=1sort=id+ascfl=id%2cbookCode%2cnavPointId%2csectionTitlehl=truehl.fl=contenthl.snippets=100hl.fragSize=1000hl.maxAnalyzedChars=-1version=2.2

 /spanspan id=w20422 class=werd to/spanspan id=w20423
 class=werd emwrite/em/spanspan id=w20424 class=werd a

 Because of our particular needs, the content has been spanified, each word
 with its own span id. I do apply HTMLStrip during the index time.

 What I would like to do is to increase the size of snippet so that the
 highlighted snippets contain more surrounding words.

 Although hl.fragSize went from 10 to 1000, the result is the same.
 This leads me to believe that hl.fragSize might not be the correct parameter
 to achieve the effect i am looking for. If so, what parameter should I use?

 Thanks!



Re: Increasing the highlight snippet size

2011-08-10 Thread Sang Yum
I was just trying to set it a ridiculously large number to make it work.
What I am seeing is that hl.fragsize doesn't seem to make any difference in
term of highlight snippet size... I just tried the query with hl.fragsize
set to 1000. Same result as 10.

On Wed, Aug 10, 2011 at 2:20 PM, simon mtnes...@gmail.com wrote:

 an hl.fragsize of 1000 is problematical, as Solr parses that
 parameter as a 32 bit int... that's several bits more.

 -Simon

 On Wed, Aug 10, 2011 at 4:59 PM, Sang Yum sang...@gmail.com wrote:
  Hi,
 
  I have been trying to increase the size of the highlight snippets using
  hl.fragSize parameter, without much success. It seems that hl.fragSize
 is
  not making any difference at all in terms of snippet size.
 
  For example, compare the following two set of query/results:
 
 
 http://10.1.1.51:8983/solr/select?q=%28bookCode%3abarglewargle+AND+content%3awriting+AND+id:6970%29rows=1sort=id+ascfl=id%2cbookCode%2cnavPointId%2csectionTitlehl=truehl.fl=contenthl.snippets=100hl.fragSize=10hl.maxAnalyzedChars=-1version=2.2
 
  /spanspan id=w20422 class=werd to/spanspan id=w20423
  class=werd emwrite/em/spanspan id=w20424 class=werd a
 
 
 http://10.1.1.51:8983/solr/select?q=%28bookCode%3abarglewargle+AND+content%3awriting+AND+id:6970%29rows=1sort=id+ascfl=id%2cbookCode%2cnavPointId%2csectionTitlehl=truehl.fl=contenthl.snippets=100hl.fragSize=1000hl.maxAnalyzedChars=-1version=2.2
 
  /spanspan id=w20422 class=werd to/spanspan id=w20423
  class=werd emwrite/em/spanspan id=w20424 class=werd a
 
  Because of our particular needs, the content has been spanified, each
 word
  with its own span id. I do apply HTMLStrip during the index time.
 
  What I would like to do is to increase the size of snippet so that the
  highlighted snippets contain more surrounding words.
 
  Although hl.fragSize went from 10 to 1000, the result is the
 same.
  This leads me to believe that hl.fragSize might not be the correct
 parameter
  to achieve the effect i am looking for. If so, what parameter should I
 use?
 
  Thanks!
 




-- 
http://twitter.com/sangyum


Re: Cache replication

2011-08-10 Thread arian487
Thanks for the advice paul, but post processing is a must for me given the
nature of my application.  I haven't had problems yet though.  

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Cache-replication-tp3240708p3244202.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: Can't mix Synonyms with Shingles?

2011-08-10 Thread Jeff Wartes

After some further playing around, I think I understand what's going on. 
Because the SynonymFilterFactory pays attention to term position when it 
inserts a multi-word synonym, I had assumed it scanned for matches in a way 
that respected term position as well. (ie, for a two-word synonym, I assumed it 
would try to find the second word in position n+1 if it found the first word in 
position n) 

This does not appear to be the case. It appears to find multi-word synonym 
matches by simply walking the list of terms, exhausting all the terms in 
position one before looking at any terms in position two. The ShingleFilter 
adds terms to most positions, so that throws off the 'adjacency' of the 
flattened list of terms. Meaning, a two-word synonym can only match if the 
synonym consists of the original term (position 1) followed by the added 
shingle (also in position 1).
Perhaps a better description is if you're looking at the analysis.jsp display, 
it does not scan for multi-word synonym tokens across then down, it scans 
down then across.


It doesn't look like there's a way to do what I'm trying to do (index shingles 
AND multi-word synonyms in one field) without writing my own filter.


-Original Message-
From: Jeff Wartes [mailto:jwar...@whitepages.com] 
Sent: Wednesday, August 10, 2011 1:27 PM
To: solr-user@lucene.apache.org
Subject: RE: Can't mix Synonyms with Shingles?


Hi Steven,

The token separator was certainly a deliberate choice, are you saying that 
after applying shingles, synonyms can only match shingled terms? The term 
analysis suggests the original tokens still exist. 
You've made me realize that only certain synonyms seem to have problems though, 
so it's not a blanket failure.

Take this synonym definition:
wamu, washington mutual bank, washington mutual

Indexing wamu looks like it'll work fine - there are no shingles, and all 
three synonym expansions appear to get indexed. (expand=true) However, 
indexing washington mutual applies the shingles correctly, (adds 
washingtonmutual to position 1) but the synonym expansion does not happen. I 
would still expect the synonym definition to match the original terms and index 
'wamu' along with the other stuff.

Thanks.



-Original Message-
From: Steven A Rowe [mailto:sar...@syr.edu]
Sent: Wednesday, August 10, 2011 12:54 PM
To: solr-user@lucene.apache.org
Subject: RE: Can't mix Synonyms with Shingles?

Hi Jeff,

Hi Jeff,

You have configured ShingleFilterFactory with a token separator of , so e.g. 
International Corporation will output the shingle InternationalCorporation. 
 If this is the form you want to use for synonym matching, it must exist in 
your synonym file.  Does it?

Steve

 -Original Message-
 From: Jeff Wartes [mailto:jwar...@whitepages.com]
 Sent: Wednesday, August 10, 2011 3:43 PM
 To: solr-user@lucene.apache.org
 Subject: Can't mix Synonyms with Shingles?
 
 
 I would like to combine the ShingleFilterFactory with a 
 SynonymFilterFactory in a field type.
 
 I've looked at something like this using the analysis.jsp tool:
 
 fieldType name=TestTerm class=solr.TextField
 positionIncrementGap=100
   analyzer type=index
 tokenizer class=solr.WhitespaceTokenizerFactory/
 filter class=solr.WordDelimiterFilterFactory
 generateWordParts=1 generateNumberParts=1 stemEnglishPosessive=1/
 filter class=solr.ShingleFilterFactory tokenSeparator= /
 filter class=solr.SynonymFilterFactory
 synonyms=synonyms.BusinessNames.txt ignoreCase=true expand=true/
 ...
   /analyzer
   analyzer type=query
   ...
   /analyzer
 /fieldType
 
 However, when a ShingleFilterFactory is applied first, the 
 SynonymFilterFactory appears to do nothing.
 I haven't found any documentation or other warnings against this 
 combination, and I don't want to apply shingles after synonyms (this
 works) because multi-word synonyms then cause severe term expansion. I 
 don't really mind if the synonyms fail to match shingles, (although 
 I'd prefer they succeed) but I'd at least expect that synonyms would 
 continue to match on the original tokens, as they do if I remove the 
 ShingleFilterFactory.
 
 I'm using Solr 3.3, any clarification would be appreciated.
 
 Thanks,
   -Jeff Wartes



Re: Increasing the highlight snippet size

2011-08-10 Thread Sang Yum
Well, only after I posted this question in a public forum, I found the cause
of my problem. I was using hl.fragSize, instead of hl.fragsize. After
correcting the case, it worked as expected.

Thanks.

On Wed, Aug 10, 2011 at 3:19 PM, Sang Yum sang...@gmail.com wrote:

 I was just trying to set it a ridiculously large number to make it work.
 What I am seeing is that hl.fragsize doesn't seem to make any difference in
 term of highlight snippet size... I just tried the query with hl.fragsize
 set to 1000. Same result as 10.


 On Wed, Aug 10, 2011 at 2:20 PM, simon mtnes...@gmail.com wrote:

 an hl.fragsize of 1000 is problematical, as Solr parses that
 parameter as a 32 bit int... that's several bits more.

 -Simon

 On Wed, Aug 10, 2011 at 4:59 PM, Sang Yum sang...@gmail.com wrote:
  Hi,
 
  I have been trying to increase the size of the highlight snippets using
  hl.fragSize parameter, without much success. It seems that hl.fragSize
 is
  not making any difference at all in terms of snippet size.
 
  For example, compare the following two set of query/results:
 
 
 http://10.1.1.51:8983/solr/select?q=%28bookCode%3abarglewargle+AND+content%3awriting+AND+id:6970%29rows=1sort=id+ascfl=id%2cbookCode%2cnavPointId%2csectionTitlehl=truehl.fl=contenthl.snippets=100hl.fragSize=10hl.maxAnalyzedChars=-1version=2.2
 
  /spanspan id=w20422 class=werd to/spanspan id=w20423
  class=werd emwrite/em/spanspan id=w20424 class=werd a
 
 
 http://10.1.1.51:8983/solr/select?q=%28bookCode%3abarglewargle+AND+content%3awriting+AND+id:6970%29rows=1sort=id+ascfl=id%2cbookCode%2cnavPointId%2csectionTitlehl=truehl.fl=contenthl.snippets=100hl.fragSize=1000hl.maxAnalyzedChars=-1version=2.2
 
  /spanspan id=w20422 class=werd to/spanspan id=w20423
  class=werd emwrite/em/spanspan id=w20424 class=werd a
 
  Because of our particular needs, the content has been spanified, each
 word
  with its own span id. I do apply HTMLStrip during the index time.
 
  What I would like to do is to increase the size of snippet so that the
  highlighted snippets contain more surrounding words.
 
  Although hl.fragSize went from 10 to 1000, the result is the
 same.
  This leads me to believe that hl.fragSize might not be the correct
 parameter
  to achieve the effect i am looking for. If so, what parameter should I
 use?
 
  Thanks!
 




 --
 http://twitter.com/sangyum




-- 
http://twitter.com/sangyum


Re: Can't mix Synonyms with Shingles?

2011-08-10 Thread Robert Muir
On Wed, Aug 10, 2011 at 7:10 PM, Jeff Wartes jwar...@whitepages.com wrote:

 After some further playing around, I think I understand what's going on. 
 Because the SynonymFilterFactory pays attention to term position when it 
 inserts a multi-word synonym, I had assumed it scanned for matches in a way 
 that respected term position as well. (ie, for a two-word synonym, I assumed 
 it would try to find the second word in position n+1 if it found the first 
 word in position n)

 This does not appear to be the case. It appears to find multi-word synonym 
 matches by simply walking the list of terms, exhausting all the terms in 
 position one before looking at any terms in position two.

this is correct: and i think it would cause some serious bad
performance otherwise: if you have a tokenstream like this: (A B C) (D
E F) (G H I) ..., and are matching multiword synonyms, it can
potentially explode at least in terms of cpu time and all the
state-saving/restoring/copying and stuff it would need to start
considering the tokenstream as more of a token-confusion-network, and
it gets worse if you think about position increments  1.

at least recently in svn, the limitation is documented:
http://svn.apache.org/repos/asf/lucene/dev/trunk/modules/analysis/common/src/java/org/apache/lucene/analysis/synonym/SynonymFilter.java

-- 
lucidimagination.com


Hudson build issues

2011-08-10 Thread arian487
Whenever I try to build this on our hudson server it says it can't find
org.apache.lucene:lucene-xercesImpl:jar:4.0-SNAPSHOT.  Is the Apache repo
lacking this artifact?

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Hudson-build-issues-tp3244563p3244563.html
Sent from the Solr - User mailing list archive at Nabble.com.


LockObtainFailedException

2011-08-10 Thread Naveen Gupta
Hi,

We are doing streaming update to solr for multiple user,

We are getting


Aug 10, 2011 11:56:55 AM org.apache.solr.common.SolrException log

SEVERE: org.apache.lucene.store.LockObtainFailedException: Lock obtain timed
out: NativeFSLock@/var/lib/solr/data/index/write.lock
at org.apache.lucene.store.Lock.obtain(Lock.java:84)
at org.apache.lucene.index.IndexWriter.init(IndexWriter.java:1097)
at
org.apache.solr.update.SolrIndexWriter.init(SolrIndexWriter.java:83)
at
org.apache.solr.update.UpdateHandler.createMainIndexWriter(UpdateHandler.java:102)
at
org.apache.solr.update.DirectUpdateHandler2.openWriter(DirectUpdateHandler2.java:174)
at
org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:222)
at
org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:61)
at
org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:147)
at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:77)
at
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:55)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1360)
at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:356)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:252)
at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
at
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
at
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298)
at
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:859)
at
org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:588)
at org.apache.tomcat.util.net.JIoEndpoint

Aug 10, 2011 12:00:16 PM org.apache.solr.common.SolrException log
SEVERE: org.apache.lucene.store.LockObtainFailedException: Lock obtain timed
out: NativeFSLock@/var/lib/solr/data/index/write.lock
at org.apache.lucene.store.Lock.obtain(Lock.java:84)
at org.apache.lucene.index.IndexWriter.init(IndexWriter.java:1097)
at
org.apache.solr.update.SolrIndexWriter.init(SolrIndexWriter.java:83)
at
org.apache.solr.update.UpdateHandler.createMainIndexWriter(UpdateHandler.java:102)
at
org.apache.solr.update.DirectUpdateHandler2.openWriter(DirectUpdateHandler2.java:174)
at
org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:222)
at
org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:61)
at
org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:147)
at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:77)
at
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:55)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1360)
at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:356)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:252)
at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
at
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
at
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298)
at
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:859)
at
org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:588)
at
org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489)
at java.lang.Thread.run(Thread.java:662)


Re: Indexing tweet and searching @keyword OR #keyword

2011-08-10 Thread Mohammad Shariq
Do you really want a search on ipad to *fail* to match input of #ipad?
Or
vice-versa?
My requirement is :  I want to search both '#ipad' and 'ipad' for q='ipad'
BUT for q='#ipad'  I want to search ONLY '#ipad' excluding 'ipad'.


On 10 August 2011 19:49, Erick Erickson erickerick...@gmail.com wrote:

 Please look more carefully at the documentation for WDDF,
 specifically:

 split on intra-word delimiters (all non alpha-numeric characters).

 WordDelimiterFilterFactory will always throw away non alpha-numeric
 characters, you can't tell it do to otherwise. Try some of the other
 tokenizers/analyzers to get what you want, and also look at the
 admin/analysis page to see what the exact effects are of your
 fieldType definitions.

 Here's a great place to start:
 http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters

 You probably want something like WhitespaceTokenizerFactory
 followed by LowerCaseFilterFactory or some such...

 But I really question whether this is what you want either. Do you
 really want a search on ipad to *fail* to match input of #ipad? Or
 vice-versa?

 KeywordTokenizerFactory is probably not the place you want to start,
 the tokenization process doesn't break anything up, you happen to be
 getting separate tokens because of WDDF, which as you see can't
 process things the way you want.


 Best
 Erick

 On Wed, Aug 10, 2011 at 3:09 AM, Mohammad Shariq shariqn...@gmail.com
 wrote:
  I tried tweaking WordDelimiterFactory but I won't accept # OR @ symbols
  and it ignored totally.
  I need solution plz suggest.
 
  On 4 August 2011 21:08, Jonathan Rochkind rochk...@jhu.edu wrote:
 
  It's the WordDelimiterFactory in your filter chain that's removing the
  punctuation entirely from your index, I think.
 
  Read up on what the WordDelimiter filter does, and what it's settings
 are;
  decide how you want things to be tokenized in your index to get the
 behavior
  your want; either get WordDelimiter to do it that way by passing it
  different arguments, or stop using WordDelimiter; come back with any
  questions after trying that!
 
 
 
  On 8/4/2011 11:22 AM, Mohammad Shariq wrote:
 
  I have indexed around 1 million tweets ( using  text dataType).
  when I search the tweet with #  OR @  I dont get the exact result.
  e.g.  when I search for #ipad OR @ipad   I get the result where
 ipad
  is
  mentioned skipping the # and @.
  please suggest me, how to tune or what are filterFactories to use to
 get
  the
  desired result.
  I am indexing the tweet as text, below is text which is there in my
  schema.xml.
 
 
  fieldType name=text class=solr.TextField
 positionIncrementGap=100
  analyzer type=index
  tokenizer class=solr.**KeywordTokenizerFactory/
  filter class=solr.**CommonGramsFilterFactory
 words=stopwords.txt
  minShingleSize=3 maxShingleSize=3 ignoreCase=true/
  filter class=solr.**WordDelimiterFilterFactory
  generateWordParts=1
  generateNumberParts=1 catenateWords=1 catenateNumbers=1
  catenateAll=0 splitOnCaseChange=1/
  filter class=solr.**LowerCaseFilterFactory/
  filter class=solr.**SnowballPorterFilterFactory
  protected=protwords.txt language=English/
  /analyzer
  analyzer type=query
  tokenizer class=solr.**KeywordTokenizerFactory/
  filter class=solr.**CommonGramsFilterFactory
  words=stopwords.txt
  minShingleSize=3 maxShingleSize=3 ignoreCase=true/
  filter class=solr.**WordDelimiterFilterFactory
  generateWordParts=1 generateNumberParts=1 catenateWords=1
  catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/
  filter class=solr.**LowerCaseFilterFactory/
  filter class=solr.**SnowballPorterFilterFactory
  protected=protwords.txt language=English/
  /analyzer
  /fieldType
 
 
 
 
  --
  Thanks and Regards
  Mohammad Shariq
 




-- 
Thanks and Regards
Mohammad Shariq