Re: Solr with Auto-suggest

2009-09-23 Thread dharhsana
Hi Ryan, I gone through your post https://issues.apache.org/jira/browse/SOLR-357 where you mention about prefix filter,can you tell me how to use that patch,and you mentioned to use the code as bellow, fieldType name=prefix_full class=solr.TextField positionIncrementGap=1 analyzer type=index

Highlighting not working on a prefix_token field

2009-09-23 Thread Avlesh Singh
I have a prefix_token field defined as underneath in my schema.xml fieldType name=prefix_token class=solr.TextField positionIncrementGap=1 analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.LowerCaseFilterFactory / filter

Re: Highlighting not working on a prefix_token field

2009-09-23 Thread Shalin Shekhar Mangar
On Wed, Sep 23, 2009 at 12:23 PM, Avlesh Singh avl...@gmail.com wrote: I have a prefix_token field defined as underneath in my schema.xml fieldType name=prefix_token class=solr.TextField positionIncrementGap=1 analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory/

Re: Highlighting not working on a prefix_token field

2009-09-23 Thread Avlesh Singh
Hmmm .. But ngrams with KeywordTokenizerFactory instead of the WhitespaceTokenizerFactory work just as fine. Related issues? Cheers Avlesh On Wed, Sep 23, 2009 at 12:27 PM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: On Wed, Sep 23, 2009 at 12:23 PM, Avlesh Singh avl...@gmail.com

Re: Solr with Auto-suggest

2009-09-23 Thread Shalin Shekhar Mangar
On Wed, Sep 23, 2009 at 11:30 AM, dharhsana rekha.dharsh...@gmail.comwrote: Hi Ryan, I gone through your post https://issues.apache.org/jira/browse/SOLR-357 where you mention about prefix filter,can you tell me how to use that patch,and you mentioned to use the code as bellow, fieldType

Re: Highlighting not working on a prefix_token field

2009-09-23 Thread Shalin Shekhar Mangar
On Wed, Sep 23, 2009 at 12:31 PM, Avlesh Singh avl...@gmail.com wrote: Hmmm .. But ngrams with KeywordTokenizerFactory instead of the WhitespaceTokenizerFactory work just as fine. Related issues? I'm sorry I don't understand the question. Do you mean to say that highlighting works with one

Re: Highlighting not working on a prefix_token field

2009-09-23 Thread Avlesh Singh
I'm sorry I don't understand the question. Do you mean to say that highlighting works with one but not with another? Yes. Cheers Avlesh On Wed, Sep 23, 2009 at 12:59 PM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: On Wed, Sep 23, 2009 at 12:31 PM, Avlesh Singh avl...@gmail.com

Finding near duplicates which searching Documents

2009-09-23 Thread Ninad Raut
Hi, When we have news content crawled we face a problme of same content being repeated in many documents. We want to add a near duplicate document filter to detect such documents. Is there a way to do that in SOLR? Regards, Ninad Raut.

Re: Finding near duplicates which searching Documents

2009-09-23 Thread Shalin Shekhar Mangar
On Wed, Sep 23, 2009 at 3:14 PM, Ninad Raut hbase.user.ni...@gmail.comwrote: Hi, When we have news content crawled we face a problme of same content being repeated in many documents. We want to add a near duplicate document filter to detect such documents. Is there a way to do that in SOLR?

Phrase stopwords

2009-09-23 Thread Pooja Verlani
Hi, Is it possible to have a phrase as a stopword in solr? In case, please share how to do so? regards, Pooja

Re: Finding near duplicates which searching Documents

2009-09-23 Thread Ninad Raut
Is this feature included in SOLR 1.4?? On Wed, Sep 23, 2009 at 3:29 PM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: On Wed, Sep 23, 2009 at 3:14 PM, Ninad Raut hbase.user.ni...@gmail.com wrote: Hi, When we have news content crawled we face a problme of same content being

RE: Oracle incomplete DataImport results

2009-09-23 Thread Daniel Bradley
After investigating the log files, the DataImporter was throwing an error from the Oracle DB driver: java.sql.SQLException: ORA-22835: Buffer too small for CLOB to CHAR or BLOB to RAW conversion (actual: 2890, maximum: 2000) Aka. There was a problem with the 551st item where a related item had

Re: Finding near duplicates which searching Documents

2009-09-23 Thread Shalin Shekhar Mangar
On Wed, Sep 23, 2009 at 3:50 PM, Ninad Raut hbase.user.ni...@gmail.comwrote: Is this feature included in SOLR 1.4?? Yep. -- Regards, Shalin Shekhar Mangar.

Re: Oracle incomplete DataImport results

2009-09-23 Thread Shalin Shekhar Mangar
On Wed, Sep 23, 2009 at 3:53 PM, Daniel Bradley daniel.brad...@adfero.co.uk wrote: After investigating the log files, the DataImporter was throwing an error from the Oracle DB driver: java.sql.SQLException: ORA-22835: Buffer too small for CLOB to CHAR or BLOB to RAW conversion (actual:

Exact match

2009-09-23 Thread bhaskar chandrasekar
Hi,   I am doing exact search in Solr .In Solr admin page I  am giving the search input string for search. For ex: I am giving “channeL12” as search input string in solr home page it displays search results as   doc   str name=urlhttp://rediff/field   str name=titlefirst/field   str

Re: Exact match

2009-09-23 Thread AHMET ARSLAN
Hi,   I am doing exact search in Solr .In Solr admin page I  am giving the search input string for search. For ex: I am giving “channeL12” as search input string in solr home page it displays search results as   doc   str name=urlhttp://rediff/field   str name=titlefirst/field   str

about url field error

2009-09-23 Thread net_nav
hello guy I am newbie on solr. I have running solr on tomcat6, all is ok, when i add data to solrserver via http post cause a error the below is code SolrInputDocument solrdoc=new SolrInputDocument(); solrdoc.addField(url,request.getParameter(URL)); 2009-9-23 21:18:03

Re: Phrase stopwords

2009-09-23 Thread AHMET ARSLAN
From: Pooja Verlani pooja.verl...@gmail.com Subject: Phrase stopwords To: solr-user@lucene.apache.org Date: Wednesday, September 23, 2009, 1:15 PM Hi, Is it possible to have a phrase as a stopword in solr? In case, please share how to do so? regards, Pooja I think that can be

RE: Parallel requests to Tomcat

2009-09-23 Thread Fuad Efendi
For 8-CPU load-stress testing of Tomcat you are probably making mistake: - you should execute load-stress software and wait 5-30 minutes (depends on index size) BEFORE taking measurements. 1. JVM HotSpot need to compile everything into native code 2. Tomcat Thread Pool needs warm up 3. SOLR

Re: Parallel requests to Tomcat

2009-09-23 Thread Michael
I'm using a Solr 1.4 nightly from around July. Is that recent enough to have the improved reader implementation? I'm not sure whether you'd call my operations IO heavy -- each query has so many terms (~50) that even against a 45K document index a query takes 130ms, but the entire index is in a

RE: Parallel requests to Tomcat

2009-09-23 Thread Fuad Efendi
I have 0-15ms for 50M (millions docs), Tomcat, 8-CPU: http://www.tokenizer.org == - something obviously wrong in your case, 130ms is too high. Is it dedicated server? Disk swapping? Etc. -Original Message- From: Michael [mailto:solrco...@gmail.com]

Re: Parallel requests to Tomcat

2009-09-23 Thread Michael
Hi Fuad, thanks for the reply. My queries are heavy enough that the difference in performance is obvious. I am using a home-grown load testing script that sends 1000 realistic queries to the server and takes the average response time. My index is on a ramfs which I've shown makes the QR and doc

RE: Parallel requests to Tomcat

2009-09-23 Thread Fuad Efendi
Correction: 0 - 150ms (depends on size of query results; 150ms for non-cached (new) queries returning more than 50K docs). -Original Message- From: Fuad Efendi [mailto:f...@efendi.ca] Sent: September-23-09 11:26 AM To: solr-user@lucene.apache.org Subject: RE: Parallel requests to

Re: Parallel requests to Tomcat

2009-09-23 Thread Michael
On Wed, Sep 23, 2009 at 11:26 AM, Fuad Efendi f...@efendi.ca wrote: - something obviously wrong in your case, 130ms is too high. Is it dedicated server? Disk swapping? Etc. It's that my queries are ridiculously complex. My users are very familiar with boolean searching, and I'm doing a lot

RE: Parallel requests to Tomcat

2009-09-23 Thread Fuad Efendi
8 queries against 1 Tomcat average 600ms per query, while 8 queries against 8 Tomcats average 190ms per query (on a dedicated 8 CPU server w 32G RAM). I don't see how to interpret these numbers except that Tomcat is not multithreading as well as it should :) Hi Michael, I think it is very

Re: Parallel requests to Tomcat

2009-09-23 Thread Yonik Seeley
On Wed, Sep 23, 2009 at 11:17 AM, Michael solrco...@gmail.com wrote: I'm using a Solr 1.4 nightly from around July.  Is that recent enough to have the improved reader implementation? I'm not sure whether you'd call my operations IO heavy -- each query has so many terms (~50) that even against

Re: Parallel requests to Tomcat

2009-09-23 Thread Michael
Hi Fuad, On Wed, Sep 23, 2009 at 11:37 AM, Fuad Efendi f...@efendi.ca wrote: 8 queries against 1 Tomcat average 600ms per query, while 8 queries against 8 Tomcats average 190ms per query (on a dedicated 8 CPU server w 32G RAM). I don't see how to interpret these numbers except that

Re: Parallel requests to Tomcat

2009-09-23 Thread Michael
Hi Yonik, On Wed, Sep 23, 2009 at 11:42 AM, Yonik Seeley yo...@lucidimagination.comwrote: This could well be IO bound - lots of seeks and reads. If this were IO bound, wouldn't I see the same results when sending my 8 requests to 8 Tomcats? There's only one disk (well, RAM) whether I'm

Re: Parallel requests to Tomcat

2009-09-23 Thread Walter Underwood
This sure seems like a good time to try LucidGaze for Solr. That would give some Solr-specific profiling data. http://www.lucidimagination.com/Downloads/LucidGaze-for-Solr wunder On Sep 23, 2009, at 8:47 AM, Michael wrote: Hi Yonik, On Wed, Sep 23, 2009 at 11:42 AM, Yonik Seeley

Re: Parallel requests to Tomcat

2009-09-23 Thread Michael
Thanks for the suggestion, Walter! I've been using Gaze 1.0 for a while now, but when I moved to a multicore approach (which was the impetus behind all of this testing) Gaze failed to start and I had to comment it out of solrconfig.xml to get Solr to start. Are you aware whether Gaze is able to

Re: Parallel requests to Tomcat

2009-09-23 Thread Yonik Seeley
On Wed, Sep 23, 2009 at 11:47 AM, Michael solrco...@gmail.com wrote: Hi Yonik, On Wed, Sep 23, 2009 at 11:42 AM, Yonik Seeley yo...@lucidimagination.com wrote: This could well be IO bound - lots of seeks and reads. If this were IO bound, wouldn't I see the same results when sending my 8

Re: Parallel requests to Tomcat

2009-09-23 Thread Michael
On Wed, Sep 23, 2009 at 12:05 PM, Yonik Seeley yo...@lucidimagination.comwrote: On Wed, Sep 23, 2009 at 11:47 AM, Michael solrco...@gmail.com wrote: If this were IO bound, wouldn't I see the same results when sending my 8 requests to 8 Tomcats? There's only one disk (well, RAM) whether I'm

RE: Parallel requests to Tomcat

2009-09-23 Thread Fuad Efendi
8 threads sharing something may have *some* overhead versus 8 processes, but as you say, 410ms overhead points to a different problem. - You have baseline (single-threaded load-stress script sending requests to SOLR) (1-request-in-parallel, 8 requests to 8 Tomcats); 200ms looks extremely

RE: Parallel requests to Tomcat

2009-09-23 Thread Fuad Efendi
I'm not sure whether you'd call my operations IO heavy -- each query has so many terms (~50) that even against a 45K document index a query takes 130ms, but the entire index is in a ramfs. The more terms, the more it takes to find docset intersections (belonging to each term); something in

Re: Solr via ruby

2009-09-23 Thread Ian Connor
Hi, Thanks for the discussion. We use the distributed option so I am not sure embedded is possible. As you also guessed, we use haproxy for load balancing and failover between replicas of the shards so giving this up for a minor performance boost is probably not wise. So essentially we have:

Multiple DisMax Queries spanning across multiple fields

2009-09-23 Thread Kay Kay
For a particular requirement we have - we need to do a query that is a combination of multiple dismax queries behind the scenes. (Using solr 1.4 nightly ). The DisMaxQParser org.apache.solr.search.DisMaxQParser ( details at - http://wiki.apache.org/solr/DisMaxRequestHandler ) takes in the

ReversedWildcardFilterFactory (SOLR-1321) and KeywordTokenizerFactory

2009-09-23 Thread Ravi Kiran
Hello, Can ReversedWildcardFilterFactory be used with KeywordTokenizerFactory ? I get the following error, looks like solr expects WhitespaceTokenizerFactory...Can anybody suggest how to rectify it. My schema snippet is also given below. Data is extracted via OpenNLP and indexed into

Very big numbers

2009-09-23 Thread Jonathan Ariel
Hi! I need to index in solr very big numbers. Something like 99,999,999,999,999.99 Right now i'm using an sdouble field type because I need to make range queries on this field. The problem is that the field value is being returned in scientific notation. Is there any way to avoid that? Thanks!

Re: Solr http post performance seems slow - help?

2009-09-23 Thread Dan A. Dickey
On Friday 11 September 2009 11:06:20 am Dan A. Dickey wrote: ... Our JBoss expert and I will be looking into why this might be occurring. Does anyone know of any JBoss related slowness with Solr? And does anyone have any other sort of suggestions to speed indexing performance? Thanks for

java doc error local params syntax for dismax

2009-09-23 Thread Naomi Dushay
The javadoc for DisMaxQParserPlugin states: {!dismax qf=myfield,mytitle^2}foo creates a dismax query but actually, that gives an error. The correct syntax is {!dismax qf=myfield mytitle^2}foo (could use single quote instead of double quote). - Naomi

Re: java doc error local params syntax for dismax

2009-09-23 Thread Yonik Seeley
On Wed, Sep 23, 2009 at 5:59 PM, Naomi Dushay ndus...@stanford.edu wrote: The javadoc for  DisMaxQParserPlugin states: {!dismax qf=myfield,mytitle^2}foo creates a dismax query but actually, that gives an error. The correct syntax is {!dismax qf=myfield mytitle^2}foo (could use single

Re: Solrj possible deadlock

2009-09-23 Thread pof
I had the same problem again yesterday except the process halted after about 20mins this time. pof wrote: Hello, I was running a batch index the other day using the Solrj EmbeddedSolrServer when the process abruptly froze in it's tracks after running for about 4-5 hours and indexing ~400K

Re: Solrj possible deadlock

2009-09-23 Thread Ryan McKinley
do you have anything custom going on? The fact that the lock is in java2d seems suspicious... On Sep 23, 2009, at 7:01 PM, pof wrote: I had the same problem again yesterday except the process halted after about 20mins this time. pof wrote: Hello, I was running a batch index the other

Re: java doc error local params syntax for dismax

2009-09-23 Thread Naomi Dushay
It's not just the spaces - it's that the quotes (single or double flavor) is required as well. On Sep 23, 2009, at 3:10 PM, Yonik Seeley wrote: On Wed, Sep 23, 2009 at 5:59 PM, Naomi Dushay ndus...@stanford.edu wrote: The javadoc for DisMaxQParserPlugin states: {!dismax

Re: java doc error local params syntax for dismax

2009-09-23 Thread Yonik Seeley
On Wed, Sep 23, 2009 at 8:24 PM, Naomi Dushay ndus...@stanford.edu wrote: It's not just the spaces - it's that the quotes (single or double flavor) is required as well. LocalParams are space delimited, so the original example would have worked if the dismax parser accepted comma delimited

Can solr build on top of HBase

2009-09-23 Thread 梁景明
hi, i use hbase and solr ,now i have a large data need to index ,it means solr-index will be large, as the data increases,it will be more larger than now. so solrconfig.xml 's dataDir/solrhome/data/dataDir ,can i used it from api ,and point to my distrabuted hbase data storage, and if the index

Re: solr caching problem

2009-09-23 Thread satya
Is there any way to analyze or see that which documents are getting cached by documentCache - documentCache class=solr.LRUCache size=512 initialSize=512 autowarmCount=0/ On Wed, Sep 23, 2009 at 8:10 AM, satya tosatyaj...@gmail.com wrote: First of all , thanks a lot for

Re: Can solr build on top of HBase

2009-09-23 Thread Noble Paul നോബിള്‍ नोब्ळ्
can hbase be mounted on the filesystem? Solr can only read data from a filesystem On Thu, Sep 24, 2009 at 7:27 AM, 梁景明 futur...@gmail.com wrote: hi,  i use hbase and solr ,now i have a large data need to index ,it means solr-index  will be large, as the data increases,it will be more larger

Can we point a Solr server to index directory dynamically at runtime..

2009-09-23 Thread Silent Surfer
Hi, Is there any way to dynamically point the Solr servers to an index/data directories at run time? We are generating 200 GB worth of index per day and we want to retain the index for approximately 1 month. So our idea is to keep the first 1 week of index available at anytime for the users

Re: Can solr build on top of HBase

2009-09-23 Thread Amit Nithian
Would FUSE (http://wiki.apache.org/hadoop/MountableHDFS) be of use? I wonder if you could take the data from HBase and index it into a Lucene index stored on HDFS. 2009/9/23 Noble Paul നോബിള്‍ नोब्ळ् noble.p...@corp.aol.com can hbase be mounted on the filesystem? Solr can only read data from a

Re: java doc error local params syntax for dismax

2009-09-23 Thread Naomi Dushay
Okay, but {!dismax qf=myfield mytitle^2}foo works {!dismax qf=myfield mytitle^2}foo does NOT work - Naomi On Sep 23, 2009, at 5:52 PM, Yonik Seeley wrote: On Wed, Sep 23, 2009 at 8:24 PM, Naomi Dushay ndus...@stanford.edu wrote: It's not just the spaces - it's that the quotes