Hi Ryan,
I gone through your post
https://issues.apache.org/jira/browse/SOLR-357
where you mention about prefix filter,can you tell me how to use that
patch,and you mentioned to use the code as bellow,
fieldType name=prefix_full class=solr.TextField
positionIncrementGap=1
analyzer type=index
I have a prefix_token field defined as underneath in my schema.xml
fieldType name=prefix_token class=solr.TextField
positionIncrementGap=1
analyzer type=index
tokenizer class=solr.WhitespaceTokenizerFactory/
filter class=solr.LowerCaseFilterFactory /
filter
On Wed, Sep 23, 2009 at 12:23 PM, Avlesh Singh avl...@gmail.com wrote:
I have a prefix_token field defined as underneath in my schema.xml
fieldType name=prefix_token class=solr.TextField
positionIncrementGap=1
analyzer type=index
tokenizer class=solr.WhitespaceTokenizerFactory/
Hmmm .. But ngrams with KeywordTokenizerFactory instead of the
WhitespaceTokenizerFactory work just as fine. Related issues?
Cheers
Avlesh
On Wed, Sep 23, 2009 at 12:27 PM, Shalin Shekhar Mangar
shalinman...@gmail.com wrote:
On Wed, Sep 23, 2009 at 12:23 PM, Avlesh Singh avl...@gmail.com
On Wed, Sep 23, 2009 at 11:30 AM, dharhsana rekha.dharsh...@gmail.comwrote:
Hi Ryan,
I gone through your post
https://issues.apache.org/jira/browse/SOLR-357
where you mention about prefix filter,can you tell me how to use that
patch,and you mentioned to use the code as bellow,
fieldType
On Wed, Sep 23, 2009 at 12:31 PM, Avlesh Singh avl...@gmail.com wrote:
Hmmm .. But ngrams with KeywordTokenizerFactory instead of the
WhitespaceTokenizerFactory work just as fine. Related issues?
I'm sorry I don't understand the question. Do you mean to say that
highlighting works with one
I'm sorry I don't understand the question. Do you mean to say that
highlighting works with one but not with another?
Yes.
Cheers
Avlesh
On Wed, Sep 23, 2009 at 12:59 PM, Shalin Shekhar Mangar
shalinman...@gmail.com wrote:
On Wed, Sep 23, 2009 at 12:31 PM, Avlesh Singh avl...@gmail.com
Hi,
When we have news content crawled we face a problme of same content being
repeated in many documents. We want to add a near duplicate document filter
to detect such documents. Is there a way to do that in SOLR?
Regards,
Ninad Raut.
On Wed, Sep 23, 2009 at 3:14 PM, Ninad Raut hbase.user.ni...@gmail.comwrote:
Hi,
When we have news content crawled we face a problme of same content being
repeated in many documents. We want to add a near duplicate document
filter
to detect such documents. Is there a way to do that in SOLR?
Hi,
Is it possible to have a phrase as a stopword in solr? In case, please share
how to do so?
regards,
Pooja
Is this feature included in SOLR 1.4??
On Wed, Sep 23, 2009 at 3:29 PM, Shalin Shekhar Mangar
shalinman...@gmail.com wrote:
On Wed, Sep 23, 2009 at 3:14 PM, Ninad Raut hbase.user.ni...@gmail.com
wrote:
Hi,
When we have news content crawled we face a problme of same content being
After investigating the log files, the DataImporter was throwing an error from
the Oracle DB driver:
java.sql.SQLException: ORA-22835: Buffer too small for CLOB to CHAR or BLOB to
RAW conversion (actual: 2890, maximum: 2000)
Aka. There was a problem with the 551st item where a related item had
On Wed, Sep 23, 2009 at 3:50 PM, Ninad Raut hbase.user.ni...@gmail.comwrote:
Is this feature included in SOLR 1.4??
Yep.
--
Regards,
Shalin Shekhar Mangar.
On Wed, Sep 23, 2009 at 3:53 PM, Daniel Bradley daniel.brad...@adfero.co.uk
wrote:
After investigating the log files, the DataImporter was throwing an error
from the Oracle DB driver:
java.sql.SQLException: ORA-22835: Buffer too small for CLOB to CHAR or BLOB
to RAW conversion (actual:
Hi,
I am doing exact search in Solr .In Solr admin page I am giving the search
input string for search.
For ex: I am giving “channeL12” as search input string in solr home page it
displays search results as
doc
str name=urlhttp://rediff/field
str name=titlefirst/field
str
Hi,
I am doing exact search in Solr .In Solr admin page I am
giving the search input string for search.
For ex: I am giving “channeL12” as search input string
in solr home page it displays search results as
doc
str name=urlhttp://rediff/field
str name=titlefirst/field
str
hello guy
I am newbie on solr.
I have running solr on tomcat6, all is ok,
when i add data to solrserver via http post cause a error
the below is code
SolrInputDocument solrdoc=new SolrInputDocument();
solrdoc.addField(url,request.getParameter(URL));
2009-9-23 21:18:03
From: Pooja Verlani pooja.verl...@gmail.com
Subject: Phrase stopwords
To: solr-user@lucene.apache.org
Date: Wednesday, September 23, 2009, 1:15 PM
Hi,
Is it possible to have a phrase as a stopword in solr? In
case, please share
how to do so?
regards,
Pooja
I think that can be
For 8-CPU load-stress testing of Tomcat you are probably making mistake:
- you should execute load-stress software and wait 5-30 minutes (depends on
index size) BEFORE taking measurements.
1. JVM HotSpot need to compile everything into native code
2. Tomcat Thread Pool needs warm up
3. SOLR
I'm using a Solr 1.4 nightly from around July. Is that recent enough to
have the improved reader implementation?
I'm not sure whether you'd call my operations IO heavy -- each query has so
many terms (~50) that even against a 45K document index a query takes 130ms,
but the entire index is in a
I have 0-15ms for 50M (millions docs), Tomcat, 8-CPU:
http://www.tokenizer.org
==
- something obviously wrong in your case, 130ms is too high. Is it dedicated
server? Disk swapping? Etc.
-Original Message-
From: Michael [mailto:solrco...@gmail.com]
Hi Fuad, thanks for the reply.
My queries are heavy enough that the difference in performance is obvious.
I am using a home-grown load testing script that sends 1000 realistic
queries to the server and takes the average response time. My index is on a
ramfs which I've shown makes the QR and doc
Correction: 0 - 150ms (depends on size of query results; 150ms for
non-cached (new) queries returning more than 50K docs).
-Original Message-
From: Fuad Efendi [mailto:f...@efendi.ca]
Sent: September-23-09 11:26 AM
To: solr-user@lucene.apache.org
Subject: RE: Parallel requests to
On Wed, Sep 23, 2009 at 11:26 AM, Fuad Efendi f...@efendi.ca wrote:
- something obviously wrong in your case, 130ms is too high. Is it
dedicated
server? Disk swapping? Etc.
It's that my queries are ridiculously complex. My users are very familiar
with boolean searching, and I'm doing a lot
8 queries against 1 Tomcat average 600ms per query, while 8 queries
against
8 Tomcats average 190ms per query (on a dedicated 8 CPU server w 32G RAM).
I don't see how to interpret these numbers except that Tomcat is not
multithreading as well as it should :)
Hi Michael, I think it is very
On Wed, Sep 23, 2009 at 11:17 AM, Michael solrco...@gmail.com wrote:
I'm using a Solr 1.4 nightly from around July. Is that recent enough to
have the improved reader implementation?
I'm not sure whether you'd call my operations IO heavy -- each query has so
many terms (~50) that even against
Hi Fuad,
On Wed, Sep 23, 2009 at 11:37 AM, Fuad Efendi f...@efendi.ca wrote:
8 queries against 1 Tomcat average 600ms per query, while 8 queries
against
8 Tomcats average 190ms per query (on a dedicated 8 CPU server w 32G
RAM).
I don't see how to interpret these numbers except that
Hi Yonik,
On Wed, Sep 23, 2009 at 11:42 AM, Yonik Seeley
yo...@lucidimagination.comwrote:
This could well be IO bound - lots of seeks and reads.
If this were IO bound, wouldn't I see the same results when sending my 8
requests to 8 Tomcats? There's only one disk (well, RAM) whether I'm
This sure seems like a good time to try LucidGaze for Solr. That would
give some Solr-specific profiling data.
http://www.lucidimagination.com/Downloads/LucidGaze-for-Solr
wunder
On Sep 23, 2009, at 8:47 AM, Michael wrote:
Hi Yonik,
On Wed, Sep 23, 2009 at 11:42 AM, Yonik Seeley
Thanks for the suggestion, Walter! I've been using Gaze 1.0 for a while
now, but when I moved to a multicore approach (which was the impetus behind
all of this testing) Gaze failed to start and I had to comment it out of
solrconfig.xml to get Solr to start. Are you aware whether Gaze is able to
On Wed, Sep 23, 2009 at 11:47 AM, Michael solrco...@gmail.com wrote:
Hi Yonik,
On Wed, Sep 23, 2009 at 11:42 AM, Yonik Seeley yo...@lucidimagination.com
wrote:
This could well be IO bound - lots of seeks and reads.
If this were IO bound, wouldn't I see the same results when sending my 8
On Wed, Sep 23, 2009 at 12:05 PM, Yonik Seeley
yo...@lucidimagination.comwrote:
On Wed, Sep 23, 2009 at 11:47 AM, Michael solrco...@gmail.com wrote:
If this were IO bound, wouldn't I see the same results when sending my 8
requests to 8 Tomcats? There's only one disk (well, RAM) whether I'm
8 threads sharing something may have *some* overhead versus 8 processes,
but
as you say, 410ms overhead points to a different problem.
- You have baseline (single-threaded load-stress script sending requests to
SOLR) (1-request-in-parallel, 8 requests to 8 Tomcats); 200ms looks
extremely
I'm not sure whether you'd call my operations IO heavy -- each query has
so
many terms (~50) that even against a 45K document index a query takes
130ms,
but the entire index is in a ramfs.
The more terms, the more it takes to find docset intersections (belonging to
each term); something in
Hi,
Thanks for the discussion. We use the distributed option so I am not sure
embedded is possible.
As you also guessed, we use haproxy for load balancing and failover between
replicas of the shards so giving this up for a minor performance boost is
probably not wise.
So essentially we have:
For a particular requirement we have - we need to do a query that is a
combination of multiple dismax queries behind the scenes. (Using solr
1.4 nightly ).
The DisMaxQParser org.apache.solr.search.DisMaxQParser ( details at -
http://wiki.apache.org/solr/DisMaxRequestHandler ) takes in the
Hello,
Can ReversedWildcardFilterFactory be used with KeywordTokenizerFactory
? I get the following error, looks like solr expects
WhitespaceTokenizerFactory...Can anybody suggest how to rectify it. My schema
snippet is also given below. Data is extracted via OpenNLP and indexed into
Hi! I need to index in solr very big numbers. Something like
99,999,999,999,999.99
Right now i'm using an sdouble field type because I need to make range
queries on this field.
The problem is that the field value is being returned in scientific
notation. Is there any way to avoid that?
Thanks!
On Friday 11 September 2009 11:06:20 am Dan A. Dickey wrote:
...
Our JBoss expert and I will be looking into why this might be occurring.
Does anyone know of any JBoss related slowness with Solr?
And does anyone have any other sort of suggestions to speed indexing
performance? Thanks for
The javadoc for DisMaxQParserPlugin states:
{!dismax qf=myfield,mytitle^2}foo creates a dismax query
but actually, that gives an error.
The correct syntax is
{!dismax qf=myfield mytitle^2}foo
(could use single quote instead of double quote).
- Naomi
On Wed, Sep 23, 2009 at 5:59 PM, Naomi Dushay ndus...@stanford.edu wrote:
The javadoc for DisMaxQParserPlugin states:
{!dismax qf=myfield,mytitle^2}foo creates a dismax query
but actually, that gives an error.
The correct syntax is
{!dismax qf=myfield mytitle^2}foo
(could use single
I had the same problem again yesterday except the process halted after about
20mins this time.
pof wrote:
Hello, I was running a batch index the other day using the Solrj
EmbeddedSolrServer when the process abruptly froze in it's tracks after
running for about 4-5 hours and indexing ~400K
do you have anything custom going on?
The fact that the lock is in java2d seems suspicious...
On Sep 23, 2009, at 7:01 PM, pof wrote:
I had the same problem again yesterday except the process halted
after about
20mins this time.
pof wrote:
Hello, I was running a batch index the other
It's not just the spaces - it's that the quotes (single or double
flavor) is required as well.
On Sep 23, 2009, at 3:10 PM, Yonik Seeley wrote:
On Wed, Sep 23, 2009 at 5:59 PM, Naomi Dushay ndus...@stanford.edu
wrote:
The javadoc for DisMaxQParserPlugin states:
{!dismax
On Wed, Sep 23, 2009 at 8:24 PM, Naomi Dushay ndus...@stanford.edu wrote:
It's not just the spaces - it's that the quotes (single or double flavor) is
required as well.
LocalParams are space delimited, so the original example would have
worked if the dismax parser accepted comma delimited
hi, i use hbase and solr ,now i have a large data need to index ,it means
solr-index will be large,
as the data increases,it will be more larger than now.
so solrconfig.xml 's dataDir/solrhome/data/dataDir ,can i used it from
api ,and point to my
distrabuted hbase data storage,
and if the index
Is there any way to analyze or see that which documents are getting cached
by documentCache -
documentCache
class=solr.LRUCache
size=512
initialSize=512
autowarmCount=0/
On Wed, Sep 23, 2009 at 8:10 AM, satya tosatyaj...@gmail.com wrote:
First of all , thanks a lot for
can hbase be mounted on the filesystem? Solr can only read data from a
filesystem
On Thu, Sep 24, 2009 at 7:27 AM, 梁景明 futur...@gmail.com wrote:
hi, i use hbase and solr ,now i have a large data need to index ,it means
solr-index will be large,
as the data increases,it will be more larger
Hi,
Is there any way to dynamically point the Solr servers to an index/data
directories at run time?
We are generating 200 GB worth of index per day and we want to retain the index
for approximately 1 month. So our idea is to keep the first 1 week of index
available at anytime for the users
Would FUSE (http://wiki.apache.org/hadoop/MountableHDFS) be of use?
I wonder if you could take the data from HBase and index it into a Lucene
index stored on HDFS.
2009/9/23 Noble Paul നോബിള് नोब्ळ् noble.p...@corp.aol.com
can hbase be mounted on the filesystem? Solr can only read data from a
Okay, but
{!dismax qf=myfield mytitle^2}foo works
{!dismax qf=myfield mytitle^2}foo does NOT work
- Naomi
On Sep 23, 2009, at 5:52 PM, Yonik Seeley wrote:
On Wed, Sep 23, 2009 at 8:24 PM, Naomi Dushay ndus...@stanford.edu
wrote:
It's not just the spaces - it's that the quotes
51 matches
Mail list logo