Re: Searching on mulit-core Solr

2009-04-09 Thread vivek sar
Hi, I've gone through the mailing archive and have read contradicting remarks on this issue. Can someone please clear this up as I'm not able to run distributed search on multi-cores. Is there any document on how can I search across multicore which share the same schema. Here are the various co

Re: httpclient.ProtocolException using Solrj

2009-04-09 Thread Noble Paul നോബിള്‍ नोब्ळ्
how many documents are you inserting ? may be you can create multiple instances of CommonshttpSolrServer and upload in parallel On Thu, Apr 9, 2009 at 11:58 AM, vivek sar wrote: > Thanks Shalin and Paul. > > I'm not using MultipartRequest. I do share the same SolrServer between > two threads. I'

Re: solr 1.4 memory jvm

2009-04-09 Thread sunnyfr
Hi Noble, Yes exactly that, I would like to know how people do during a replication ? Do they turn off servers and put a high autowarmCount which turn off the slave for a while like for my case, 10mn to bring back the new index and then autowarmCount maybe 10 minutes more. Otherwise I tried to p

Re: solr 1.4 facet boost field according to another field

2009-04-09 Thread sunnyfr
Do you have an idea ? sunnyfr wrote: > > Hi, > > I've title description and tag field ... According to where I find the > word searched, I would like to boost differently other field like nb_views > or rating. > > if word is find in title then nb_views^10 and rating^10 > if word is find in d

different scoring for different types of found documents

2009-04-09 Thread Andrey Klochkov
Hi, We have a quite complex requirement concerning scoring logic customization, but but I guess it's quite useful and probably something like it was done already. So we're searching through the product catalog. Product have types (i.e. "Electronics", "Apparel", "Furniture" etc). What we need is

Re: Its urgent! plz help in schema.xml- appending one field to another

2009-04-09 Thread Erik Hatcher
On Apr 8, 2009, at 9:50 PM, Udaya wrote: Hi, Need your help, I would like to know how we could append or add one field value to another field in Scheme.xml My scheme is as follows (only the field part is given): Scheme.xml stored="true" required="true"/> default="http://comp

Re: Searching on mulit-core Solr

2009-04-09 Thread Erik Hatcher
On Apr 9, 2009, at 3:00 AM, vivek sar wrote: Can someone please clear this up as I'm not able to run distributed search on multi-cores. What error or problem are you encountering when trying this? How are you trying it? Erik

Re: solr 1.4 facet boost field according to another field

2009-04-09 Thread Shalin Shekhar Mangar
I don't think conditional boosting is possible. You can boost the same field on which the match was found. But you cannot boost a different field. On Thu, Apr 9, 2009 at 2:05 PM, sunnyfr wrote: > > Do you have an idea ? > > > > sunnyfr wrote: > > > > Hi, > > > > I've title description and tag fi

Re: different scoring for different types of found documents

2009-04-09 Thread Shalin Shekhar Mangar
On Thu, Apr 9, 2009 at 2:17 PM, Andrey Klochkov wrote: > > So we're searching through the product catalog. Product have types (i.e. > "Electronics", "Apparel", "Furniture" etc). What we need is to customize > scoring of the results so that top results should contain products of all > different typ

Re: Searching on mulit-core Solr

2009-04-09 Thread Fergus McMenemie
>Any help on this issue? Would distributed search on multi-core on same >Solr instance even work? Does it has to be different Solr instances >altogether (separate shards)? As best I can tell this works fine for me. Multiple cores on the one machine. Very different schema and solrconfig.xml for eac

Multi-language support

2009-04-09 Thread revas
Hi, To reframe my earlier question Some languages have just analyzers only but nostemmer from snowball porter,then does the analyzer take care of stemming as well? Some languages only have the stemmer from snowball but no analyzer? Some have both. Can we say then that solr supports all the abo

Re: Using constants with DataImportHandler and MySQL ?

2009-04-09 Thread gateway0
Here´s the solution: just insert a dummy sql field 'dataci_project' in your select statement. Glen Newton wrote: > > In MySql at least, you can do achieve what I think you want by > manipulating the SQL, like this: > > mysql> select "foo" as Constant1, id from Article li

Analyzers and stemmer

2009-04-09 Thread revas
Hi , With respect to language support in solr ,we have analyzers for some languages and stemmers for certain langauges.Do we say that solr supports this particular language only if we have both analyzer and stemmer for the language or also for which we have analyzer but not stemmer Regards Suja

Dataimporthandler + MySQL = Datetime offset by 2 hours ?

2009-04-09 Thread gateway0
Hi, im fetching entries from my mysql database and index them with the Dataimporthandler: MySQL Table entry: (for example) pr_timedate : 2009-04-14 11:00:00 entry in data-config.xml to index the mysql field: result in solr index: 2009-04-14T09:00:00Z:confused: it says 09:00:00 instead of 11

Re: Dataimporthandler + MySQL = Datetime offset by 2 hours ?

2009-04-09 Thread Shalin Shekhar Mangar
On Thu, Apr 9, 2009 at 6:18 PM, gateway0 wrote: > > Hi, > > im fetching entries from my mysql database and index them with the > Dataimporthandler: > > MySQL Table entry: (for example) > pr_timedate : 2009-04-14 11:00:00 > > entry in data-config.xml to index the mysql field: > dateTimeFormat="yy

Access HTTP headers from custom request handler

2009-04-09 Thread Giovanni De Stefano
Hello all, we are writing a custom request handler and we need to implement some business logic according to some HTTP headers. I see there is no easy way to access HTTP headers from the request handler. Moreover it seems to me that the HTTPServletness is lost way before the custom request handl

Re: Exception while solr commit

2009-04-09 Thread Michael McCandless
This is a spooky exception. Committing after every update will give very poor performance, but should be "fine" (ie, not cause exceptions like this). What filesystem are you on? Is there any possibility that two writers are open against the same index? Is this easily reproduced? Mike On Wed,

Re: Snapinstaller vs Solr Restart

2009-04-09 Thread sunnyfr
Hi Otis, Ok about that, but still when it merges segments it changes names and I've no choice to replicate all the segment which is bad for the replication and cpu. ?? Thanks Otis Gospodnetic wrote: > > Lower your mergeFactor and Lucene will merge segments(i.e. fewer index > files) and purge

Re: Any tips for indexing large amounts of data?

2009-04-09 Thread sunnyfr
Hi Otis, How did you manage that? I've 8 core machine with 8GB of ram and 11GB index for 14M docs and 5 update every 30mn but my replication kill everything. My segments are merged too often sor full index replicate and cache lost and I've no idea what can I do now? Some help would be br

Re: Any tips for indexing large amounts of data?

2009-04-09 Thread Glen Newton
For Solr / Lucene: - use -XX:+AggressiveOpts - If available, huge pages can help. See http://zzzoot.blogspot.com/2009/02/java-mysql-increased-performance-with.html I haven't yet followed-up with my Lucene performance numbers using huge pages: it is 10-15% for large indexing jobs. For Lucene: - mu

Re: Any tips for indexing large amounts of data?

2009-04-09 Thread Glen Newton
> - As per > http://developers.sun.com/learning/javaoneonline/2008/pdf/TS-5515.pdf Sorry, the presentation covers a lot of ground: see slide #20: "Standard thread pools can have high contention for task queue and other data structures when used with fine-grained tasks" [I haven't yet implemented wo

Custom DIH: FileDataSource with additional business logic?

2009-04-09 Thread Giovanni De Stefano
Hello, here I am with another question. I am using DIH to index a DB. Additionally I also have to index some files containing Java serialized objects (and I cannot change this... :-( ). I currently have implemented a standalone Java app with the following features: 1) read all files from a give

Re: Searching on mulit-core Solr

2009-04-09 Thread vivek sar
Erik, Here is what I'd posted in this thread earlier, I tried the following with two cores (they both share the same schema and solrconfig.xml) on the same box on same solr instance, 1) http://10.4.x.x:8080/solr/core0/admin/ - works fine, shows all the cores in admin interface 2) http://10.4.

Re: Searching on mulit-core Solr

2009-04-09 Thread vivek sar
Attached is the solr.xml - note, the schema and solrconfig are located in the core0 and all other cores point to the same core0 instance for schema. Searches on individual cores work fine so I'm using the solr.xml is correct - I also get their status correctly. From the "NullPointerException" it

Re: httpclient.ProtocolException using Solrj

2009-04-09 Thread vivek sar
I'm inserting 10K in a batch (using addBeans method). I read somewhere in the wiki that it's better to use the same instance of SolrServer for better performance. Would MultiThreadedConnectionManager help? How do I use it? I also wanted to know how can use EmbeddedSolrServer - does my app needs to

Re: Access HTTP headers from custom request handler

2009-04-09 Thread Noble Paul നോബിള്‍ नोब्ळ्
well unfortunately , no. Solr cannot assume that the request would always come from http (think of EmbeddedSolrServer) .So it assumes that there are only parameters Your best bet is to modify SolrDispatchFilter and readthe params and set them in the SolrRequest Object or you can just write a Filt

Re: Custom DIH: FileDataSource with additional business logic?

2009-04-09 Thread Noble Paul നോബിള്‍ नोब्ळ्
FileDataSource is of type Reader . means getData() returns ajava.io.Reader.That is not very suitable for you. your best bet is to write a simple DataSource which returns an Iterator> after reading the serialized Objects .This is what JdbcdataSource does. Then you can use it with SqlEntityProcesso

Re: httpclient.ProtocolException using Solrj

2009-04-09 Thread Noble Paul നോബിള്‍ नोब्ळ्
using a single request is the fatest http://wiki.apache.org/solr/Solrj#head-2046bbaba3759b6efd0e33e93f5502038c01ac65 I could index at the rate of 10,000 docs/sec using this and BinaryRequestWriter On Thu, Apr 9, 2009 at 10:36 PM, vivek sar wrote: > I'm inserting 10K in a batch (using addBeans m

Re: httpclient.ProtocolException using Solrj

2009-04-09 Thread Shalin Shekhar Mangar
On Thu, Apr 9, 2009 at 10:36 PM, vivek sar wrote: > I'm inserting 10K in a batch (using addBeans method). I read somewhere > in the wiki that it's better to use the same instance of SolrServer > for better performance. Would MultiThreadedConnectionManager help? How > do I use it? > If you are no

Re: Any tips for indexing large amounts of data?

2009-04-09 Thread Noble Paul നോബിള്‍ नोब्ळ्
On Thu, Apr 9, 2009 at 8:51 PM, sunnyfr wrote: > > Hi Otis, > How did you manage that? I've 8 core machine with 8GB of ram and 11GB index > for 14M docs and 5 update every 30mn but my replication kill everything. > My segments are merged too often sor full index replicate and cache lost and >

Dictionary lookup possibilities

2009-04-09 Thread Jaco
Hello, I'm struggling with some ideas, maybe somebody can help me with past experiences or tips. I have loaded a dictionary into a Solr index, using stemming and some stopwords in analysis part of the schema. Each record holds a term from the dictionary, which can consist of multiple words. For so

Re: Using ExtractingRequestHandler to index a large PDF ~solved

2009-04-09 Thread Grant Ingersoll
On Apr 6, 2009, at 10:16 AM, Fergus McMenemie wrote: Hmmm, Not sure how this all hangs together. But editing my solrconfig.xml as follows sorted the problem:- multipartUploadLimitInKB="2048" /> to multipartUploadLimitInKB="20048" /> We should document this on the wiki or in the

logging

2009-04-09 Thread Kevin Osborn
We built our own webapp that used the Solr JARs. We used Apache Commons/log4j logging and just put log4j.properties in the Resin conf directory. The commons-logging and log4j jars were put in the Resin lib driectory. Everything worked great and we got log files for our code only. So, I upgraded

Re: httpclient.ProtocolException using Solrj

2009-04-09 Thread vivek sar
Here is what I'm doing, SolrServer server = new StreamingUpdateSolrServer(url, 1000,5); server.addBeans(dataList); //where dataList is List with 10K elements I run two threads each using the same server object and then each call server.addBeans(...). I'm able to get 50K/sec inserted using that

Re: How to get the solrhome location dynamically

2009-04-09 Thread Chris Hostetter
: Subject: How to get the solrhome location dynamically Do you really want the Solr Home Dir, or do you want the instanceDir for a specific SolrCore? If you're using a solr.xml file (ie: one or many cores), you can get hte instanceDir for each core from the CoreAdminHandler -- but it doesn't

Question on Solr Distributed Search

2009-04-09 Thread vivek sar
Hi, I've another thread on multi-core distributed search, but just wanted to put a simple question here on distributed search to get some response. I've a search query, http://etsx19.co.com:8080/solr/20090409_9/select?q=usa - returns with 10 result now if I add "shards" parameter to it,

Re: Querying for multi-word synonyms

2009-04-09 Thread Chris Hostetter
: Unfortunately, I have to use SynonymFilter at query time due to the nature : of the data I'm indexing. At index time, all I have are keywords but at : query time I will have some semantic markup which allows me to expand into : synonyms. I am wondering if any progress has been made into making q

Re: Additive filter queries

2009-04-09 Thread Chris Hostetter
: Right now a document looks like this: : : : : 1598548 : 12545 : Adidas : 1, 2, 3, 4, 5, 6, 7 : AA, A, B, W, W, : Brown : : : If we went down a level, it could look like.. : : : 1598548 : 12545 : 654641654684 : Adidas : 1 : AA : Brown : If you want result at the "product" level then

Re: Question on Solr Distributed Search

2009-04-09 Thread vivek sar
I think the reason behind the "connection reset" is. Looking at the code it points to QueryComponent.mergeIds() resultIds.put(shardDoc.id.toString(), shardDoc); looks like the doc unique id is returning null. I'm not sure how is it possible as its a required field. Right my unique id is not store

Re: Question on Solr Distributed Search

2009-04-09 Thread vivek sar
Just an update. I changed the schema to store the unique id field, but I still get the connection reset exception. I did notice that if there is no data in the core then it returns the 0 result (no exception), but if there is data and you search using "shards" parameter I get the connection reset e

multiple tokenizers needed

2009-04-09 Thread Ashish P
I want to analyze a text based on pattern ";" and separate on whitespace and it is a Japanese text so use CJKAnalyzer + tokenizer also. in short I want to do:

Re: sorlj search

2009-04-09 Thread RajuMaddy
Tevfik Kiziloren wrote: > > Hi. I'm a newbie. I need to develop a jsf based search application by > using solr. I found nothing about soljava imlementation except simple > example on the solr wiki. When I tried a console program that similar in > the example at solr wiki, I got the exception b