date:20120116

Re: Solr - Tika(?) memory leak

2012-01-16 Thread Wayne W

Thanks for the links - I've put a posting on the Tika ML. I've just checked and we using tika-0.2.jar - does anyone know which version I can use with solr 1.3? Is there any info on upgrading from this far back to the latest version - is it even possible? or would I need to re-index everything? O

Re: Can Apache Solr Handle TeraByte Large Data

2012-01-16 Thread Memory Makers

I've been toying with the idea of setting up an experiment to index a large document set 1+ TB -- any thoughts on an open data set that one could use for this purpose? Thanks. On Mon, Jan 16, 2012 at 5:00 PM, Burton-West, Tom wrote: > Hello , > > Searching real-time sounds difficult with that am

Solr Cloud Indexing

2012-01-16 Thread Sujatha Arun

Would it make sense to Index on the cloud and periodically [2-4 times /day] replicate the index at our server for searching .Which service to go with for solr Cloud Indexing ? Any good and tried services? Regards Sujatha

Re: Trying to understand SOLR memory requirements

2012-01-16 Thread qiu chi

I remembered there is another implementation using lucene index file as the look up table not the in memory FST FST has its advantage in speed but if you writes documents during runtime, reconstructing FST may cause performance issue On Tue, Jan 17, 2012 at 11:08 AM, Robert Muir wrote: > looks l

Re: Trying to understand SOLR memory requirements

2012-01-16 Thread Robert Muir

looks like https://issues.apache.org/jira/browse/SOLR-2888. Previously, FST would need to hold all the terms in RAM during construction, but with the patch it uses offline sorts/temporary files. I'll reopen the issue to backport this to the 3.x branch. On Mon, Jan 16, 2012 at 8:31 PM, Dave wrot

Re: Trying to understand SOLR memory requirements

2012-01-16 Thread Dave

According to http://wiki.apache.org/solr/Suggester FSTLookup is the least memory-intensive of the lookupImpl's. Are you suggesting a different approach entirely or is that a lookupImpl that is not mentioned in the documentation? On Mon, Jan 16, 2012 at 9:54 PM, qiu chi wrote: > you may disable

Re: Trying to understand SOLR memory requirements

2012-01-16 Thread qiu chi

you may disable FST look up and use lucene index as the suggest method FST look up loads all documents into the memory, you can use the lucene spell checker instead On Tue, Jan 17, 2012 at 10:31 AM, Dave wrote: > I've tried up to -Xmx5g > > On Mon, Jan 16, 2012 at 9:15 PM, qiu chi wrote: > > >

Re: Trying to understand SOLR memory requirements

2012-01-16 Thread Dave

I've tried up to -Xmx5g On Mon, Jan 16, 2012 at 9:15 PM, qiu chi wrote: > What is the largest -Xmx value you have tried? > Your index size seems not very big > Try -Xmx2048m , it should work > > On Tue, Jan 17, 2012 at 9:31 AM, Dave wrote: > > > I'm trying to figure out what my memory needs are

Re: Trying to understand SOLR memory requirements

2012-01-16 Thread qiu chi

What is the largest -Xmx value you have tried? Your index size seems not very big Try -Xmx2048m , it should work On Tue, Jan 17, 2012 at 9:31 AM, Dave wrote: > I'm trying to figure out what my memory needs are for a rather large > dataset. I'm trying to build an auto-complete system for every >

Trying to understand SOLR memory requirements

2012-01-16 Thread Dave

I'm trying to figure out what my memory needs are for a rather large dataset. I'm trying to build an auto-complete system for every city/state/country in the world. I've got a geographic database, and have setup the DIH to pull the proper data in. There are 2,784,937 documents which I've formatted

Re: SolrJ Embedded

2012-01-16 Thread Erick Erickson

I don't see why not. I'm assuming a *nix system here so when Solr updated an index, any deleted files would hang around. But I have to ask why bother with the Embedded server in the first place? You already have a Solr instance up and running, why not just query that instead, perhaps using SolrJ?

Re: FacetComponent: suppress original query

2012-01-16 Thread Erick Erickson

Why not just up the maxBooleanClauses parameter in solrconfig.xml? Best Erick On Sat, Jan 14, 2012 at 1:41 PM, Dmitry Kan wrote: > OK, let me clarify it: > > if solrconfig has maxBooleanClauses set to 1000 for example, than queries > with clauses more than 1000 in number will be rejected with th

Re: Solr Query Multiple words

2012-01-16 Thread Erick Erickson

What have you tried and what have the results been? Because this is well within Solr's oob capabilities. Best Erick On Fri, Jan 13, 2012 at 10:37 AM, vibhoreng04 wrote: > Hi , > > I want to do a 800 words multiple search across the index of 1 million > records. > Can anyone suggest something whi

Re: Merging text nodes/blocks of the catchall field

2012-01-16 Thread Erick Erickson

I don't know where the commas are coming from, as far as I know that's not part of Solr. You must have the catchall field defined with 'multiValued="true"" ', so if you set the increment gap to 0, that should help. When you do that, what does your return look like? Best Erick P.S. It's rather un

Re: Re:Re: Re:Re: problem of solr replcation's speed

2012-01-16 Thread astubbs

For future reference, I had this problem, and it was the debug statements in commons HTTP that were printing out all the binary data to the log, but my console appender was set to INFO so I wasn't seeing them. Setting http commons to INFO level fixed my speed issue (two orders of magnitude faster).

Re: Detecting replication slave health

2012-01-16 Thread astubbs

Did this ever progress? Shall we make a jira? -- View this message in context: http://lucene.472066.n3.nabble.com/Detecting-replication-slave-health-tp677584p3664739.html Sent from the Solr - User mailing list archive at Nabble.com.

RE: Can Apache Solr Handle TeraByte Large Data

2012-01-16 Thread Burton-West, Tom

Hello , Searching real-time sounds difficult with that amount of data. With large documents, 3 million documents, and 5TB of data the index will be very large. With indexes that large your performance will probably be I/O bound. Do you plan on allowing phrase or proximity searches? If so, you

SolrJ Embedded

2012-01-16 Thread spring

Hi, is it possible to use the same index in a solr webapp and additionally in a EmbeddedSolrServer? The embbedded one would be read only. Thank you.

Re: Solr - Tika(?) memory leak

2012-01-16 Thread P Williams

Hi, I'm not sure which version of Solr/Tika you're using but I had a similar experience which turned out to be the result of a design change to PDFBox. https://issues.apache.org/jira/browse/SOLR-2886 Tricia On Sat, Jan 14, 2012 at 12:53 AM, Wayne W wrote: > Hi, > > we're using Solr running on

RE: Improving Solr Spell Checker Results

2012-01-16 Thread Dyer, James

David, The spellchecker normally won't give suggestions for any term in your index. So even if "wever" is misspelled in context, if it exists in the index the spell checker will not try correcting it. There are 3 workarounds: 1. Use the patch included with SOLR-2585 (this is for Trunk/4.x only

Re: Replication and segment files

2012-01-16 Thread Otis Gospodnetic

Hi Herman, Try adding this to your replication config: 00:00:10 See also http://search-lucene.com/?q=commitReserveDuration&fc_project=Solr Otis Performance Monitoring SaaS for Solr - http://sematext.com/spm/solr-performance-monitoring/index.html - Original Message - > From: H

Replication and segment files

2012-01-16 Thread Herman Kiefus

We are at times having some difficulty achieving a 'successful' replication. Our Operations personnel have reported the following behavior (which I cannot attest to): A master has a set of segment files (let's say 25). A slave then polls the master, get the list of segment files that differ an

Re: Solr - Tika(?) memory leak

2012-01-16 Thread Otis Gospodnetic

Wayne, Have you asked on Tika's ML? You may also want to watch https://issues.apache.org/jira/browse/SOLR-2901 Otis Performance Monitoring SaaS for Solr - http://sematext.com/spm/solr-performance-monitoring/index.html - Original Message - > From: Wayne W > To: solr-user@lucene.

Re: best query for one-box search string over multiple types & fields?

2012-01-16 Thread Otis Gospodnetic

Johnny, If you are indexing a catalog of songs and artists you can write a query parser or search component that recognizes known things like song ("you must have "bohemian rhapsody" in your catalog) or artist names (you must have the exact string "queen" in your catalog) or even their combinat

Re: Can Apache Solr Handle TeraByte Large Data

2012-01-16 Thread Otis Gospodnetic

Hello, > > From: mustafozbek > >All documents that we use are rich text documents and we parse them with >tika. we need to search real time. Because of real-time requirement, you'll need to use unreleased/dev version of Solr. >Robert Stewart wrote >> Any idea

Solr 3.5 MoreLikeThis on Date fields

2012-01-16 Thread Jaco Olivier

Hi Everyone, Please help out if you know what is going on. We are upgrading to Solr 3.5 (from 1.4.1) and busy with a Re-Index and Test on our data. Everything seems OK, but Date Fields seem to be "broken" when using with the MoreLikeThis handler (I also saw the same error on Date Fields using

Re: Replace Patter "," with ""

2012-01-16 Thread stockii

okay, thx =) but i replace it now in my data-config ;) - --- System One Server, 12 GB RAM, 2 Solr Instances, 8 Cores, 1 Core with 45 Million Documents other Cores < 200.000 - Solr1 for Search-Requests - commit every Minu

Re: Replace Patter "," with ""

2012-01-16 Thread Koji Sekiguchi

(12/01/16 19:43), stockii wrote: Why does this not work? OR i dont know where is my error? i only want to replace comma with a blank ... Try t

Replace Patter "," with ""

2012-01-16 Thread stockii

Why does this not work? OR i dont know where is my error? i only want to replace comma with a blank ... thx =))) - ---

Re: Solr - Tika(?) memory leak

Re: Can Apache Solr Handle TeraByte Large Data

Solr Cloud Indexing

Re: Trying to understand SOLR memory requirements

Re: Trying to understand SOLR memory requirements

Re: Trying to understand SOLR memory requirements

Re: Trying to understand SOLR memory requirements

Re: Trying to understand SOLR memory requirements

Re: Trying to understand SOLR memory requirements

Trying to understand SOLR memory requirements

Re: SolrJ Embedded

Re: FacetComponent: suppress original query

Re: Solr Query Multiple words

Re: Merging text nodes/blocks of the catchall field

Re: Re:Re: Re:Re: problem of solr replcation's speed

Re: Detecting replication slave health

RE: Can Apache Solr Handle TeraByte Large Data

SolrJ Embedded

Re: Solr - Tika(?) memory leak

RE: Improving Solr Spell Checker Results

Re: Replication and segment files

Replication and segment files

Re: Solr - Tika(?) memory leak

Re: best query for one-box search string over multiple types & fields?

Re: Can Apache Solr Handle TeraByte Large Data

Solr 3.5 MoreLikeThis on Date fields

Re: Replace Patter "," with ""

Re: Replace Patter "," with ""

Replace Patter "," with ""

29 matches

Site Navigation

Mail list logo

Footer information