Re: Fastest way to import big amount of documents in SolrCloud

2014-05-02 Thread Alexander Kanarsky
If you build your index in Hadoop, read this (it is about the Cloudera Search but in my understanding also should work with Solr Hadoop contrib since 4.7)

Re: Production Release process with Solr 3.5 implementation.

2012-11-01 Thread Alexander Kanarsky
Why not to change the order to this: 3. Upgrade Solr Schema (Master) Replication is disabled. 4. Start Index Rebuild. (if step 3) 1. Pull up Maintenance Pages 2. Upgrade DB 5. Upgrade UI code 6. Index build complete ? Start Replication 7. Verify UI and Drop Maintenance Pages. So your slaves will

Re: 400 MB Fields

2011-06-08 Thread Alexander Kanarsky
Otis, Not sure about the Solr, but with Lucene It was certainly doable. I saw fields way bigger than 400Mb indexed, sometimes having a large set of unique terms as well (think something like log file with lots of alphanumeric tokens, couple of gigs in size). While indexing and querying of such

Re: copyField generates multiple values encountered for non multiValued field

2011-05-31 Thread Alexander Kanarsky
Alexander, I saw the same behavior in 1.4.x with non-multivalued fields when updating the document in the index (i.e obtaining the doc from the index, modifying some fields and then adding the document with the same id back). I do not know what causes this, but it looks like the copyField logic

Re: Replication Clarification Please

2011-05-15 Thread Alexander Kanarsky
to 0 6. Changed caches from LRUCache to FastLRUCache as we had hit ratios well over 75% to increase warming speed 7. Increased the poll interval to 6 minutes and re-indexed all content. Thanks, Ravi Kiran Bhaskar On Wed, May 11, 2011 at 6:00 PM, Alexander Kanarsky alexan

Re: Replication Clarification Please

2011-05-11 Thread Alexander Kanarsky
Downloaded: 317.48 KB / 436.24 MB [0.0%] Downloading File: _ayu.nrm, Downloaded: 4 bytes / 4 bytes [100.0%] Time Elapsed: 17s, Estimated Time Remaining: 23902s, Speed: 18.67 KB/s Thanks, Ravi Kiran Bhaskar On Tue, May 10, 2011 at 4:10 AM, Alexander Kanarsky alexan...@trulia.com

Re: Replication Clarification Please

2011-05-10 Thread Alexander Kanarsky
Ravi, as far as I remember, this is how the replication logic works (see SnapPuller class, fetchLatestIndex method): 1. Does the Slave get the whole index every time during replication or just the delta since the last replication happened ? It look at the index version AND the index

Re: Multicore Relaod Theoretical Question

2011-01-24 Thread Alexander Kanarsky
response. You said that the old index files were still in use. That means Linux does not *really* delete them until Solr frees its locks from it, which happens while reloading? Thank you for sharing your experiences! Kind regards, Em Alexander Kanarsky wrote: Em, yes, you can replace

Re: Multicore Relaod Theoretical Question

2011-01-22 Thread Alexander Kanarsky
Em, yes, you can replace the index (get the new one into a separate folder like index.new and then rename it to the index folder) outside the Solr, then just do the http call to reload the core. Note that the old index files may still be in use (continue to serve the queries while reloading),

Re: old index files not deleted on slave

2011-01-22 Thread Alexander Kanarsky
I see the file -rw-rw-r-- 1 feeddo feeddo0 Dec 15 01:19 lucene-cdaa80c0fefe1a7dfc7aab89298c614c-write.lock was created on Dec. 15. At the end of the replication, as far as I remember, the SnapPuller tries to open the writer to ensure the old files are deleted, and in your case it cannot

Re: Can I host TWO separate datasets in Solr?

2011-01-21 Thread Alexander Kanarsky
Igor, you can set two different Solr cores in solr.xml and search them separately. See multicore example in Solr distribution. -Alexander On Fri, Jan 21, 2011 at 3:51 PM, Igor Chudov ichu...@gmail.com wrote: I would like to have two sets of data and search them separately (they are used for

Re: Solr + Hadoop

2011-01-13 Thread Alexander Kanarsky
Joan, make sure that you are running the job on Hadoop 0.21 cluster. (It looks like you have compiled the apache-solr-hadoop jar with Hadoop 0.21 but using it on 0.20 cluster). -Alexander

Re: Creating Solr index from map/reduce

2011-01-03 Thread Alexander Kanarsky
Joan, current version of the patch assumes the location and names for the schema and solrconfig files ($SOLR_HOME/conf), it is hardcoded (see the SolrRecordWriter's constructor). Multi-core configuration with separate configuration locations via solr.xml is not supported as for now. As a

Re: Searching with wrong keyboard layout or using translit

2010-10-28 Thread Alexander Kanarsky
Pavel, I think there is no single way to implement this. Some ideas that might be helpful: 1. Consider adding additional terms while indexing. This assumes conversion of Russian text to both translit and wrong keyboard forms and index converted terms along with original terms (i.e. your

Re: I was at a search vendor round table today...

2010-09-22 Thread Alexander Kanarsky
He said some other things about a huge petabyte hosted search collection they have used by banks.. In context of your discussion this reference sounds really, really funny... :) -Alexander On Wed, Sep 22, 2010 at 1:17 PM, Grant Ingersoll gsing...@apache.org wrote: On Sep 22, 2010, at 2:04