lost in solr new core architecture

2014-04-11 Thread Aman Tandon
HI, Currently i am using solr 4.2 with tomcat right now i am stucked because i don't know how to upgrade to solr 4.7, because the problem for me is that i am familiar with the cores architecture of solr 4.2 in which we defined the every core name as well as instanceDir but not with solr 4.7. Any h

Re: deleting large amount data from solr cloud

2014-04-11 Thread Aman Tandon
Vinay please share your experience after trying this solution. On Sat, Apr 12, 2014 at 4:12 AM, Vinay Pothnis wrote: > The query is something like this: > > > *curl -H 'Content-Type: text/xml' --data 'param1:(val1 OR > val2) AND -param2:(val3 OR val4) AND date_param:[138395520 TO > 13851648

Re: [ANN] Solr learning resources on safariflow.com (w/subscription or free trial)

2014-04-11 Thread Alexandre Rafalovitch
Looks nice. Would love to see the author-side usage/statistics too. To know which chapters of my book were most useful/recommended. Regards, Alex On 11/04/2014 8:45 pm, "Michael Sokolov" wrote: > I just wanted to let people know about some recent Solr books and videos > that are now availab

Re: deleting large amount data from solr cloud

2014-04-11 Thread Vinay Pothnis
The query is something like this: *curl -H 'Content-Type: text/xml' --data 'param1:(val1 OR val2) AND -param2:(val3 OR val4) AND date_param:[138395520 TO 138516480]' 'http://host:port/solr/coll-name1/update?commit=true'* Trying to restrict the number of documents deleted via the date par

Re: deleting large amount data from solr cloud

2014-04-11 Thread Shawn Heisey
On 4/10/2014 7:25 PM, Vinay Pothnis wrote: When we tried to delete the data through a query - say 1 day/month's worth of data. But after deleting just 1 month's worth of data, the master node is going out of memory - heap space. Wondering is there any way to incrementally delete the data without

Re: deleting large amount data from solr cloud

2014-04-11 Thread Vinay Pothnis
Tried to increase the memory to 24G but that wasn't enough as well. Agree that the index has now grown too much and had to monitor this and take action much earlier. The search operations seem to run ok with 16G - mainly because the bulk of the data that we are trying to delete is not getting sear

Re: Solr Admin core status - Index is not "Current"

2014-04-11 Thread Chris W
Thanks, Shawn. On Fri, Apr 11, 2014 at 11:11 AM, Shawn Heisey wrote: > On 4/10/2014 2:50 PM, Chris W wrote: > >> Hi there >> >>I am using solrcloud (4.3). I am trying to get the status of a core >> from >> solr using (localhost:8000/solr/admin/cores?action=STATUS&core=) >> and >> i get the

Strange double-logging with log4j

2014-04-11 Thread Shawn Heisey
This is lucene_solr_4_7_2_r1586229, downloaded from the release manager's staging area. I configured the following in my log4j.properties file: log4j.rootLogger=WARN, file log4j.category.org.apache.solr.core.SolrCore=INFO, file Now EVERYTHING that SolrCore logs (which is all at INFO) is being

Re: Solr dosn't load index at startup: out of memory

2014-04-11 Thread Erick Erickson
My assumption is that you've been adding documents and just have finally run out of space Is that true Best, Erick On Fri, Apr 11, 2014 at 9:31 AM, Rafał Kuć wrote: > Hello! > > Do you have warming queries defined? > > -- > Regards, > Rafał Kuć > Performance Monitoring * Log Analytics * S

Re: deleting large amount data from solr cloud

2014-04-11 Thread Erick Erickson
Using 16G for a 360G index is probably pushing things. A lot. I'm actually a bit surprised that the problem only occurs when you delete docs The simplest thing would be to increase the JVM memory. You should be looking at your index to see how big it is, be sure to subtract out the *.fdt and *

Re: High CPU usage after import

2014-04-11 Thread Erick Erickson
Are you storing the data? That is, the raw binary of the MP3? B/c when stored="true", Solr will try to compress the data, perhaps that's what's driving the CPU utilization? Easy test: set stored="false" for everything.. FWIW, Erick On Fri, Apr 11, 2014 at 5:23 AM, Александр Вандышев wrote: > I

Re: Solr Admin core status - Index is not "Current"

2014-04-11 Thread Shawn Heisey
On 4/10/2014 2:50 PM, Chris W wrote: Hi there I am using solrcloud (4.3). I am trying to get the status of a core from solr using (localhost:8000/solr/admin/cores?action=STATUS&core=) and i get the following output 100 102 2 20527 20 *false* What does current mean? A few of the cores are op

Re: Solr Admin core status - Index is not "Current"

2014-04-11 Thread Chris W
Any help on this is much appreciated. I cannot find any documentation around this and would be good to understand what this means Thanks On Thu, Apr 10, 2014 at 1:50 PM, Chris W wrote: > Hi there > > I am using solrcloud (4.3). I am trying to get the status of a core from > solr using (loca

RE: Relevance/Rank

2014-04-11 Thread EXTERNAL Taminidi Ravi (ETI, Automotive-Service-Solutions)
HI thanks Aman/Eric, I move part of the query under q=*:* and there is a difference in the score and the Order. It seems work for me now. I use this and move forward Thanks Ravi -Original Message- From: Aman Tandon [mailto:amantandon...@gmail.com] Sent: Friday, April 11, 2014 12:02 AM

Re: Solr dosn't load index at startup: out of memory

2014-04-11 Thread Rafał Kuć
Hello! Do you have warming queries defined? -- Regards, Rafał Kuć Performance Monitoring * Log Analytics * Search Analytics Solr & Elasticsearch Support * http://sematext.com/ > Hi, > my solr (v. 4.5) after moths of work suddenly stopped to index: it responded > at the query but didn't index

Re: Search a list of words and returned order

2014-04-11 Thread Jack Krupansky
Generally, the documents containing more of the terms should score higher and be returned first, but "relevancy" for some terms can skew that ordering, to some degree. What specific use cases are failing for you? You can always add an additional optional subquery which is the AND of all terms

Solr dosn't load index at startup: out of memory

2014-04-11 Thread Erik
Hi, my solr (v. 4.5) after moths of work suddenly stopped to index: it responded at the query but didn't index anymore new data. Here the error message: ERROR - 2014-04-11 15:52:30.317; org.apache.solr.common.SolrException; java.lang.IllegalStateException: this writer hit an OutOfMemoryError; canno

Re: Class not found ICUFoldingFilter (SOLR-4852)

2014-04-11 Thread Shawn Heisey
On 4/11/2014 3:44 AM, ronak kirit wrote: > I am facing the same issue discussed at SOLR-4852. I am getting below error: > > Caused by: java.lang.NoClassDefFoundError: Could not initialize class > org.apache.lucene.analysis.icu.ICUFoldingFilter > at > org.apache.lucene.analysis.icu.ICUFoldingFilter

Re: deleting large amount data from solr cloud

2014-04-11 Thread Vinay Pothnis
Sorry - yes, I meant to say leader. Each JVM has 16G of memory. On 10 April 2014 20:54, Erick Erickson wrote: > First, there is no "master" node, just leaders and replicas. But that's a > nit. > > No real clue why you would be going out of memory. Deleting a > document, even by query should jus

RE: Were changes made to facetting on multivalued fields recently?

2014-04-11 Thread Jean-Sebastien Vachon
Thanks to both of you. I finally found the issue and you were right (again) ;) The problem was not coming from the full indexation code containing the SQL replace statement but from another process whose job is to maintain our index up to date. This process had no idea that commas were to be rep

Search a list of words and returned order

2014-04-11 Thread Croci Francesco Luigi (ID SWS)
When I search for a list of words, per default Solr uses the OR operator. In my case I index (pdfs) files. How/what can I do so that when I search the index for a list of words, I get the list of documents ordered first by the ones that have all the words in them? Thank you Francesco

[ANN] Solr learning resources on safariflow.com (w/subscription or free trial)

2014-04-11 Thread Michael Sokolov
I just wanted to let people know about some recent Solr books and videos that are now available at safariflow.com. You can sign up for a free trial and get instant access, buy a subscription, or you may already be a subscriber. I don't normally send out announcements like this, but because we

Re: High CPU usage after import

2014-04-11 Thread Александр Вандышев
I realized what the problem was. One of the Solr threads freezes when importing MP3 files. When there are many such files Solr loads all processors. Is there a way to free thread? Re: High CPU usage after import That could mean that the code is hung somehow. Or, maybe Solr is just working on the

Re: DataImportHandler - Automatic scheduling of delta imports in Solr in windows 7

2014-04-11 Thread harshrossi
Yes that is all fine with me. Only thing that worries me is what needs to be coded in the batch file. I will just try a sample batch file and get back with queries if any. Thank you -- View this message in context: http://lucene.472066.n3.nabble.com/DataImportHandler-Automatic-scheduling-of-d

Re: Fails to index if unique field has special characters

2014-04-11 Thread Markus Jelsma
Well, this is somewhat of a problem if you have have URL's as uniqueKey that contain exclamation marks. Isn't it an idea to allow those to be escaped and thus ignored by CompositeIdRouter? On Friday, April 11, 2014 11:43:31 AM Cool Techi wrote: > Thanks, that was helpful. > Regards,Rohit > > >

RE: Fails to index if unique field has special characters

2014-04-11 Thread Cool Techi
Thanks, that was helpful. Regards,Rohit > Date: Thu, 10 Apr 2014 08:44:36 -0700 > From: iori...@yahoo.com > Subject: Re: Fails to index if unique field has special characters > To: solr-user@lucene.apache.org > > Hi Ayush, > > I thinks this > > ""IBM!12345". The exclamation mark ('!') is criti

SOLR problem with full-import and shards

2014-04-11 Thread Wojciech Jaworski
Hi, I built an Apache SOLR cloud (version 4.7.0) with 3 shards. I chose implicit routing mechanism while creating new collection (one shard per month, fields with date format MM use as shardId). I configured DataImportHandler with database as a data source. Finally I run full-import (data

highlighting displays to much

2014-04-11 Thread aowen
i am using solr 4.3.1 and want to highlight complete sentences if possible or at least not cut up words. it it finds something the hole field is displayed instead of only 180 chars the field is: solrconfig setting for highlighting: true plain_text title description

Class not found ICUFoldingFilter (SOLR-4852)

2014-04-11 Thread ronak kirit
Hello, I am facing the same issue discussed at SOLR-4852. I am getting below error: Caused by: java.lang.NoClassDefFoundError: Could not initialize class org.apache.lucene.analysis.icu.ICUFoldingFilter at org.apache.lucene.analysis.icu.ICUFoldingFilterFactory.create(ICUFoldingFilterFactory.java:5

Re: Solr relevancy tuning

2014-04-11 Thread Giovanni Bricconi
Hello Doug I have just watched the quepid demonstration video, and I strongly agree with your introduction: it is very hard to involve marketing/business people in repeated testing session, and speadsheets or other kind of files are not the right tool to use. Currenlty I'm quite alone in my tuning

Re: Shared Stored Field

2014-04-11 Thread StrW_dev
Erick Erickson wrote > So you're saying that you have B_1 - B_8 in one doc, B_9 - B_16 in > another doc etc? Well yes that could work, but this would mean we get a lot of unique dymanic fields, basically equal to the number of documents in our system and I am not sure if that is a good practice.

Re: MapReduceIndexerTool does not respect Lucene version in solrconfig Was: converting 4.7 index to 4.3.1

2014-04-11 Thread Dmitry Kan
Thanks Shawn, perhaps the comment on the luceneMatchVersion in the example schema.xml could be changed to reflect / clarify this? this comment made me think that the parameter is affecting the index side of things too (aka index format version). I.e. I would appreciate seeing there things lik

Re: MapReduceIndexerTool does not respect Lucene version in solrconfig Was: converting 4.7 index to 4.3.1

2014-04-11 Thread Shawn Heisey
On 4/11/2014 12:42 AM, Dmitry Kan wrote: > Thanks! So solr 4.7 does not seem to respect the luceneMatchVersion on the > binary (index) level. Or perhaps, I misunderstand the meaning of the > luceneMatchVersion. luceneMatchVersion does not dictate the index format. It is a way to signal things lik

Re: Pushing content to Solr from Nutch

2014-04-11 Thread Furkan KAMACI
Hi Xavier; I think that it is better to ask this question at Nutch user list. Thanks; Furkan KAMACI 2014-04-11 7:52 GMT+03:00 Jack Krupansky : > Does your Solr schema match the data output by nutch? It's up to you to > create a Solr schema that matches the output of nutch - read up on the > nu