Re: DataImportHandler - Automatic scheduling of delta imports in Solr in windows 7

2014-04-11 Thread William Bell
You can use PowerShell in windows to kick off a URL at a scheduled time. On Thu, Apr 10, 2014 at 11:02 PM, harshrossi harshro...@gmail.com wrote: I am using *DeltaImportHandler* for indexing data in Solr. Currently I am manually indexing the data into Solr by selecting commands full-import

Re: MapReduceIndexerTool does not respect Lucene version in solrconfig Was: converting 4.7 index to 4.3.1

2014-04-11 Thread Dmitry Kan
Thanks! So solr 4.7 does not seem to respect the luceneMatchVersion on the binary (index) level. Or perhaps, I misunderstand the meaning of the luceneMatchVersion. This is what I see when loading index from hdfs via luke and launching the Index Checker tool: [clip] Segments file=segments_2

Re: Problem Integrating solr-4.7.1 with apache-nutch-1.8

2014-04-11 Thread ArunVC
Can you try this bin/nutch solrindex http://127.0.0.1:8080/solr/ crawl/crawldb -linkdb crawl/linkdb crawl/segments/* -- View this message in context: http://lucene.472066.n3.nabble.com/Problem-Integrating-solr-4-7-1-with-apache-nutch-1-8-tp4130149p4130569.html Sent from the Solr - User

Re: Pushing content to Solr from Nutch

2014-04-11 Thread Furkan KAMACI
Hi Xavier; I think that it is better to ask this question at Nutch user list. Thanks; Furkan KAMACI 2014-04-11 7:52 GMT+03:00 Jack Krupansky j...@basetechnology.com: Does your Solr schema match the data output by nutch? It's up to you to create a Solr schema that matches the output of nutch

Re: MapReduceIndexerTool does not respect Lucene version in solrconfig Was: converting 4.7 index to 4.3.1

2014-04-11 Thread Shawn Heisey
On 4/11/2014 12:42 AM, Dmitry Kan wrote: Thanks! So solr 4.7 does not seem to respect the luceneMatchVersion on the binary (index) level. Or perhaps, I misunderstand the meaning of the luceneMatchVersion. luceneMatchVersion does not dictate the index format. It is a way to signal things like

Re: MapReduceIndexerTool does not respect Lucene version in solrconfig Was: converting 4.7 index to 4.3.1

2014-04-11 Thread Dmitry Kan
Thanks Shawn, perhaps the comment on the luceneMatchVersion in the example schema.xml could be changed to reflect / clarify this? !-- Controls what version of Lucene various components of Solr adhere to. Generally, you want to use the latest version to get all bug fixes and

Re: Shared Stored Field

2014-04-11 Thread StrW_dev
Erick Erickson wrote So you're saying that you have B_1 - B_8 in one doc, B_9 - B_16 in another doc etc? Well yes that could work, but this would mean we get a lot of unique dymanic fields, basically equal to the number of documents in our system and I am not sure if that is a good practice.

Re: Solr relevancy tuning

2014-04-11 Thread Giovanni Bricconi
Hello Doug I have just watched the quepid demonstration video, and I strongly agree with your introduction: it is very hard to involve marketing/business people in repeated testing session, and speadsheets or other kind of files are not the right tool to use. Currenlty I'm quite alone in my

Class not found ICUFoldingFilter (SOLR-4852)

2014-04-11 Thread ronak kirit
Hello, I am facing the same issue discussed at SOLR-4852. I am getting below error: Caused by: java.lang.NoClassDefFoundError: Could not initialize class org.apache.lucene.analysis.icu.ICUFoldingFilter at

highlighting displays to much

2014-04-11 Thread aowen
i am using solr 4.3.1 and want to highlight complete sentences if possible or at least not cut up words. it it finds something the hole field is displayed instead of only 180 chars the field is: fieldType name=text_de class=solr.TextField positionIncrementGap=100 field name=plain_text

SOLR problem with full-import and shards

2014-04-11 Thread Wojciech Jaworski
Hi, I built an Apache SOLR cloud (version 4.7.0) with 3 shards. I chose implicit routing mechanism while creating new collection (one shard per month, fields with date format MM use as shardId). I configured DataImportHandler with database as a data source. Finally I run full-import

RE: Fails to index if unique field has special characters

2014-04-11 Thread Cool Techi
Thanks, that was helpful. Regards,Rohit Date: Thu, 10 Apr 2014 08:44:36 -0700 From: iori...@yahoo.com Subject: Re: Fails to index if unique field has special characters To: solr-user@lucene.apache.org Hi Ayush, I thinks this IBM!12345. The exclamation mark ('!') is critical here, as

Re: Fails to index if unique field has special characters

2014-04-11 Thread Markus Jelsma
Well, this is somewhat of a problem if you have have URL's as uniqueKey that contain exclamation marks. Isn't it an idea to allow those to be escaped and thus ignored by CompositeIdRouter? On Friday, April 11, 2014 11:43:31 AM Cool Techi wrote: Thanks, that was helpful. Regards,Rohit

Re: DataImportHandler - Automatic scheduling of delta imports in Solr in windows 7

2014-04-11 Thread harshrossi
Yes that is all fine with me. Only thing that worries me is what needs to be coded in the batch file. I will just try a sample batch file and get back with queries if any. Thank you -- View this message in context:

Re: High CPU usage after import

2014-04-11 Thread Александр Вандышев
I realized what the problem was. One of the Solr threads freezes when importing MP3 files. When there are many such files Solr loads all processors. Is there a way to free thread? Re: High CPU usage after import That could mean that the code is hung somehow. Or, maybe Solr is just working on the

[ANN] Solr learning resources on safariflow.com (w/subscription or free trial)

2014-04-11 Thread Michael Sokolov
I just wanted to let people know about some recent Solr books and videos that are now available at safariflow.com. You can sign up for a free trial and get instant access, buy a subscription, or you may already be a subscriber. I don't normally send out announcements like this, but because

Search a list of words and returned order

2014-04-11 Thread Croci Francesco Luigi (ID SWS)
When I search for a list of words, per default Solr uses the OR operator. In my case I index (pdfs) files. How/what can I do so that when I search the index for a list of words, I get the list of documents ordered first by the ones that have all the words in them? Thank you Francesco

RE: Were changes made to facetting on multivalued fields recently?

2014-04-11 Thread Jean-Sebastien Vachon
Thanks to both of you. I finally found the issue and you were right (again) ;) The problem was not coming from the full indexation code containing the SQL replace statement but from another process whose job is to maintain our index up to date. This process had no idea that commas were to be

Re: deleting large amount data from solr cloud

2014-04-11 Thread Vinay Pothnis
Sorry - yes, I meant to say leader. Each JVM has 16G of memory. On 10 April 2014 20:54, Erick Erickson erickerick...@gmail.com wrote: First, there is no master node, just leaders and replicas. But that's a nit. No real clue why you would be going out of memory. Deleting a document, even by

Re: Class not found ICUFoldingFilter (SOLR-4852)

2014-04-11 Thread Shawn Heisey
On 4/11/2014 3:44 AM, ronak kirit wrote: I am facing the same issue discussed at SOLR-4852. I am getting below error: Caused by: java.lang.NoClassDefFoundError: Could not initialize class org.apache.lucene.analysis.icu.ICUFoldingFilter at

Solr dosn't load index at startup: out of memory

2014-04-11 Thread Erik
Hi, my solr (v. 4.5) after moths of work suddenly stopped to index: it responded at the query but didn't index anymore new data. Here the error message: ERROR - 2014-04-11 15:52:30.317; org.apache.solr.common.SolrException; java.lang.IllegalStateException: this writer hit an OutOfMemoryError;

Re: Search a list of words and returned order

2014-04-11 Thread Jack Krupansky
Generally, the documents containing more of the terms should score higher and be returned first, but relevancy for some terms can skew that ordering, to some degree. What specific use cases are failing for you? You can always add an additional optional subquery which is the AND of all terms

Re: Solr dosn't load index at startup: out of memory

2014-04-11 Thread Rafał Kuć
Hello! Do you have warming queries defined? -- Regards, Rafał Kuć Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/ Hi, my solr (v. 4.5) after moths of work suddenly stopped to index: it responded at the query but didn't index

RE: Relevance/Rank

2014-04-11 Thread EXTERNAL Taminidi Ravi (ETI, Automotive-Service-Solutions)
HI thanks Aman/Eric, I move part of the query under q=*:* and there is a difference in the score and the Order. It seems work for me now. I use this and move forward Thanks Ravi -Original Message- From: Aman Tandon [mailto:amantandon...@gmail.com] Sent: Friday, April 11, 2014 12:02

Re: Solr Admin core status - Index is not Current

2014-04-11 Thread Chris W
Any help on this is much appreciated. I cannot find any documentation around this and would be good to understand what this means Thanks On Thu, Apr 10, 2014 at 1:50 PM, Chris W chris1980@gmail.com wrote: Hi there I am using solrcloud (4.3). I am trying to get the status of a core

Re: Solr Admin core status - Index is not Current

2014-04-11 Thread Shawn Heisey
On 4/10/2014 2:50 PM, Chris W wrote: Hi there I am using solrcloud (4.3). I am trying to get the status of a core from solr using (localhost:8000/solr/admin/cores?action=STATUScore=core) and i get the following output int name=numDocs100/int int name=maxDoc102/int int name=deletedDocs2/int

Re: High CPU usage after import

2014-04-11 Thread Erick Erickson
Are you storing the data? That is, the raw binary of the MP3? B/c when stored=true, Solr will try to compress the data, perhaps that's what's driving the CPU utilization? Easy test: set stored=false for everything.. FWIW, Erick On Fri, Apr 11, 2014 at 5:23 AM, Александр Вандышев

Re: deleting large amount data from solr cloud

2014-04-11 Thread Erick Erickson
Using 16G for a 360G index is probably pushing things. A lot. I'm actually a bit surprised that the problem only occurs when you delete docs The simplest thing would be to increase the JVM memory. You should be looking at your index to see how big it is, be sure to subtract out the *.fdt and

Re: Solr dosn't load index at startup: out of memory

2014-04-11 Thread Erick Erickson
My assumption is that you've been adding documents and just have finally run out of space Is that true Best, Erick On Fri, Apr 11, 2014 at 9:31 AM, Rafał Kuć r@solr.pl wrote: Hello! Do you have warming queries defined? -- Regards, Rafał Kuć Performance Monitoring * Log

Strange double-logging with log4j

2014-04-11 Thread Shawn Heisey
This is lucene_solr_4_7_2_r1586229, downloaded from the release manager's staging area. I configured the following in my log4j.properties file: log4j.rootLogger=WARN, file log4j.category.org.apache.solr.core.SolrCore=INFO, file Now EVERYTHING that SolrCore logs (which is all at INFO) is being

Re: Solr Admin core status - Index is not Current

2014-04-11 Thread Chris W
Thanks, Shawn. On Fri, Apr 11, 2014 at 11:11 AM, Shawn Heisey s...@elyograg.org wrote: On 4/10/2014 2:50 PM, Chris W wrote: Hi there I am using solrcloud (4.3). I am trying to get the status of a core from solr using (localhost:8000/solr/admin/cores?action=STATUScore=core) and i get

Re: deleting large amount data from solr cloud

2014-04-11 Thread Vinay Pothnis
Tried to increase the memory to 24G but that wasn't enough as well. Agree that the index has now grown too much and had to monitor this and take action much earlier. The search operations seem to run ok with 16G - mainly because the bulk of the data that we are trying to delete is not getting

Re: deleting large amount data from solr cloud

2014-04-11 Thread Shawn Heisey
On 4/10/2014 7:25 PM, Vinay Pothnis wrote: When we tried to delete the data through a query - say 1 day/month's worth of data. But after deleting just 1 month's worth of data, the master node is going out of memory - heap space. Wondering is there any way to incrementally delete the data

Re: deleting large amount data from solr cloud

2014-04-11 Thread Vinay Pothnis
The query is something like this: *curl -H 'Content-Type: text/xml' --data 'deletequeryparam1:(val1 OR val2) AND -param2:(val3 OR val4) AND date_param:[138395520 TO 138516480]/query/delete' 'http://host:port/solr/coll-name1/update?commit=true'* Trying to restrict the number of documents

Re: [ANN] Solr learning resources on safariflow.com (w/subscription or free trial)

2014-04-11 Thread Alexandre Rafalovitch
Looks nice. Would love to see the author-side usage/statistics too. To know which chapters of my book were most useful/recommended. Regards, Alex On 11/04/2014 8:45 pm, Michael Sokolov msoko...@safaribooksonline.com wrote: I just wanted to let people know about some recent Solr books and

Re: deleting large amount data from solr cloud

2014-04-11 Thread Aman Tandon
Vinay please share your experience after trying this solution. On Sat, Apr 12, 2014 at 4:12 AM, Vinay Pothnis poth...@gmail.com wrote: The query is something like this: *curl -H 'Content-Type: text/xml' --data 'deletequeryparam1:(val1 OR val2) AND -param2:(val3 OR val4) AND