Re: Deploying multiple ZooKeeper ensemble on a single machine

2015-04-07 Thread Shawn Heisey
On 4/7/2015 9:16 PM, Zheng Lin Edwin Yeo wrote: > I'm using SolrCloud 5.0.0 and ZooKeeper 3.4.6 running on Windows, and now > I'm trying to deploy a multiple ZooKeeper ensemble (3 servers) on a single > machine. These are the settings which I have configured, according to the > Solr Reference Guide

Re: Deploying multiple ZooKeeper ensemble on a single machine

2015-04-07 Thread nutchsolruser
I have to choose unique client port #’s for each. Here I can see that you have same client port for all 3 servers. You can refer this link. -- View this message in context: http://lucene.472066.n3.nabble.com/Deploying-multiple-ZooKeeper-

Deploying multiple ZooKeeper ensemble on a single machine

2015-04-07 Thread Zheng Lin Edwin Yeo
Hi, I'm using SolrCloud 5.0.0 and ZooKeeper 3.4.6 running on Windows, and now I'm trying to deploy a multiple ZooKeeper ensemble (3 servers) on a single machine. These are the settings which I have configured, according to the Solr Reference Guide. These files are under \conf\ directory (C:\Users

How to trace error records during POST?

2015-04-07 Thread Simon Cheng
Good morning, I used Solr 4.7 to post 186,745 XML files and 186,622 files have been indexed. That means there are 123 XML files with errors. How can I trace what these files are? Thank you in advance, Simon Cheng.

RE: Trouble GetSpans lucene 4

2015-04-07 Thread Allison, Timothy B.
Oh, ok, if that's just a regular query, you will need to convert it to a SpanQuery, and you may need to rewrite the SpanQuery after conversion. If you're trying to do a concordance or trying to retrieve windows around the hits, take a look at ConcordanceSearcher within: https://github.com/tball

Re: Config join parse in solrconfig.xml

2015-04-07 Thread Frank li
Cool. It actually works after I removed those extra columns. Thanks for your help. On Mon, Apr 6, 2015 at 8:19 PM, Erick Erickson wrote: > df does not allow multiple fields, it stands for "default field", not > "default fields". To get what you're looking for, you need to use > edismax or explic

Re: Trouble GetSpans lucene 4

2015-04-07 Thread Test Test
Re, origQuery is a Query object, i got it from a ResponseBuilder object, passed by the method getQuery. ResponseBuilder rb // it's method parameterQuery origQuery = rb.getQuery(); Thanks for the link, i'll keep you informed. Regards,Andy Le Mardi 7 avril 2015 20h26, "Allison, Timothy B."

Re: Problem with new solr.xml format and core swaps

2015-04-07 Thread Shawn Heisey
On 4/7/2015 10:54 AM, Erick Erickson wrote: > I'm pretty clueless why you would be seeing this, and slammed with > other stuff so I can't dig into this right now. > > What do the "core.properties" files look like when you see this? They > should be re-written when you swap cores. Hmmm, I wonder if

Re: Merge Two Fields in SOLR

2015-04-07 Thread Damien Dykman
Ravi, what about using field aliasing at search time? Would that do the trick for your use case? http://localhost:8983/solr/mycollection/select?defType=edismax&q=name:"john doe"&f.name.qf=firstname surname For more details: https://cwiki.apache.org/confluence/display/solr/The+Extended+DisMax+Quer

RE: Trouble GetSpans lucene 4

2015-04-07 Thread Allison, Timothy B.
What class is origQuery? You will have to do more rewriting/calculation if you're trying to convert a PhraseQuery to a SpanNearQuery. If you dig around in org.apache.lucene.search.highlight.WeightedSpanTermExtractor in the Lucene highlighter package, you might get some inspiration. I have a h

Re: Trouble GetSpans lucene 4

2015-04-07 Thread Compte Poubelle
Up. Anyone? Best regards. > On 6 avr. 2015, at 21:32, Test Test wrote: > > Hi, > I'm working on TamingText's book.I try to upgrade the code from solr 3.6 to > solr 4.10.2.At the moment, i have a problem about the method > "getSpans"."spans.next()" returns always "false".Anyone can helps? > S

Re: Merge Two Fields in SOLR

2015-04-07 Thread Erick Erickson
I don't understand why copyField doesn't work. Admittedly the firstName and SurName would be separate tokens, but isn't that what you want? The fact that it's multiValued isn't really a problem, multiValued fields are really functionally identical to single valued fields if you set positionIncremen

Merge Two Fields in SOLR

2015-04-07 Thread EXTERNAL Taminidi Ravi (ETI, AA-AS/PAS-PTS)
Hi Group, I am not sure if we have any easy way to merge two fields data in One Field, the Copy field doesn’t works as it stores as Multivalued. Can someone suggest any workaround to achieve this Use Case? FirstName:ABC SurName:XYZ I need an Another Field with Name:ABCXYZ where I have to do a

Re: What is the best way of Indexing different formats of documents?

2015-04-07 Thread Erick Erickson
The disadvantages of DIH are 1> it's a black box, debugging it isn't easy 2> it puts all the work on the Solr node. Parsing documents in various forms can be pretty heavy-weight and steal cycles from indexing and searching. 2a> the extracting request handler also puts all the load on Solr FWIW. P

Re: Problem with new solr.xml format and core swaps

2015-04-07 Thread Erick Erickson
Shawn: I'm pretty clueless why you would be seeing this, and slammed with other stuff so I can't dig into this right now. What do the "core.properties" files look like when you see this? They should be re-written when you swap cores. Hmmm, I wonder if there's some condition where the files are al

Re: What is the best way of Indexing different formats of documents?

2015-04-07 Thread Dan Davis
Sangeetha, You can also run Tika directly from data import handler, and Data Import Handler can be made to run several threads if you can partition the input documents by directory or database id. I've done 4 "threads" by having a base configuration that does an Oracle query like this: SE

Re: Lucene indexWriter update does not affect Solr search

2015-04-07 Thread Ali Nazemian
I did some investigation and found out that the retrieving part of documents works fine while Solr did not restarted. But the searching part of documents did not work. After I restarted Solr it seems that the core corrupted and failed to start! Here is the corresponding log: org.apache.solr.common

Re: DictionaryCompoundWordTokenFilterFactory - Dictionary/Compound-Words File

2015-04-07 Thread Mike L.
Typo:   *even when the user delimits with a space. (e.g. base ball should find baseball). Thanks, From: Mike L. To: "solr-user@lucene.apache.org" Sent: Tuesday, April 7, 2015 9:05 AM Subject: DictionaryCompoundWordTokenFilterFactory - Dictionary/Compound-Words File Solr User G

DictionaryCompoundWordTokenFilterFactory - Dictionary/Compound-Words File

2015-04-07 Thread Mike L.
Solr User Group -    I have a case where I need to be able to search against compound words, even when the user delimits with a space. (e.g. baseball => base ball).  I think I've solved this by creating a compound-words dictionary file containing the split words that I would want DictionaryCom

Re: Lucene indexWriter update does not affect Solr search

2015-04-07 Thread Ali Nazemian
Dear Upayavira, Hi, It is just the part of my code in which caused the problem. I know searchComponent is not for changing the index, but for the purpose of extracting document keywords I was forced to hack searchComponent for extracting keywords and putting them into index. For more information ab

Re: What is the best way of Indexing different formats of documents?

2015-04-07 Thread Yavar Husain
Well have indexed heterogeneous sources including a variety of NoSQL's, RDBMs and Rich Documents (PDF Word etc.) using SolrJ. The only prerequisite of using SolrJ is that you should have an API to fetch data from your data source (Say JDBC for RDBMS, Tika for extracting text content from rich docum

Re: What is the best way of Indexing different formats of documents?

2015-04-07 Thread Upayavira
On Tue, Apr 7, 2015, at 11:48 AM, sangeetha.subraman...@gtnexus.com wrote: > Hi, > > I am a newbie to SOLR and basically from database background. We have a > requirement of indexing files of different formats (x12,edifact, > csv,xml). > The files which are inputted can be of any format and we n

Re: Lucene indexWriter update does not affect Solr search

2015-04-07 Thread Upayavira
What are you trying to do? A search component is not intended for updating the index, so it really doesn’t surprise me that you aren’t seeing updates. I’d suggest you describe the problem you are trying to solve before proposing solutions. Upayavira On Tue, Apr 7, 2015, at 01:32 PM, Ali Nazemia

Lucene indexWriter update does not affect Solr search

2015-04-07 Thread Ali Nazemian
I implement a small code for the purpose of extracting some keywords out of Lucene index. I did implement that using search component. My problem is when I tried to update Lucene IndexWriter, Solr index which is placed on top of that, does not affect. As you can see I did the commit part. Bool

Re: What is the best way of Indexing different formats of documents?

2015-04-07 Thread Swaraj Kumar
You can always choose either DIH or /update/extract to index docs in solr. Now there are multiple benefits of DIH which I am listing below :- 1. Clean and update using a single command. 2. DIH also optimize indexing using optimize=true 3. You can do delta-import based on last index time where as i

Re: Collapse and Expand behaviour on result with 1 document.

2015-04-07 Thread Joel Bernstein
I believe currently issuing another query will be necessary to get the count of the expanded result set. I think it does make sense to include this information as part of the ExpandComponent output. So feel free to create a jira ticket for this and we should be able to get this into a future relea

What is the best way of Indexing different formats of documents?

2015-04-07 Thread sangeetha.subraman...@gtnexus.com
Hi, I am a newbie to SOLR and basically from database background. We have a requirement of indexing files of different formats (x12,edifact, csv,xml). The files which are inputted can be of any format and we need to do a content based search on it. >From the web I understand we can use TIKA pro

RE: How do I use CachedSqlEntityProcessor?

2015-04-07 Thread chuotlac
The conversation helps me understand Cached processor a lot. I'm working on DIH cache using MapDB as backed engine instead of default CachedSqlEntityProcessor -- View this message in context: http://lucene.472066.n3.nabble.com/How-do-I-use-CachedSqlEntityProcessor-tp4064919p4198037.html Sent fr

Re: Setting up SolrCloud 5.0.0 and ZooKeeper 3.4.6

2015-04-07 Thread Zheng Lin Edwin Yeo
Thanks Swaraj. It is working now, after I run without start, and changing the zookeeper port to 2888 instead. Regards, Edwin On 7 April 2015 at 14:59, Swaraj Kumar wrote: > As per http://stackoverflow.com/questions/11765015/zookeeper-not-starting >

Re: Collapse and Expand behaviour on result with 1 document.

2015-04-07 Thread Derek Poh
Hi Joel Is the number of documents info available when using collapse and expand parameters? I can't seem to find it in the return xml. I know the numFound in the the main result set (maxScore="6.470696" name="response" numFound="27" start="0">) refer to the number of collapse groups. I nee

Re: Solr 4.2.0 index corruption issue

2015-04-07 Thread Puneet Jain
HI Guys, Please can someone help out here to pin-point the issue..? Thanks & Regards, Puneet On Mon, Apr 6, 2015 at 1:27 PM, Puneet Jain wrote: > Hi Guys, > > I am using 4.2.0 since more than a year and since last October 2014 facing > index corruption issue. However, now it is happening every

Re: Setting up SolrCloud 5.0.0 and ZooKeeper 3.4.6

2015-04-07 Thread Swaraj Kumar
As per http://stackoverflow.com/questions/11765015/zookeeper-not-starting Running without start will fix this. One more change you need to do is Solr default runs on 8983 and you have used 8983 in zookeeper so start solr on diffe