Re: Setting up SolrCloud 5.0.0 and ZooKeeper 3.4.6

2015-04-07 Thread Zheng Lin Edwin Yeo
Thanks Swaraj. It is working now, after I run without start, and changing the zookeeper port to 2888 instead. Regards, Edwin On 7 April 2015 at 14:59, Swaraj Kumar swaraj2...@gmail.com wrote: As per http://stackoverflow.com/questions/11765015/zookeeper-not-starting

Re: Solr 4.2.0 index corruption issue

2015-04-07 Thread Puneet Jain
HI Guys, Please can someone help out here to pin-point the issue..? Thanks Regards, Puneet On Mon, Apr 6, 2015 at 1:27 PM, Puneet Jain ja.pun...@gmail.com wrote: Hi Guys, I am using 4.2.0 since more than a year and since last October 2014 facing index corruption issue. However, now it is

Re: Collapse and Expand behaviour on result with 1 document.

2015-04-07 Thread Derek Poh
Hi Joel Is the number of documents info available when using collapse and expand parameters? I can't seem to find it in the return xml. I know the numFound in the the main result set (result maxScore=6.470696 name=response numFound=27 start=0) refer to the number of collapse groups. I

RE: How do I use CachedSqlEntityProcessor?

2015-04-07 Thread chuotlac
The conversation helps me understand Cached processor a lot. I'm working on DIH cache using MapDB as backed engine instead of default CachedSqlEntityProcessor -- View this message in context: http://lucene.472066.n3.nabble.com/How-do-I-use-CachedSqlEntityProcessor-tp4064919p4198037.html Sent

Re: Setting up SolrCloud 5.0.0 and ZooKeeper 3.4.6

2015-04-07 Thread Swaraj Kumar
As per http://stackoverflow.com/questions/11765015/zookeeper-not-starting http://stackoverflow.com/questions/11765015/zookeeper-not-starting Running without start will fix this. One more change you need to do is Solr default runs on 8983 and you have used 8983 in zookeeper so start solr on

What is the best way of Indexing different formats of documents?

2015-04-07 Thread sangeetha.subraman...@gtnexus.com
Hi, I am a newbie to SOLR and basically from database background. We have a requirement of indexing files of different formats (x12,edifact, csv,xml). The files which are inputted can be of any format and we need to do a content based search on it. From the web I understand we can use TIKA

Re: Collapse and Expand behaviour on result with 1 document.

2015-04-07 Thread Joel Bernstein
I believe currently issuing another query will be necessary to get the count of the expanded result set. I think it does make sense to include this information as part of the ExpandComponent output. So feel free to create a jira ticket for this and we should be able to get this into a future

Re: What is the best way of Indexing different formats of documents?

2015-04-07 Thread Swaraj Kumar
You can always choose either DIH or /update/extract to index docs in solr. Now there are multiple benefits of DIH which I am listing below :- 1. Clean and update using a single command. 2. DIH also optimize indexing using optimize=true 3. You can do delta-import based on last index time where as

Lucene indexWriter update does not affect Solr search

2015-04-07 Thread Ali Nazemian
I implement a small code for the purpose of extracting some keywords out of Lucene index. I did implement that using search component. My problem is when I tried to update Lucene IndexWriter, Solr index which is placed on top of that, does not affect. As you can see I did the commit part.

Re: Lucene indexWriter update does not affect Solr search

2015-04-07 Thread Upayavira
What are you trying to do? A search component is not intended for updating the index, so it really doesn’t surprise me that you aren’t seeing updates. I’d suggest you describe the problem you are trying to solve before proposing solutions. Upayavira On Tue, Apr 7, 2015, at 01:32 PM, Ali

Re: What is the best way of Indexing different formats of documents?

2015-04-07 Thread Upayavira
On Tue, Apr 7, 2015, at 11:48 AM, sangeetha.subraman...@gtnexus.com wrote: Hi, I am a newbie to SOLR and basically from database background. We have a requirement of indexing files of different formats (x12,edifact, csv,xml). The files which are inputted can be of any format and we need

Re: DictionaryCompoundWordTokenFilterFactory - Dictionary/Compound-Words File

2015-04-07 Thread Mike L.
Typo:   *even when the user delimits with a space. (e.g. base ball should find baseball). Thanks, From: Mike L. javaone...@yahoo.com To: solr-user@lucene.apache.org solr-user@lucene.apache.org Sent: Tuesday, April 7, 2015 9:05 AM Subject: DictionaryCompoundWordTokenFilterFactory -

Re: Lucene indexWriter update does not affect Solr search

2015-04-07 Thread Ali Nazemian
I did some investigation and found out that the retrieving part of documents works fine while Solr did not restarted. But the searching part of documents did not work. After I restarted Solr it seems that the core corrupted and failed to start! Here is the corresponding log:

Re: What is the best way of Indexing different formats of documents?

2015-04-07 Thread Yavar Husain
Well have indexed heterogeneous sources including a variety of NoSQL's, RDBMs and Rich Documents (PDF Word etc.) using SolrJ. The only prerequisite of using SolrJ is that you should have an API to fetch data from your data source (Say JDBC for RDBMS, Tika for extracting text content from rich

Re: Lucene indexWriter update does not affect Solr search

2015-04-07 Thread Ali Nazemian
Dear Upayavira, Hi, It is just the part of my code in which caused the problem. I know searchComponent is not for changing the index, but for the purpose of extracting document keywords I was forced to hack searchComponent for extracting keywords and putting them into index. For more information

DictionaryCompoundWordTokenFilterFactory - Dictionary/Compound-Words File

2015-04-07 Thread Mike L.
Solr User Group -    I have a case where I need to be able to search against compound words, even when the user delimits with a space. (e.g. baseball = base ball).  I think I've solved this by creating a compound-words dictionary file containing the split words that I would want

Re: What is the best way of Indexing different formats of documents?

2015-04-07 Thread Dan Davis
Sangeetha, You can also run Tika directly from data import handler, and Data Import Handler can be made to run several threads if you can partition the input documents by directory or database id. I've done 4 threads by having a base configuration that does an Oracle query like this:

Re: What is the best way of Indexing different formats of documents?

2015-04-07 Thread Erick Erickson
The disadvantages of DIH are 1 it's a black box, debugging it isn't easy 2 it puts all the work on the Solr node. Parsing documents in various forms can be pretty heavy-weight and steal cycles from indexing and searching. 2a the extracting request handler also puts all the load on Solr FWIW.

Merge Two Fields in SOLR

2015-04-07 Thread EXTERNAL Taminidi Ravi (ETI, AA-AS/PAS-PTS)
Hi Group, I am not sure if we have any easy way to merge two fields data in One Field, the Copy field doesn’t works as it stores as Multivalued. Can someone suggest any workaround to achieve this Use Case? FirstName:ABC SurName:XYZ I need an Another Field with Name:ABCXYZ where I have to do

Re: Problem with new solr.xml format and core swaps

2015-04-07 Thread Erick Erickson
Shawn: I'm pretty clueless why you would be seeing this, and slammed with other stuff so I can't dig into this right now. What do the core.properties files look like when you see this? They should be re-written when you swap cores. Hmmm, I wonder if there's some condition where the files are

Re: Trouble GetSpans lucene 4

2015-04-07 Thread Compte Poubelle
Up. Anyone? Best regards. On 6 avr. 2015, at 21:32, Test Test andymish...@yahoo.fr wrote: Hi, I'm working on TamingText's book.I try to upgrade the code from solr 3.6 to solr 4.10.2.At the moment, i have a problem about the method getSpans.spans.next() returns always false.Anyone can

RE: Trouble GetSpans lucene 4

2015-04-07 Thread Allison, Timothy B.
What class is origQuery? You will have to do more rewriting/calculation if you're trying to convert a PhraseQuery to a SpanNearQuery. If you dig around in org.apache.lucene.search.highlight.WeightedSpanTermExtractor in the Lucene highlighter package, you might get some inspiration. I have a

Re: Merge Two Fields in SOLR

2015-04-07 Thread Damien Dykman
Ravi, what about using field aliasing at search time? Would that do the trick for your use case? http://localhost:8983/solr/mycollection/select?defType=edismaxq=name:john doef.name.qf=firstname surname For more details:

Re: Merge Two Fields in SOLR

2015-04-07 Thread Erick Erickson
I don't understand why copyField doesn't work. Admittedly the firstName and SurName would be separate tokens, but isn't that what you want? The fact that it's multiValued isn't really a problem, multiValued fields are really functionally identical to single valued fields if you set

Re: Problem with new solr.xml format and core swaps

2015-04-07 Thread Shawn Heisey
On 4/7/2015 10:54 AM, Erick Erickson wrote: I'm pretty clueless why you would be seeing this, and slammed with other stuff so I can't dig into this right now. What do the core.properties files look like when you see this? They should be re-written when you swap cores. Hmmm, I wonder if

Re: Trouble GetSpans lucene 4

2015-04-07 Thread Test Test
Re, origQuery is a Query object, i got it from a ResponseBuilder object, passed by the method getQuery. ResponseBuilder rb // it's method parameterQuery origQuery = rb.getQuery(); Thanks for the link, i'll keep you informed. Regards,Andy Le Mardi 7 avril 2015 20h26, Allison, Timothy B.

Re: Config join parse in solrconfig.xml

2015-04-07 Thread Frank li
Cool. It actually works after I removed those extra columns. Thanks for your help. On Mon, Apr 6, 2015 at 8:19 PM, Erick Erickson erickerick...@gmail.com wrote: df does not allow multiple fields, it stands for default field, not default fields. To get what you're looking for, you need to use

RE: Trouble GetSpans lucene 4

2015-04-07 Thread Allison, Timothy B.
Oh, ok, if that's just a regular query, you will need to convert it to a SpanQuery, and you may need to rewrite the SpanQuery after conversion. If you're trying to do a concordance or trying to retrieve windows around the hits, take a look at ConcordanceSearcher within:

How to trace error records during POST?

2015-04-07 Thread Simon Cheng
Good morning, I used Solr 4.7 to post 186,745 XML files and 186,622 files have been indexed. That means there are 123 XML files with errors. How can I trace what these files are? Thank you in advance, Simon Cheng.

Re: Deploying multiple ZooKeeper ensemble on a single machine

2015-04-07 Thread nutchsolruser
I have to choose unique client port #’s for each. Here I can see that you have same client port for all 3 servers. You can refer this http://myjeeva.com/zookeeper-cluster-setup.html link. -- View this message in context:

Re: Deploying multiple ZooKeeper ensemble on a single machine

2015-04-07 Thread Shawn Heisey
On 4/7/2015 9:16 PM, Zheng Lin Edwin Yeo wrote: I'm using SolrCloud 5.0.0 and ZooKeeper 3.4.6 running on Windows, and now I'm trying to deploy a multiple ZooKeeper ensemble (3 servers) on a single machine. These are the settings which I have configured, according to the Solr Reference Guide.

Deploying multiple ZooKeeper ensemble on a single machine

2015-04-07 Thread Zheng Lin Edwin Yeo
Hi, I'm using SolrCloud 5.0.0 and ZooKeeper 3.4.6 running on Windows, and now I'm trying to deploy a multiple ZooKeeper ensemble (3 servers) on a single machine. These are the settings which I have configured, according to the Solr Reference Guide. These files are under ZOOKEEPER_HOME\conf\