Re: Re: solr 4.2.1 index gets slower over time

2014-04-02 Thread Dmitry Kan
Thanks, Markus, that is useful. I'm guessing the higher the weight, the longer the op takes? On Tue, Apr 1, 2014 at 10:39 PM, Markus Jelsma markus.jel...@openindex.iowrote: You may want to increase reclaimdeletesweight for tieredmergepolicy from 2 to 3 or 4. By default it may keep too much

Re: How to add a map of key/value pairs into a solr schema?

2014-04-02 Thread Silvia Suárez
Dear Jack I'm using SolrJ to access and query the values in the solr collection, For example, I have a collection in solr in which I are updating the c_perfil multivalued field, using this code: SolrInputDocument sdoc = new SolrInputDocument();

Re: Product index schema for solr

2014-04-02 Thread Ajay Patel
As per your suggestion my final schema will be like { id:unique_id ... ... [PRODUCT RELATED DATAS] ... ... ... min_qty: 1 max_qty: 50 price: 4 } [OTHER SAME LIKE ABOVE DATA] now i want to create range facet field by combing min_qty and

Velocity template examples and hardcoded contextPath

2014-04-02 Thread Thomas Pii
The current velocity template examples in the 4.6.1 distribution have a hard coded context path for the solr web application: #macro(url_root)/solr#end in VM_global_library.vm hardcodes it to /solr I would like to change this to determine the context path at run time, so the templates do not

Re: transaction log size

2014-04-02 Thread Gurfan
Thanks Shawn for the quick reply. We are using Solr Cloud version 4.6.1 Usually we see higher transaction log on replica. Leader`s tlog size is in KB`s. We also tried keeping the hard commit(autoCommit) as 20 Sec and autoSoftCommit as 30 Sec. We written a script to monitor the disk usage of

Re: Re: solr 4.2.1 index gets slower over time

2014-04-02 Thread elisabeth benoit
This sounds interesting, I'll check this out. Thanks! Elisabeth 2014-04-02 8:54 GMT+02:00 Dmitry Kan solrexp...@gmail.com: Thanks, Markus, that is useful. I'm guessing the higher the weight, the longer the op takes? On Tue, Apr 1, 2014 at 10:39 PM, Markus Jelsma

Issue with solr searching : words with - not able to search

2014-04-02 Thread Priti Solanki
Hello friends, I have got one issue I am trying to searching X-Ray Machine Now Solr is returning multiple rows even if I am doing a exact search. [ On solr server directly] Secondly, I am using PHP client to talk to solr but with some reason I can't search with X-Ray Machine. Solr response

Re: Issue with solr searching : words with - not able to search

2014-04-02 Thread Alexandre Rafalovitch
What's your field type definition where your X-Ray string is stored? Regards, Alex. Personal website: http://www.outerthoughts.com/ Current project: http://www.solr-start.com/ - Accelerating your Solr proficiency On Wed, Apr 2, 2014 at 3:19 PM, Priti Solanki pritiatw...@gmail.com wrote:

Re: Block Join Parent Query across children docs

2014-04-02 Thread mertens
Hi Mikhail, Thanks for your response. Here is an example of what I'm trying to do. If I had the following documents: doc field name=id10/field field name=type_sparent/field field name=name_sUser1/field doc field name=id11/field field

Suspicious Object.wait in UnInvertedField.getUnInvertedField

2014-04-02 Thread adfel70
While debugging a problem where 400 threads were waiting for a single lock we traced the issue to the getUnInvertedField method. public static UnInvertedField getUnInvertedField(String field, SolrIndexSearcher searcher) throws IOException { SolrCacheString,UnInvertedField cache =

Re: Issue with solr searching : words with - not able to search

2014-04-02 Thread Ajay Patel
Can u please share your schema.xml used for this solr instance? On Wednesday 02 April 2014 01:49 PM, Priti Solanki wrote: Hello friends, I have got one issue I am trying to searching X-Ray Machine Now Solr is returning multiple rows even if I am doing a exact search. [ On solr server

RE: Where to specify numShards when startup up a cloud setup

2014-04-02 Thread zzT
It seems that I've figured out a configuration approach to this issue. I'm having the exact same issue and the only viable solutions found on the net till now are 1) Pass -DnumShards=x when starting up Solr server 2) Use the Collections API as indicated by Shawn. What I've noticed though - after

split existing indexes into shards

2014-04-02 Thread Gastone Penzo
Hello, i have 2 shards with 2 replicas in 4 different node like this scheme: server1 -- shard1 master server2 - shard 2 master server 3 shard replicas of shard1 server 4 --- shard replicas of shard2 i have existing indexes only in shard 1

Return Solr docs in a specific order by list of ids

2014-04-02 Thread marotosg
Hi, I have a use case where I have a list of doc ids and I need to return Documents from solr in the same order as my list of ids. For instance: 459,185,569,8,1,896 Is it possible to return docs is Solr following in the same order? Regards, Sergio -- View this message in context:

Re: sort by an attribute values sequence

2014-04-02 Thread santosh sidnal
Re-sending my e-mail. any pointers/ links for the issue will help me lot. Thanks in advance. On Tue, Apr 1, 2014 at 4:25 PM, santosh sidnal sidnal.sant...@gmail.comwrote: Hi All, We have a specific requirement of sorting the products as per a specific attribute value sequence. Any pointer

Re: The word no in a query

2014-04-02 Thread François Schiettecatte
Have you looked at the debugging output? http://wiki.apache.org/solr/CommonQueryParameters#Debugging François On Apr 2, 2014, at 1:37 AM, Bob Laferriere spongeb...@icloud.com wrote: I have built an commerce search engine. I am struggling with the word “no” in queries. We have

Re: Return Solr docs in a specific order by list of ids

2014-04-02 Thread Alexandre Rafalovitch
Most anything us possible but maybe not out of the box. Custom post filter ? On 02/04/2014 5:47 pm, marotosg marot...@gmail.com wrote: Hi, I have a use case where I have a list of doc ids and I need to return Documents from solr in the same order as my list of ids. For instance:

Re: eDismax parser and the mm parameter

2014-04-02 Thread simpleliving...@gmail.com
It only works for a single word search term and not multiple word search term. Sent from my HTC - Reply message - From: William Bell billnb...@gmail.com To: solr-user@lucene.apache.org solr-user@lucene.apache.org Subject: eDismax parser and the mm parameter Date: Wed, Apr 2, 2014 12:03

Flush buffer exceptions

2014-04-02 Thread ku3ia
Hi all! I'm using Solr 4.6.0 and Jetty 8. Sometimes in jetty's logs are these errors and warnings: ERROR - 2014-03-27 17:11:15.022; org.apache.solr.common.SolrException; null:org.eclipse.jetty.io.EofException at org.eclipse.jetty.http.HttpGenerator.flushBuffer(HttpGenerator.java:914)

Errors after upgrading to 4.6.1

2014-04-02 Thread Christopher Gross
I get both of these errors a few times in my tomcat (7.0.52) catalina.out logfile: 2014-04-02 13:22:32,026 WARN org.apache.solr.schema.FieldTypePluginLoader - TokenFilterFactory is using deprecated LUCENE_33 emulation. You should at some point declare and reindex to at least 4.0, because 3.x

[ANNOUNCE] Apache Solr 4.7.1 released

2014-04-02 Thread Steve Rowe
April 2014, Apache Solr™ 4.7.1 available The Lucene PMC is pleased to announce the release of Apache Solr 4.7.1 Solr is the popular, blazing fast, open source NoSQL search platform from the Apache Lucene project. Its major features include powerful full-text search, hit highlighting, faceted

Re: eDismax parser and the mm parameter

2014-04-02 Thread Ahmet Arslan
Hi SL, Instead of fuzzy queries, can't you use spell checker? Generally Spell Checker (a.k.a did you mean) is a preferred tool for typos. Ahmet On Wednesday, April 2, 2014 4:13 PM, simpleliving...@gmail.com simpleliving...@gmail.com wrote: It only works for a single word search term and not

Re: Luke 4.7.0 released

2014-04-02 Thread Joshua P
Hi there! I'm recieving the following errors when trying to run luke-with-deps.jar SLF4J: Failed to load class org.slf4j.impl.StaticLoggerBinder. SLF4J: Defaulting to no-operation (NOP) logger implementation SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.

Re: sort by an attribute values sequence

2014-04-02 Thread Ahmet Arslan
Hi, How many distinct producttype do you have? May be  q=C^5000 OR B^4000 OR A^3000 OR Ddf=producttype could work. If you can came up with a function that takes maximum value when producttype=C … etc you can sort by function queries too. http://wiki.apache.org/solr/FunctionQuery Ahmet On

Re: The word no in a query

2014-04-02 Thread Ahmet Arslan
Hi Bob, Your field type would be useful here. Can you copy-paste it? Ahmet On Wednesday, April 2, 2014 2:01 PM, François Schiettecatte fschietteca...@gmail.com wrote: Have you looked at the debugging output?     http://wiki.apache.org/solr/CommonQueryParameters#Debugging François On Apr

Re: transaction log size

2014-04-02 Thread Erick Erickson
On the surface, this doesn't make sense, I'd expect that the tlogs would be roughly the same size on leaders and replicas. Or at least show the same variance. If you were to guess how much volume in terms of files being fired at the index, how much would you expect in 30 seconds? And does it

Re: Spatial maxDistErr changes

2014-04-02 Thread David Smiley
Good question Steve, You'll have to re-index right off. ~ David p.s. Sorry I didn't reply sooner; I just switched jobs and reconfigured my mailing list subscriptions Steven Bower wrote If am only indexing point shapes and I want to change the maxDistErr from 0.09 (1m res) to 0.00045

Get number of documents in a new (not visible) Searcher

2014-04-02 Thread Oliver Schrenk
Hi, We have a SolrCloud 4.7 cluster with five machines and index in a distributed fashion. When finished adding and deleting documents, we want to commit programmaticly and switch to a new searcher. But before doing that we want to make a final check that the number of documents have not

Get number of documents in a new (not visible) Searcher

2014-04-02 Thread Oliver Schrenk
Hi, We have a SolrCloud 4.7 cluster with five machines and index in a distributed fashion. When finished adding and deleting documents, we want to commit programmaticly and switch to a new searcher. But before doing that we want to make a final check that the number of documents have not

Re: Luke 4.7.0 released

2014-04-02 Thread simon
Also seeing this on Mac OS X. java version = Java(TM) SE Runtime Environment (build 1.7.0_51-b13) On Wed, Apr 2, 2014 at 11:01 AM, Joshua P jpetersen...@gmail.com wrote: Hi there! I'm recieving the following errors when trying to run luke-with-deps.jar SLF4J: Failed to load class

Re: Return Solr docs in a specific order by list of ids

2014-04-02 Thread marotosg
I found an easy solution which is using the boosting (PersonID:459)^0.6 OR (PersonID:185)^0.5 OR (PersonID:569)^0.4 OR (PersonID:8)^0.3 OR (PersonID:1)^0.2 OR (PersonID:896) ^0.1 -- View this message in context:

Re: how do I get search for fort st john to match ft saint john

2014-04-02 Thread solr-user
Hi Eric. No, that doesnt fix the problem either (I have tested this previously and did so again just now) Since the PatternTokenizerFactory is not tokenizing on whitespace(by design since I want the user to search by phrase), the phrase marina former fort ord (for example) does not get turned

Re: how do I get search for fort st john to match ft saint john

2014-04-02 Thread Jack Krupansky
Query by phrase is a core feature of tokenized text in Lucene and Solr, so there is no need to use a pattern token filter for that purpose. And yes, doing so pretty much breaks most token filters that would assume that the text is tokenized. -- Jack Krupansky -Original Message-

Postings Format for Span queries on big index

2014-04-02 Thread Gopal Agarwal
Does lucene 4.6 use Lucene41PostingsFormat for Postings.nextdoc() while executing the span queries? When I am debugging the lucene 4.6 test cases for span queries, it is showing that for above nextdoc() call it is utilizing DirectPostingsFormat. My requirement is to run multiple span queries

Analysis of Japanese characters

2014-04-02 Thread Shawn Heisey
My company is setting up a system for a customer from Japan. We have an existing system that handles primarily English. Here's my general text analysis chain: http://apaste.info/xa5 After talking to the customer about problems they are encountering with search, we have determined that some

Re: Analysis of Japanese characters

2014-04-02 Thread Tom Burton-West
Hi Shawn, I'm not sure I understand the problem and why you need to solve it at the ICUTokenizer level rather than the CJKBigramFilter Can you perhaps give a few examples of the problem? Have you looked at the flags for the CJKBigramfilter? You can tell it to make bigrams of different Japanese

Re: Analysis of Japanese characters

2014-04-02 Thread Shawn Heisey
On 4/2/2014 11:33 AM, Tom Burton-West wrote: Hi Shawn, I'm not sure I understand the problem and why you need to solve it at the ICUTokenizer level rather than the CJKBigramFilter Can you perhaps give a few examples of the problem? Have you looked at the flags for the CJKBigramfilter? You can

PDF Indexing

2014-04-02 Thread Sujatha Arun
Hi, I am able to use TIKA and DIH to Index a pdf as a single document.However I need each page to be single document. Is there any inbuilt mechanism to achieve the same or do I have to use pdfbox or any other tool achieve this? Regards

Re: AND not as a boolean operator in Phrase

2014-04-02 Thread abhishek jain
Hi, Ok thanks, i want to search for phrase A and B with the *and *word sandwiched between A and B. I dont want to work with and as a boolean operator when within quotes. I have and as a stop word and i dont want to reindex data. What is my best bet. thanks abhishek jain On Sun, Mar 30, 2014

Re: how do I get search for fort st john to match ft saint john

2014-04-02 Thread Erick Erickson
No, there isn't a tokenizer that'll do what you want that I know about. Really, I suspect you need to back up a bit and re-think the problem. It looks to me like you've taken a path that's going to cause you endless grief when, as Jack says, phrase searches are built in to the tokenization

Re: PDF Indexing

2014-04-02 Thread Ahmet Arslan
Hi Sujatha, There is no built in mechanism. Prepare page documents outside of the solr.  http://searchhub.org/2012/02/14/indexing-with-solrj/ And you may want to save text content somewhere too. If you change something in index analysis/schema you need to reindex. If you save text data, you

Re: Get number of documents in a new (not visible) Searcher

2014-04-02 Thread Ahmet Arslan
Hi Oliver, You can see docsPending:30 adds:30 in  plugin stats section.  http://localhost:8983/solr/#/collection1/plugins/updatehandler?entry=updateHandler These parameters are exposed via JMX.  https://cwiki.apache.org/confluence/display/solr/Using+JMX+with+Solr Alternative way is to use : 

Re: AND not as a boolean operator in Phrase

2014-04-02 Thread Ahmet Arslan
Hi Abhishek, Your best bet is dismax query parser which does not recognize and/AND as an operator. q=A and Bqf=someFielddefType=dismax Ahmet On Wednesday, April 2, 2014 10:01 PM, abhishek jain abhishek.netj...@gmail.com wrote: Hi, Ok thanks, i want to search for phrase A and B with the *and

Re: Analysis of Japanese characters

2014-04-02 Thread Tom Burton-West
Hi Shawn, I may still be missing your point. Below is an example where the ICUTokenizer splits Now, I'm beginning to wonder if I really understand what those flags on the CJKBigramFilter do. The ICUTokenizer spits out unigrams and the CJKBigramFilter will put them back together into bigrams. I

Re: Analysis of Japanese characters

2014-04-02 Thread Shawn Heisey
On 4/2/2014 2:19 PM, Tom Burton-West wrote: Hi Shawn, I may still be missing your point. Below is an example where the ICUTokenizer splits Now, I'm beginning to wonder if I really understand what those flags on the CJKBigramFilter do. The ICUTokenizer spits out unigrams and the CJKBigramFilter

Re: Get number of documents in a new (not visible) Searcher

2014-04-02 Thread Ahmet Arslan
Hi Oliver, You can get docsPending  http://localhost:8983/solr/admin/mbeans?cat=UPDATEHANDLERstats=true https://cwiki.apache.org/confluence/display/solr/MBean+Request+Handler Ahmet On Wednesday, April 2, 2014 7:42 PM, Oliver Schrenk oliver.schr...@gmail.com wrote: Hi, We have a SolrCloud

How to search one field and highlight another

2014-04-02 Thread Tang, Rebecca
Hi there, For dates we create two Solr fields: date_display and date. date_display: stored = true, indexed = false, it's for display purpose only date: stored = false, indexed = true, it's used for searching, ordering and faceting When users search on date, I need to be able to highlight

Re: eDismax parser and the mm parameter

2014-04-02 Thread simpleliving...@gmail.com
Ahmet. Thanks I will look into this option . Does spellchecker support multiple word search terms? Sent from my HTC - Reply message - From: Ahmet Arslan iori...@yahoo.com To: solr-user@lucene.apache.org solr-user@lucene.apache.org Subject: eDismax parser and the mm parameter Date:

Re: Errors after upgrading to 4.6.1

2014-04-02 Thread Chris Hostetter
: What should I be doing to fix them? Is there a replacement for those : classes? Do I just need to change the luceneMatchVersion to be LUCENE_461 : or something? that's pretty much exactly what that warning message is trying to tell you -- your config ways to use LUCENE_33 mode, but that

Re: Block Join Parent Query across children docs

2014-04-02 Thread Chris Hostetter
: Thanks for your response. Here is an example of what I'm trying to do. If I : had the following documents: what you are attempting is fairly trivial -- you want to query for all parent documents, then kapply 3 filters: * parent of a child matching item1 * parent of a child matching item2

Re: eDismax parser and the mm parameter

2014-04-02 Thread Ahmet Arslan
Yes, it has spellcheck.collate parameter. I mean it has lots of parameters and with correct combination of parameters  it can suggest White Siberian Ginseng from Whte Sberia Ginsng https://cwiki.apache.org/confluence/display/solr/Spell+Checking On Thursday, April 3, 2014 1:57 AM,

Re: eDismax parser and the mm parameter

2014-04-02 Thread S.L
Thanks Ahmet, I would definitely look into this . I appreciate that. On Wed, Apr 2, 2014 at 7:47 PM, Ahmet Arslan iori...@yahoo.com wrote: Yes, it has spellcheck.collate parameter. I mean it has lots of parameters and with correct combination of parameters it can suggest White Siberian

Re: Errors on index in SolrCloud: ConcurrentUpdateSolrServer$Runner.run()

2014-04-02 Thread rulinma
org.apache.solr.common.SolrException: Bad Request request: http://192.168.22.35:8080/solr/collection_networkSchool_shard1_replica3/update?update.distrib=FROMLEADERdistrib.from=http%3A%2F%2F192.168.22.34%3A8080%2Fsolr%2Fcollection_networkSchool_shard1_replica1%2Fwt=javabinversion=2 at

Re: Errors on index in SolrCloud: ConcurrentUpdateSolrServer$Runner.run()

2014-04-02 Thread rulinma
I see it is config problem. -- View this message in context: http://lucene.472066.n3.nabble.com/Errors-on-index-in-SolrCloud-ConcurrentUpdateSolrServer-Runner-run-tp4107661p4128751.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: PDF Indexing

2014-04-02 Thread Jack Krupansky
I see that the PDFBox library (which is what Tika uses for PDF files) has methods to manipulate individual pages: http://stackoverflow.com/questions/6839787/reading-a-particular-page-from-a-pdf-document-using-pdfbox -- Jack Krupansky -Original Message- From: Ahmet Arslan Sent:

Re: how do I get search for fort st john to match ft saint john

2014-04-02 Thread Jack Krupansky
And, if you use the pf, pf2, and pf3 parameters of edismax, with boosting, you can assure that the closest matches always appear first. And assuming you do index-time synonym expansion. -- Jack Krupansky -Original Message- From: Erick Erickson Sent: Wednesday, April 2, 2014 3:09 PM