Re: Unable to start solr 4.8

2014-06-23 Thread atp
hi , After rebooting all the cluster machine , we unable to start the solr . getting error messge like in solr log file, We removed index directory contents solr/collection_name/data/index.*/ and restared but getting 'â Recovery failed' from one of the node. please help to resolve this.

Formatting Issue

2014-06-23 Thread Venkata krishna
Hi, I am indexing RTF documents, but in search results i am not getting same format of text as indexed RTF document's text. I would like to get same format of text in search result. So could you please provide me any suggestion to resolve this formatting problem. Thanks, Venkata

Re: Formatting Issue

2014-06-23 Thread Alexandre Rafalovitch
Solr indexes RTF by using Apache Tika to extract text content of RTF. It does not know how to process the RTF. Tika knows how to process RTF, but not how to display it (which is what you are asking). You can probably store the original document either in Solr (not recommended) or in a separate

Re: Solr alternates returning different versions of the same document

2014-06-23 Thread yann
Hi Erik, thanks for your answer. I didn't manually assign docs to shards, I indexed all docs on one server, which then assigned it to shards (based on the default Solr behaviour, based on the document ID I believe). If I understood you correctly - this means the update section of the admin

Re: [ANN] Heliosearch 0.06 released, native code faceting

2014-06-23 Thread Bram Van Dam
On 06/20/2014 06:48 PM, Yonik Seeley wrote: Heliosearch is a Solr fork that will hopefully find it's way back to the ASF in the future. There are about 50 instances of sun.misc.unsafe in heliosearch's code at this point. Has this been tested on non-oracle VMs? Particularly IBM? Also: please

Paging while indexes

2014-06-23 Thread Bram Van Dam
Is there any way to take the current index version (or commit number or something) into account in paged queries? When navigating through a large result set in an NRT environment, I want the navigation to remain *fixed* on the initial results. I'm trying to avoid a scenario where a user has a

How-to get results of comparison between documents

2014-06-23 Thread Moshe Recanati
Hi, I've several documents that describe mobile phone specification with index on release date. Assume I want to query these documents and get the latest document based on release date. Please describe how can I do it (if at all) and which query I need to execute Regards, Moshe Recanati SVP

Evaluate function only on subset of documents

2014-06-23 Thread Costi Muraru
Hi guys, I'm running some tests and I can't see to figure this one out. Suppose we have a real estate index, containing homes for rent and purchase. The first kind of query I want to make is like so: - type:purchase AND {!frange u=10}mycustomfunction() The function is expensive and, in order to

Re: deep faceting issues in distributed mode

2014-06-23 Thread Dmitry Kan
any clues? Too little detail? On Thu, Jun 19, 2014 at 1:26 PM, Dmitry Kan solrexp...@gmail.com wrote: Hello, We face an issue with deep faceting in a distributed non-SolrCloud setting. A query comes in through the solr frontend (router) and broadcasts to each shard. The exception below

Can SolrRequestHandler choose Solr collection?

2014-06-23 Thread Lee Chunki
Hi, If there are one “RequestHandler Server” and two “Index Server”s, Could RequestHandler Server choose index server for request ? I am testing with three Solr servers - blog index, news index and request handler. and I should set up news and blog index separately. I want to send query to

Re: docFreq coming to be more than 1 for unique id field

2014-06-23 Thread Apoorva Gaurav
Hello Markus, Ahmet, Forgot to update the thread; optimization works i.e. after optimizing all unique keys have docFreq as 1. On Wed, Jun 18, 2014 at 1:58 AM, Chris Hostetter hossman_luc...@fucit.org wrote: : text in it, query is of the type keywords:(word1 OR word2 ... OR wordN). : The

Restricting access to reading full text document field

2014-06-23 Thread Bjørn Axelsen
Dear Solr users, I am building a Solr 4.8 search engine that will hold documents containing subscription-only content. We want potential customers to be able to search the full content. And we also want to show them highlighted context snippets from the full contents. So, I have included the

Re: Restricting access to reading full text document field

2014-06-23 Thread Michael Della Bitta
Unfortunately, it's not really advisable to allow open access to Solr to the open web. There are many avenues of DOSing a Solr install otherwise, and depending on how it's configured, some more intrusive vulnerabilities. Michael Della Bitta Applications Developer o: +1 646 532 3062 appinions

Re: Restricting access to reading full text document field

2014-06-23 Thread Bjørn Axelsen
Thanks, Michael ... so if I plan to do client-side ajax, you would suggest to call back an ajax proxy rather than query the Solr instance directly? 2014-06-23 14:57 GMT+02:00 Michael Della Bitta michael.della.bi...@appinions.com: Unfortunately, it's not really advisable to allow open access to

Re: Restricting access to reading full text document field

2014-06-23 Thread Michael Della Bitta
Yes, that's the general model. Use a layer in between your clients and Solr to restrict access to what you wish to let people to do. Generally speaking, you should expose a SearchHandler that hardcodes the fl param to prevent retrieval of your full text field, and uses a filter query param to

Re: Error creating collection

2014-06-23 Thread pravin
I am also facing this issue recently. Any solution to fix this issue? I have almost 3000+ core created and adding some more. Please suggest if there is restriction on the core numbers and shard and collection. Here is trace: Jun 23, 2014 9:01:45 AM org.apache.solr.common.SolrException log

No results for a wildcard query for text_general field in solr 4.1

2014-06-23 Thread Sven Schönfeldt
Hi Solr-Users, i am trying to do a wildcard query on a dynamic textfield (_t), but don’t get the right result. The configuration for the field type is „text_general“, the default configuration: fieldType name=text_general class=solr.TextField positionIncrementGap=100 analyzer

Re: SolrCloud multiple data center support

2014-06-23 Thread Arcadius Ahouansou
On 3 February 2014 22:16, Daniel Collins danwcoll...@gmail.com wrote: One other option is in ZK trunk (but not yet in a release) is the ability to dynamically reconfigure ZK ensembles ( https://issues.apache.org/jira/browse/ZOOKEEPER-107). That would give the ability to create new ZK

Solrcloud- index lock

2014-06-23 Thread atp
Hi , we have configured three node solrcloud., with zookeeper , and tomcat . but when we start solr, throwing error in one of the node, we configured dataimporthandler as well, SolrCore 'collection1' is not available due to init failure: Index locked for write for core collection1 not yet

RE: Spell checker - limit on number of misspelt words in a search term.

2014-06-23 Thread Dyer, James
I do not believe there is such a setting. Most likely you will need to increase the value for maxCollationTries to get it to discover the correct combination. Just be sure not to set this too high as queries with a lot of misspelled words (or for something your index simply doesn't have) will

Re: Unable to start solr 4.8

2014-06-23 Thread Erick Erickson
Try nuking the entire data directory. As in rm -rf .../data. Although why it should report a problem with the lock file I'm not quite sure. Best, Erick On Mon, Jun 23, 2014 at 12:10 AM, atp annamalai...@hcl.com wrote: hi , After rebooting all the cluster machine , we unable to start the solr

Re: How-to get results of comparison between documents

2014-06-23 Thread lboutros
Hi Moshe, If I understand correctly your needs, I think you want to use the CollapsingQParser post filter: https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=40509582 I think that, basically, adding this filter to your query should solve your problem: fq={!collapse

Re: SolrCloud multiple data center support

2014-06-23 Thread Daniel Collins
https://issues.apache.org/jira/browse/ZOOKEEPER-107 maybe be implemented, but it isn't in a release as yet :) Its slated for 3.5.0 but given 3.4.0 came out in November 2011, and there has been no minor release since then. The 3.4.x release is only releasing critical fixes now, so any new

Re: Multivalue wild card search

2014-06-23 Thread Ethan
Ahmet, Yes, they were part of JSON output, Here is the xml response arr name=Name str[[Hifte, Grop, , ]]/str str[]/str str[[Ethan, G, , ],[Steve, Wonder, , ]]/str /arr I solution suggested by Jack to look up Steve Wonder doesn't work as asterick is replaced by the defaultsearch field. Any

RE: How-to get results of comparison between documents

2014-06-23 Thread Moshe Recanati
Hi Ludvic, Thanks a lot. I looked into this reference guide and I would like to make sure I understand it correctly. Can you mention specific example on how to use it? It'll help me a lot. Regards, Moshe Original message From: lboutros Date:06/23/2014 19:10 (GMT+02:00) To:

Connection Time out Issue

2014-06-23 Thread Venkata krishna
Hi, In my project we need to index millon files, but connection time out problems are coming after completion of indexing 1000 to 2000 files. What connection values would be preferred to inject for the solrserver object to avoid connection problems ?

RE: running Post jar from different server

2014-06-23 Thread EXTERNAL Taminidi Ravi (ETI, Automotive-Service-Solutions)
HI Anyone has the any reference for these type of execution..? -Original Message- From: EXTERNAL Taminidi Ravi (ETI, Automotive-Service-Solutions) [mailto:external.ravi.tamin...@us.bosch.com] Sent: Friday, June 20, 2014 1:46 PM To: solr-user@lucene.apache.org Subject: RE: running Post

POST Vs GET

2014-06-23 Thread EXTERNAL Taminidi Ravi (ETI, Automotive-Service-Solutions)
Hi, I am executing a solr query runs 10 to 12 lines with all the boosting and condition. I change the Http Contentype to POST from GET as post doesn't have any restriction for size. But I am getting an error. I am using Tomcat 7, Is there any place we need to specify in Tomcat to accept POST..

Re: SolrCloud multiple data center support

2014-06-23 Thread Mark Miller
We have been waiting for that issue to be finished before thinking too hard about how it can improve things. There have been a couple ideas (I’ve mostly wanted it for improving the internal zk mode situation), but no JIRAs yet that I know of. --  Mark Miller about.me/markrmiller On June 23,

Re: Multivalue wild card search

2014-06-23 Thread Ahmet Arslan
Hi Ethan, XML response is helpful, so you still have brackets, commas, quotes in field value? What is the field type you use for Name field? If you tokenize it StandardTokenizer simple phrase query would do the trick q=Name:Steve Wonder Also consider cleaning up your values. Why would you

SOLR in Production

2014-06-23 Thread EXTERNAL Taminidi Ravi (ETI, Automotive-Service-Solutions)
Hi, We are planning to move solr for Production Environment. I like to get some real time experience or checklist to take care. We have 1 instance of Solr with 2 Cores. What should be taken care in case of the solr instance is down? What I have to do , if I have 2 instances, the first one is

Re: Connection Time out Issue

2014-06-23 Thread Shalin Shekhar Mangar
Those methods set the same attributes on the underlying HttpClient object so look at the docs for the commons-httpclient libraries. How are you indexing the documents? One at a time? In batches? If an indexing request is taking a lot of time then you may need to increase the read timeout

Re: SOLR in Production

2014-06-23 Thread Shalin Shekhar Mangar
You need to decide between SolrCloud and non-solrcloud mode. A SolrCloud cluster will need external ZooKeeper instances and will provide failover, replication, sharding and load balancing between replicas automatically. But if your needs are small then you can go with a non-solrcloud cluster as

Re: POST Vs GET

2014-06-23 Thread Shalin Shekhar Mangar
Why don't you just use the jetty shipped with Solr? It has all the correct defaults. In future, we may not even support shipping a war file. On Mon, Jun 23, 2014 at 11:07 PM, EXTERNAL Taminidi Ravi (ETI, Automotive-Service-Solutions) external.ravi.tamin...@us.bosch.com wrote: Hi, I am

Re: running Post jar from different server

2014-06-23 Thread Shalin Shekhar Mangar
You said that SQLDB and Solr are on different servers and that you are running post.jar from a network drive mapped to your SQLDB. If so, then why are you trying to post to localhost? That would resolve to the SQLDB host where Solr is not running. Instead of using localhost in the -Durl part of

Re: How-to get results of comparison between documents

2014-06-23 Thread Shalin Shekhar Mangar
Hi Moshe, The CollpasingQParser will group documents on a given field. In this case, you can group by the mobile phone's id (which is unique across all mobile phones) and then ask Solr to return the document with the maximum revision value. That is exactly what Ludovic's example does. On Mon,

RE: How-to get results of comparison between documents

2014-06-23 Thread Moshe Recanati
Got it thank you Regards, Moshe Original message From: Shalin Shekhar Mangar Date:06/23/2014 21:38 (GMT+02:00) To: solr-user@lucene.apache.org Subject: Re: How-to get results of comparison between documents Hi Moshe, The CollpasingQParser will group documents on a given

Re: Solr alternates returning different versions of the same document

2014-06-23 Thread Erick Erickson
bq: If I understood you correctly - this means the update section of the admin should be avoided when using a sharded install, because it doesn't guarantee a given document ID will be sent to the same shard as the previous version of the same document? You've got it, but I want to emphasize

Re: Multivalue wild card search

2014-06-23 Thread Ethan
Hey Ahmet, Yes, brackets, commas and quotes are part of fields value. It's something I inherited and working on improving it. The field is of type solr.TextField. Adding StandardTokenizer solves the problem for the new documents. It doesn't work on already indexed docs. Is there a solution

Re: Error creating collection

2014-06-23 Thread Erick Erickson
I suspect (but don't know for sure) that the problem here is that ZK is limited to 1M at the moment, although this is configurable (sorry, don't have the reference handy). But at 3,000 cores you're certainly going into relatively uncharted territory so I'm unsurprised that you're running into

Block Join Not Working - what am I doing wrong?

2014-06-23 Thread Vinay B,
Hi, I've been trying to experiment with block joins and parent / child docs as described in this thread (input described in my first post of the thread, .. and block join in my second post, as per the suggestions given). What else am I missing? Thanks

Re: Multivalue wild card search

2014-06-23 Thread Ahmet Arslan
Hi Ethan, I understand that you are dealing legacy system. Can you paste analysis chain used for already indexed docs. I mean xml snippet taken from schema xml. With this, we will figure out how that text is indexed. We will write our query according to that info. Ahmet On Monday, June

Getting stats on Date facet groups

2014-06-23 Thread Andrew Shumway
Using solr version 4.5 and an index of commercial flights data and am getting record counts by faceted date. I also want to get the total of an integer field by faceted date but am having difficulty. Here are the fields: departureDateGMT carrier seatingCapacity If I set facet=true,

Re: No results for a wildcard query for text_general field in solr 4.1

2014-06-23 Thread Erick Erickson
Well, you can do more than guess by looking at the admin/analysis page and trying your input on the field in question. That'll show you what actual transformations are performed. You're probably right though. Try adding debug=query to your URL to see what the actual parsed query looks like and

Re: Block Join Not Working - what am I doing wrong?

2014-06-23 Thread Erick Erickson
Well, what do you mean by not working? You might review: http://wiki.apache.org/solr/UsingMailingLists Best, Erick On Mon, Jun 23, 2014 at 12:20 PM, Vinay B, vybe3...@gmail.com wrote: Hi, I've been trying to experiment with block joins and parent / child docs as described in this thread

Re: Getting stats on Date facet groups

2014-06-23 Thread Chris Hostetter
: record counts by faceted date. I also want to get the total of an integer : field by faceted date but am having difficulty. Unfortunately, what you are asking about isn't currently possible. FWIW: lately i've been thinking a lot lately about stats and accumulating stats over facets, and i

Re: Evaluate function only on subset of documents

2014-06-23 Thread Ahmet Arslan
Hi Costi, This is untested, but in theory you could use ReRankingQParserPlugin http://heliosearch.org/solrs-new-re-ranking-feature/ Your expensive query will be reRankQuery used to re-rank sample documents. Please lets us know it that works for you. Ahmet On Monday, June 23, 2014 1:03 PM,

Re: Multivalue wild card search

2014-06-23 Thread Ethan
Ahmet, Here the xml for the field Name - Let me know if I need to update it. field name=Name type=token2 indexed=true stored=true multiValued=true omitTermFreqAndPositions=false/ types fieldType name=token2 class=solr.TextField omitNorms=true positionIncrementGap=1 analyzer

Solr on S3FileSystem, Kosmos, GlusterFS, etc….

2014-06-23 Thread Jay Vyas
Hi folks. Does anyone deploy solr indices on other HCFS implementations (S3FileSystem, for example) regularly ? If so I'm wondering 1) Where are the docs for doing this - or examples? Seems like everything, including parameter names for dfs setup, are based around hdfs. Maybe I should

Re: Multivalue wild card search

2014-06-23 Thread Ahmet Arslan
Hi Ethan, With that type standard phrase query should work. If you paste you sample text in analysis page, you will see indexed terms. q=Name:steve wonder should work. You don't need wildcard search in this case. Just do a phrase query. (surrounded with quotes) Ahmet  On Tuesday, June 24,

Re: Multivalue wild card search

2014-06-23 Thread Ethan
Hi Ahmet, I have tested this and it doesn't work for existing documents. I couldn't make much sense of the field analysis. I didn't find an option to see indexed terms in Analysis tab. Instead you feed it the value you want analyzed and it prints index or query time analysis. Is this what

Get position of first occurrence in search result

2014-06-23 Thread Jorge Luis Betancourt Gonzalez
I’m using Solr for an analytic use case, one of the requirements is basically given a search query get the position of the first hit. I’m indexing web pages, so given a search criteria the client want’s to know the position (first occurrence) of his webpage in the result set (if it appears at

Re: Evaluate function only on subset of documents

2014-06-23 Thread Alexandre Rafalovitch
On Mon, Jun 23, 2014 at 5:02 PM, Costi Muraru costimur...@gmail.com wrote: q=*:*fq={!cost=1}type:purchase{!frange u=0 cost=3}mycustomfunction() The function is applied on all documents, instead of only those that match the *purchase* type. I verified this assumption, by checking the query time

Re: Evaluate function only on subset of documents

2014-06-23 Thread Chris Hostetter
: Now, if I want to make a query that also contains some OR, it is impossible : to do so with this approach. This is because fq with OR operator is not : supported (SOLR-1223). As an alternative I've tried these queries: : : county='New York' AND (location:Maylands OR location:Holliscort or :

Re: Multivalue wild card search

2014-06-23 Thread Erick Erickson
Nope, got to re-index. bq: Assuming there is a multiValued field called Name of type string stored in index - bq: I tested both cases with empty index. When I inserted the document after changing fieldType to StandardTokenizerFactory, it worked fine with the standard phrase query. But I was

Problem : limit results when using edismax way .

2014-06-23 Thread xiaoqi
when user input keywords like white T shirt to search products, i want to list all T shirt with white colour , so i using edismax like item_category^1.4 item_colour^0.5 , but the result still come out some other products which is not T shirt but white. is any way to limit result only T shirt

Re: Problem : limit results when using edismax way .

2014-06-23 Thread Aman Tandon
Can you please paste the whole query url here. With Regards Aman Tandon On Tue, Jun 24, 2014 at 8:48 AM, xiaoqi belivexia...@gmail.com wrote: when user input keywords like white T shirt to search products, i want to list all T shirt with white colour , so i using edismax like

Re: Problem : limit results when using edismax way .

2014-06-23 Thread xiaoqi
http://localhost:7080/solr/select?l=*,scorestart=0q=%E7%99%BD%E8%89%B2%E5%B8%BD%E5%AD%90qf=item_category^1.4+item_colour^0.5bf=mul(ctr,2.5)wt=xmlfq=item_id:[0+TO+*]fq=rows=60defType=edismaxversion=2debugQuery=on -- View this message in context:

Re: Get position of first occurrence in search result

2014-06-23 Thread Aman Tandon
What kind of search criteria, could you please explain With Regards Aman Tandon On Tue, Jun 24, 2014 at 4:30 AM, Jorge Luis Betancourt Gonzalez jlbetanco...@uci.cu wrote: I’m using Solr for an analytic use case, one of the requirements is basically given a search query get the position of

Re: Problem : limit results when using edismax way .

2014-06-23 Thread Walter Underwood
The edismax query handler will match the best field, not all fields. So for some documents it matches t-shirt and for some it matches white. Try creating a single field with all the description information. That field will have both white and t-shirt. Then add pf2, pf3, and pf config lines.

Re: Get position of first occurrence in search result

2014-06-23 Thread Jorge Luis Betancourt Gonzalez
Basically given a few search terms (query) the idea is to know given one or more terms in which position your website is located for those specific terms. On Jun 24, 2014, at 12:12 AM, Aman Tandon amantandon...@gmail.com wrote: What kind of search criteria, could you please explain With

Re: Get position of first occurrence in search result

2014-06-23 Thread Walter Underwood
Solr is designed to do exactly this very, very fast. So there isn't a faster way to do it. But you only need to fetch the URL field. You can ignore everything else. wunder On Jun 23, 2014, at 9:32 PM, Jorge Luis Betancourt Gonzalez jlbetanco...@uci.cu wrote: Basically given a few search

Re: Get position of first occurrence in search result

2014-06-23 Thread Jorge Luis Betancourt Gonzalez
Yes, but I’m looking for the position of the url field of interest in the response of solr. Solr matches the terms against the collection of documents and returns sorted list by score, what I’m trying to do is get the position of the a specific id in this sorted response. The response could be

Re: Error creating collection

2014-06-23 Thread pravin
Thanks Eric for your suggestion. It helped me by increasing the znode data size from 1M to 2M. Here is the reference for the same to change this configuration: https://zookeeper.apache.org/doc/r3.3.2/zookeeperAdmin.html I used this parameter in the JAVA_OPTS -Djute.maxbuffer=2M which helped me

Re: Get position of first occurrence in search result

2014-06-23 Thread Tri Cao
Oh, I see what you are trying to do, you were confusing :) To get the exact position of a particular document in the ranked list, you will need to loop through the whole list, as that's exactly what Solr has to do to get to the that document. However, you could do some optimization with the

Re: Get position of first occurrence in search result

2014-06-23 Thread Aman Tandon
Jorge, i don't think that solr provide this functionality, you have to iterate and solr is very fast in this, you can create a script for that which search for pattern(term) and parse(request) the records until get the record of that desired url, i don't thing 1/3 seconds time to find out is more.

Double cast exception with grouping and sort function

2014-06-23 Thread Nate Dire
Hi, I recently tried upgrading our setup from 4.5.1 to 4.7+, and I'm seeing an exception when I use (1) a function to sort and (2) result grouping. The same query works fine with either (1) or (2) alone. Example below. At a glance, it looks similar to:

Re: Get position of first occurrence in search result

2014-06-23 Thread Jorge Luis Betancourt Gonzalez
Basically this is for analytical purposes, essentially we want to help people (which sites we’ve indexed in our app) to find out for which particular terms (in theory related with their domain) they are bad positioned in our index. Initially we’re starting with this basic “position per term”