Re: SOLR 4.10.4 - error creating document

2015-05-11 Thread Bernd Fehling
Hi Erik, thanks for your concerns and thoughts. There is no XY problem because we decouple input (storing) from, searching, faceting, ... What you see is just the input for storing and output of the original text in the results. There is no need to do any analysis on this. So don't worry, it works

Sorting on multivalues field in Solr

2015-05-11 Thread nutchsolruser
Is there any way we can sort multivalued field in Solr. I have two documents with field custom_code and values are as below, Doc 1 : 11, 78, 45, 22 Doc 2 : 56, 74, 62, 10 When I sort it in ascending order the order should be , Doc 2 : 56, 74, 62, 10 Doc 1 : 11, 78, 45, 22 Here Doc 2 will come fi

Re: Solr query which return only those docs whose all tokens are from given list

2015-05-11 Thread Naresh Yadav
Thanks Andrew, You got my problem precisely But solutions you suggested may not work for me. In my API i get only list of tags authorized i.e [T1, T2, T3] and based on that only i need to construct my Solr query. So first solution with NOT (T4 OR T5) will not work. In real case tag ids T1, T2 are

Re: Best way to backup and restore an index for a cloud setup in 4.6.1?

2015-05-11 Thread Shalin Shekhar Mangar
Hi John, There are a few HTTP APIs for replication, one of which can let you take a backup of the index. Restoring can be as simple as just copying over the index in the right location on the disk. A new restore API will be released with the next version of Solr which will make some of these tasks

Re: SolrJ vs. plain old HTTP post

2015-05-11 Thread Shalin Shekhar Mangar
On Mon, May 11, 2015 at 8:20 PM, Steven White wrote: > Thanks Erik and Emir. > > > > To close the loop on this question, I will need to enable Jetty's SSL (the > jetty that comes with Solr 5.1). If I do so, will SolrJ still work, can I > assume that SolrJ supports SSL? > > Yes, SolrJ can work

Solr Multiword Synonym Problem

2015-05-11 Thread solrnovice
Hi all, I am trying to solve the solr multiword synonym issue at our installation, I am currently using SOLR-4.9.x version. I used the "com.lucidworks.analysis.AutoPhrasingTokenFilterFactory" from Lucidworks git repo and used this in my schema.xml and also used their "com.lucidworks.analysis.Auto

Re: Queries on SynonymFilterFactory

2015-05-11 Thread Zheng Lin Edwin Yeo
Yes sure, thanks for your advice. I'm still waiting for my server to come before I can scale up my system and do the testing. Now the Solr running on my 4GB RAM system will crash if I try to scale up my system as there's not enough memory to support it. Regards, Edwin On 11 May 2015 at 19:11, A

Re: Queries on SynonymFilterFactory

2015-05-11 Thread Zheng Lin Edwin Yeo
Yes sure, thanks for your advice. I'm still waiting for my server to come before I can scale up my system and do the testing. Now the Solr running on my 4GB RAM system will crash if I try to scale up my system as there's not enough memory to support it. Regards, Edwin On 11 May 2015 at 19:11, A

Re: PatternReplaceCharFilter + solr.WhitespaceTokenizerFactory behaviour

2015-05-11 Thread Erick Erickson
This trips up _everybody_ at one point or other. The problem is that the input goes through the query _parsing_ prior to getting to the field analysis, and the parser is sensitive to spaces. Consider the input (without quotes) of "my dog". That gets broken up into default_field:my default_field:do

Re: SOLR 4.10.4 - error creating document

2015-05-11 Thread Erick Erickson
I've got to ask _how_ are you intending to search this field? On the surface, this feels like an XY problem. It's a "string" type. Therefore, if this is the input: 102, 111, 114, 32, 97, 32, 114, 101, 118, 105, 101, 119, 32, 115, 101, 101, 32, 66, 114 you'll only ever get a match if you search ex

SOLR plugin: Retrieve all values of multivalued field

2015-05-11 Thread Costi Muraru
Hi folks, I'm playing with a custom SOLR plugin and I'm trying to retrieve the value for a multivalued field, using the code below. == schema.xml: == input data: 83127 somevalue some other value some other value 3 some other value 4 == plugin: SortedDoc

PatternReplaceCharFilter + solr.WhitespaceTokenizerFactory behaviour

2015-05-11 Thread Mihran Shahinian
I must be missing something obvious.I have a simple regex that removes pattern. The unit test below works fine, but when I plug it into schema and query, regex does not match, since input already gets split by space (further below). My understanding that charFilter would operate on raw input stri

Help to index nested document

2015-05-11 Thread Vishal Swaroop
Need your valuable inputs... I am indexing data from database (one table) which is in this example format : id name value 1 Joe 102724904 2 Joe 100996643 - id is primary/ unique key - there can be same "name" but different "value" - If I try "name" as unique key then SOLR removes duplicate and in

boolean operators OR/NOT get highlighted by solr

2015-05-11 Thread Tang, Rebecca
Hi, We have a SOLR query like this q=ddmdate%3A2012-05-01T00%3A00%3A00Z+NOT+dddate%3A2010-06-11T00%3A00%3A00Z&wt=json&indent=true&hl=true&hl.simple.pre=%3Ch1%3E&hl.simple.post=%3C%2Fh1%3E&hl.requireFieldMatch=true&hl.preserveMulti=true&hl.fl=ot&f.ot.hl.fragsize=300&f.ot.hl.alternateField=ot&f.ot.

Re: Solr custom component issue

2015-05-11 Thread j 90
unsubscribe On Mon, May 11, 2015 at 6:58 PM, Upayavira wrote: > attaching them to each request, then just add qf= as a param to the URL, > easy. > > On Mon, May 11, 2015, at 12:17 PM, nutchsolruser wrote: > > These boosting parameters will be configured outside Solr and there is > > seperate mod

Re: Solr custom component issue

2015-05-11 Thread Upayavira
attaching them to each request, then just add qf= as a param to the URL, easy. On Mon, May 11, 2015, at 12:17 PM, nutchsolruser wrote: > These boosting parameters will be configured outside Solr and there is > seperate module from which these values get populated , I am reading > those > values fr

RE: Solr query which return only those docs whose all tokens are from given list

2015-05-11 Thread Andrew Chillrud
Based on his example, it sounds like Naresh not only wants the tags field to contain at least one of the values [T1, T2, T3] but also wants to exclude documents that contain a tag other than T1, T2, or T3 (Doc3 should not be retrieved). If the set of possible values in the tags field is limited

Re: schema modification issue

2015-05-11 Thread Steve Rowe
Hi, Thanks for reporting, I’m working a test to reproduce. Can you please create a Solr JIRA issue for this?: https://issues.apache.org/jira/browse/SOLR/ Thanks, Steve > On May 7, 2015, at 5:40 AM, User Zolr wrote: > > Hi there, > > I have come accross a problem that when using managed

Re: Completion Suggester in Solr

2015-05-11 Thread Pradeep Bhattiprolu
Bumping this thread again in the group, haven't received any responses for this. I am kind of stuck with this problem last week, any help is highly appreciated. Thanks Pradeep On Wed, May 6, 2015 at 5:00 PM, Pradeep Bhattiprolu wrote: > Hi > > Is there a equivalent of Completion suggester of El

Re: Solr query which return only those docs whose all tokens are from given list

2015-05-11 Thread Alessandro Benedetti
A simple OR query should be fine : tags:(T1 T2 T3) Cheers 2015-05-11 15:39 GMT+01:00 Sujit Pal : > Hi Naresh, > > Couldn't you could just model this as an OR query since your requirement is > at least one (but can be more than one), ie: > > tags:T1 tags:T2 tags:T3 > > -sujit > > > On Mon, May 1

Re: indexing java byte code in classes / jars

2015-05-11 Thread Walter Underwood
How about Krugle? http://opensearch.krugle.org/ Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) On May 11, 2015, at 3:18 AM, Tomasz Borek wrote: > There's also Perl-backed ACK. http://beyondgrep.com/ > > Which does the job of searching code really well. > >

Re: SolrJ vs. plain old HTTP post

2015-05-11 Thread Steven White
Thanks Erik and Emir. Erik: The fact that SolrJ is aware of SolrCloud is enough to put it over plain old HTTP post. Emir: I looked into Solr's data import handler, unfortunately, it won't work for my need. To close the loop on this question, I will need to enable Jetty's SSL (the jetty that come

Re: Solr query which return only those docs whose all tokens are from given list

2015-05-11 Thread Sujit Pal
Hi Naresh, Couldn't you could just model this as an OR query since your requirement is at least one (but can be more than one), ie: tags:T1 tags:T2 tags:T3 -sujit On Mon, May 11, 2015 at 4:14 AM, Naresh Yadav wrote: > Hi all, > > Also asked this here : http://stackoverflow.com/questions/3016

Re: SOLR 4.10.4 - error creating document

2015-05-11 Thread Bernd Fehling
It turned out that I didn't recognized that dcdescription is not indexed, only stored. So the next in "chain" ist f_dcperson where dccreator and dcdescription is combined and indexed. And this is why the error shows up on f_dcperson. ("delay of error") Thanks for your help, regards. Bernd Am 11.

Re: SOLR 4.10.4 - error creating document

2015-05-11 Thread Shawn Heisey
On 5/11/2015 7:19 AM, Bernd Fehling wrote: > After reading https://issues.apache.org/jira/browse/LUCENE-5472 > one question still remains. > > Why is it complaining about f_dcperson which is a copyField when the > origin problem field is dcdescription which definately is much larger > than 32766? >

Re: SOLR 4.10.4 - error creating document

2015-05-11 Thread Bernd Fehling
Hi Emir, ahhh, yes you're right. I missed that. Now I understand why it is not complaining about dcdescription and the error shows up on f_dcperson. "delay of error" ;-) Thanks Bernd Am 11.05.2015 um 15:25 schrieb Emir Arnautovic: > Hi Bernrd, > dcdescription field is not indexed. > > Thanks,

Re: SOLR 4.10.4 - error creating document

2015-05-11 Thread Bernd Fehling
Hi Shawn, that means if I set a length limit on dcdescription or make dcdescription multivalue than the problem is solved because f_dcperson is already multivalue? Regards Bernd Am 11.05.2015 um 15:17 schrieb Shawn Heisey: > On 5/11/2015 6:13 AM, Bernd Fehling wrote: >> Caused by: java.lang.Il

Re: SOLR 4.10.4 - error creating document

2015-05-11 Thread Emir Arnautovic
Hi Bernrd, dcdescription field is not indexed. Thanks, Emir -- Monitoring * Alerting * Anomaly Detection * Centralized Log Management Solr & Elasticsearch Support * http://sematext.com/ On 11.05.2015 15:22, Bernd Fehling wrote: Hi Emir, the dcdescription field is definately to big. But why i

Re: SOLR 4.10.4 - error creating document

2015-05-11 Thread Bernd Fehling
After reading https://issues.apache.org/jira/browse/LUCENE-5472 one question still remains. Why is it complaining about f_dcperson which is a copyField when the origin problem field is dcdescription which definately is much larger than 32766? I would assume it complains about dcdescription field.

Re: SOLR 4.10.4 - error creating document

2015-05-11 Thread Bernd Fehling
Hi Emir, the dcdescription field is definately to big. But why is it complaining about f_dcperson and not dcdescription? Regards Bernd Am 11.05.2015 um 15:12 schrieb Emir Arnautovic: > Hi Bernd, > Issue is with f_dcperson and what ends up in that field. It is configured to > be string, which m

Re: SOLR 4.10.4 - error creating document

2015-05-11 Thread Shawn Heisey
On 5/11/2015 6:13 AM, Bernd Fehling wrote: > Caused by: java.lang.IllegalArgumentException: Document contains at least one > immense term > in field="f_dcperson" (whose UTF8 encoding is longer than the max length > 32766), all of which were skipped. > Please correct the analyzer to not produce su

Re: SOLR 4.10.4 - error creating document

2015-05-11 Thread Emir Arnautovic
Hi Bernd, Issue is with f_dcperson and what ends up in that field. It is configured to be string, which means it is not tokenized so if some huge value is in either dccreator or dccontributor it will end up as single term. Nemes suggest that it should not contain such values, but double check

storeOffsetsWithPositions does not reflect in the index

2015-05-11 Thread Dmitry Kan
Hi, Using solr 4.10.2. Looks like storeOffsetsWithPositions has no effect, i.e. it does not store offsets in addition to positions. If we use termVectors="true" termPositions="true" termOffsets="true", then offsets and positions are available fine. Any ideas how to make storeOffsetsWithPositions

Re: SolrJ vs. plain old HTTP post

2015-05-11 Thread Erik Hatcher
Another advantage to SolrJ is with SolrCloud (ZK) awareness, and taking advantage of some routing optimizations client-side so the cluster has less hops to make. — Erik Hatcher, Senior Solutions Architect http://www.lucidworks.com > On May 11, 2015, at 8:21 AM, S

Re: SolrJ vs. plain old HTTP post

2015-05-11 Thread Emir Arnautovic
Hi Steve, Main advantage is that it uses binary format so XML/JSON overhead is avoided. You should also check out if SOLR's Data Import Handler is good fit for you. Thanks, Emir -- Monitoring * Alerting * Anomaly Detection * Centralized Log Management Solr & Elasticsearch Support * http://sem

SOLR 4.10.4 - error creating document

2015-05-11 Thread Bernd Fehling
I'm getting the following error with 4.10.4 WARN org.apache.solr.handler.dataimport.SolrWriter – Error creating document : SolrInputDocument(fields: [dcautoclasscode=310, dclang=unknown, ..., dcdocid=dd05ad427a58b49150a4ca36148187028562257a77643062382a1366250112ac]) org.apache.solr.comm

SolrJ vs. plain old HTTP post

2015-05-11 Thread Steven White
Hi Everyone, If all that I need to do is send data to Solr to add / delete a Solr document, which tool is better for the job: SolrJ or plain old HTTP post? In other word, what are the advantages of using SolrJ when the need is to push data to Solr for indexing? Thanks, Steve

Re: Solr custom component issue

2015-05-11 Thread nutchsolruser
These boosting parameters will be configured outside Solr and there is seperate module from which these values get populated , I am reading those values from external datasource and I want to attach them to each request . -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-

Solr query which return only those docs whose all tokens are from given list

2015-05-11 Thread Naresh Yadav
Hi all, Also asked this here : http://stackoverflow.com/questions/30166116 For example i have SOLR docs in which tags field is indexed : Doc1 -> tags:T1 T2 Doc2 -> tags:T1 T3 Doc3 -> tags:T1 T4 Doc4 -> tags:T1 T2 T3 Query1 : get all docs with "tags:T1 AND tags:T3" then it works and will give

Re: Queries on SynonymFilterFactory

2015-05-11 Thread Alessandro Benedetti
2015-05-11 4:44 GMT+01:00 Zheng Lin Edwin Yeo : > I've managed to run the synonyms with 10 different synonyms file. Each of > the synonym file size is 1MB, which consist of about 1000 tokens, and each > token has about 40-50 words. These lists of files are more extreme, which I > probably won't us

Re: Slow highlighting on Solr 5.0.0

2015-05-11 Thread Ere Maijala
Thanks for the pointers. Using hl.usePhraseHighlighter=false does indeed make it a lot faster. Obviously it's not really a solution, though, since in 4.10 it wasn't a problem and turning it off has consequences. I'm looking forward for the improvements in the next releases. --Ere 8.5.2015, 19

Re: indexing java byte code in classes / jars

2015-05-11 Thread Tomasz Borek
There's also Perl-backed ACK. http://beyondgrep.com/ Which does the job of searching code really well. And I think at least once I came across something that stemmed from ACK and claimed it was faster/better... googling... aah! The Silver Searcher it was. :-) http://betterthanack.com/ pozdrawiam

Re: Solr custom component issue

2015-05-11 Thread Upayavira
On Mon, May 11, 2015, at 10:30 AM, nutchsolruser wrote: > I can not set qf in solrconfig.xml file because my qf and boost values > will > be changing frequently . I am reading those values from external source. > > Can we not set qf value from searchComponent? Or is there any other way > to > d

答复: 答复: How to get the docs id after commit

2015-05-11 Thread 李文
You are right. I get last commit time and current commit time in the newsearcher listener, then query from last commit time to current commit time that I can get the newest committed docs.Thanks. Best, WenLi -邮件原件- 发件人: Erick Erickson [mailto:erickerick...@gmail.com] 发送时间: 2015年5月11日 9

Re: Solr custom component issue

2015-05-11 Thread nutchsolruser
I can not set qf in solrconfig.xml file because my qf and boost values will be changing frequently . I am reading those values from external source. Can we not set qf value from searchComponent? Or is there any other way to do this? -- View this message in context: http://lucene.472066.n3.nab

Re: Solr custom component issue

2015-05-11 Thread Upayavira
If all you want to do is to hardwire a qf, you can do that in your requestHandler config in solrconfig.xml. If you want to extend how the edismax query parser works, you may well be better off subclassing the edismax query parser, and passing in modified request parameters, but I'd explore getting

Re: Solr custom component issue

2015-05-11 Thread nutchsolruser
Thanks Upayavira, I tried it by changing it to first-component in solrconfig.xml but no luck . Am I missing something here ? Here I want to add my own qf fields with boost in query. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-custom-component-issue-tp4204799p420

Re: Solr custom component issue

2015-05-11 Thread Upayavira
You are adding a "search component", and adding it as a "last-component", meaning, it will come after the Query component which actually does the work. Given the parameters you have set, you will be using the default Lucene query parser which doesn't honour the qf parameter, so it isn't surprising

Solr custom component issue

2015-05-11 Thread nutchsolruser
Hi , I am trying to add my own query parameters in Solr query using solr component . In below example I am trying to add qf parameter in the query. Below is my prepare method of component. But Solr is not considering qf parameter while searching It is using df parameter that I have added in schema

Re: Query multiple collections together

2015-05-11 Thread Zheng Lin Edwin Yeo
Ok, thank you so much. Regards, Edwin On 11 May 2015 16:15, "Anshum Gupta" wrote: > FWIR, you just need to make sure that it's a valid collection. It doesn't > have to be one from the list of collections that you want to query, but the > collection name you use in the URL should exist. > e.g, as

Re: Query multiple collections together

2015-05-11 Thread Anshum Gupta
FWIR, you just need to make sure that it's a valid collection. It doesn't have to be one from the list of collections that you want to query, but the collection name you use in the URL should exist. e.g, assuming you have 2 collections foo (10 docs) and bar (5 docs): */solr/foo/select?q=*:*&collec

Re: Query multiple collections together

2015-05-11 Thread Zheng Lin Edwin Yeo
Thank you for the query. Just to confirm, for the 'gettingstarted' in the query, does it matter which collection name I put? Regards, Edwin On 11 May 2015 15:51, "Anshum Gupta" wrote: > You can query multiple collections by specifying the list of collections > e.g.: > > http://hostname:port >

Re: Unable to identify why faceting is taking so much time

2015-05-11 Thread Toke Eskildsen
On Mon, 2015-05-11 at 05:48 +, Abhishek Gupta wrote: > According to this there are 137 records. Now I am faceting over these 137 > records with facet.method=fc. Ideally it should just iterate over these 137 > records and sub up the facets. That is only the ideal method if you are not planning

Re: Upgraded to 4.10.3, highlighting performance unusably slow

2015-05-11 Thread William Bell
Has anyone looked at it? On Sun, May 3, 2015 at 10:18 AM, jaime spicciati wrote: > We ran into this as well on 4.10.3 (not related to an upgrade). It was > identified during load testing when a small percentage of queries would > take more than 20 seconds to return. We were able to isolate it by

Re: Query multiple collections together

2015-05-11 Thread Anshum Gupta
You can query multiple collections by specifying the list of collections e.g.: http://hostname:port /solr/gettingstarted/select?q=test&collection=collection1,collection2,collection3 On Sun, May 10, 2015 at 11:49 PM, Zheng Lin Edwin Yeo wrote: > Hi, > > Would like to check, is there a way to que