Query multiple collections together

2015-05-11 Thread Zheng Lin Edwin Yeo
Hi, Would like to check, is there a way to query multiple collections together in a single query and return the results in one result set? For example, I have 2 collections and I want to search for records with the word 'solr' in both of the collections. Is there a query to do that, or must I

Re: Query multiple collections together

2015-05-11 Thread Anshum Gupta
You can query multiple collections by specifying the list of collections e.g.: http://hostname:port /solr/gettingstarted/select?q=testcollection=collection1,collection2,collection3 On Sun, May 10, 2015 at 11:49 PM, Zheng Lin Edwin Yeo edwinye...@gmail.com wrote: Hi, Would like to check, is

Re: Upgraded to 4.10.3, highlighting performance unusably slow

2015-05-11 Thread William Bell
Has anyone looked at it? On Sun, May 3, 2015 at 10:18 AM, jaime spicciati jaime.spicci...@gmail.com wrote: We ran into this as well on 4.10.3 (not related to an upgrade). It was identified during load testing when a small percentage of queries would take more than 20 seconds to return. We

Re: Unable to identify why faceting is taking so much time

2015-05-11 Thread Toke Eskildsen
On Mon, 2015-05-11 at 05:48 +, Abhishek Gupta wrote: According to this there are 137 records. Now I am faceting over these 137 records with facet.method=fc. Ideally it should just iterate over these 137 records and sub up the facets. That is only the ideal method if you are not planning on

Re: Query multiple collections together

2015-05-11 Thread Zheng Lin Edwin Yeo
Thank you for the query. Just to confirm, for the 'gettingstarted' in the query, does it matter which collection name I put? Regards, Edwin On 11 May 2015 15:51, Anshum Gupta ans...@anshumgupta.net wrote: You can query multiple collections by specifying the list of collections e.g.:

答复: 答复: How to get the docs id after commit

2015-05-11 Thread 李文
You are right. I get last commit time and current commit time in the newsearcher listener, then query from last commit time to current commit time that I can get the newest committed docs.Thanks. Best, WenLi -邮件原件- 发件人: Erick Erickson [mailto:erickerick...@gmail.com] 发送时间: 2015年5月11日

Re: Queries on SynonymFilterFactory

2015-05-11 Thread Alessandro Benedetti
2015-05-11 4:44 GMT+01:00 Zheng Lin Edwin Yeo edwinye...@gmail.com: I've managed to run the synonyms with 10 different synonyms file. Each of the synonym file size is 1MB, which consist of about 1000 tokens, and each token has about 40-50 words. These lists of files are more extreme, which I

Re: Query multiple collections together

2015-05-11 Thread Anshum Gupta
FWIR, you just need to make sure that it's a valid collection. It doesn't have to be one from the list of collections that you want to query, but the collection name you use in the URL should exist. e.g, assuming you have 2 collections foo (10 docs) and bar (5 docs):

Re: Query multiple collections together

2015-05-11 Thread Zheng Lin Edwin Yeo
Ok, thank you so much. Regards, Edwin On 11 May 2015 16:15, Anshum Gupta ans...@anshumgupta.net wrote: FWIR, you just need to make sure that it's a valid collection. It doesn't have to be one from the list of collections that you want to query, but the collection name you use in the URL

Re: Solr custom component issue

2015-05-11 Thread nutchsolruser
Thanks Upayavira, I tried it by changing it to first-component in solrconfig.xml but no luck . Am I missing something here ? Here I want to add my own qf fields with boost in query. -- View this message in context:

Re: indexing java byte code in classes / jars

2015-05-11 Thread Tomasz Borek
There's also Perl-backed ACK. http://beyondgrep.com/ Which does the job of searching code really well. And I think at least once I came across something that stemmed from ACK and claimed it was faster/better... googling... aah! The Silver Searcher it was. :-) http://betterthanack.com/

Re: Slow highlighting on Solr 5.0.0

2015-05-11 Thread Ere Maijala
Thanks for the pointers. Using hl.usePhraseHighlighter=false does indeed make it a lot faster. Obviously it's not really a solution, though, since in 4.10 it wasn't a problem and turning it off has consequences. I'm looking forward for the improvements in the next releases. --Ere 8.5.2015,

Re: Solr custom component issue

2015-05-11 Thread Upayavira
On Mon, May 11, 2015, at 10:30 AM, nutchsolruser wrote: I can not set qf in solrconfig.xml file because my qf and boost values will be changing frequently . I am reading those values from external source. Can we not set qf value from searchComponent? Or is there any other way to do

Re: Solr custom component issue

2015-05-11 Thread Upayavira
You are adding a search component, and adding it as a last-component, meaning, it will come after the Query component which actually does the work. Given the parameters you have set, you will be using the default Lucene query parser which doesn't honour the qf parameter, so it isn't surprising

Re: Solr custom component issue

2015-05-11 Thread Upayavira
If all you want to do is to hardwire a qf, you can do that in your requestHandler config in solrconfig.xml. If you want to extend how the edismax query parser works, you may well be better off subclassing the edismax query parser, and passing in modified request parameters, but I'd explore

Solr custom component issue

2015-05-11 Thread nutchsolruser
Hi , I am trying to add my own query parameters in Solr query using solr component . In below example I am trying to add qf parameter in the query. Below is my prepare method of component. But Solr is not considering qf parameter while searching It is using df parameter that I have added in

Re: Solr custom component issue

2015-05-11 Thread nutchsolruser
I can not set qf in solrconfig.xml file because my qf and boost values will be changing frequently . I am reading those values from external source. Can we not set qf value from searchComponent? Or is there any other way to do this? -- View this message in context:

SOLR 4.10.4 - error creating document

2015-05-11 Thread Bernd Fehling
I'm getting the following error with 4.10.4 WARN org.apache.solr.handler.dataimport.SolrWriter – Error creating document : SolrInputDocument(fields: [dcautoclasscode=310, dclang=unknown, ..., dcdocid=dd05ad427a58b49150a4ca36148187028562257a77643062382a1366250112ac])

Re: SolrJ vs. plain old HTTP post

2015-05-11 Thread Emir Arnautovic
Hi Steve, Main advantage is that it uses binary format so XML/JSON overhead is avoided. You should also check out if SOLR's Data Import Handler is good fit for you. Thanks, Emir -- Monitoring * Alerting * Anomaly Detection * Centralized Log Management Solr Elasticsearch Support *

SolrJ vs. plain old HTTP post

2015-05-11 Thread Steven White
Hi Everyone, If all that I need to do is send data to Solr to add / delete a Solr document, which tool is better for the job: SolrJ or plain old HTTP post? In other word, what are the advantages of using SolrJ when the need is to push data to Solr for indexing? Thanks, Steve

Re: SOLR 4.10.4 - error creating document

2015-05-11 Thread Emir Arnautovic
Hi Bernd, Issue is with f_dcperson and what ends up in that field. It is configured to be string, which means it is not tokenized so if some huge value is in either dccreator or dccontributor it will end up as single term. Nemes suggest that it should not contain such values, but double check

Solr query which return only those docs whose all tokens are from given list

2015-05-11 Thread Naresh Yadav
Hi all, Also asked this here : http://stackoverflow.com/questions/30166116 For example i have SOLR docs in which tags field is indexed : Doc1 - tags:T1 T2 Doc2 - tags:T1 T3 Doc3 - tags:T1 T4 Doc4 - tags:T1 T2 T3 Query1 : get all docs with tags:T1 AND tags:T3 then it works and will give Doc2

Re: Solr custom component issue

2015-05-11 Thread nutchsolruser
These boosting parameters will be configured outside Solr and there is seperate module from which these values get populated , I am reading those values from external datasource and I want to attach them to each request . -- View this message in context:

Re: SolrJ vs. plain old HTTP post

2015-05-11 Thread Erik Hatcher
Another advantage to SolrJ is with SolrCloud (ZK) awareness, and taking advantage of some routing optimizations client-side so the cluster has less hops to make. — Erik Hatcher, Senior Solutions Architect http://www.lucidworks.com http://www.lucidworks.com/ On May 11, 2015, at 8:21 AM,

storeOffsetsWithPositions does not reflect in the index

2015-05-11 Thread Dmitry Kan
Hi, Using solr 4.10.2. Looks like storeOffsetsWithPositions has no effect, i.e. it does not store offsets in addition to positions. If we use termVectors=true termPositions=true termOffsets=true, then offsets and positions are available fine. Any ideas how to make storeOffsetsWithPositions

Re: SOLR 4.10.4 - error creating document

2015-05-11 Thread Bernd Fehling
After reading https://issues.apache.org/jira/browse/LUCENE-5472 one question still remains. Why is it complaining about f_dcperson which is a copyField when the origin problem field is dcdescription which definately is much larger than 32766? I would assume it complains about dcdescription

Re: SOLR 4.10.4 - error creating document

2015-05-11 Thread Bernd Fehling
Hi Emir, the dcdescription field is definately to big. But why is it complaining about f_dcperson and not dcdescription? Regards Bernd Am 11.05.2015 um 15:12 schrieb Emir Arnautovic: Hi Bernd, Issue is with f_dcperson and what ends up in that field. It is configured to be string, which

Re: SOLR 4.10.4 - error creating document

2015-05-11 Thread Emir Arnautovic
Hi Bernrd, dcdescription field is not indexed. Thanks, Emir -- Monitoring * Alerting * Anomaly Detection * Centralized Log Management Solr Elasticsearch Support * http://sematext.com/ On 11.05.2015 15:22, Bernd Fehling wrote: Hi Emir, the dcdescription field is definately to big. But why

Re: SOLR 4.10.4 - error creating document

2015-05-11 Thread Bernd Fehling
Hi Shawn, that means if I set a length limit on dcdescription or make dcdescription multivalue than the problem is solved because f_dcperson is already multivalue? Regards Bernd Am 11.05.2015 um 15:17 schrieb Shawn Heisey: On 5/11/2015 6:13 AM, Bernd Fehling wrote: Caused by:

Re: SOLR 4.10.4 - error creating document

2015-05-11 Thread Bernd Fehling
It turned out that I didn't recognized that dcdescription is not indexed, only stored. So the next in chain ist f_dcperson where dccreator and dcdescription is combined and indexed. And this is why the error shows up on f_dcperson. (delay of error) Thanks for your help, regards. Bernd Am

Re: SOLR 4.10.4 - error creating document

2015-05-11 Thread Bernd Fehling
Hi Emir, ahhh, yes you're right. I missed that. Now I understand why it is not complaining about dcdescription and the error shows up on f_dcperson. delay of error ;-) Thanks Bernd Am 11.05.2015 um 15:25 schrieb Emir Arnautovic: Hi Bernrd, dcdescription field is not indexed. Thanks,

Re: SOLR 4.10.4 - error creating document

2015-05-11 Thread Shawn Heisey
On 5/11/2015 6:13 AM, Bernd Fehling wrote: Caused by: java.lang.IllegalArgumentException: Document contains at least one immense term in field=f_dcperson (whose UTF8 encoding is longer than the max length 32766), all of which were skipped. Please correct the analyzer to not produce such

Re: SOLR 4.10.4 - error creating document

2015-05-11 Thread Shawn Heisey
On 5/11/2015 7:19 AM, Bernd Fehling wrote: After reading https://issues.apache.org/jira/browse/LUCENE-5472 one question still remains. Why is it complaining about f_dcperson which is a copyField when the origin problem field is dcdescription which definately is much larger than 32766? I

Re: Solr query which return only those docs whose all tokens are from given list

2015-05-11 Thread Sujit Pal
Hi Naresh, Couldn't you could just model this as an OR query since your requirement is at least one (but can be more than one), ie: tags:T1 tags:T2 tags:T3 -sujit On Mon, May 11, 2015 at 4:14 AM, Naresh Yadav nyadav@gmail.com wrote: Hi all, Also asked this here :

Re: SolrJ vs. plain old HTTP post

2015-05-11 Thread Steven White
Thanks Erik and Emir. Erik: The fact that SolrJ is aware of SolrCloud is enough to put it over plain old HTTP post. Emir: I looked into Solr's data import handler, unfortunately, it won't work for my need. To close the loop on this question, I will need to enable Jetty's SSL (the jetty that

Re: indexing java byte code in classes / jars

2015-05-11 Thread Walter Underwood
How about Krugle? http://opensearch.krugle.org/ Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) On May 11, 2015, at 3:18 AM, Tomasz Borek tomasz.bo...@gmail.com wrote: There's also Perl-backed ACK. http://beyondgrep.com/ Which does the job of searching

PatternReplaceCharFilter + solr.WhitespaceTokenizerFactory behaviour

2015-05-11 Thread Mihran Shahinian
I must be missing something obvious.I have a simple regex that removes spacehyphenspace pattern. The unit test below works fine, but when I plug it into schema and query, regex does not match, since input already gets split by space (further below). My understanding that charFilter would operate

Re: Completion Suggester in Solr

2015-05-11 Thread Pradeep Bhattiprolu
Bumping this thread again in the group, haven't received any responses for this. I am kind of stuck with this problem last week, any help is highly appreciated. Thanks Pradeep On Wed, May 6, 2015 at 5:00 PM, Pradeep Bhattiprolu pbhatt...@gmail.com wrote: Hi Is there a equivalent of

Re: Solr custom component issue

2015-05-11 Thread j 90
unsubscribe On Mon, May 11, 2015 at 6:58 PM, Upayavira u...@odoko.co.uk wrote: attaching them to each request, then just add qf= as a param to the URL, easy. On Mon, May 11, 2015, at 12:17 PM, nutchsolruser wrote: These boosting parameters will be configured outside Solr and there is

SOLR plugin: Retrieve all values of multivalued field

2015-05-11 Thread Costi Muraru
Hi folks, I'm playing with a custom SOLR plugin and I'm trying to retrieve the value for a multivalued field, using the code below. == schema.xml: field name=my_field_name type=string indexed=true stored=false multiValued=true/ == input data: add doc field name=id83127/field

Help to index nested document

2015-05-11 Thread Vishal Swaroop
Need your valuable inputs... I am indexing data from database (one table) which is in this example format : id name value 1 Joe 102724904 2 Joe 100996643 - id is primary/ unique key - there can be same name but different value - If I try name as unique key then SOLR removes duplicate and indexes

Re: PatternReplaceCharFilter + solr.WhitespaceTokenizerFactory behaviour

2015-05-11 Thread Erick Erickson
This trips up _everybody_ at one point or other. The problem is that the input goes through the query _parsing_ prior to getting to the field analysis, and the parser is sensitive to spaces. Consider the input (without quotes) of my dog. That gets broken up into default_field:my default_field:dog

Re: SOLR 4.10.4 - error creating document

2015-05-11 Thread Erick Erickson
I've got to ask _how_ are you intending to search this field? On the surface, this feels like an XY problem. It's a string type. Therefore, if this is the input: 102, 111, 114, 32, 97, 32, 114, 101, 118, 105, 101, 119, 32, 115, 101, 101, 32, 66, 114 you'll only ever get a match if you search

Re: SolrJ vs. plain old HTTP post

2015-05-11 Thread Shalin Shekhar Mangar
On Mon, May 11, 2015 at 8:20 PM, Steven White swhite4...@gmail.com wrote: Thanks Erik and Emir. snip/ To close the loop on this question, I will need to enable Jetty's SSL (the jetty that comes with Solr 5.1). If I do so, will SolrJ still work, can I assume that SolrJ supports SSL?

Re: Solr query which return only those docs whose all tokens are from given list

2015-05-11 Thread Naresh Yadav
Thanks Andrew, You got my problem precisely But solutions you suggested may not work for me. In my API i get only list of tags authorized i.e [T1, T2, T3] and based on that only i need to construct my Solr query. So first solution with NOT (T4 OR T5) will not work. In real case tag ids T1, T2

Re: Best way to backup and restore an index for a cloud setup in 4.6.1?

2015-05-11 Thread Shalin Shekhar Mangar
Hi John, There are a few HTTP APIs for replication, one of which can let you take a backup of the index. Restoring can be as simple as just copying over the index in the right location on the disk. A new restore API will be released with the next version of Solr which will make some of these

Re: Queries on SynonymFilterFactory

2015-05-11 Thread Zheng Lin Edwin Yeo
Yes sure, thanks for your advice. I'm still waiting for my server to come before I can scale up my system and do the testing. Now the Solr running on my 4GB RAM system will crash if I try to scale up my system as there's not enough memory to support it. Regards, Edwin On 11 May 2015 at 19:11,

Re: Queries on SynonymFilterFactory

2015-05-11 Thread Zheng Lin Edwin Yeo
Yes sure, thanks for your advice. I'm still waiting for my server to come before I can scale up my system and do the testing. Now the Solr running on my 4GB RAM system will crash if I try to scale up my system as there's not enough memory to support it. Regards, Edwin On 11 May 2015 at 19:11,

Solr Multiword Synonym Problem

2015-05-11 Thread solrnovice
Hi all, I am trying to solve the solr multiword synonym issue at our installation, I am currently using SOLR-4.9.x version. I used the com.lucidworks.analysis.AutoPhrasingTokenFilterFactory from Lucidworks git repo and used this in my schema.xml and also used their

boolean operators OR/NOT get highlighted by solr

2015-05-11 Thread Tang, Rebecca
Hi, We have a SOLR query like this

RE: Solr query which return only those docs whose all tokens are from given list

2015-05-11 Thread Andrew Chillrud
Based on his example, it sounds like Naresh not only wants the tags field to contain at least one of the values [T1, T2, T3] but also wants to exclude documents that contain a tag other than T1, T2, or T3 (Doc3 should not be retrieved). If the set of possible values in the tags field is

Re: Solr query which return only those docs whose all tokens are from given list

2015-05-11 Thread Alessandro Benedetti
A simple OR query should be fine : tags:(T1 T2 T3) Cheers 2015-05-11 15:39 GMT+01:00 Sujit Pal sujit@comcast.net: Hi Naresh, Couldn't you could just model this as an OR query since your requirement is at least one (but can be more than one), ie: tags:T1 tags:T2 tags:T3 -sujit On

Re: schema modification issue

2015-05-11 Thread Steve Rowe
Hi, Thanks for reporting, I’m working a test to reproduce. Can you please create a Solr JIRA issue for this?: https://issues.apache.org/jira/browse/SOLR/ Thanks, Steve On May 7, 2015, at 5:40 AM, User Zolr zolr.u...@gmail.com wrote: Hi there, I have come accross a problem that when

Re: Solr custom component issue

2015-05-11 Thread Upayavira
attaching them to each request, then just add qf= as a param to the URL, easy. On Mon, May 11, 2015, at 12:17 PM, nutchsolruser wrote: These boosting parameters will be configured outside Solr and there is seperate module from which these values get populated , I am reading those values from