returning only certain fields from the docs - parsing on the server side
Hi, I only want to return one field in the documents being returned from my query. I know there is the 'fl' parameter, which is described in the documentation http://wiki.apache.org/solr/CommonQueryParameters as: This parameter can be used to specify a set of fields to return, limiting the amount of information in the response. When returning the results to the client, only fields in this list will be included. But seems like 'fl' works on the client side, after the results have been constructed on the server side, passing the whole docs back on the wire. Is my assumption wrong ? Is there a way to filter things out directly on the Solr side, and return only the field that I desire to the client? Thanks, Matt NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference.
Processing a lot of results in Solr
Hello Solr users, Question regarding processing a lot of docs returned from a query; I potentially have millions of documents returned back from a query. What is the common design to deal with this ? 2 ideas I have are: - create a client service that is multithreaded to handled this - Use the Solr pagination to retrieve a batch of rows at a time (start, rows in Solr Admin console ) Any other ideas that I may be missing ? Thanks, Matt NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference.
Re: Processing a lot of results in Solr
That sounds like a satisfactory solution for the time being - I am assuming you dump the data from Solr in a csv format? How did you implement the streaming processor ? (what tool did you use for this? Not familiar with that) You say it takes a few minutes only to dump the data - how long does it to stream it back in, are performances acceptable (~ within minutes) ? Thanks, Matt On 7/23/13 6:57 PM, Roman Chyla roman.ch...@gmail.com wrote: Hello Matt, You can consider writing a batch processing handler, which receives a query and instead of sending results back, it writes them into a file which is then available for streaming (it has its own UUID). I am dumping many GBs of data from solr in few minutes - your query + streaming writer can go very long way :) roman On Tue, Jul 23, 2013 at 5:04 PM, Matt Lieber mlie...@impetus.com wrote: Hello Solr users, Question regarding processing a lot of docs returned from a query; I potentially have millions of documents returned back from a query. What is the common design to deal with this ? 2 ideas I have are: - create a client service that is multithreaded to handled this - Use the Solr pagination to retrieve a batch of rows at a time (start, rows in Solr Admin console ) Any other ideas that I may be missing ? Thanks, Matt NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference. NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference.
Re: Solr with Hadoop
Rajesh, If you require to have an integration between Solr and Hadoop or NoSQL, I would recommend using a commercial distribution. I think most are free to use as long as you don't require support. I inquired about the Cloudera Search capability, but it seems like that far it is just preliminary: there is no tight integration yet between Hbase and Solr, for example, other than full text search on the HDFS data (I believe enabled in Hue). I am not too familiar with what MapR's M7 has to offer. However Datastax does a good job of tightly integrating Solr with Cassandra, and lets you query over the data ingested from Solr in Hive for example, which is pretty nice. Solr would not trigger Hadoop jobs, though. Cheers, Matt On 7/17/13 7:37 PM, Rajesh Jain rjai...@gmail.com wrote: I have a newbie question on integrating Solr with Hadoop. There are some vendors like Cloudera/MapR who have announced Solr Search for Hadoop. If I use the Apache distro, how can I use Solr Search on docs in HDFS/Hadoop Is there a tutorial on how to use it or getting started. I am using Flume to sink CSV docs into Hadoop/HDFS and I would like to use Solr to provide Search. Does Solr Search trigger MapReduce Jobs (like Splunk-Hunk) does? Thanks, Rajesh NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference.
Re: How to set a condition on the number of docs found
Thanks William, I'll do that. Matt On 7/12/13 7:38 AM, William Bell billnb...@gmail.com wrote: Hmmm. One way is: http://localhost:8983/solr/core/select/?q=*%3A*facet=truefacet.field=id; facet.offset=10rows=0facet.limit=1http://hgsolr2devmstr:8983/solr/provi dersearch/select/?q=*%3A*facet=truefacet.field=cityfacet.offset=10rows =0facet.limit=1 If you have a result you have results 10. Another way is to just look at it wth a facet.query and have your app deal with it. http:/localhost:8983/solr/core/select/?q=*%3A*facet=truefacet.query={!lu cene%20key=numberofresults}state:COrows=0http://hgsolr2devmstr:8983/solr /providersearch/select/?q=*%3A*facet=truefacet.query={!lucene%20key=numb erofresults}state:COrows=0 On Thu, Jul 11, 2013 at 11:45 PM, Matt Lieber mlie...@impetus.com wrote: Hello there, I would like to be able to know whether I got over a certain threshold of doc results. I.e. Test (Result.numFound 10 ) - true. Is there a way to do this ? I can't seem to find how to do this; (other than have to do this test on the client app, which is not great). Thanks, Matt NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference. -- Bill Bell billnb...@gmail.com cell 720-256-8076 NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference.
How to set a condition over stats result
Hello, I am trying to see how I can test the sum of values of an attribute across docs. I.e. Whether sum(myfieldvalue)100 . I know I can use the stats module which compiles the sum of my attributes on a certain facet , but how can I perform a test this result (i.e. Is sum100) within my stats query? From what I read, it's not supported yet to perform a function on the stats module.. Any other way to do this ? Cheers, Matt NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference.
How to set a condition on the number of docs found
Hello there, I would like to be able to know whether I got over a certain threshold of doc results. I.e. Test (Result.numFound 10 ) - true. Is there a way to do this ? I can't seem to find how to do this; (other than have to do this test on the client app, which is not great). Thanks, Matt NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference.