RE: Invalid version (expected 2, but 60) or the data in not in 'javabin'
Thanks Otis. I went through every piece of info that I could lay may hands on. Most of them are about incompatible SolrJ versions (that's not my case) and there was one message from Mark Miller that Solr may respond with an XML instead of javabin in case there was some kind of http error being returned (that's not my case either). I'm using distributed search. I added some debug output to print out the response once the Invalid version exception is caught (in JavaBinCode.unmarshal() ). What I saw is that the response actually contains the facet response in XML format, yet I also noticed that the response is corrupt (i.e. as if a chunk of text has been taken out of the middle of the reply - some kind of overrun perhaps?). Any help would be appreciated. Thanks, Shahar. -Original Message- From: Otis Gospodnetic [mailto:otis.gospodne...@gmail.com] Sent: Friday, December 21, 2012 6:23 AM To: solr-user@lucene.apache.org Subject: Re: Invalid version (expected 2, but 60) or the data in not in 'javabin' Hi, Have a look at http://search-lucene.com/?q=invalid+version+javabin Otis -- Solr Monitoring - http://sematext.com/spm/index.html Search Analytics - http://sematext.com/search-analytics/index.html On Wed, Dec 19, 2012 at 11:23 AM, Shahar Davidson shah...@checkpoint.comwrote: Hi, I'm encountering this error randomly when running a distributed facet. (i.e. I'm sending the exact same request, yet this does not reproduce consistently) I have about 180 shards that are being queried. It seems that when Solr distributes the request to the shards one , or perhaps more, shards return an XML reply instead of Javabin. I added some debug output to JavaBinCode.unmarshal (as done in the debugging.patch of SOLR-3258) to check whether the XML reply holds an error or not, and I noticed that the XML actually holds the response from one of the shards. I'm using the patch provided in SOLR-2894 on top of trunk 1404975. Has anyone encountered such an issue? Any ideas? Thanks, Shahar. Email secured by Check Point
MoreLikeThis supporting multiple document IDs as input?
I'm unclear on this point from the documentation. Is it possible to give Solr X # of document IDs and tell it that I want documents similar to those X documents? Example: - The user is browsing 5 different articles - I send Solr the IDs of these 5 articles so I can present the user other similar articles I see this example for sending it 1 document ID: http://localhost:8080/solr/select/?qt=mltq=id:[document id]mlt.fl=[field1],[field2],[field3]fl=idrows=10 But can I send it 2+ document IDs as the query?
how to use RemoveDuplicatesTokenFilterFactory?
I want to avoid duplicate values in one multivalued field. i am using dataimport handler to import data, the particular multivalued field are being filled from xml source. now that xml has duplicate values, but i want to have unique valued in this multivalued field. e.g. xml data a1 b1 a1 a1 /data i have added RemoveDuplicatesTokenFilterFactory in data type of the field, in index analyzer. still it gives below o/p. arr name=field stra1/str strb1/str stra1/str stra1/str /arr i am using solr 3.5. how can i avoid importing duplicate values in the field? -- View this message in context: http://lucene.472066.n3.nabble.com/how-to-use-RemoveDuplicatesTokenFilterFactory-tp4029004.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: how to use RemoveDuplicatesTokenFilterFactory?
I want to avoid duplicate values in one multivalued field. i am using dataimport handler to import data, the particular multivalued field are being filled from xml source. now that xml has duplicate values, but i want to have unique valued in this multivalued field. e.g. xml data a1 b1 a1 a1 /data i have added RemoveDuplicatesTokenFilterFactory in data type of the field, in index analyzer. still it gives below o/p. arr name=field stra1/str strb1/str stra1/str stra1/str /arr i am using solr 3.5. how can i avoid importing duplicate values in the field? RDTF removes duplicates at the same position. http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.RemoveDuplicatesTokenFilterFactory Elegant solution would be subclass the http://lucene.apache.org/solr/api-4_0_0-ALPHA/org/apache/solr/update/processor/FieldValueSubsetUpdateProcessorFactory.html and create DistinctFieldValueUpdateProcessorFactory or something like that. MinFieldValueUpdateProcessorFactory can be used as an example.
Re: how to use RemoveDuplicatesTokenFilterFactory?
The values are at same logical position. You mean positionIncrementGap set to 0? can you see that duplicates are removed in analysis page? By the way returned values are original (stored) values. Analysis (tokenfilter tokenizer etc) are about indexed values. UpdateProcessorFactory can change stored ( returned) values.
Re: solr java API for fuzzy query
Otis, stop teasing people! You know as well as I do that 2 is the maximum edit distance for fuzzy query in 4.0. So, Keyword~5 Is treated as: Keyword~2 Check with debug=query to see. -- Jack Krupansky -Original Message- From: Otis Gospodnetic Sent: Tuesday, December 25, 2012 1:38 AM To: solr-user@lucene.apache.org Subject: Re: solr java API for fuzzy query Hi Alexey, You can use the Lucene query syntax with Solr, does that help? Try Keyword~5 for example. Otis -- SOLR Performance Monitoring - http://sematext.com/spm/index.html Search Analytics - http://sematext.com/search-analytics/index.html On Tue, Dec 25, 2012 at 12:19 AM, Yakubovich Alexey (Nokia-LC/Chicago) alexey.yakubov...@nokia.com wrote: Is there any java API available in Solr for fuzzy query, similar to the Lucene org.apache.lucene.search.FuzzyQuery class? More general, : is there any general way to define query with Lucene java API and invoke it thru Solr (kind of Lucene-Solr bridge)? Thanks Alexey The information contained in this communication may be CONFIDENTIAL and is intended only for the use of the recipient(s) named above. If you are not the intended recipient, you are hereby notified that any dissemination, distribution, or copying of this communication, or any of its contents, is strictly prohibited. If you have received this communication in error, please notify the sender and delete/destroy the original message and any copy of it from your computer or paper files.
[ANNOUNCE] Apache Solr 3.6.2 released
25 December 2012, Apache Solr™ 3.6.2 available The Lucene PMC and Santa Claus are pleased to announce the release of Apache Solr 3.6.2. Solr is the popular, blazing fast open source enterprise search platform from the Apache Lucene project. Its major features include powerful full-text search, hit highlighting, faceted search, dynamic clustering, database integration, rich document (e.g., Word, PDF) handling, and geospatial search. Solr is highly scalable, providing distributed search and index replication, and it powers the search and navigation features of many of the world's largest internet sites. This release is a bug fix release for version 3.6.1. It contains numerous bug fixes, optimizations, and improvements, some of which are highlighted below. The release is available for immediate download at: http://lucene.apache.org/solr/mirrors-solr-3x-redir.html (see note below). See the CHANGES.txt file included with the release for a full list of details. Solr 3.6.2 Release Highlights: * Fixed ConcurrentModificationException during highlighting, if all fields were requested. * Fixed edismax queryparser to apply minShouldMatch to implicit boolean queries. * Several bugfixes to the DataImportHandler. * Bug fixes from Apache Lucene 3.6.2. Note: The Apache Software Foundation uses an extensive mirroring network for distributing releases. It is possible that the mirror you are using may not have replicated the release yet. If that is the case, please try another mirror. This also goes for Maven access. Happy holidays and happy searching, Lucene/Solr developers
Re: facet query
Please see http://wiki.apache.org/solr/SimpleFacetParameters for more details On Friday, December 21, 2012, hank williams wrote: Great, thank you. Date: Fri, 21 Dec 2012 14:42:13 +0100 From: r@solr.pl javascript:; To: solr-user@lucene.apache.org javascript:; Subject: Re: facet query Hello! Try facet.mincount=1, that should help. -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch Hi, is there a way with facets to say, return facets that are not 0? I have facet=truefacet.field=officefacet.field=name as my facet parameters, and with some of my queries it brings back people that have a value of 0. Thanks -- Anirudha P. Jadhav
Re: solr java API for fuzzy query
But you are assuming 4.0 :) Otis Solr ElasticSearch Support http://sematext.com/ On Dec 25, 2012 10:47 AM, Jack Krupansky j...@basetechnology.com wrote: Otis, stop teasing people! You know as well as I do that 2 is the maximum edit distance for fuzzy query in 4.0. So, Keyword~5 Is treated as: Keyword~2 Check with debug=query to see. -- Jack Krupansky -Original Message- From: Otis Gospodnetic Sent: Tuesday, December 25, 2012 1:38 AM To: solr-user@lucene.apache.org Subject: Re: solr java API for fuzzy query Hi Alexey, You can use the Lucene query syntax with Solr, does that help? Try Keyword~5 for example. Otis -- SOLR Performance Monitoring - http://sematext.com/spm/index.**htmlhttp://sematext.com/spm/index.html Search Analytics - http://sematext.com/search-**analytics/index.htmlhttp://sematext.com/search-analytics/index.html On Tue, Dec 25, 2012 at 12:19 AM, Yakubovich Alexey (Nokia-LC/Chicago) alexey.yakubov...@nokia.com wrote: Is there any java API available in Solr for fuzzy query, similar to the Lucene org.apache.lucene.search.**FuzzyQuery class? More general, : is there any general way to define query with Lucene java API and invoke it thru Solr (kind of Lucene-Solr bridge)? Thanks Alexey __**__ The information contained in this communication may be CONFIDENTIAL and is intended only for the use of the recipient(s) named above. If you are not the intended recipient, you are hereby notified that any dissemination, distribution, or copying of this communication, or any of its contents, is strictly prohibited. If you have received this communication in error, please notify the sender and delete/destroy the original message and any copy of it from your computer or paper files.
Spatial filter in solr 4.0 - Intersects operation with parameters
Hi, I went through example for spatial search in Solr 4.0 (http://wiki.apache.org/solr/SolrAdaptersForLuceneSpatial4) Both indexing and searching work fine. Example is: fq=geo:Intersects(-74.093 41.042 -69.347 44.558) My problem is how to send values to Intersects operation as parameters. If would like to send custom parameters in URL: ...lon1=-74.093lat1=41.042lon2=-69.347lat2=44.558 and have default filter query: fq=geo:Intersects($lon1 $lat1 $lon2 $lat2) I tried this approach - but it did not work. How do I do this? Using {!bbox} is not documented in 4.0 wiki. Anyways, I tried to use it against geo field but got following error: field does not support spatial filtering ... Can I use {!bbox} in 4.0 ? Thanks. Mladen -- View this message in context: http://lucene.472066.n3.nabble.com/Spatial-filter-in-solr-4-0-Intersects-operation-with-parameters-tp4029034.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Dynamic collections in SolrCloud for log indexing
I've been thinking about aliases for a while as well. Seem very handy and fairly easy to implement. So far there has just always been higher priority things (need to finish collection api responses this week…) but this is something I'd def help work on. - Mark On Dec 25, 2012, at 1:49 AM, Otis Gospodnetic otis.gospodne...@gmail.com wrote: Hi, Right, this is not really about routing in ElasticSearch-sense. What's handy for indexing logs are index aliases which I thought I had added to JIRA a while back, but it looks like I have not. Index aliases would let you keep a last 7 days alias fixed while underneath you push and pop an index every day without the client app having to adjust. Otis -- Performance Monitoring - http://sematext.com/spm/index.html Search Analytics - http://sematext.com/search-analytics/index.html On Mon, Dec 24, 2012 at 4:30 AM, Per Steffensen st...@designware.dk wrote: I believe it is a misunderstandig to use custom routing (or sharding as Erick calls it) for this kind of stuff. Custom routing is nice if you want to control which slice/shard under a collection a specific document goes to - mainly to be able to control that two (or more) documents are indexed on the same slice/shard, but also just to be able to control on which slice/shard a specific document is indexed. Knowing/controlling this kind of stuff can be used for a lot of nice purposes. But you dont want to move slices/shards around among collection or delete/add slices from/to a collection - unless its for elasticity reasons. I think you should fill a collection every week/month and just keep those collections as is. Instead of ending up with a big historic collection containing many slices/shards/cores (one for each historic week/month), you will end up with many historic collections (one for each historic week/month). Searching historic data you will have to cross-search those historic collections, but that is no problem at all. If Solr Cloud is made at it is supposed to be made (and I believe it is) it shouldnt require more resouces or be harder in any way to cross-search X slices across many collections, than it is to cross-search X slices under the same collection. Besides that see my answer for topic Will SolrCloud always slice by ID hash? a few days back. Regards, Per Steffensen On 12/24/12 1:07 AM, Erick Erickson wrote: I think this is one of the primary use-cases for custom sharding. Solr 4.0 doesn't really lend itself to this scenario, but I _believe_ that the patch for custom sharding has been committed... That said, I'm not quite sure how you drop off the old shard if you don't need to keep old data. I'd guess it's possible, but haven't implemented anything like that myself. FWIW, Erick On Fri, Dec 21, 2012 at 12:17 PM, Upayavira u...@odoko.co.uk wrote: I'm working on a system for indexing logs. We're probably looking at filling one core every month. We'll maintain a short term index containing the last 7 days - that one is easy to handle. For the longer term stuff, we'd like to maintain a collection that will query across all the historic data, but that means every month we need to add another core to an existing collection, which as I understand it in 4.0 is not possible. How do people handle this sort of situation where you have rolling new content arriving? I'm sure I've heard people using SolrCloud for this sort of thing. Given it is logs, distributed IDF has no real bearing. Upayavira
Re: Invalid version (expected 2, but 60) or the data in not in 'javabin'
The problem is not necessary xml - it seems to be anything that is not valid javabin - I've just most often seen it with 404s that return an html error. I'm not sure if there is a jira issue or not, but this type of thing should be failing in a more user friendly way. As to why your response is corrupt, I have no guesses. This is easily repeatable? It's happening every time, or randomly? - Mark On Dec 25, 2012, at 4:23 AM, Shahar Davidson shah...@checkpoint.com wrote: Thanks Otis. I went through every piece of info that I could lay may hands on. Most of them are about incompatible SolrJ versions (that's not my case) and there was one message from Mark Miller that Solr may respond with an XML instead of javabin in case there was some kind of http error being returned (that's not my case either). I'm using distributed search. I added some debug output to print out the response once the Invalid version exception is caught (in JavaBinCode.unmarshal() ). What I saw is that the response actually contains the facet response in XML format, yet I also noticed that the response is corrupt (i.e. as if a chunk of text has been taken out of the middle of the reply - some kind of overrun perhaps?). Any help would be appreciated. Thanks, Shahar. -Original Message- From: Otis Gospodnetic [mailto:otis.gospodne...@gmail.com] Sent: Friday, December 21, 2012 6:23 AM To: solr-user@lucene.apache.org Subject: Re: Invalid version (expected 2, but 60) or the data in not in 'javabin' Hi, Have a look at http://search-lucene.com/?q=invalid+version+javabin Otis -- Solr Monitoring - http://sematext.com/spm/index.html Search Analytics - http://sematext.com/search-analytics/index.html On Wed, Dec 19, 2012 at 11:23 AM, Shahar Davidson shah...@checkpoint.comwrote: Hi, I'm encountering this error randomly when running a distributed facet. (i.e. I'm sending the exact same request, yet this does not reproduce consistently) I have about 180 shards that are being queried. It seems that when Solr distributes the request to the shards one , or perhaps more, shards return an XML reply instead of Javabin. I added some debug output to JavaBinCode.unmarshal (as done in the debugging.patch of SOLR-3258) to check whether the XML reply holds an error or not, and I noticed that the XML actually holds the response from one of the shards. I'm using the patch provided in SOLR-2894 on top of trunk 1404975. Has anyone encountered such an issue? Any ideas? Thanks, Shahar. Email secured by Check Point
Re: MoreLikeThis supporting multiple document IDs as input?
MLT has both a request handler and a search component. The MLT handler returns similar documents only for the first document that the query matches. The MLT search component returns similar documents for each of the documents in the search results, but processes each search result base document one at a time and keeps its similar documents segregated by each of the base documents. It sounds like you wanted to merge the base search results and then find documents similar to that merged super-document. Is that what you were really seeking, as opposed to what the MLT component does? Unfortunately, you can't do that with the components as they are. You would have to manually merge the values from the base documents and then you could POST that text back to the MLT handler and find similar documents using the posted text rather than a query. Kind of messy, but in theory that should work. -- Jack Krupansky -Original Message- From: David Parks Sent: Tuesday, December 25, 2012 5:04 AM To: solr-user@lucene.apache.org Subject: MoreLikeThis supporting multiple document IDs as input? I'm unclear on this point from the documentation. Is it possible to give Solr X # of document IDs and tell it that I want documents similar to those X documents? Example: - The user is browsing 5 different articles - I send Solr the IDs of these 5 articles so I can present the user other similar articles I see this example for sending it 1 document ID: http://localhost:8080/solr/select/?qt=mltq=id:[document id]mlt.fl=[field1],[field2],[field3]fl=idrows=10 But can I send it 2+ document IDs as the query?
Re: Spatial filter in solr 4.0 - Intersects operation with parameters
Hi Mladen, Despite some similarities at first glance, the Solr 4 spatial fields are not implemented with Solr query parsers, unlike Solr 3 spatial. Everything in quotes is handled by the field type. What you're looking for is for the Solr 3 geospatial functions to be adapted to support the Solr 4 spatial fields. I created an issue, SOLR-4230 to track this. I never got around to doing this before because it wasn't strictly necessary to use the new fields, but it is of course a nice-to-have. ~ David mladen micevic wrote Hi, I went through example for spatial search in Solr 4.0 (http://wiki.apache.org/solr/SolrAdaptersForLuceneSpatial4) Both indexing and searching work fine. Example is: fq=geo:Intersects(-74.093 41.042 -69.347 44.558) My problem is how to send values to Intersects operation as parameters. If would like to send custom parameters in URL: ...lon1=-74.093lat1=41.042lon2=-69.347lat2=44.558 and have default filter query: fq=geo:Intersects($lon1 $lat1 $lon2 $lat2) I tried this approach - but it did not work. How do I do this? Using {!bbox} is not documented in 4.0 wiki. Anyways, I tried to use it against geo field but got following error: field does not support spatial filtering ... Can I use {!bbox} in 4.0 ? Thanks. Mladen - Author: http://www.packtpub.com/apache-solr-3-enterprise-search-server/book -- View this message in context: http://lucene.472066.n3.nabble.com/Spatial-filter-in-solr-4-0-Intersects-operation-with-parameters-tp4029034p4029071.html Sent from the Solr - User mailing list archive at Nabble.com.