Re: Short is buggish ?
Hi. Thanks for the quick response. We have looked through the shards trying to find a value which is greater than radix 10 which would throw this exception. We did not find any. We have values between 0 and 100 in that field. Would not SOLR complain if we tried to index a non-short like for example a float or am integer size number ? Came to think of something. Would this really explain why sorting works without the shards parameter ? It only spits out the string version when we use sharding. I will show by a few examples what I mean. Note that both the field feedType and sentimentScore which are the only shorts both get skewed when using the shards param. --- 1. A specific blog entry by id without sharding. GET http://192.168.10.11:8110/solr/blogosphere-sv-2010Q1/select?q=feedItemId:137768916rows=1indent=on ?xml version=1.0 encoding=UTF-8? response lst name=responseHeader int name=status0/int int name=QTime1/int lst name=params str name=sortsentimentScore asc/str str name=indenton/str str name=qfeedItemId:137768916/str str name=rows1/str /lst /lst result name=response numFound=1 start=0 doc str name=author/ str name=descriptionhej bloggen!vissa av er klagar på att jag tar bort alla kommentarer, oh jag har ju sagt det flera gånger, ATT JAG TAR BORT ALLA ANONYMA KOMMENTARER!thats it!vafan ge er nån jävla gång, så jävla trött på att ni sitter bakom eran data o klagar på alla?visa .../str int name=feedId2958282/int long name=feedItemId137768916/long short name=feedType1/short str name=hashedLink-92603838263017753/str str name=link http://emmathorslund.blogg.se/2010/january/ge-er-vafan.html/str date name=publishedDate2010-01-03T13:32:54Z/date date name=publishedDateDay2010-01-03T12:00:00Z/date date name=publishedDateMonth2009-12-01T12:00:00Z/date date name=publishedDateWeek2009-12-28T12:00:00Z/date short name=sentimentScore0/short str name=titlege er vafan!/str date name=tstamp2010-01-29T16:55:39.52Z/date date name=tstampDay2010-01-29T12:00:00Z/date /doc /result /response --- 2. A specific blog entry by id with sharding GET http://192.168.10.11:8110/solr/blogosphere-sv-2010Q1/select?q=feedItemId:137768916rows=1shards=192.168.10.11:8110/solr/blogosphere-sv-2010Q1indent=on; ?xml version=1.0 encoding=UTF-8? response lst name=responseHeader int name=status0/int int name=QTime11/int lst name=params str name=shards192.168.10.11:8110/solr/blogosphere-sv-2010Q1/str str name=indenton/str str name=qfeedItemId:137768916/str str name=rows1/str /lst /lst result name=response numFound=1 start=0 doc long name=feedItemId137768916/long int name=feedId2958282/int str name=feedTypejava.lang.Short:1/str date name=publishedDate2010-01-03T13:32:54Z/date date name=publishedDateDay2010-01-03T12:00:00Z/date date name=publishedDateWeek2009-12-28T12:00:00Z/date date name=publishedDateMonth2009-12-01T12:00:00Z/date date name=tstamp2010-01-29T16:55:39.52Z/date date name=tstampDay2010-01-29T12:00:00Z/date str name=author/ str name=descriptionhej bloggen!vissa av er klagar på att jag tar bort alla kommentarer, oh jag har ju sagt det flera gånger, ATT JAG TAR BORT ALLA ANONYMA KOMMENTARER!thats it!vafan ge er nån jävla gång, så jävla trött på att ni sitter bakom eran data o klagar på alla?visa .../str str name=titlege er vafan!/str str name=link http://emmathorslund.blogg.se/2010/january/ge-er-vafan.html/str str name=hashedLink-92603838263017753/str str name=sentimentScorejava.lang.Short:0/str /doc /result /response Hopes it makes sense to you, it does not to us :) Oh and changing the field to integer solves the issue. Cheers //Marcus Cheers //Marcus On Fri, Feb 5, 2010 at 9:19 PM, Grant Ingersoll gsing...@apache.org wrote: In looking at the code, I see: code try { short val = Short.parseShort(s); writer.writeShort(name, val); } catch (NumberFormatException e){ // can't parse - write out the contents as a string so nothing is lost and // clients don't get a parse error. writer.writeStr(name, s, true); } /code And it makes me wonder if you are hitting the NFE. Can you recreate this in a self-contained test? -Grant On Feb 5, 2010, at 4:10 AM, Marcus Herou wrote: Hi. When using the field type solr.ShortField in combination with sharding we get results like this back: str name=sentimentScorejava.lang.Short:40/str Making it impossible to sort on that value. Changing the field to IntegerField solves it. Example search: GET
Re: Short is buggish ?
More info from the registry.jsp page Solr Specification Version: 1.4.0Solr Implementation Version: 1.4.0 833479 - grantingersoll - 2009-11-06 12:33:40 Lucene Specification Version: 2.9.1 Lucene Implementation Version: 2.9.1 832363 - 2009-11-03 04:37:25 /M On Mon, Feb 8, 2010 at 9:46 AM, Marcus Herou marcus.he...@tailsweep.comwrote: Hi. Thanks for the quick response. We have looked through the shards trying to find a value which is greater than radix 10 which would throw this exception. We did not find any. We have values between 0 and 100 in that field. Would not SOLR complain if we tried to index a non-short like for example a float or am integer size number ? Came to think of something. Would this really explain why sorting works without the shards parameter ? It only spits out the string version when we use sharding. I will show by a few examples what I mean. Note that both the field feedType and sentimentScore which are the only shorts both get skewed when using the shards param. --- 1. A specific blog entry by id without sharding. GET http://192.168.10.11:8110/solr/blogosphere-sv-2010Q1/select?q=feedItemId:137768916rows=1indent=on ?xml version=1.0 encoding=UTF-8? response lst name=responseHeader int name=status0/int int name=QTime1/int lst name=params str name=sortsentimentScore asc/str str name=indenton/str str name=qfeedItemId:137768916/str str name=rows1/str /lst /lst result name=response numFound=1 start=0 doc str name=author/ str name=descriptionhej bloggen!vissa av er klagar på att jag tar bort alla kommentarer, oh jag har ju sagt det flera gånger, ATT JAG TAR BORT ALLA ANONYMA KOMMENTARER!thats it!vafan ge er nån jävla gång, så jävla trött på att ni sitter bakom eran data o klagar på alla?visa .../str int name=feedId2958282/int long name=feedItemId137768916/long short name=feedType1/short str name=hashedLink-92603838263017753/str str name=link http://emmathorslund.blogg.se/2010/january/ge-er-vafan.html/str date name=publishedDate2010-01-03T13:32:54Z/date date name=publishedDateDay2010-01-03T12:00:00Z/date date name=publishedDateMonth2009-12-01T12:00:00Z/date date name=publishedDateWeek2009-12-28T12:00:00Z/date short name=sentimentScore0/short str name=titlege er vafan!/str date name=tstamp2010-01-29T16:55:39.52Z/date date name=tstampDay2010-01-29T12:00:00Z/date /doc /result /response --- 2. A specific blog entry by id with sharding GET http://192.168.10.11:8110/solr/blogosphere-sv-2010Q1/select?q=feedItemId:137768916rows=1shards=192.168.10.11:8110/solr/blogosphere-sv-2010Q1indent=on; ?xml version=1.0 encoding=UTF-8? response lst name=responseHeader int name=status0/int int name=QTime11/int lst name=params str name=shards192.168.10.11:8110/solr/blogosphere-sv-2010Q1/str str name=indenton/str str name=qfeedItemId:137768916/str str name=rows1/str /lst /lst result name=response numFound=1 start=0 doc long name=feedItemId137768916/long int name=feedId2958282/int str name=feedTypejava.lang.Short:1/str date name=publishedDate2010-01-03T13:32:54Z/date date name=publishedDateDay2010-01-03T12:00:00Z/date date name=publishedDateWeek2009-12-28T12:00:00Z/date date name=publishedDateMonth2009-12-01T12:00:00Z/date date name=tstamp2010-01-29T16:55:39.52Z/date date name=tstampDay2010-01-29T12:00:00Z/date str name=author/ str name=descriptionhej bloggen!vissa av er klagar på att jag tar bort alla kommentarer, oh jag har ju sagt det flera gånger, ATT JAG TAR BORT ALLA ANONYMA KOMMENTARER!thats it!vafan ge er nån jävla gång, så jävla trött på att ni sitter bakom eran data o klagar på alla?visa .../str str name=titlege er vafan!/str str name=link http://emmathorslund.blogg.se/2010/january/ge-er-vafan.html/str str name=hashedLink-92603838263017753/str str name=sentimentScorejava.lang.Short:0/str /doc /result /response Hopes it makes sense to you, it does not to us :) Oh and changing the field to integer solves the issue. Cheers //Marcus Cheers //Marcus On Fri, Feb 5, 2010 at 9:19 PM, Grant Ingersoll gsing...@apache.orgwrote: In looking at the code, I see: code try { short val = Short.parseShort(s); writer.writeShort(name, val); } catch (NumberFormatException e){ // can't parse - write out the contents as a string so nothing is lost and // clients don't get a parse error. writer.writeStr(name, s, true); } /code And it makes me wonder if you are hitting the NFE. Can you recreate this in a self-contained test? -Grant On Feb 5, 2010, at 4:10 AM, Marcus Herou wrote: Hi. When using the field type solr.ShortField
[jira] Commented: (SOLR-1722) Allowing changing the special default core name, and as a default default core name, switch to using collection1 rather than DEFAULT_CORE
[ https://issues.apache.org/jira/browse/SOLR-1722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12830943#action_12830943 ] Mark Miller commented on SOLR-1722: --- If no one objects, I'd like to commit this soon. I think its a clear improvement on what is there now, so I'd like to get it in. I think we can talk about how the normalization occurs in another issue. Doing things differently has its own back compat issues, and it would be nice if this configurability wasn't caught up in it. Another option we have is to leave the normalization as it is, but just change getName so that it returns the default name rather than . Allowing changing the special default core name, and as a default default core name, switch to using collection1 rather than DEFAULT_CORE --- Key: SOLR-1722 URL: https://issues.apache.org/jira/browse/SOLR-1722 Project: Solr Issue Type: Improvement Reporter: Mark Miller Assignee: Mark Miller Priority: Minor Fix For: 1.5 Attachments: SOLR-1722.patch see http://search.lucidimagination.com/search/document/f5f2af7c5041a79e/default_core -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: Solr Associating documents
Please re-post your question to the solr-user list which is a more appropriate place. -- Jan Høydahl - search architect Cominvent AS - www.cominvent.com On 8. feb. 2010, at 07.11, swapnilagarwal wrote: I am creating .xml files which are then posted to solr for creating documents. The xml files have add/add tags in which there are several doc/doc tags. Now consider an example where we data for a car. We would like to add one doc related to some features of the car like mileage, horsepower e.t.c. We will add another doc containing information about the price, service about the same car. Now both the above docs will contain an ID representing the same car. Now when search is performed, how do I get the result combined from both the above docs. For instance, if I search for feature:13KM/L price:60K, I want the search results having combined score from the docs containing the same car ID. Merging these docs is not favourable as I want to add,delete and upadate the features for a particular car. Thanks in advance!! -- View this message in context: http://old.nabble.com/Solr-Associating-documents-tp27495702p27495702.html Sent from the Solr - Dev mailing list archive at Nabble.com.
[jira] Created: (SOLR-1763) Integrate Solr Cell/Tika as an UpdateRequestProcessor
Integrate Solr Cell/Tika as an UpdateRequestProcessor - Key: SOLR-1763 URL: https://issues.apache.org/jira/browse/SOLR-1763 Project: Solr Issue Type: New Feature Components: update Reporter: Jan Høydahl From Chris Hostetter's original post in solr-dev: As someone with very little knowledge of Solr Cell and/or Tika, I find myself wondering if ExtractingRequestHandler would make more sense as an extractingUpdateProcessor -- where it could be configured to take take either binary fields (or string fields containing URLs) out of the Documents, parse them with tika, and add the various XPath matching hunks of text back into the document as new fields. Then ExtractingRequestHandler just becomes a handler that slurps up it's ContentStreams and adds them as binary data fields and adds the other literal params as fields. Wouldn't that make things like SOLR-1358, and using Tika with URLs/filepaths in XML and CSV based updates fairly trivial? -Hoss I couldn't agree more, so I decided to add it as an issue. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1763) Integrate Solr Cell/Tika as an UpdateRequestProcessor
[ https://issues.apache.org/jira/browse/SOLR-1763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12831108#action_12831108 ] Jan Høydahl commented on SOLR-1763: --- Re-posting my comment from solr-dev in this ticket: Good match. UpdateProcessors is the way to go for functionality which modifiy documents prior to indexing. With this, we can mix and match any type of content source with other processing needs. I think it can be neneficial to have the choice to do extration on the SolrJ side. But you don't always have that choice, if your source is a crawler without built-in Tika, some base64 encoded field in an XML or some other random source, you want to do the extraction at an arbitrary place in the chain. Examples: Crawler (httpheaders, binarybody) - TikaUpdateProcessor (+title, +text, +meta...) - index XML (title, pdfurl) - GetUrlProcessor (+pdfbin) - TikaUpdateProcessor (+text, +meta) - index DIH (city, street, lat, lon) - LatLon2GeoHashProcessor (+geohash) - index I propose to model the document processor chain more after FAST ESP's flexible processing chain, which must be seen as an industry best practice. I'm thinking of starting a Wiki page to model what direction we should go. -- Jan Høydahl - search architect Cominvent AS - www.cominvent.com Integrate Solr Cell/Tika as an UpdateRequestProcessor - Key: SOLR-1763 URL: https://issues.apache.org/jira/browse/SOLR-1763 Project: Solr Issue Type: New Feature Components: update Reporter: Jan Høydahl From Chris Hostetter's original post in solr-dev: As someone with very little knowledge of Solr Cell and/or Tika, I find myself wondering if ExtractingRequestHandler would make more sense as an extractingUpdateProcessor -- where it could be configured to take take either binary fields (or string fields containing URLs) out of the Documents, parse them with tika, and add the various XPath matching hunks of text back into the document as new fields. Then ExtractingRequestHandler just becomes a handler that slurps up it's ContentStreams and adds them as binary data fields and adds the other literal params as fields. Wouldn't that make things like SOLR-1358, and using Tika with URLs/filepaths in XML and CSV based updates fairly trivial? -Hoss I couldn't agree more, so I decided to add it as an issue. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: Solr Cell revamped as an UpdateProcessor?
I created an issue for this improvement idea to make sure it doesn't just die away: https://issues.apache.org/jira/browse/SOLR-1763 -- Jan Høydahl - search architect Cominvent AS - www.cominvent.com On 22. jan. 2010, at 23.37, Jan Høydahl / Cominvent wrote: On 8. des. 2009, at 00.29, Grant Ingersoll wrote: On Dec 7, 2009, at 3:51 PM, Chris Hostetter wrote: ASs someone with very little knowledge of Solr Cell and/or Tika, I find myself wondering if ExtractingRequestHandler would make more sense as an extractingUpdateProcessor -- where it could be configured to take take either binary fields (or string fields containing URLs) out of the Documents, parse them with tika, and add the various XPath matching hunks of text back into the document as new fields. Then ExtractingRequestHandler just becomes a handler that slurps up it's ContentStreams and adds them as binary data fields and adds the other literal params as fields. Wouldn't that make things like SOLR-1358, and using Tika with URLs/filepaths in XML and CSV based updates fairly trivial? It probably could, but am not sure how it works in a processor chain. However, I'm not sure I understand how they work all that much either. I also plan on adding, BTW, a SolrJ client for Tika that does the extraction on the client. In many cases, the ExtrReqHandler is really only designed for lighter weight extraction cases, as one would simply not want to send that much rich content over the wire. Good match. UpdateProcessors is the way to go for functionality which modifiy documents prior to indexing. With this, we can mix and match any type of content source with other processing needs. I think it can be neneficial to have the choice to do extration on the SolrJ side. But you don't always have that choice, if your source is a crawler without built-in Tika, some base64 encoded field in an XML or some other random source, you want to do the extraction at an arbitrary place in the chain. Examples: Crawler (httpheaders, binarybody) - TikaUpdateProcessor (+title, +text, +meta...) - index XML (title, pdfurl) - GetUrlProcessor (+pdfbin) - TikaUpdateProcessor (+text, +meta) - index DIH (city, street, lat, lon) - LatLon2GeoHashProcessor (+geohash) - index I propose to model the document processor chain more after FAST ESP's flexible processing chain, which must be seen as an industry best practice. I'm thinking of starting a Wiki page to model what direction we should go. -- Jan Høydahl - search architect Cominvent AS - www.cominvent.com
Re: Real-time deletes
Hey Guys, havent heard back from anyone - Would really appreciate any response what so ever (even a 'extremely not feasible right now'), just so i know if to try and pursue this direction or abandon.. Thanks, -Chak On Fri, Feb 5, 2010 at 11:41 AM, KaktuChakarabati jimmoe...@gmail.comwrote: Hey, some time ago I asked around and found out that lucene has inbuilt support pretty much for propagating deletes to the active index without a lengthy commit ( I do not remember the exact semantics but I believe it involves using an IndexReader reopen() method or so). I wanted to check back and find out whether solr now makes use of this in any way - Otherwise, is anyone working on such a feature - And Otherwise, if i'd like to pick up the glove on this, what would be a correct way, architecture-wise to go about it ? implement as a separate UpdateHandler / flag..? Thanks, -Chak -- View this message in context: http://old.nabble.com/Real-time-deletes-tp27472975p27472975.html Sent from the Solr - Dev mailing list archive at Nabble.com.
Re: Real-time deletes
Hello there dude... I started on this, http://issues.apache.org/jira/browse/SOLR-1606 However since then things have changed, so it may not work... You're welcome to continue on it... Cheers, Jason On Tue, Feb 9, 2010 at 3:20 PM, Kaktu Chakarabati jimmoe...@gmail.com wrote: Hey Guys, havent heard back from anyone - Would really appreciate any response what so ever (even a 'extremely not feasible right now'), just so i know if to try and pursue this direction or abandon.. Thanks, -Chak On Fri, Feb 5, 2010 at 11:41 AM, KaktuChakarabati jimmoe...@gmail.comwrote: Hey, some time ago I asked around and found out that lucene has inbuilt support pretty much for propagating deletes to the active index without a lengthy commit ( I do not remember the exact semantics but I believe it involves using an IndexReader reopen() method or so). I wanted to check back and find out whether solr now makes use of this in any way - Otherwise, is anyone working on such a feature - And Otherwise, if i'd like to pick up the glove on this, what would be a correct way, architecture-wise to go about it ? implement as a separate UpdateHandler / flag..? Thanks, -Chak -- View this message in context: http://old.nabble.com/Real-time-deletes-tp27472975p27472975.html Sent from the Solr - Dev mailing list archive at Nabble.com.
[jira] Commented: (SOLR-1568) Implement Spatial Filter
[ https://issues.apache.org/jira/browse/SOLR-1568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12831140#action_12831140 ] Grant Ingersoll commented on SOLR-1568: --- So, one of the things I'm not sure on here is how best to associate the filtering information with the FieldType. On the one hand, we could have a base class or a small interface that defines the filter call back on the FieldType and then the PointType and other spatial FieldTypes could implement/extend that capability. Going this approach means that if someone wants to provide a different way of filtering for a FieldType, they would have to implement a derived class overriding the method. For instance, on the PointType, the base implementation may be to just generate a range query for each field based on distance. However, if someone wanted a different approach, they would then have to extend PointType and register a whole other FieldType, let's call it NewFilterPointType. An alternative approach would be to separate the filter calculation in a different class and then somehow associate it with the FieldType (maybe as a map). I've started this to some extent on the last NON WORKING patch, but don't feel great about the actual implementation just yet. In the case above, Solr would provide a default implementation (automatically registered) and then it could be overridden by configuring in solrconfig.xml. I'm also open to other suggestions. I still am pretty open to taking baby steps here by defining the API as Yonik described above (more or less, see my last patch) but only providing a single implementation right now for the Spatial Tile Field Type (Cartesian Tier). Thoughts and suggestions welcome? I'd like to get something in Solr pretty soon. Implement Spatial Filter Key: SOLR-1568 URL: https://issues.apache.org/jira/browse/SOLR-1568 Project: Solr Issue Type: New Feature Reporter: Grant Ingersoll Assignee: Grant Ingersoll Priority: Minor Fix For: 1.5 Attachments: CartesianTierQParserPlugin.java, SOLR-1568.patch Given an index with spatial information (either as a geohash, SpatialTileField (see SOLR-1586) or just two lat/lon pairs), we should be able to pass in a filter query that takes in the field name, lat, lon and distance and produces an appropriate Filter (i.e. one that is aware of the underlying field type for use by Solr. The interface _could_ look like: {code} fq={!sfilt dist=20}location:49.32,-79.0 {code} or it could be: {code} fq={!sfilt lat=49.32 lat=-79.0 f=location dist=20} {code} or: {code} fq={!sfilt p=49.32,-79.0 f=location dist=20} {code} or: {code} fq={!sfilt lat=49.32,-79.0 fl=lat,lon dist=20} {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-1761) Command line Solr check softwares
[ https://issues.apache.org/jira/browse/SOLR-1761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Rutherglen updated SOLR-1761: --- Attachment: SOLR-1761.patch No-commit Here's a couple apps that: 1) Check the query time 2) Check the last replication time They exit with error code 1 on failure, 0 on success Command line Solr check softwares - Key: SOLR-1761 URL: https://issues.apache.org/jira/browse/SOLR-1761 Project: Solr Issue Type: New Feature Affects Versions: 1.4 Reporter: Jason Rutherglen Fix For: 1.5 Attachments: SOLR-1761.patch I'm in need of a command tool Nagios and the like can execute that verifies a Solr server is working... Basically it'll be a jar with apps that return error codes if a given criteria isn't met. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-236) Field collapsing
[ https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12831150#action_12831150 ] Martijn van Groningen commented on SOLR-236: bq. Regarding Patrick's comment about a memory leak, we are seeing something similar - very large memory usage and eventually using all the available memory. Were there any confirmed issues that may have been addressed with the later patches? We're using the 12-24 patch. Any toggles we can switch to still get the feature, yet minimize the memory footprint? Are you using any other features besides plain collapsing? The field collapse cache gets large very quickly, I suggest you turn it off (if you are using it). Also you can try to make your filterCache smaller. bq. What fixes would we be missing if ran Solr 1.4 with the last field-collapse-5.patch patch? Not much I believe, some are using it in production without too many problems. Field collapsing Key: SOLR-236 URL: https://issues.apache.org/jira/browse/SOLR-236 Project: Solr Issue Type: New Feature Components: search Affects Versions: 1.3 Reporter: Emmanuel Keller Assignee: Shalin Shekhar Mangar Fix For: 1.5 Attachments: collapsing-patch-to-1.3.0-dieter.patch, collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-3.patch, field-collapse-4-with-solrj.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-solr-236-2.patch, field-collapse-solr-236.patch, field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, quasidistributed.additional.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, solr-236.patch, SOLR-236_collapsing.patch, SOLR-236_collapsing.patch This patch include a new feature called Field collapsing. Used in order to collapse a group of results with similar value for a given field to a single entry in the result set. Site collapsing is a special case of this, where all results for a given web site is collapsed into one or two entries in the result set, typically with an associated more documents from this site link. See also Duplicate detection. http://www.fastsearch.com/glossary.aspx?m=48amid=299 The implementation add 3 new query parameters (SolrParams): collapse.field to choose the field used to group results collapse.type normal (default value) or adjacent collapse.max to select how many continuous results are allowed before collapsing TODO (in progress): - More documentation (on source code) - Test cases Two patches: - field_collapsing.patch for current development version - field_collapsing_1.1.0.patch for Solr-1.1.0 P.S.: Feedback and misspelling correction are welcome ;-) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: [jira] Commented: (SOLR-1163) Solr Explorer - A generic GWT client for Solr
Hi Uri, This is very good App. I applied the patch into my contrib directory in my machine. I am using Eclipse, I do not know how to compile. Is there any libs to be in the classpath. Please explain in detail about how to set up using eclipse IDE. Thanks in advance, Pradeep. --- On Wed, 2/3/10, Uri Boness ubon...@gmail.com wrote: From: Uri Boness ubon...@gmail.com Subject: Re: [jira] Commented: (SOLR-1163) Solr Explorer - A generic GWT client for Solr To: solr-dev@lucene.apache.org Date: Wednesday, February 3, 2010, 4:36 PM If you're looking for the new version, then no, I haven't created a patch for it yet. The old one you can find in JIRA and applying it will create the contrib folder for it. Uri Pradeep Pujari wrote: I checked contrib directory. I did not find this patch. Do you have commited this code? Pradeep. --- On Wed, 1/27/10, Uri Boness (JIRA) j...@apache.org wrote: From: Uri Boness (JIRA) j...@apache.org Subject: [jira] Commented: (SOLR-1163) Solr Explorer - A generic GWT client for Solr To: solr-dev@lucene.apache.org Date: Wednesday, January 27, 2010, 10:22 AM [ https://issues.apache.org/jira/browse/SOLR-1163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12805601#action_12805601 ] Uri Boness commented on SOLR-1163: -- Actually I've been working on a new version for the explorer which I plan to put soon as a patch here. Solr Explorer - A generic GWT client for Solr - Key: SOLR-1163 URL: https://issues.apache.org/jira/browse/SOLR-1163 Project: Solr Issue Type: New Feature Components: web gui Affects Versions: 1.3 Reporter: Uri Boness Attachments: graphics.zip, solr-explorer.patch, solr-explorer.patch The attached patch is a GWT generic client for solr. It is currently standalone, meaning that once built, one can open the generated HTML file in a browser and communicate with any deployed solr. It is configured with it's own configuration file, where one can configure the solr instance/core to connect to. Since it's currently standalone and completely client side based, it uses JSON with padding (cross-side scripting) to connect to remote solr servers. Some of the supported features: - Simple query search - Sorting - one can dynamically define new sort criterias - Search results are rendered very much like Google search results are rendered. It is also possible to view all stored field values for every hit. - Custom hit rendering - It is possible to show thumbnails (images) per hit and also customize a view for a hit based on html templates - Faceting - one can dynamically define field and query facets via the UI. it is also possible to pre-configure these facets in the configuration file. - Highlighting - you can dynamically configure highlighting. it can also be pre-configured in the configuration file - Spellchecking - you can dynamically configure spell checking. Can also be done in the configuration file. Supports collation. It is also possible to send build and reload commands. - Data import handler - if used, it is possible to send a full-import and status command (delta-import is not implemented yet, but it's easy to add) - Console - For development time, there's a small console which can help to better understand what's going on behind the scenes. One can use it to: ** view the client logs ** browse the solr scheme ** View a break down of the current search context ** View a break down of the query URL that is sent to solr ** View the raw JSON response returning from Solr This client is actually a platform that can be greatly extended for more things. The goal is to have a client where the explorer part is just one view of it. Other future views include: Monitoring, Administration, Query Builder, DataImportHandler configuration, and more... To get a better view of what's currently possible. We've set up a public version of this client at: http://search.jteam.nl/explorer. This client is configured with one solr instance where crawled YouTube movies where indexed. You can also check out a screencast for this deployed client: http://search.jteam.nl/help The patch created a new folder in the contrib. directory. Since the patch doesn't contain binaries, an additional zip file is provides that needs to be extract to add all the
[jira] Updated: (SOLR-1757) DIH multithreading sometimes throws NPE
[ https://issues.apache.org/jira/browse/SOLR-1757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Henson updated SOLR-1757: - Attachment: solr-1757-abort-threaddump.zip This is the ../admin/threaddjump.jsp page for the core configured with threads=3, running a full-import, after having sent it the abort command. DIH multithreading sometimes throws NPE --- Key: SOLR-1757 URL: https://issues.apache.org/jira/browse/SOLR-1757 Project: Solr Issue Type: Bug Components: contrib - DataImportHandler Affects Versions: 1.4 Environment: tomcat 6.0.x, jdk 1.6.x on windows xp 32bit Reporter: Michael Henson Assignee: Noble Paul Attachments: solr-1352-threads-bt.txt, solr-1757-abort-threaddump.zip, SOLR-1757.patch When the threads attribute is set on a root entity in the DIH's data-config.xml, the multithreading code sometimes throws a NullPointerException after the full-index command is given. I haven't yet been able to figure out exactly which reference holds the null or why, but it does happen consistently with the same backtrace. My configuration is: 1. Multi-core Solr under tomcat 2. Using JdbcDataSource and the default SqlEntityProcessor To reproduce: 1. Add the attribute threads=2 to the root entity declaration in data-config.xml 2. Send the full-import command either directly to .../core/dataimport?command=full-import or through the /admin/dataimport.jsp control panel. 3. Wait for the NPE to show up in the logs/console -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-1761) Command line Solr check softwares
[ https://issues.apache.org/jira/browse/SOLR-1761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Rutherglen updated SOLR-1761: --- Attachment: SOLR-1761.patch Here's a cleaned up, commitable version Command line Solr check softwares - Key: SOLR-1761 URL: https://issues.apache.org/jira/browse/SOLR-1761 Project: Solr Issue Type: New Feature Affects Versions: 1.4 Reporter: Jason Rutherglen Fix For: 1.5 Attachments: SOLR-1761.patch, SOLR-1761.patch I'm in need of a command tool Nagios and the like can execute that verifies a Solr server is working... Basically it'll be a jar with apps that return error codes if a given criteria isn't met. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (SOLR-1764) While indexing a java.lang.IllegalStateException: Can't overwrite cause exception is thrown
While indexing a java.lang.IllegalStateException: Can't overwrite cause exception is thrown - Key: SOLR-1764 URL: https://issues.apache.org/jira/browse/SOLR-1764 Project: Solr Issue Type: Bug Components: clients - java Affects Versions: 1.4 Environment: Windows XP, JBoss 4.2.3 GA Reporter: Michael McGowan Priority: Blocker I get an exception while indexing. It seems that I'm unable to see the root cause of the exception because it is masked by another java.lang.IllegalStateException: Can't overwrite cause exception. Here is the stacktrace : 16:59:04,292 ERROR [STDERR] Feb 8, 2010 4:59:04 PM org.apache.solr.update.processor.LogUpdateProcessor finish INFO: {} 0 15 16:59:04,292 ERROR [STDERR] Feb 8, 2010 4:59:04 PM org.apache.solr.common.SolrException log SEVERE: java.lang.IllegalStateException: Can't overwrite cause at java.lang.Throwable.initCause(Throwable.java:320) at com.ctc.wstx.compat.Jdk14Impl.setInitCause(Jdk14Impl.java:70) at com.ctc.wstx.exc.WstxException.init(WstxException.java:46) at com.ctc.wstx.exc.WstxIOException.init(WstxIOException.java:16) at com.ctc.wstx.stax.WstxInputFactory.doCreateSR(WstxInputFactory.java:536) at com.ctc.wstx.stax.WstxInputFactory.createSR(WstxInputFactory.java:592) at com.ctc.wstx.stax.WstxInputFactory.createSR(WstxInputFactory.java:648) at com.ctc.wstx.stax.WstxInputFactory.createXMLStreamReader(WstxInputFactory.java:319) at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:68) at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.jboss.web.tomcat.filters.ReplyHeaderFilter.doFilter(ReplyHeaderFilter.java:96) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:230) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:175) at org.jboss.web.tomcat.security.SecurityAssociationValve.invoke(SecurityAssociationValve.java:182) at org.jboss.web.tomcat.security.JaccContextValve.invoke(JaccContextValve.java:84) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) at org.jboss.web.tomcat.service.jca.CachedConnectionValve.invoke(CachedConnectionValve.java:157) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:262) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:844) at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583) at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:446) at java.lang.Thread.run(Thread.java:619) 16:59:04,292 ERROR [STDERR] Feb 8, 2010 4:59:04 PM org.apache.solr.core.SolrCore execute INFO: [] webapp=/solr path=/update params={wt=xmlversion=2.2} status=500 QTime=15 16:59:04,292 ERROR [STDERR] Feb 8, 2010 4:59:04 PM org.apache.solr.common.SolrException log SEVERE: java.lang.IllegalStateException: Can't overwrite cause at java.lang.Throwable.initCause(Throwable.java:320) at com.ctc.wstx.compat.Jdk14Impl.setInitCause(Jdk14Impl.java:70) at com.ctc.wstx.exc.WstxException.init(WstxException.java:46) at com.ctc.wstx.exc.WstxIOException.init(WstxIOException.java:16) at com.ctc.wstx.stax.WstxInputFactory.doCreateSR(WstxInputFactory.java:536) at com.ctc.wstx.stax.WstxInputFactory.createSR(WstxInputFactory.java:592) at com.ctc.wstx.stax.WstxInputFactory.createSR(WstxInputFactory.java:648) at com.ctc.wstx.stax.WstxInputFactory.createXMLStreamReader(WstxInputFactory.java:319) at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:68) at
Re: [jira] Commented: (SOLR-236) Field collapsing
I also think the isTokenized() check/exception should be removed. It is probably a common use-case to have a single-valued tokenized field - i.e. a case insensitive string (a text field where the only filter applied is a LowerCaseFilterFactory). I think that as long as it's documented that field collapsing doesn't work for fields with multiple tokens then it shouldn't be an issue. That certainly seems better to me than preventing a perfectly valid use case, since you wouldn't get any results anyway. if (schemaField.getType(). isTokenized()) { throw new RuntimeException(Could not collapse, because collapse field is tokenized); } I agree that it would be better to check if the field has multiple values or not. In the mean-time, though, perhaps the remove the check and log a warning approach would suffice? -Trey On Tue, Jan 19, 2010 at 5:46 AM, Martijn van Groningen (JIRA) j...@apache.org wrote: [ https://issues.apache.org/jira/browse/SOLR-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12802186#action_12802186] Martijn van Groningen commented on SOLR-236: If the field is tokenized and has more than one token your field collapse result will become incorrect. What happens if I remember correctly is that it will only collapse on the field's last token. This off course leads to weird collapse groups. For the users that only have one token per collapse field are because of this check out of luck. Somehow I think we should make the user know that is not possible to collapse on a tokenized field (at least with multiple tokens). Maybe adding a warning in the response. Still I think the exception is more clear, but also prohibits it off course. bq. Or someone could come after me and write a patch that checks for multi-tokened fields somehow and throws an exception. Checking if a tokenized field contains only one token is really inefficient, because you have the check all every collapse field of all documents. Now do check is done based on the field's definition in the schema. Field collapsing Key: SOLR-236 URL: https://issues.apache.org/jira/browse/SOLR-236 Project: Solr Issue Type: New Feature Components: search Affects Versions: 1.3 Reporter: Emmanuel Keller Assignee: Shalin Shekhar Mangar Fix For: 1.5 Attachments: collapsing-patch-to-1.3.0-dieter.patch, collapsing-patch-to-1.3.0-ivan.patch, collapsing-patch-to-1.3.0-ivan_2.patch, collapsing-patch-to-1.3.0-ivan_3.patch, field-collapse-3.patch, field-collapse-4-with-solrj.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-5.patch, field-collapse-solr-236-2.patch, field-collapse-solr-236.patch, field-collapsing-extended-592129.patch, field_collapsing_1.1.0.patch, field_collapsing_1.3.patch, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, field_collapsing_dsteigerwald.diff, quasidistributed.additional.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, SOLR-236-FieldCollapsing.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, SOLR-236.patch, solr-236.patch, SOLR-236_collapsing.patch, SOLR-236_collapsing.patch This patch include a new feature called Field collapsing. Used in order to collapse a group of results with similar value for a given field to a single entry in the result set. Site collapsing is a special case of this, where all results for a given web site is collapsed into one or two entries in the result set, typically with an associated more documents from this site link. See also Duplicate detection. http://www.fastsearch.com/glossary.aspx?m=48amid=299 The implementation add 3 new query parameters (SolrParams): collapse.field to choose the field used to group results collapse.type normal (default value) or adjacent collapse.max to select how many continuous results are allowed before collapsing TODO (in progress): - More documentation (on source code) - Test cases Two patches: - field_collapsing.patch for current development version - field_collapsing_1.1.0.patch for Solr-1.1.0 P.S.: Feedback and misspelling correction are welcome ;-) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1764) While indexing a java.lang.IllegalStateException: Can't overwrite cause exception is thrown
[ https://issues.apache.org/jira/browse/SOLR-1764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12831268#action_12831268 ] Fuad Efendi commented on SOLR-1764: --- Michael, Which version of Java are you using? I believe something wrong with XML (upload) file, and specific Java version classes conflict with WoodStox, although SOLR may need improvement too: http://forums.sun.com/thread.jspa?threadID=5150576 While indexing a java.lang.IllegalStateException: Can't overwrite cause exception is thrown - Key: SOLR-1764 URL: https://issues.apache.org/jira/browse/SOLR-1764 Project: Solr Issue Type: Bug Components: clients - java Affects Versions: 1.4 Environment: Windows XP, JBoss 4.2.3 GA Reporter: Michael McGowan Priority: Blocker I get an exception while indexing. It seems that I'm unable to see the root cause of the exception because it is masked by another java.lang.IllegalStateException: Can't overwrite cause exception. Here is the stacktrace : 16:59:04,292 ERROR [STDERR] Feb 8, 2010 4:59:04 PM org.apache.solr.update.processor.LogUpdateProcessor finish INFO: {} 0 15 16:59:04,292 ERROR [STDERR] Feb 8, 2010 4:59:04 PM org.apache.solr.common.SolrException log SEVERE: java.lang.IllegalStateException: Can't overwrite cause at java.lang.Throwable.initCause(Throwable.java:320) at com.ctc.wstx.compat.Jdk14Impl.setInitCause(Jdk14Impl.java:70) at com.ctc.wstx.exc.WstxException.init(WstxException.java:46) at com.ctc.wstx.exc.WstxIOException.init(WstxIOException.java:16) at com.ctc.wstx.stax.WstxInputFactory.doCreateSR(WstxInputFactory.java:536) at com.ctc.wstx.stax.WstxInputFactory.createSR(WstxInputFactory.java:592) at com.ctc.wstx.stax.WstxInputFactory.createSR(WstxInputFactory.java:648) at com.ctc.wstx.stax.WstxInputFactory.createXMLStreamReader(WstxInputFactory.java:319) at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:68) at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.jboss.web.tomcat.filters.ReplyHeaderFilter.doFilter(ReplyHeaderFilter.java:96) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:230) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:175) at org.jboss.web.tomcat.security.SecurityAssociationValve.invoke(SecurityAssociationValve.java:182) at org.jboss.web.tomcat.security.JaccContextValve.invoke(JaccContextValve.java:84) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) at org.jboss.web.tomcat.service.jca.CachedConnectionValve.invoke(CachedConnectionValve.java:157) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:262) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:844) at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583) at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:446) at java.lang.Thread.run(Thread.java:619) 16:59:04,292 ERROR [STDERR] Feb 8, 2010 4:59:04 PM org.apache.solr.core.SolrCore execute INFO: [] webapp=/solr path=/update params={wt=xmlversion=2.2} status=500 QTime=15 16:59:04,292 ERROR [STDERR] Feb 8, 2010 4:59:04 PM org.apache.solr.common.SolrException log SEVERE: java.lang.IllegalStateException: Can't overwrite cause at java.lang.Throwable.initCause(Throwable.java:320) at com.ctc.wstx.compat.Jdk14Impl.setInitCause(Jdk14Impl.java:70) at
[jira] Issue Comment Edited: (SOLR-1764) While indexing a java.lang.IllegalStateException: Can't overwrite cause exception is thrown
[ https://issues.apache.org/jira/browse/SOLR-1764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12831268#action_12831268 ] Fuad Efendi edited comment on SOLR-1764 at 2/9/10 2:48 AM: --- Michael, Which version of Java are you using? I believe something wrong with XML (upload) file, and specific Java version classes conflict with WoodStox, although SOLR may need improvement too: http://forums.sun.com/thread.jspa?threadID=5150576 It says that text nodes such as prim name=y[-A-Z0-9.,()/='+:?!%amp;amp;*; ]/prim can be split (for instance, to porcess entities), depending on implementation, and, to be safe, SOLR needs something like {code} while (reader.isCharacters()) { sb.append(reader.getText()); reader.next(); } {code} was (Author: funtick): Michael, Which version of Java are you using? I believe something wrong with XML (upload) file, and specific Java version classes conflict with WoodStox, although SOLR may need improvement too: http://forums.sun.com/thread.jspa?threadID=5150576 While indexing a java.lang.IllegalStateException: Can't overwrite cause exception is thrown - Key: SOLR-1764 URL: https://issues.apache.org/jira/browse/SOLR-1764 Project: Solr Issue Type: Bug Components: clients - java Affects Versions: 1.4 Environment: Windows XP, JBoss 4.2.3 GA Reporter: Michael McGowan Priority: Blocker I get an exception while indexing. It seems that I'm unable to see the root cause of the exception because it is masked by another java.lang.IllegalStateException: Can't overwrite cause exception. Here is the stacktrace : 16:59:04,292 ERROR [STDERR] Feb 8, 2010 4:59:04 PM org.apache.solr.update.processor.LogUpdateProcessor finish INFO: {} 0 15 16:59:04,292 ERROR [STDERR] Feb 8, 2010 4:59:04 PM org.apache.solr.common.SolrException log SEVERE: java.lang.IllegalStateException: Can't overwrite cause at java.lang.Throwable.initCause(Throwable.java:320) at com.ctc.wstx.compat.Jdk14Impl.setInitCause(Jdk14Impl.java:70) at com.ctc.wstx.exc.WstxException.init(WstxException.java:46) at com.ctc.wstx.exc.WstxIOException.init(WstxIOException.java:16) at com.ctc.wstx.stax.WstxInputFactory.doCreateSR(WstxInputFactory.java:536) at com.ctc.wstx.stax.WstxInputFactory.createSR(WstxInputFactory.java:592) at com.ctc.wstx.stax.WstxInputFactory.createSR(WstxInputFactory.java:648) at com.ctc.wstx.stax.WstxInputFactory.createXMLStreamReader(WstxInputFactory.java:319) at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:68) at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.jboss.web.tomcat.filters.ReplyHeaderFilter.doFilter(ReplyHeaderFilter.java:96) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:230) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:175) at org.jboss.web.tomcat.security.SecurityAssociationValve.invoke(SecurityAssociationValve.java:182) at org.jboss.web.tomcat.security.JaccContextValve.invoke(JaccContextValve.java:84) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) at org.jboss.web.tomcat.service.jca.CachedConnectionValve.invoke(CachedConnectionValve.java:157) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:262) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:844) at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583) at
[jira] Issue Comment Edited: (SOLR-1764) While indexing a java.lang.IllegalStateException: Can't overwrite cause exception is thrown
[ https://issues.apache.org/jira/browse/SOLR-1764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12831268#action_12831268 ] Fuad Efendi edited comment on SOLR-1764 at 2/9/10 2:51 AM: --- Michael, Which version of Java are you using? I believe something wrong with XML (upload) file, and specific Java version classes conflict with WoodStox, although SOLR may need improvement too: http://forums.sun.com/thread.jspa?threadID=5150576 It says that text nodes such as prim name=y[-A-Z0-9.,()/='+:?!%amp;amp;*; ]/prim can be split (for instance, to process entities), depending on implementation, and, to be safe, SOLR needs something like {code} while (reader.isCharacters()) { sb.append(reader.getText()); reader.next(); } {code} was (Author: funtick): Michael, Which version of Java are you using? I believe something wrong with XML (upload) file, and specific Java version classes conflict with WoodStox, although SOLR may need improvement too: http://forums.sun.com/thread.jspa?threadID=5150576 It says that text nodes such as prim name=y[-A-Z0-9.,()/='+:?!%amp;amp;*; ]/prim can be split (for instance, to porcess entities), depending on implementation, and, to be safe, SOLR needs something like {code} while (reader.isCharacters()) { sb.append(reader.getText()); reader.next(); } {code} While indexing a java.lang.IllegalStateException: Can't overwrite cause exception is thrown - Key: SOLR-1764 URL: https://issues.apache.org/jira/browse/SOLR-1764 Project: Solr Issue Type: Bug Components: clients - java Affects Versions: 1.4 Environment: Windows XP, JBoss 4.2.3 GA Reporter: Michael McGowan Priority: Blocker I get an exception while indexing. It seems that I'm unable to see the root cause of the exception because it is masked by another java.lang.IllegalStateException: Can't overwrite cause exception. Here is the stacktrace : 16:59:04,292 ERROR [STDERR] Feb 8, 2010 4:59:04 PM org.apache.solr.update.processor.LogUpdateProcessor finish INFO: {} 0 15 16:59:04,292 ERROR [STDERR] Feb 8, 2010 4:59:04 PM org.apache.solr.common.SolrException log SEVERE: java.lang.IllegalStateException: Can't overwrite cause at java.lang.Throwable.initCause(Throwable.java:320) at com.ctc.wstx.compat.Jdk14Impl.setInitCause(Jdk14Impl.java:70) at com.ctc.wstx.exc.WstxException.init(WstxException.java:46) at com.ctc.wstx.exc.WstxIOException.init(WstxIOException.java:16) at com.ctc.wstx.stax.WstxInputFactory.doCreateSR(WstxInputFactory.java:536) at com.ctc.wstx.stax.WstxInputFactory.createSR(WstxInputFactory.java:592) at com.ctc.wstx.stax.WstxInputFactory.createSR(WstxInputFactory.java:648) at com.ctc.wstx.stax.WstxInputFactory.createXMLStreamReader(WstxInputFactory.java:319) at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:68) at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.jboss.web.tomcat.filters.ReplyHeaderFilter.doFilter(ReplyHeaderFilter.java:96) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:230) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:175) at org.jboss.web.tomcat.security.SecurityAssociationValve.invoke(SecurityAssociationValve.java:182) at org.jboss.web.tomcat.security.JaccContextValve.invoke(JaccContextValve.java:84) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) at org.jboss.web.tomcat.service.jca.CachedConnectionValve.invoke(CachedConnectionValve.java:157) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
Re: priority queue in query component
At this point, Distributed Search does not support any recovery if when one or more shards fail. If any fail or time out, the whole query fails. On Sat, Feb 6, 2010 at 9:34 AM, mike anderson saidthero...@gmail.com wrote: so if we received the response from shard2 before shard1, we would just queue it up and wait for the response to shard1. This crossed my mind, but my concern was how to handle the case when shard1 never responds. Is this something I need to worry about? -mike On Sat, Feb 6, 2010 at 11:33 AM, Yonik Seeley yo...@lucidimagination.comwrote: It seems like changing an element in a priority queue breaks the invariants, and hence it's not doable with a priority queue and with the current strategy of adding sub-responses as they are received. One way to continue using a priority queue would be to add sub-responses to the queue in the preferred order... so if we received the response from shard2 before shard1, we would just queue it up and wait for the response to shard1. -Yonik http://www.lucidimagination.com On Sat, Feb 6, 2010 at 10:35 AM, mike anderson saidthero...@gmail.com wrote: I have a need to favor documents from one shard over another when duplicates occur. I found this code in the query component: String prevShard = uniqueDoc.put(id, srsp.getShard()); if (prevShard != null) { // duplicate detected numFound--; // For now, just always use the first encountered since we can't currently // remove the previous one added to the priority queue. If we switched // to the Java5 PriorityQueue, this would be easier. continue; // make which duplicate is used deterministic based on shard // if (prevShard.compareTo(srsp.shard) = 0) { // TODO: remove previous from priority queue // continue; // } } Is there a ticket open for this issue? What would it take to fix? Thanks, Mike -- Lance Norskog goks...@gmail.com
[jira] Commented: (SOLR-1764) While indexing a java.lang.IllegalStateException: Can't overwrite cause exception is thrown
[ https://issues.apache.org/jira/browse/SOLR-1764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12831294#action_12831294 ] Michael McGowan commented on SOLR-1764: --- Hi Fuad, Thanks for the info. I have this Java version : java version 1.6.0_10 Java(TM) SE Runtime Environment (build 1.6.0_10-b33) Let me know if you'd like me to try anything. Michael While indexing a java.lang.IllegalStateException: Can't overwrite cause exception is thrown - Key: SOLR-1764 URL: https://issues.apache.org/jira/browse/SOLR-1764 Project: Solr Issue Type: Bug Components: clients - java Affects Versions: 1.4 Environment: Windows XP, JBoss 4.2.3 GA Reporter: Michael McGowan Priority: Blocker I get an exception while indexing. It seems that I'm unable to see the root cause of the exception because it is masked by another java.lang.IllegalStateException: Can't overwrite cause exception. Here is the stacktrace : 16:59:04,292 ERROR [STDERR] Feb 8, 2010 4:59:04 PM org.apache.solr.update.processor.LogUpdateProcessor finish INFO: {} 0 15 16:59:04,292 ERROR [STDERR] Feb 8, 2010 4:59:04 PM org.apache.solr.common.SolrException log SEVERE: java.lang.IllegalStateException: Can't overwrite cause at java.lang.Throwable.initCause(Throwable.java:320) at com.ctc.wstx.compat.Jdk14Impl.setInitCause(Jdk14Impl.java:70) at com.ctc.wstx.exc.WstxException.init(WstxException.java:46) at com.ctc.wstx.exc.WstxIOException.init(WstxIOException.java:16) at com.ctc.wstx.stax.WstxInputFactory.doCreateSR(WstxInputFactory.java:536) at com.ctc.wstx.stax.WstxInputFactory.createSR(WstxInputFactory.java:592) at com.ctc.wstx.stax.WstxInputFactory.createSR(WstxInputFactory.java:648) at com.ctc.wstx.stax.WstxInputFactory.createXMLStreamReader(WstxInputFactory.java:319) at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:68) at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.jboss.web.tomcat.filters.ReplyHeaderFilter.doFilter(ReplyHeaderFilter.java:96) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:230) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:175) at org.jboss.web.tomcat.security.SecurityAssociationValve.invoke(SecurityAssociationValve.java:182) at org.jboss.web.tomcat.security.JaccContextValve.invoke(JaccContextValve.java:84) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) at org.jboss.web.tomcat.service.jca.CachedConnectionValve.invoke(CachedConnectionValve.java:157) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:262) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:844) at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583) at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:446) at java.lang.Thread.run(Thread.java:619) 16:59:04,292 ERROR [STDERR] Feb 8, 2010 4:59:04 PM org.apache.solr.core.SolrCore execute INFO: [] webapp=/solr path=/update params={wt=xmlversion=2.2} status=500 QTime=15 16:59:04,292 ERROR [STDERR] Feb 8, 2010 4:59:04 PM org.apache.solr.common.SolrException log SEVERE: java.lang.IllegalStateException: Can't overwrite cause at java.lang.Throwable.initCause(Throwable.java:320) at com.ctc.wstx.compat.Jdk14Impl.setInitCause(Jdk14Impl.java:70) at com.ctc.wstx.exc.WstxException.init(WstxException.java:46) at
[jira] Commented: (SOLR-1757) DIH multithreading sometimes throws NPE
[ https://issues.apache.org/jira/browse/SOLR-1757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12831298#action_12831298 ] Noble Paul commented on SOLR-1757: -- I guess you have pasted the wrong stacktrace. We can close this issue and open another for the persistent threads after abort command DIH multithreading sometimes throws NPE --- Key: SOLR-1757 URL: https://issues.apache.org/jira/browse/SOLR-1757 Project: Solr Issue Type: Bug Components: contrib - DataImportHandler Affects Versions: 1.4 Environment: tomcat 6.0.x, jdk 1.6.x on windows xp 32bit Reporter: Michael Henson Assignee: Noble Paul Attachments: solr-1352-threads-bt.txt, solr-1757-abort-threaddump.zip, SOLR-1757.patch When the threads attribute is set on a root entity in the DIH's data-config.xml, the multithreading code sometimes throws a NullPointerException after the full-index command is given. I haven't yet been able to figure out exactly which reference holds the null or why, but it does happen consistently with the same backtrace. My configuration is: 1. Multi-core Solr under tomcat 2. Using JdbcDataSource and the default SqlEntityProcessor To reproduce: 1. Add the attribute threads=2 to the root entity declaration in data-config.xml 2. Send the full-import command either directly to .../core/dataimport?command=full-import or through the /admin/dataimport.jsp control panel. 3. Wait for the NPE to show up in the logs/console -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (SOLR-1757) DIH multithreading sometimes throws NPE
[ https://issues.apache.org/jira/browse/SOLR-1757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Noble Paul resolved SOLR-1757. -- Resolution: Fixed Fix Version/s: 1.5 comitted r907935 DIH multithreading sometimes throws NPE --- Key: SOLR-1757 URL: https://issues.apache.org/jira/browse/SOLR-1757 Project: Solr Issue Type: Bug Components: contrib - DataImportHandler Affects Versions: 1.4 Environment: tomcat 6.0.x, jdk 1.6.x on windows xp 32bit Reporter: Michael Henson Assignee: Noble Paul Fix For: 1.5 Attachments: solr-1352-threads-bt.txt, solr-1757-abort-threaddump.zip, SOLR-1757.patch When the threads attribute is set on a root entity in the DIH's data-config.xml, the multithreading code sometimes throws a NullPointerException after the full-index command is given. I haven't yet been able to figure out exactly which reference holds the null or why, but it does happen consistently with the same backtrace. My configuration is: 1. Multi-core Solr under tomcat 2. Using JdbcDataSource and the default SqlEntityProcessor To reproduce: 1. Add the attribute threads=2 to the root entity declaration in data-config.xml 2. Send the full-import command either directly to .../core/dataimport?command=full-import or through the /admin/dataimport.jsp control panel. 3. Wait for the NPE to show up in the logs/console -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.