[jira] [Commented] (LUCENE-5771) Review semantics of SpatialOperation predicates
[ https://issues.apache.org/jira/browse/LUCENE-5771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038546#comment-14038546 ] Ryan McKinley commented on LUCENE-5771: --- change looks good I *think* the reason for queryNeedsArea and targetNeedsArea was so that we could decide if the input was valid simply based on the shape -- but i think we find that out anyway, so we could just drop these. I don't think 'scoreIsMeaningful' is used, so it can also be dropped Review semantics of SpatialOperation predicates --- Key: LUCENE-5771 URL: https://issues.apache.org/jira/browse/LUCENE-5771 Project: Lucene - Core Issue Type: Improvement Components: modules/spatial Reporter: David Smiley Attachments: LUCENE-5771_SpatialOperation_semantics.patch SpatialOperation (which I wish was named SpatialPredicate) is a bunch of predicates -- methods that return true/false based on a pair of shapes. Some of them don't seem to be defined in a way consistent with their definitions on ESRI's site: http://edndoc.esri.com/arcsde/9.1/general_topics/understand_spatial_relations.htm (which is linked as a reference, and is in turn equivalent to OGC spec definitions, I believe). Problems: * the definitions make no mention of needing to have area or not, yet some of our predicates are defined as to require area on either the indexed or query shape. * the definitions make a distinction of the boundary of a shape, yet in Lucene-spatial, there is none. That suggests our predicates are wrongly chosen since there *are* official predicates that are boundary-neutral -- namely Covers and CoveredBy in lieu of Contains and Within, respectively. If we don't rename our predicates, we should at least support the correct predicates names! * Overlaps appears totally wrong. It should be defined as indexedShape.relate(queryShape) == Intersects (and thus not Within or Contains or Disjoint). It's presently defined as the same as Intersects plus the query shape needing area. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5779) Improve BBox AreaSimilarity algorithm to consider lines and points
[ https://issues.apache.org/jira/browse/LUCENE-5779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038548#comment-14038548 ] Ryan McKinley commented on LUCENE-5779: --- +1 thanks for looking at this Improve BBox AreaSimilarity algorithm to consider lines and points -- Key: LUCENE-5779 URL: https://issues.apache.org/jira/browse/LUCENE-5779 Project: Lucene - Core Issue Type: Improvement Components: modules/spatial Reporter: David Smiley Attachments: LUCENE-5779__Improved_bbox_AreaSimilarity_algorithm.patch GeoPortal's area overlap algorithm didn't consider lines and points; they end up turning the score 0. I've thought about this for a bit and I've come up with an alternative scoring algorithm. (already coded and tested and documented): New Javadocs: {code:java} /** * The algorithm is implemented as envelope on envelope overlays rather than * complex polygon on complex polygon overlays. * p/ * p/ * Spatial relevance scoring algorithm: * DL * DTqueryArea/DT DDthe area of the input query envelope/DD * DTtargetArea/DT DDthe area of the target envelope (per Lucene document)/DD * DTintersectionArea/DT DDthe area of the intersection between the query and target envelopes/DD * DTqueryTargetProportion/DT DDA 0-1 factor that divides the score proportion between query and target. * 0.5 is evenly./DD * * DTqueryRatio/DT DDintersectionArea / queryArea; (see note)/DD * DTtargetRatio/DT DDintersectionArea / targetArea; (see note)/DD * DTqueryFactor/DT DDqueryRatio * queryTargetProportion;/DD * DTtargetFactor/DT DDtargetRatio * (1 - queryTargetProportion);/DD * DTscore/DT DDqueryFactor + targetFactor;/DD * /DL * Note: The actual computation of queryRatio and targetRatio is more complicated so that it considers * points and lines. Lines have the ratio of overlap, and points are either 1.0 or 0.0 depending on wether * it intersects or not. * p / * Based on Geoportal's * a href=http://geoportal.svn.sourceforge.net/svnroot/geoportal/Geoportal/trunk/src/com/esri/gpt/catalog/lucene/SpatialRankingValueSource.java; * SpatialRankingValueSource/a but modified. GeoPortal's algorithm will yield a score of 0 * if either a line or point is compared, and it's doesn't output a 0-1 normalized score (it multiplies the factors). * * @lucene.experimental */ {code} -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5714) Improve tests for BBoxStrategy then port to 4x.
[ https://issues.apache.org/jira/browse/LUCENE-5714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038549#comment-14038549 ] Ryan McKinley commented on LUCENE-5714: --- +1 thanks Improve tests for BBoxStrategy then port to 4x. --- Key: LUCENE-5714 URL: https://issues.apache.org/jira/browse/LUCENE-5714 Project: Lucene - Core Issue Type: Improvement Components: modules/spatial Reporter: David Smiley Assignee: David Smiley Fix For: 4.9 Attachments: LUCENE-5714__Enhance_BBoxStrategy__more_tests,_fix_dateline_bugs,_new_AreaSimilarity_algor.patch BBoxStrategy needs better tests before I'm comfortable seeing it in 4x. Specifically it should use random rectangles based validation (ones that may cross the dateline), akin to the other tests. And I think I see an equals/hashcode bug to be fixed in there too. One particular thing I'd like to see added is how to handle a zero-area case for AreaSimilarity. I think an additional feature in which you declare a minimum % area (relative to the query shape) would be good. It should be possible for the user to combine rectangle center-point to query shape center-point distance sorting as well. I think it is but I need to make sure it's possible without _having_ to index a separate center point field. Another possibility (probably not to be addressed here) is a minimum ratio between width/height, perhaps 10%. A long but nearly no height line should not be massively disadvantaged relevancy-wise to an equivalently long diagonal road that has a square bbox. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6178) Deprecate Jaspell suggester and change the default
[ https://issues.apache.org/jira/browse/SOLR-6178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038564#comment-14038564 ] Uwe Schindler commented on SOLR-6178: - I think for now, we should at least make the factory deprecated. As we deprecated Jaspell already in 4.9, this should be done before 4.9. Deprecate Jaspell suggester and change the default -- Key: SOLR-6178 URL: https://issues.apache.org/jira/browse/SOLR-6178 Project: Solr Issue Type: Improvement Components: spellchecker Reporter: Michael McCandless Fix For: 4.9, 5.0 Right now Solr defaults to Jaspell, but we've deprecated it in LUCENE-5775 ... and in trunk I'd like to remove it. But first we need to fix Solr to not default to it. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6178) Deprecate Jaspell suggester and change the default
[ https://issues.apache.org/jira/browse/SOLR-6178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated SOLR-6178: Priority: Blocker (was: Major) Deprecate Jaspell suggester and change the default -- Key: SOLR-6178 URL: https://issues.apache.org/jira/browse/SOLR-6178 Project: Solr Issue Type: Improvement Components: spellchecker Reporter: Michael McCandless Priority: Blocker Fix For: 4.9, 5.0 Right now Solr defaults to Jaspell, but we've deprecated it in LUCENE-5775 ... and in trunk I'd like to remove it. But first we need to fix Solr to not default to it. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6178) Deprecate Jaspell suggester and change the default
[ https://issues.apache.org/jira/browse/SOLR-6178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated SOLR-6178: Component/s: spellchecker Fix Version/s: 5.0 4.9 Deprecate Jaspell suggester and change the default -- Key: SOLR-6178 URL: https://issues.apache.org/jira/browse/SOLR-6178 Project: Solr Issue Type: Improvement Components: spellchecker Reporter: Michael McCandless Fix For: 4.9, 5.0 Right now Solr defaults to Jaspell, but we've deprecated it in LUCENE-5775 ... and in trunk I'd like to remove it. But first we need to fix Solr to not default to it. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6178) Deprecate Jaspell suggester and change the default
[ https://issues.apache.org/jira/browse/SOLR-6178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated SOLR-6178: Attachment: SOLR-6178.patch Simple patch for deprecation. Maybe we should change the default to FSTLookupFactory. This one looks most similar to the Jaspell one (from the functionality). I am not sure how this affects users that rely on the default impl (because they have not given one it in the solrconfig). Deprecate Jaspell suggester and change the default -- Key: SOLR-6178 URL: https://issues.apache.org/jira/browse/SOLR-6178 Project: Solr Issue Type: Improvement Components: spellchecker Reporter: Michael McCandless Priority: Blocker Fix For: 4.9, 5.0 Attachments: SOLR-6178.patch Right now Solr defaults to Jaspell, but we've deprecated it in LUCENE-5775 ... and in trunk I'd like to remove it. But first we need to fix Solr to not default to it. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-6184) Replication fetchLatestIndex always failed, that will occur the recovery error.
Raintung Li created SOLR-6184: - Summary: Replication fetchLatestIndex always failed, that will occur the recovery error. Key: SOLR-6184 URL: https://issues.apache.org/jira/browse/SOLR-6184 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.6.1, 4.6 Environment: the index file size is more than 70G Reporter: Raintung Li Usually the copy full index 70G need 20 minutes at least, 100M read/write network or disk r/w. If in the 20 minutes happen one hard commit, that means the copy full index snap pull will be failed, the temp folder will be removed because it is failed pull task. In the production, update index will happen in every minute, redo pull task always failed because index always change. And also always redo the pull it will occur the network and disk usage keep the high level. For my suggestion, the fetchLatestIndex can be done again in some frequency. Don't need remove the tmp folder, and copy the largest index at first. Redo the fetchLatestIndex don't download the same biggest file again, only will copy the commit index just now, at last the task will be easy success. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6184) Replication fetchLatestIndex always failed, that will occur the recovery error.
[ https://issues.apache.org/jira/browse/SOLR-6184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raintung Li updated SOLR-6184: -- Description: Usually the copy full index 70G need 20 minutes at least, 100M read/write network or disk r/w. If in the 20 minutes happen one hard commit, that means the copy full index snap pull will be failed, the temp folder will be removed because it is failed pull task. In the production, update index will happen in every minute, redo pull task always failed because index always change. And also always redo the pull it will occur the network and disk usage keep the high level. For my suggestion, the fetchLatestIndex can be do again in some frequency. Don't need remove the tmp folder, and copy the largest index at first. Redo the fetchLatestIndex don't download the same biggest file again, only will copy the commit index just now, at last the task will be easy success. was: Usually the copy full index 70G need 20 minutes at least, 100M read/write network or disk r/w. If in the 20 minutes happen one hard commit, that means the copy full index snap pull will be failed, the temp folder will be removed because it is failed pull task. In the production, update index will happen in every minute, redo pull task always failed because index always change. And also always redo the pull it will occur the network and disk usage keep the high level. For my suggestion, the fetchLatestIndex can be done again in some frequency. Don't need remove the tmp folder, and copy the largest index at first. Redo the fetchLatestIndex don't download the same biggest file again, only will copy the commit index just now, at last the task will be easy success. Replication fetchLatestIndex always failed, that will occur the recovery error. --- Key: SOLR-6184 URL: https://issues.apache.org/jira/browse/SOLR-6184 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.6, 4.6.1 Environment: the index file size is more than 70G Reporter: Raintung Li Usually the copy full index 70G need 20 minutes at least, 100M read/write network or disk r/w. If in the 20 minutes happen one hard commit, that means the copy full index snap pull will be failed, the temp folder will be removed because it is failed pull task. In the production, update index will happen in every minute, redo pull task always failed because index always change. And also always redo the pull it will occur the network and disk usage keep the high level. For my suggestion, the fetchLatestIndex can be do again in some frequency. Don't need remove the tmp folder, and copy the largest index at first. Redo the fetchLatestIndex don't download the same biggest file again, only will copy the commit index just now, at last the task will be easy success. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6184) Replication fetchLatestIndex always failed, that will occur the recovery error.
[ https://issues.apache.org/jira/browse/SOLR-6184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raintung Li updated SOLR-6184: -- Attachment: Solr-6184.txt Replication fetchLatestIndex always failed, that will occur the recovery error. --- Key: SOLR-6184 URL: https://issues.apache.org/jira/browse/SOLR-6184 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.6, 4.6.1 Environment: the index file size is more than 70G Reporter: Raintung Li Attachments: Solr-6184.txt Usually the copy full index 70G need 20 minutes at least, 100M read/write network or disk r/w. If in the 20 minutes happen one hard commit, that means the copy full index snap pull will be failed, the temp folder will be removed because it is failed pull task. In the production, update index will happen in every minute, redo pull task always failed because index always change. And also always redo the pull it will occur the network and disk usage keep the high level. For my suggestion, the fetchLatestIndex can be do again in some frequency. Don't need remove the tmp folder, and copy the largest index at first. Redo the fetchLatestIndex don't download the same biggest file again, only will copy the commit index just now, at last the task will be easy success. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6178) Deprecate Jaspell suggester and change the default
[ https://issues.apache.org/jira/browse/SOLR-6178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038625#comment-14038625 ] Robert Muir commented on SOLR-6178: --- Not a blocker. Sorry guys. Not even a bug! Deprecate Jaspell suggester and change the default -- Key: SOLR-6178 URL: https://issues.apache.org/jira/browse/SOLR-6178 Project: Solr Issue Type: Improvement Components: spellchecker Reporter: Michael McCandless Priority: Minor Fix For: 4.9, 5.0 Attachments: SOLR-6178.patch Right now Solr defaults to Jaspell, but we've deprecated it in LUCENE-5775 ... and in trunk I'd like to remove it. But first we need to fix Solr to not default to it. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6178) Deprecate Jaspell suggester and change the default
[ https://issues.apache.org/jira/browse/SOLR-6178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated SOLR-6178: -- Priority: Minor (was: Blocker) Deprecate Jaspell suggester and change the default -- Key: SOLR-6178 URL: https://issues.apache.org/jira/browse/SOLR-6178 Project: Solr Issue Type: Improvement Components: spellchecker Reporter: Michael McCandless Priority: Minor Fix For: 4.9, 5.0 Attachments: SOLR-6178.patch Right now Solr defaults to Jaspell, but we've deprecated it in LUCENE-5775 ... and in trunk I'd like to remove it. But first we need to fix Solr to not default to it. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6178) Deprecate Jaspell suggester and change the default
[ https://issues.apache.org/jira/browse/SOLR-6178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038627#comment-14038627 ] Robert Muir commented on SOLR-6178: --- This is *WAY* too risky to do right before a release. Please do not commit anything at all to the 4.9 branch. This can definitely be done in 4.10 Deprecate Jaspell suggester and change the default -- Key: SOLR-6178 URL: https://issues.apache.org/jira/browse/SOLR-6178 Project: Solr Issue Type: Improvement Components: spellchecker Reporter: Michael McCandless Priority: Minor Fix For: 4.9, 5.0 Attachments: SOLR-6178.patch Right now Solr defaults to Jaspell, but we've deprecated it in LUCENE-5775 ... and in trunk I'd like to remove it. But first we need to fix Solr to not default to it. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-5780) OrdinalMap's mapping from global ords to segment ords is sometimes wasteful
Adrien Grand created LUCENE-5780: Summary: OrdinalMap's mapping from global ords to segment ords is sometimes wasteful Key: LUCENE-5780 URL: https://issues.apache.org/jira/browse/LUCENE-5780 Project: Lucene - Core Issue Type: Bug Reporter: Adrien Grand Assignee: Adrien Grand Fix For: 4.9, 5.0 Robert found a case when the ordinal map can be quite wasteful in terms of memory usage: in order to be able to resolve values given a global ordinals, it stores two things: - an identifier of the segment where the value is - the difference between the ordinal on the segment and the global ordinal The issue is that OrdinalMap currently picks any of the segments that contain the value but we can do better: we can pick the first segment that has the value. This will help for two reasons: - it will potentially require fewer bits per value to store the segment ids if NRT segments don't introduce new values - if all values happen to appear in the first segment, then the map from global ords to deltas only stores zeros. I just tested on an index where all values are in the first segment and this helped reduce memory usage of the ordinal map by 4x (from 3.5MB to 800KB). -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5780) OrdinalMap's mapping from global ords to segment ords is sometimes wasteful
[ https://issues.apache.org/jira/browse/LUCENE-5780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038634#comment-14038634 ] Robert Muir commented on LUCENE-5780: - I agree conceptually its a bug, but I think this should be 4.10. Its not anything introduced in 4.9, its been this way since 4.2 I already am pushing back on other issues such as SOLR-6178, because i ran 125 rounds of tests last night and want to keep things stable. OrdinalMap's mapping from global ords to segment ords is sometimes wasteful --- Key: LUCENE-5780 URL: https://issues.apache.org/jira/browse/LUCENE-5780 Project: Lucene - Core Issue Type: Bug Reporter: Adrien Grand Assignee: Adrien Grand Fix For: 4.9, 5.0 Robert found a case when the ordinal map can be quite wasteful in terms of memory usage: in order to be able to resolve values given a global ordinals, it stores two things: - an identifier of the segment where the value is - the difference between the ordinal on the segment and the global ordinal The issue is that OrdinalMap currently picks any of the segments that contain the value but we can do better: we can pick the first segment that has the value. This will help for two reasons: - it will potentially require fewer bits per value to store the segment ids if NRT segments don't introduce new values - if all values happen to appear in the first segment, then the map from global ords to deltas only stores zeros. I just tested on an index where all values are in the first segment and this helped reduce memory usage of the ordinal map by 4x (from 3.5MB to 800KB). -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5780) OrdinalMap's mapping from global ords to segment ords is sometimes wasteful
[ https://issues.apache.org/jira/browse/LUCENE-5780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrien Grand updated LUCENE-5780: - Fix Version/s: (was: 4.9) 4.10 OrdinalMap's mapping from global ords to segment ords is sometimes wasteful --- Key: LUCENE-5780 URL: https://issues.apache.org/jira/browse/LUCENE-5780 Project: Lucene - Core Issue Type: Bug Reporter: Adrien Grand Assignee: Adrien Grand Fix For: 5.0, 4.10 Robert found a case when the ordinal map can be quite wasteful in terms of memory usage: in order to be able to resolve values given a global ordinals, it stores two things: - an identifier of the segment where the value is - the difference between the ordinal on the segment and the global ordinal The issue is that OrdinalMap currently picks any of the segments that contain the value but we can do better: we can pick the first segment that has the value. This will help for two reasons: - it will potentially require fewer bits per value to store the segment ids if NRT segments don't introduce new values - if all values happen to appear in the first segment, then the map from global ords to deltas only stores zeros. I just tested on an index where all values are in the first segment and this helped reduce memory usage of the ordinal map by 4x (from 3.5MB to 800KB). -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5780) OrdinalMap's mapping from global ords to segment ords is sometimes wasteful
[ https://issues.apache.org/jira/browse/LUCENE-5780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038637#comment-14038637 ] Adrien Grand commented on LUCENE-5780: -- No worries I just updated the fix version. OrdinalMap's mapping from global ords to segment ords is sometimes wasteful --- Key: LUCENE-5780 URL: https://issues.apache.org/jira/browse/LUCENE-5780 Project: Lucene - Core Issue Type: Bug Reporter: Adrien Grand Assignee: Adrien Grand Fix For: 5.0, 4.10 Robert found a case when the ordinal map can be quite wasteful in terms of memory usage: in order to be able to resolve values given a global ordinals, it stores two things: - an identifier of the segment where the value is - the difference between the ordinal on the segment and the global ordinal The issue is that OrdinalMap currently picks any of the segments that contain the value but we can do better: we can pick the first segment that has the value. This will help for two reasons: - it will potentially require fewer bits per value to store the segment ids if NRT segments don't introduce new values - if all values happen to appear in the first segment, then the map from global ords to deltas only stores zeros. I just tested on an index where all values are in the first segment and this helped reduce memory usage of the ordinal map by 4x (from 3.5MB to 800KB). -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5714) Improve tests for BBoxStrategy then port to 4x.
[ https://issues.apache.org/jira/browse/LUCENE-5714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038639#comment-14038639 ] Robert Muir commented on LUCENE-5714: - This isn't going to make 4.9 Improve tests for BBoxStrategy then port to 4x. --- Key: LUCENE-5714 URL: https://issues.apache.org/jira/browse/LUCENE-5714 Project: Lucene - Core Issue Type: Improvement Components: modules/spatial Reporter: David Smiley Assignee: David Smiley Fix For: 4.9 Attachments: LUCENE-5714__Enhance_BBoxStrategy__more_tests,_fix_dateline_bugs,_new_AreaSimilarity_algor.patch BBoxStrategy needs better tests before I'm comfortable seeing it in 4x. Specifically it should use random rectangles based validation (ones that may cross the dateline), akin to the other tests. And I think I see an equals/hashcode bug to be fixed in there too. One particular thing I'd like to see added is how to handle a zero-area case for AreaSimilarity. I think an additional feature in which you declare a minimum % area (relative to the query shape) would be good. It should be possible for the user to combine rectangle center-point to query shape center-point distance sorting as well. I think it is but I need to make sure it's possible without _having_ to index a separate center point field. Another possibility (probably not to be addressed here) is a minimum ratio between width/height, perhaps 10%. A long but nearly no height line should not be massively disadvantaged relevancy-wise to an equivalently long diagonal road that has a square bbox. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5780) OrdinalMap's mapping from global ords to segment ords is sometimes wasteful
[ https://issues.apache.org/jira/browse/LUCENE-5780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrien Grand updated LUCENE-5780: - Attachment: LUCENE-5780.patch Here is a patch (built on 4.x since I did the tests on 4.x but it should apply well on trunk too). OrdinalMap's mapping from global ords to segment ords is sometimes wasteful --- Key: LUCENE-5780 URL: https://issues.apache.org/jira/browse/LUCENE-5780 Project: Lucene - Core Issue Type: Bug Reporter: Adrien Grand Assignee: Adrien Grand Fix For: 5.0, 4.10 Attachments: LUCENE-5780.patch Robert found a case when the ordinal map can be quite wasteful in terms of memory usage: in order to be able to resolve values given a global ordinals, it stores two things: - an identifier of the segment where the value is - the difference between the ordinal on the segment and the global ordinal The issue is that OrdinalMap currently picks any of the segments that contain the value but we can do better: we can pick the first segment that has the value. This will help for two reasons: - it will potentially require fewer bits per value to store the segment ids if NRT segments don't introduce new values - if all values happen to appear in the first segment, then the map from global ords to deltas only stores zeros. I just tested on an index where all values are in the first segment and this helped reduce memory usage of the ordinal map by 4x (from 3.5MB to 800KB). -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-6178) Deprecate Jaspell suggester and change the default
[ https://issues.apache.org/jira/browse/SOLR-6178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038644#comment-14038644 ] Uwe Schindler edited comment on SOLR-6178 at 6/20/14 10:10 AM: --- Ok lets split this issue: We need in any case the deprecateion, not the change of defaults. I will commit that now. was (Author: thetaphi): Ok lets split this issue: We need in any case the deprecateion, not the change of defaults. Deprecate Jaspell suggester and change the default -- Key: SOLR-6178 URL: https://issues.apache.org/jira/browse/SOLR-6178 Project: Solr Issue Type: Improvement Components: spellchecker Reporter: Michael McCandless Fix For: 4.9, 5.0 Attachments: SOLR-6178.patch Right now Solr defaults to Jaspell, but we've deprecated it in LUCENE-5775 ... and in trunk I'd like to remove it. But first we need to fix Solr to not default to it. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6178) Deprecate Jaspell suggester and change the default
[ https://issues.apache.org/jira/browse/SOLR-6178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038644#comment-14038644 ] Uwe Schindler commented on SOLR-6178: - Ok lets split this issue: We need in any case the deprecateion, not the change of defaults. Deprecate Jaspell suggester and change the default -- Key: SOLR-6178 URL: https://issues.apache.org/jira/browse/SOLR-6178 Project: Solr Issue Type: Improvement Components: spellchecker Reporter: Michael McCandless Fix For: 4.9, 5.0 Attachments: SOLR-6178.patch Right now Solr defaults to Jaspell, but we've deprecated it in LUCENE-5775 ... and in trunk I'd like to remove it. But first we need to fix Solr to not default to it. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6178) Deprecate Jaspell suggester and change the default
[ https://issues.apache.org/jira/browse/SOLR-6178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated SOLR-6178: Priority: Major (was: Minor) Deprecate Jaspell suggester and change the default -- Key: SOLR-6178 URL: https://issues.apache.org/jira/browse/SOLR-6178 Project: Solr Issue Type: Improvement Components: spellchecker Reporter: Michael McCandless Fix For: 4.9, 5.0 Attachments: SOLR-6178.patch Right now Solr defaults to Jaspell, but we've deprecated it in LUCENE-5775 ... and in trunk I'd like to remove it. But first we need to fix Solr to not default to it. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5780) OrdinalMap's mapping from global ords to segment ords is sometimes wasteful
[ https://issues.apache.org/jira/browse/LUCENE-5780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrien Grand updated LUCENE-5780: - Attachment: LUCENE-5780.patch Same patch but with a better assertion. OrdinalMap's mapping from global ords to segment ords is sometimes wasteful --- Key: LUCENE-5780 URL: https://issues.apache.org/jira/browse/LUCENE-5780 Project: Lucene - Core Issue Type: Bug Reporter: Adrien Grand Assignee: Adrien Grand Fix For: 5.0, 4.10 Attachments: LUCENE-5780.patch, LUCENE-5780.patch Robert found a case when the ordinal map can be quite wasteful in terms of memory usage: in order to be able to resolve values given a global ordinals, it stores two things: - an identifier of the segment where the value is - the difference between the ordinal on the segment and the global ordinal The issue is that OrdinalMap currently picks any of the segments that contain the value but we can do better: we can pick the first segment that has the value. This will help for two reasons: - it will potentially require fewer bits per value to store the segment ids if NRT segments don't introduce new values - if all values happen to appear in the first segment, then the map from global ords to deltas only stores zeros. I just tested on an index where all values are in the first segment and this helped reduce memory usage of the ordinal map by 4x (from 3.5MB to 800KB). -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6178) Deprecate Jaspell suggester and change the default
[ https://issues.apache.org/jira/browse/SOLR-6178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038652#comment-14038652 ] ASF subversion and git services commented on SOLR-6178: --- Commit 1604124 from [~thetaphi] in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1604124 ] Merged revision(s) 1604122 from lucene/dev/trunk: SOLR-6178, LUCENE-5775: Deprecate JaspellLookupFactory Deprecate Jaspell suggester and change the default -- Key: SOLR-6178 URL: https://issues.apache.org/jira/browse/SOLR-6178 Project: Solr Issue Type: Improvement Components: spellchecker Reporter: Michael McCandless Fix For: 4.9, 5.0 Attachments: SOLR-6178.patch Right now Solr defaults to Jaspell, but we've deprecated it in LUCENE-5775 ... and in trunk I'd like to remove it. But first we need to fix Solr to not default to it. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6178) Deprecate Jaspell suggester and change the default
[ https://issues.apache.org/jira/browse/SOLR-6178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038648#comment-14038648 ] ASF subversion and git services commented on SOLR-6178: --- Commit 1604122 from [~thetaphi] in branch 'dev/trunk' [ https://svn.apache.org/r1604122 ] SOLR-6178, LUCENE-5775: Deprecate JaspellLookupFactory Deprecate Jaspell suggester and change the default -- Key: SOLR-6178 URL: https://issues.apache.org/jira/browse/SOLR-6178 Project: Solr Issue Type: Improvement Components: spellchecker Reporter: Michael McCandless Fix For: 4.9, 5.0 Attachments: SOLR-6178.patch Right now Solr defaults to Jaspell, but we've deprecated it in LUCENE-5775 ... and in trunk I'd like to remove it. But first we need to fix Solr to not default to it. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5775) JaspellTernarySearchTrie.ramBytesUsed hits StackOverflowError
[ https://issues.apache.org/jira/browse/LUCENE-5775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038653#comment-14038653 ] ASF subversion and git services commented on LUCENE-5775: - Commit 1604124 from [~thetaphi] in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1604124 ] Merged revision(s) 1604122 from lucene/dev/trunk: SOLR-6178, LUCENE-5775: Deprecate JaspellLookupFactory JaspellTernarySearchTrie.ramBytesUsed hits StackOverflowError - Key: LUCENE-5775 URL: https://issues.apache.org/jira/browse/LUCENE-5775 Project: Lucene - Core Issue Type: Bug Reporter: Michael McCandless Assignee: Michael McCandless Priority: Minor Fix For: 4.9, 5.0 I hit this when trying to run LookupBenchmarkTest for LUCENE-5752: {noformat} [junit4] 2 NOTE: reproduce with: ant test -Dtestcase=LookupBenchmarkTest -Dtests.method=testStorageNeeds -Dtests.seed=EA0FADB2EE37D385 -Dtests.locale=es_ES -Dtests.timezone=Etc/Greenwich -Dtests.file.encoding=UTF-8 [junit4] ERROR 1.89s | LookupBenchmarkTest.testStorageNeeds [junit4] Throwable #1: java.lang.StackOverflowError [junit4] at __randomizedtesting.SeedInfo.seed([EA0FADB2EE37D385:DF8106BCB29C472F]:0) [junit4] at java.lang.Class.getMethod0(Class.java:2774) [junit4] at java.lang.Class.isCheckMemberAccessOverridden(Class.java:2214) [junit4] at java.lang.Class.checkMemberAccess(Class.java:2233) [junit4] at java.lang.Class.getDeclaredFields(Class.java:1805) [junit4] at org.apache.lucene.util.RamUsageEstimator.shallowSizeOfInstance(RamUsageEstimator.java:351) [junit4] at org.apache.lucene.util.RamUsageEstimator.shallowSizeOf(RamUsageEstimator.java:329) [junit4] at org.apache.lucene.search.suggest.jaspell.JaspellTernarySearchTrie$TSTNode.ramBytesUsed(JaspellTernarySearchTrie.java:100) [junit4] at org.apache.lucene.search.suggest.jaspell.JaspellTernarySearchTrie$TSTNode.ramBytesUsed(JaspellTernarySearchTrie.java:103) [junit4] at org.apache.lucene.search.suggest.jaspell.JaspellTernarySearchTrie$TSTNode.ramBytesUsed(JaspellTernarySearchTrie.java:103) [junit4] at org.apache.lucene.search.suggest.jaspell.JaspellTernarySearchTrie$TSTNode.ramBytesUsed(JaspellTernarySearchTrie.java:103) [junit4] at org.apache.lucene.search.suggest.jaspell.JaspellTernarySearchTrie$TSTNode.ramBytesUsed(JaspellTernarySearchTrie.java:103) [junit4] at org.apache.lucene.search.suggest.jaspell.JaspellTernarySearchTrie$TSTNode.ramBytesUsed(JaspellTernarySearchTrie.java:103) [junit4] at org.apache.lucene.search.suggest.jaspell.JaspellTernarySearchTrie$TSTNode.ramBytesUsed(JaspellTernarySearchTrie.java:103) [junit4] at org.apache.lucene.search.suggest.jaspell.JaspellTernarySearchTrie$TSTNode.ramBytesUsed(JaspellTernarySearchTrie.java:103) [junit4] at org.apache.lucene.search.suggest.jaspell.JaspellTernarySearchTrie$TSTNode.ramBytesUsed(JaspellTernarySearchTrie.java:103) [junit4] at org.apache.lucene.search.suggest.jaspell.JaspellTernarySearchTrie$TSTNode.ramBytesUsed(JaspellTernarySearchTrie.java:103) [junit4] at org.apache.lucene.search.suggest.jaspell.JaspellTernarySearchTrie$TSTNode.ramBytesUsed(JaspellTernarySearchTrie.java:103) [junit4] at org.apache.lucene.search.suggest.jaspell.JaspellTernarySearchTrie$TSTNode.ramBytesUsed(JaspellTernarySearchTrie.java:103) [junit4] at org.apache.lucene.search.suggest.jaspell.JaspellTernarySearchTrie$TSTNode.ramBytesUsed(JaspellTernarySearchTrie.java:103) {noformat} I think we should just remove/deprecate this suggester? The FST based suggesters are far more RAM efficient... -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5775) JaspellTernarySearchTrie.ramBytesUsed hits StackOverflowError
[ https://issues.apache.org/jira/browse/LUCENE-5775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038649#comment-14038649 ] ASF subversion and git services commented on LUCENE-5775: - Commit 1604122 from [~thetaphi] in branch 'dev/trunk' [ https://svn.apache.org/r1604122 ] SOLR-6178, LUCENE-5775: Deprecate JaspellLookupFactory JaspellTernarySearchTrie.ramBytesUsed hits StackOverflowError - Key: LUCENE-5775 URL: https://issues.apache.org/jira/browse/LUCENE-5775 Project: Lucene - Core Issue Type: Bug Reporter: Michael McCandless Assignee: Michael McCandless Priority: Minor Fix For: 4.9, 5.0 I hit this when trying to run LookupBenchmarkTest for LUCENE-5752: {noformat} [junit4] 2 NOTE: reproduce with: ant test -Dtestcase=LookupBenchmarkTest -Dtests.method=testStorageNeeds -Dtests.seed=EA0FADB2EE37D385 -Dtests.locale=es_ES -Dtests.timezone=Etc/Greenwich -Dtests.file.encoding=UTF-8 [junit4] ERROR 1.89s | LookupBenchmarkTest.testStorageNeeds [junit4] Throwable #1: java.lang.StackOverflowError [junit4] at __randomizedtesting.SeedInfo.seed([EA0FADB2EE37D385:DF8106BCB29C472F]:0) [junit4] at java.lang.Class.getMethod0(Class.java:2774) [junit4] at java.lang.Class.isCheckMemberAccessOverridden(Class.java:2214) [junit4] at java.lang.Class.checkMemberAccess(Class.java:2233) [junit4] at java.lang.Class.getDeclaredFields(Class.java:1805) [junit4] at org.apache.lucene.util.RamUsageEstimator.shallowSizeOfInstance(RamUsageEstimator.java:351) [junit4] at org.apache.lucene.util.RamUsageEstimator.shallowSizeOf(RamUsageEstimator.java:329) [junit4] at org.apache.lucene.search.suggest.jaspell.JaspellTernarySearchTrie$TSTNode.ramBytesUsed(JaspellTernarySearchTrie.java:100) [junit4] at org.apache.lucene.search.suggest.jaspell.JaspellTernarySearchTrie$TSTNode.ramBytesUsed(JaspellTernarySearchTrie.java:103) [junit4] at org.apache.lucene.search.suggest.jaspell.JaspellTernarySearchTrie$TSTNode.ramBytesUsed(JaspellTernarySearchTrie.java:103) [junit4] at org.apache.lucene.search.suggest.jaspell.JaspellTernarySearchTrie$TSTNode.ramBytesUsed(JaspellTernarySearchTrie.java:103) [junit4] at org.apache.lucene.search.suggest.jaspell.JaspellTernarySearchTrie$TSTNode.ramBytesUsed(JaspellTernarySearchTrie.java:103) [junit4] at org.apache.lucene.search.suggest.jaspell.JaspellTernarySearchTrie$TSTNode.ramBytesUsed(JaspellTernarySearchTrie.java:103) [junit4] at org.apache.lucene.search.suggest.jaspell.JaspellTernarySearchTrie$TSTNode.ramBytesUsed(JaspellTernarySearchTrie.java:103) [junit4] at org.apache.lucene.search.suggest.jaspell.JaspellTernarySearchTrie$TSTNode.ramBytesUsed(JaspellTernarySearchTrie.java:103) [junit4] at org.apache.lucene.search.suggest.jaspell.JaspellTernarySearchTrie$TSTNode.ramBytesUsed(JaspellTernarySearchTrie.java:103) [junit4] at org.apache.lucene.search.suggest.jaspell.JaspellTernarySearchTrie$TSTNode.ramBytesUsed(JaspellTernarySearchTrie.java:103) [junit4] at org.apache.lucene.search.suggest.jaspell.JaspellTernarySearchTrie$TSTNode.ramBytesUsed(JaspellTernarySearchTrie.java:103) [junit4] at org.apache.lucene.search.suggest.jaspell.JaspellTernarySearchTrie$TSTNode.ramBytesUsed(JaspellTernarySearchTrie.java:103) [junit4] at org.apache.lucene.search.suggest.jaspell.JaspellTernarySearchTrie$TSTNode.ramBytesUsed(JaspellTernarySearchTrie.java:103) {noformat} I think we should just remove/deprecate this suggester? The FST based suggesters are far more RAM efficient... -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5775) JaspellTernarySearchTrie.ramBytesUsed hits StackOverflowError
[ https://issues.apache.org/jira/browse/LUCENE-5775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038655#comment-14038655 ] ASF subversion and git services commented on LUCENE-5775: - Commit 1604125 from [~thetaphi] in branch 'dev/branches/lucene_solr_4_9' [ https://svn.apache.org/r1604125 ] Merged revision(s) 1604122 from lucene/dev/trunk: SOLR-6178, LUCENE-5775: Deprecate JaspellLookupFactory JaspellTernarySearchTrie.ramBytesUsed hits StackOverflowError - Key: LUCENE-5775 URL: https://issues.apache.org/jira/browse/LUCENE-5775 Project: Lucene - Core Issue Type: Bug Reporter: Michael McCandless Assignee: Michael McCandless Priority: Minor Fix For: 4.9, 5.0 I hit this when trying to run LookupBenchmarkTest for LUCENE-5752: {noformat} [junit4] 2 NOTE: reproduce with: ant test -Dtestcase=LookupBenchmarkTest -Dtests.method=testStorageNeeds -Dtests.seed=EA0FADB2EE37D385 -Dtests.locale=es_ES -Dtests.timezone=Etc/Greenwich -Dtests.file.encoding=UTF-8 [junit4] ERROR 1.89s | LookupBenchmarkTest.testStorageNeeds [junit4] Throwable #1: java.lang.StackOverflowError [junit4] at __randomizedtesting.SeedInfo.seed([EA0FADB2EE37D385:DF8106BCB29C472F]:0) [junit4] at java.lang.Class.getMethod0(Class.java:2774) [junit4] at java.lang.Class.isCheckMemberAccessOverridden(Class.java:2214) [junit4] at java.lang.Class.checkMemberAccess(Class.java:2233) [junit4] at java.lang.Class.getDeclaredFields(Class.java:1805) [junit4] at org.apache.lucene.util.RamUsageEstimator.shallowSizeOfInstance(RamUsageEstimator.java:351) [junit4] at org.apache.lucene.util.RamUsageEstimator.shallowSizeOf(RamUsageEstimator.java:329) [junit4] at org.apache.lucene.search.suggest.jaspell.JaspellTernarySearchTrie$TSTNode.ramBytesUsed(JaspellTernarySearchTrie.java:100) [junit4] at org.apache.lucene.search.suggest.jaspell.JaspellTernarySearchTrie$TSTNode.ramBytesUsed(JaspellTernarySearchTrie.java:103) [junit4] at org.apache.lucene.search.suggest.jaspell.JaspellTernarySearchTrie$TSTNode.ramBytesUsed(JaspellTernarySearchTrie.java:103) [junit4] at org.apache.lucene.search.suggest.jaspell.JaspellTernarySearchTrie$TSTNode.ramBytesUsed(JaspellTernarySearchTrie.java:103) [junit4] at org.apache.lucene.search.suggest.jaspell.JaspellTernarySearchTrie$TSTNode.ramBytesUsed(JaspellTernarySearchTrie.java:103) [junit4] at org.apache.lucene.search.suggest.jaspell.JaspellTernarySearchTrie$TSTNode.ramBytesUsed(JaspellTernarySearchTrie.java:103) [junit4] at org.apache.lucene.search.suggest.jaspell.JaspellTernarySearchTrie$TSTNode.ramBytesUsed(JaspellTernarySearchTrie.java:103) [junit4] at org.apache.lucene.search.suggest.jaspell.JaspellTernarySearchTrie$TSTNode.ramBytesUsed(JaspellTernarySearchTrie.java:103) [junit4] at org.apache.lucene.search.suggest.jaspell.JaspellTernarySearchTrie$TSTNode.ramBytesUsed(JaspellTernarySearchTrie.java:103) [junit4] at org.apache.lucene.search.suggest.jaspell.JaspellTernarySearchTrie$TSTNode.ramBytesUsed(JaspellTernarySearchTrie.java:103) [junit4] at org.apache.lucene.search.suggest.jaspell.JaspellTernarySearchTrie$TSTNode.ramBytesUsed(JaspellTernarySearchTrie.java:103) [junit4] at org.apache.lucene.search.suggest.jaspell.JaspellTernarySearchTrie$TSTNode.ramBytesUsed(JaspellTernarySearchTrie.java:103) [junit4] at org.apache.lucene.search.suggest.jaspell.JaspellTernarySearchTrie$TSTNode.ramBytesUsed(JaspellTernarySearchTrie.java:103) {noformat} I think we should just remove/deprecate this suggester? The FST based suggesters are far more RAM efficient... -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6178) Deprecate Jaspell suggester
[ https://issues.apache.org/jira/browse/SOLR-6178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038656#comment-14038656 ] Robert Muir commented on SOLR-6178: --- This didn't make 4.9 Deprecate Jaspell suggester --- Key: SOLR-6178 URL: https://issues.apache.org/jira/browse/SOLR-6178 Project: Solr Issue Type: Improvement Components: spellchecker Reporter: Michael McCandless Fix For: 4.9, 5.0 Attachments: SOLR-6178.patch Right now Solr defaults to Jaspell, but we've deprecated it in LUCENE-5775 ... and in trunk I'd like to remove it. But first we need to fix Solr to not default to it. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6178) Deprecate Jaspell suggester
[ https://issues.apache.org/jira/browse/SOLR-6178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated SOLR-6178: Summary: Deprecate Jaspell suggester (was: Deprecate Jaspell suggester and change the default) Deprecate Jaspell suggester --- Key: SOLR-6178 URL: https://issues.apache.org/jira/browse/SOLR-6178 Project: Solr Issue Type: Improvement Components: spellchecker Reporter: Michael McCandless Fix For: 4.9, 5.0 Attachments: SOLR-6178.patch Right now Solr defaults to Jaspell, but we've deprecated it in LUCENE-5775 ... and in trunk I'd like to remove it. But first we need to fix Solr to not default to it. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-6178) Deprecate Jaspell suggester
[ https://issues.apache.org/jira/browse/SOLR-6178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler resolved SOLR-6178. - Resolution: Fixed Deprecate Jaspell suggester --- Key: SOLR-6178 URL: https://issues.apache.org/jira/browse/SOLR-6178 Project: Solr Issue Type: Improvement Components: spellchecker Reporter: Michael McCandless Fix For: 4.9, 5.0 Attachments: SOLR-6178.patch Right now Solr defaults to Jaspell, but we've deprecated it in LUCENE-5775 ... and in trunk I'd like to remove it. But first we need to fix Solr to not default to it. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6178) Deprecate Jaspell suggester and change the default
[ https://issues.apache.org/jira/browse/SOLR-6178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038654#comment-14038654 ] ASF subversion and git services commented on SOLR-6178: --- Commit 1604125 from [~thetaphi] in branch 'dev/branches/lucene_solr_4_9' [ https://svn.apache.org/r1604125 ] Merged revision(s) 1604122 from lucene/dev/trunk: SOLR-6178, LUCENE-5775: Deprecate JaspellLookupFactory Deprecate Jaspell suggester and change the default -- Key: SOLR-6178 URL: https://issues.apache.org/jira/browse/SOLR-6178 Project: Solr Issue Type: Improvement Components: spellchecker Reporter: Michael McCandless Fix For: 4.9, 5.0 Attachments: SOLR-6178.patch Right now Solr defaults to Jaspell, but we've deprecated it in LUCENE-5775 ... and in trunk I'd like to remove it. But first we need to fix Solr to not default to it. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6185) Change suggester default in Solr, because Jaspell is deprecated
[ https://issues.apache.org/jira/browse/SOLR-6185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated SOLR-6185: Affects Version/s: 4.9 Fix Version/s: 5.0 Issue Type: Task (was: Bug) Change suggester default in Solr, because Jaspell is deprecated --- Key: SOLR-6185 URL: https://issues.apache.org/jira/browse/SOLR-6185 Project: Solr Issue Type: Task Affects Versions: 4.9 Reporter: Uwe Schindler Fix For: 5.0 In SOLR-6178, LUCENE-5775 we deprecated JaspellLookup. But the default is still 4.9, Solr warns only if you explicitely use it. We should change the default for 5.0. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-6185) Change suggester default in Solr, because Jaspell is deprecated
Uwe Schindler created SOLR-6185: --- Summary: Change suggester default in Solr, because Jaspell is deprecated Key: SOLR-6185 URL: https://issues.apache.org/jira/browse/SOLR-6185 Project: Solr Issue Type: Bug Reporter: Uwe Schindler In SOLR-6178, LUCENE-5775 we deprecated JaspellLookup. But the default is still 4.9, Solr warns only if you explicitely use it. We should change the default for 5.0. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6178) Deprecate Jaspell suggester
[ https://issues.apache.org/jira/browse/SOLR-6178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038657#comment-14038657 ] Uwe Schindler commented on SOLR-6178: - Sorry, I don't see a release. Deprecate Jaspell suggester --- Key: SOLR-6178 URL: https://issues.apache.org/jira/browse/SOLR-6178 Project: Solr Issue Type: Improvement Components: spellchecker Reporter: Michael McCandless Fix For: 4.9, 5.0 Attachments: SOLR-6178.patch Right now Solr defaults to Jaspell, but we've deprecated it in LUCENE-5775 ... and in trunk I'd like to remove it. But first we need to fix Solr to not default to it. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6178) Deprecate Jaspell suggester
[ https://issues.apache.org/jira/browse/SOLR-6178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038659#comment-14038659 ] Robert Muir commented on SOLR-6178: --- Well, you can vote however you like. I already spun the RC and its in progress. Its risky as shit to add such deprecations which will cause warning messages to solr users and create a controversy. There is no need to rush in such things at the last minute. Deprecate Jaspell suggester --- Key: SOLR-6178 URL: https://issues.apache.org/jira/browse/SOLR-6178 Project: Solr Issue Type: Improvement Components: spellchecker Reporter: Michael McCandless Fix For: 4.9, 5.0 Attachments: SOLR-6178.patch Right now Solr defaults to Jaspell, but we've deprecated it in LUCENE-5775 ... and in trunk I'd like to remove it. But first we need to fix Solr to not default to it. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6178) Deprecate Jaspell suggester
[ https://issues.apache.org/jira/browse/SOLR-6178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038664#comment-14038664 ] Michael McCandless commented on SOLR-6178: -- In hindsight I should have just backported the bug fix in LUCENE-5775 (stack overflow when calling .ramBytesUsed()), and not the deprecation, for 4.9 next time I'll make separate issues. Sorry for the hassle. Deprecate Jaspell suggester --- Key: SOLR-6178 URL: https://issues.apache.org/jira/browse/SOLR-6178 Project: Solr Issue Type: Improvement Components: spellchecker Reporter: Michael McCandless Fix For: 4.9, 5.0 Attachments: SOLR-6178.patch Right now Solr defaults to Jaspell, but we've deprecated it in LUCENE-5775 ... and in trunk I'd like to remove it. But first we need to fix Solr to not default to it. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6178) Deprecate Jaspell suggester
[ https://issues.apache.org/jira/browse/SOLR-6178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038663#comment-14038663 ] Uwe Schindler commented on SOLR-6178: - This was not last minute. Mike just missed to add the deprecation also in Solr. If we go this way, please also revert Mike's commit. Sorry. Deprecate Jaspell suggester --- Key: SOLR-6178 URL: https://issues.apache.org/jira/browse/SOLR-6178 Project: Solr Issue Type: Improvement Components: spellchecker Reporter: Michael McCandless Fix For: 4.9, 5.0 Attachments: SOLR-6178.patch Right now Solr defaults to Jaspell, but we've deprecated it in LUCENE-5775 ... and in trunk I'd like to remove it. But first we need to fix Solr to not default to it. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6178) Deprecate Jaspell suggester
[ https://issues.apache.org/jira/browse/SOLR-6178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038669#comment-14038669 ] Robert Muir commented on SOLR-6178: --- Its last minute because it can be potentially controversial. Deprecating the lucene suggester has no impact to solr users. Putting a big warning in the default configuration is a different story. I didnt revert anything, your commit just didn't make it in time for the release candidate. I dont want such controversy: I care about keeping things stable and bad bugs. I can EASILY see such a warning turning into we should change the solr default and spinning totally out of control and somehow breaking the example. I waited extra time for SOLR-6182 because it looked like a really bad bug. And if a similar bug like that pops up I won't even have a second thought about respinning for it. But by no means is this a blocker. Deprecate Jaspell suggester --- Key: SOLR-6178 URL: https://issues.apache.org/jira/browse/SOLR-6178 Project: Solr Issue Type: Improvement Components: spellchecker Reporter: Michael McCandless Fix For: 4.9, 5.0 Attachments: SOLR-6178.patch Right now Solr defaults to Jaspell, but we've deprecated it in LUCENE-5775 ... and in trunk I'd like to remove it. But first we need to fix Solr to not default to it. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5780) OrdinalMap's mapping from global ords to segment ords is sometimes wasteful
[ https://issues.apache.org/jira/browse/LUCENE-5780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038674#comment-14038674 ] Robert Muir commented on LUCENE-5780: - This looks good (+1 to commit to trunk/4.10) but i think we can do better, by explicitly sorting? E.g. take a long[] sizes parameter (can be optional and zeros would give us what we have today if we use a stable sort), that the user could populate either with valueCount or number of docs in the segment (both are probably a fine heuristic). I know this means we will need an array to remap lookups, but this only happens once per segment with the new LongValues api so it won't impact performance. OrdinalMap's mapping from global ords to segment ords is sometimes wasteful --- Key: LUCENE-5780 URL: https://issues.apache.org/jira/browse/LUCENE-5780 Project: Lucene - Core Issue Type: Bug Reporter: Adrien Grand Assignee: Adrien Grand Fix For: 5.0, 4.10 Attachments: LUCENE-5780.patch, LUCENE-5780.patch Robert found a case when the ordinal map can be quite wasteful in terms of memory usage: in order to be able to resolve values given a global ordinals, it stores two things: - an identifier of the segment where the value is - the difference between the ordinal on the segment and the global ordinal The issue is that OrdinalMap currently picks any of the segments that contain the value but we can do better: we can pick the first segment that has the value. This will help for two reasons: - it will potentially require fewer bits per value to store the segment ids if NRT segments don't introduce new values - if all values happen to appear in the first segment, then the map from global ords to deltas only stores zeros. I just tested on an index where all values are in the first segment and this helped reduce memory usage of the ordinal map by 4x (from 3.5MB to 800KB). -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-6186) Highlight on parent in block join query.
Rob Bamberg created SOLR-6186: - Summary: Highlight on parent in block join query. Key: SOLR-6186 URL: https://issues.apache.org/jira/browse/SOLR-6186 Project: Solr Issue Type: Improvement Components: highlighter Affects Versions: 4.8.1 Reporter: Rob Bamberg Fix For: 4.9, 5.0 The issue I have is with the highlighting component during block joins where I am searching on the parent document but returning the children. As I am returning the children the snippets also apply to the children, but I am actually searching in the parent. (search query example: {!child of=type:parent}q ) So could it be possible to return snippets for the parent on which I am actually searching? -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6178) Deprecate Jaspell suggester
[ https://issues.apache.org/jira/browse/SOLR-6178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038726#comment-14038726 ] Uwe Schindler commented on SOLR-6178: - In my opinion, we should revert this one *and* the deprecation inside Lucene. To me it is also very controversal if Solr defaults to use something that is deprecated in Lucene because it uses too much memory! I agree to revert this issue if the deprecation in Lucene is also reverted. Deprecate Jaspell suggester --- Key: SOLR-6178 URL: https://issues.apache.org/jira/browse/SOLR-6178 Project: Solr Issue Type: Improvement Components: spellchecker Reporter: Michael McCandless Fix For: 4.9, 5.0 Attachments: SOLR-6178.patch Right now Solr defaults to Jaspell, but we've deprecated it in LUCENE-5775 ... and in trunk I'd like to remove it. But first we need to fix Solr to not default to it. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[VOTE] 4.9.0
Artifacts here: http://people.apache.org/~rmuir/staging_area/lucene_solr_4_9_0_r1604085/ Here's my +1 SUCCESS! [0:35:36.654925] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6178) Deprecate Jaspell suggester
[ https://issues.apache.org/jira/browse/SOLR-6178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038732#comment-14038732 ] Robert Muir commented on SOLR-6178: --- Well, thats your opinion. There is absolutely no way I am going to let a lucene deprecation annotation have a destabilizing snowball effect on this release: first its a deprecation warning, then someone doesn't want the warning in the default configuration and wants a change of defaults, then the change of defaults breaks the example, then the change to the example breaks the documentation or tutorial, and on and on and on. There is just no reason for such risks. Deprecate Jaspell suggester --- Key: SOLR-6178 URL: https://issues.apache.org/jira/browse/SOLR-6178 Project: Solr Issue Type: Improvement Components: spellchecker Reporter: Michael McCandless Fix For: 4.9, 5.0 Attachments: SOLR-6178.patch Right now Solr defaults to Jaspell, but we've deprecated it in LUCENE-5775 ... and in trunk I'd like to remove it. But first we need to fix Solr to not default to it. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5780) OrdinalMap's mapping from global ords to segment ords is sometimes wasteful
[ https://issues.apache.org/jira/browse/LUCENE-5780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038734#comment-14038734 ] ASF subversion and git services commented on LUCENE-5780: - Commit 1604157 from [~jpountz] in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1604157 ] LUCENE-5780: Make OrdinalMap more memory-efficient. OrdinalMap's mapping from global ords to segment ords is sometimes wasteful --- Key: LUCENE-5780 URL: https://issues.apache.org/jira/browse/LUCENE-5780 Project: Lucene - Core Issue Type: Bug Reporter: Adrien Grand Assignee: Adrien Grand Fix For: 5.0, 4.10 Attachments: LUCENE-5780.patch, LUCENE-5780.patch Robert found a case when the ordinal map can be quite wasteful in terms of memory usage: in order to be able to resolve values given a global ordinals, it stores two things: - an identifier of the segment where the value is - the difference between the ordinal on the segment and the global ordinal The issue is that OrdinalMap currently picks any of the segments that contain the value but we can do better: we can pick the first segment that has the value. This will help for two reasons: - it will potentially require fewer bits per value to store the segment ids if NRT segments don't introduce new values - if all values happen to appear in the first segment, then the map from global ords to deltas only stores zeros. I just tested on an index where all values are in the first segment and this helped reduce memory usage of the ordinal map by 4x (from 3.5MB to 800KB). -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5780) OrdinalMap's mapping from global ords to segment ords is sometimes wasteful
[ https://issues.apache.org/jira/browse/LUCENE-5780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038733#comment-14038733 ] Adrien Grand commented on LUCENE-5780: -- +1 I had planned to open a follow-up issue about it. Will open it shortly. OrdinalMap's mapping from global ords to segment ords is sometimes wasteful --- Key: LUCENE-5780 URL: https://issues.apache.org/jira/browse/LUCENE-5780 Project: Lucene - Core Issue Type: Bug Reporter: Adrien Grand Assignee: Adrien Grand Fix For: 5.0, 4.10 Attachments: LUCENE-5780.patch, LUCENE-5780.patch Robert found a case when the ordinal map can be quite wasteful in terms of memory usage: in order to be able to resolve values given a global ordinals, it stores two things: - an identifier of the segment where the value is - the difference between the ordinal on the segment and the global ordinal The issue is that OrdinalMap currently picks any of the segments that contain the value but we can do better: we can pick the first segment that has the value. This will help for two reasons: - it will potentially require fewer bits per value to store the segment ids if NRT segments don't introduce new values - if all values happen to appear in the first segment, then the map from global ords to deltas only stores zeros. I just tested on an index where all values are in the first segment and this helped reduce memory usage of the ordinal map by 4x (from 3.5MB to 800KB). -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5780) OrdinalMap's mapping from global ords to segment ords is sometimes wasteful
[ https://issues.apache.org/jira/browse/LUCENE-5780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038736#comment-14038736 ] ASF subversion and git services commented on LUCENE-5780: - Commit 1604158 from [~jpountz] in branch 'dev/trunk' [ https://svn.apache.org/r1604158 ] LUCENE-5780: Make OrdinalMap more memory-efficient. OrdinalMap's mapping from global ords to segment ords is sometimes wasteful --- Key: LUCENE-5780 URL: https://issues.apache.org/jira/browse/LUCENE-5780 Project: Lucene - Core Issue Type: Bug Reporter: Adrien Grand Assignee: Adrien Grand Fix For: 5.0, 4.10 Attachments: LUCENE-5780.patch, LUCENE-5780.patch Robert found a case when the ordinal map can be quite wasteful in terms of memory usage: in order to be able to resolve values given a global ordinals, it stores two things: - an identifier of the segment where the value is - the difference between the ordinal on the segment and the global ordinal The issue is that OrdinalMap currently picks any of the segments that contain the value but we can do better: we can pick the first segment that has the value. This will help for two reasons: - it will potentially require fewer bits per value to store the segment ids if NRT segments don't introduce new values - if all values happen to appear in the first segment, then the map from global ords to deltas only stores zeros. I just tested on an index where all values are in the first segment and this helped reduce memory usage of the ordinal map by 4x (from 3.5MB to 800KB). -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5714) Improve tests for BBoxStrategy then port to 4x.
[ https://issues.apache.org/jira/browse/LUCENE-5714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Smiley updated LUCENE-5714: - Fix Version/s: (was: 4.9) 4.10 That's right [~rcmuir]; it was assigned as such prior to your release branch. I bumped it to 4.10. Improve tests for BBoxStrategy then port to 4x. --- Key: LUCENE-5714 URL: https://issues.apache.org/jira/browse/LUCENE-5714 Project: Lucene - Core Issue Type: Improvement Components: modules/spatial Reporter: David Smiley Assignee: David Smiley Fix For: 4.10 Attachments: LUCENE-5714__Enhance_BBoxStrategy__more_tests,_fix_dateline_bugs,_new_AreaSimilarity_algor.patch BBoxStrategy needs better tests before I'm comfortable seeing it in 4x. Specifically it should use random rectangles based validation (ones that may cross the dateline), akin to the other tests. And I think I see an equals/hashcode bug to be fixed in there too. One particular thing I'd like to see added is how to handle a zero-area case for AreaSimilarity. I think an additional feature in which you declare a minimum % area (relative to the query shape) would be good. It should be possible for the user to combine rectangle center-point to query shape center-point distance sorting as well. I think it is but I need to make sure it's possible without _having_ to index a separate center point field. Another possibility (probably not to be addressed here) is a minimum ratio between width/height, perhaps 10%. A long but nearly no height line should not be massively disadvantaged relevancy-wise to an equivalently long diagonal road that has a square bbox. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6178) Deprecate Jaspell suggester
[ https://issues.apache.org/jira/browse/SOLR-6178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038739#comment-14038739 ] Robert Muir commented on SOLR-6178: --- Also, changing the default gets even more complex. its been jaspell for a long time: but changing the default could cause a ton of confusion (users think 4.9 cannot read their index), even though we know that suggester data is really not part of the index... the user just may not see it that way: to them its their data and the upgrade broke them. Such a change needs to be planned out from a docs and release notes perspective, not shoved in right at the last minute hastily. Its too risky. Deprecate Jaspell suggester --- Key: SOLR-6178 URL: https://issues.apache.org/jira/browse/SOLR-6178 Project: Solr Issue Type: Improvement Components: spellchecker Reporter: Michael McCandless Fix For: 4.9, 5.0 Attachments: SOLR-6178.patch Right now Solr defaults to Jaspell, but we've deprecated it in LUCENE-5775 ... and in trunk I'd like to remove it. But first we need to fix Solr to not default to it. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-5780) OrdinalMap's mapping from global ords to segment ords is sometimes wasteful
[ https://issues.apache.org/jira/browse/LUCENE-5780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrien Grand resolved LUCENE-5780. -- Resolution: Fixed OrdinalMap's mapping from global ords to segment ords is sometimes wasteful --- Key: LUCENE-5780 URL: https://issues.apache.org/jira/browse/LUCENE-5780 Project: Lucene - Core Issue Type: Bug Reporter: Adrien Grand Assignee: Adrien Grand Fix For: 5.0, 4.10 Attachments: LUCENE-5780.patch, LUCENE-5780.patch Robert found a case when the ordinal map can be quite wasteful in terms of memory usage: in order to be able to resolve values given a global ordinals, it stores two things: - an identifier of the segment where the value is - the difference between the ordinal on the segment and the global ordinal The issue is that OrdinalMap currently picks any of the segments that contain the value but we can do better: we can pick the first segment that has the value. This will help for two reasons: - it will potentially require fewer bits per value to store the segment ids if NRT segments don't introduce new values - if all values happen to appear in the first segment, then the map from global ords to deltas only stores zeros. I just tested on an index where all values are in the first segment and this helped reduce memory usage of the ordinal map by 4x (from 3.5MB to 800KB). -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [VOTE] 4.9.0
+1 SUCCESS! [0:47:26.115239] Mike McCandless http://blog.mikemccandless.com On Fri, Jun 20, 2014 at 8:13 AM, Robert Muir rcm...@gmail.com wrote: Artifacts here: http://people.apache.org/~rmuir/staging_area/lucene_solr_4_9_0_r1604085/ Here's my +1 SUCCESS! [0:35:36.654925] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-6187) facet.mincount ignored in range date faceting using distributed search
Zaccheo Bagnati created SOLR-6187: - Summary: facet.mincount ignored in range date faceting using distributed search Key: SOLR-6187 URL: https://issues.apache.org/jira/browse/SOLR-6187 Project: Solr Issue Type: Bug Components: search Affects Versions: 4.8.1, 4.8 Reporter: Zaccheo Bagnati While I was trying to do a range faceting with gap +1YEAR using shards, I noticed that facet.mincount parameter seems to be ignored. Issue can be reproduced in this way: Create 2 cores testshard1 and testshard2 with: solrconfig.xml ?xml version=1.0 encoding=UTF-8 ? config luceneMatchVersionLUCENE_41/luceneMatchVersion lib dir=/opt/solr/dist regex=solr-cell-.*\.jar/ directoryFactory name=DirectoryFactory class=${solr.directoryFactory:solr.NRTCachingDirectoryFactory}/ updateHandler class=solr.DirectUpdateHandler2 / requestHandler name=/select class=solr.SearchHandler lst name=defaults str name=echoParamsexplicit/str int name=rows10/int str name=dfid/str /lst /requestHandler requestHandler name=/update class=solr.UpdateRequestHandler / requestHandler name=/admin/ class=org.apache.solr.handler.admin.AdminHandlers / requestHandler name=/admin/ping class=solr.PingRequestHandler lst name=invariants str name=qsolrpingquery/str /lst lst name=defaults str name=echoParamsall/str /lst /requestHandler /config schema.xml ?xml version=1.0 ? schema name=${solr.core.name} version=1.5 xmlns:xi=http://www.w3.org/2001/XInclude; fieldType name=int class=solr.TrieIntField precisionStep=0 positionIncrementGap=0/ fieldType name=long class=solr.TrieLongField precisionStep=0 positionIncrementGap=0/ fieldType name=date class=solr.TrieDateField precisionStep=0 positionIncrementGap=0/ field name=_version_ type=long indexed=true stored=true/ field name=id type=int indexed=true stored=true multiValued=false / field name=date type=date indexed=true stored=true multiValued=false / uniqueKeyid/uniqueKey defaultSearchFieldid/defaultSearchField /schema Insert in testshard1: add doc field name=id1/field field name=date2014-06-20T12:51:00Z/field /doc /add Insert into testshard2: add doc field name=id2/field field name=date2013-06-20T12:51:00Z/field /doc /add Now if I execute: curl http://localhost:8983/solr/testshard1/select?q=id:1facet=truefacet.mincount=1facet.range=datef.date.facet.range.start=1900-01-01T00:00:00Zf.date.facet.range.end=NOWf.date.facet.range.gap=%2B1YEARshards=localhost%3A8983%2Fsolr%2Ftestshard1%2Clocalhost%3A8983%2Fsolr%2Ftestshard2shards.info=truewt=json; I obtain:
[jira] [Created] (SOLR-6188) solr.ICUFoldingFilterFactory causes NoClassDefFoundError: o/a/l/a/icu/ICUFoldingFilter
Ahmet Arslan created SOLR-6188: -- Summary: solr.ICUFoldingFilterFactory causes NoClassDefFoundError: o/a/l/a/icu/ICUFoldingFilter Key: SOLR-6188 URL: https://issues.apache.org/jira/browse/SOLR-6188 Project: Solr Issue Type: Bug Components: Schema and Analysis Affects Versions: 4.8.1 Reporter: Ahmet Arslan When fully qualified class name is used in schema.xml {{org.apache.lucene.analysis.icu.ICUFoldingFilterFactory}} it works. However as documented in confluence and wiki, when {{solr.ICUFoldingFilterFactory}} is used it throws following exception. This is true for both released 4.8.1 version and trunk r1604168 following type works : {code:xml} fieldType name=folded2 class=solr.TextField analyzer tokenizer class=solr.StandardTokenizerFactory/ filter class=org.apache.lucene.analysis.icu.ICUFoldingFilterFactory/ /analyzer /fieldType {code} this does not : {code:xml} fieldType name=folded class=solr.TextField analyzer tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.ICUFoldingFilterFactory/ /analyzer /fieldType {code} {noformat} 257 [main] ERROR org.apache.solr.core.SolrCore – Error loading core:java.util.concurrent.ExecutionException: java.lang.NoClassDefFoundError: org/apache/lucene/analysis/icu/ICUFoldingFilter at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:188) at org.apache.solr.core.CoreContainer.load(CoreContainer.java:301) at org.apache.solr.servlet.SolrDispatchFilter.createCoreContainer(SolrDispatchFilter.java:190) at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:137) at org.eclipse.jetty.servlet.FilterHolder.doStart(FilterHolder.java:119) at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64) at org.eclipse.jetty.servlet.ServletHandler.initialize(ServletHandler.java:719) at org.eclipse.jetty.servlet.ServletContextHandler.startContext(ServletContextHandler.java:265) at org.eclipse.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1252) at org.eclipse.jetty.server.handler.ContextHandler.doStart(ContextHandler.java:710) at org.eclipse.jetty.webapp.WebAppContext.doStart(WebAppContext.java:494) at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64) at org.eclipse.jetty.deploy.bindings.StandardStarter.processBinding(StandardStarter.java:39) at org.eclipse.jetty.deploy.AppLifeCycle.runBindings(AppLifeCycle.java:186) at org.eclipse.jetty.deploy.DeploymentManager.requestAppGoal(DeploymentManager.java:494) at org.eclipse.jetty.deploy.DeploymentManager.addApp(DeploymentManager.java:141) at org.eclipse.jetty.deploy.providers.ScanningAppProvider.fileAdded(ScanningAppProvider.java:145) at org.eclipse.jetty.deploy.providers.ScanningAppProvider$1.fileAdded(ScanningAppProvider.java:56) at org.eclipse.jetty.util.Scanner.reportAddition(Scanner.java:609) at org.eclipse.jetty.util.Scanner.reportDifferences(Scanner.java:540) at org.eclipse.jetty.util.Scanner.scan(Scanner.java:403) at org.eclipse.jetty.util.Scanner.doStart(Scanner.java:337) at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64) at org.eclipse.jetty.deploy.providers.ScanningAppProvider.doStart(ScanningAppProvider.java:121) at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64) at org.eclipse.jetty.deploy.DeploymentManager.startAppProvider(DeploymentManager.java:555) at org.eclipse.jetty.deploy.DeploymentManager.doStart(DeploymentManager.java:230) at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64) at org.eclipse.jetty.util.component.AggregateLifeCycle.doStart(AggregateLifeCycle.java:81) at org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:58) at org.eclipse.jetty.server.handler.HandlerWrapper.doStart(HandlerWrapper.java:96) at org.eclipse.jetty.server.Server.doStart(Server.java:280) at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64) at org.eclipse.jetty.xml.XmlConfiguration$1.run(XmlConfiguration.java:1259) at java.security.AccessController.doPrivileged(Native Method) at org.eclipse.jetty.xml.XmlConfiguration.main(XmlConfiguration.java:1182) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at
[jira] [Updated] (SOLR-6188) solr.ICUFoldingFilterFactory causes NoClassDefFoundError: o/a/l/a/icu/ICUFoldingFilter
[ https://issues.apache.org/jira/browse/SOLR-6188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ahmet Arslan updated SOLR-6188: --- Fix Version/s: 4.9 solr.ICUFoldingFilterFactory causes NoClassDefFoundError: o/a/l/a/icu/ICUFoldingFilter -- Key: SOLR-6188 URL: https://issues.apache.org/jira/browse/SOLR-6188 Project: Solr Issue Type: Bug Components: Schema and Analysis Affects Versions: 4.8.1 Reporter: Ahmet Arslan Labels: ICUFoldingFilterFactory Fix For: 4.9 When fully qualified class name is used in schema.xml {{org.apache.lucene.analysis.icu.ICUFoldingFilterFactory}} it works. However as documented in confluence and wiki, when {{solr.ICUFoldingFilterFactory}} is used it throws following exception. This is true for both released 4.8.1 version and trunk r1604168 following type works : {code:xml} fieldType name=folded2 class=solr.TextField analyzer tokenizer class=solr.StandardTokenizerFactory/ filter class=org.apache.lucene.analysis.icu.ICUFoldingFilterFactory/ /analyzer /fieldType {code} this does not : {code:xml} fieldType name=folded class=solr.TextField analyzer tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.ICUFoldingFilterFactory/ /analyzer /fieldType {code} {noformat} 257 [main] ERROR org.apache.solr.core.SolrCore – Error loading core:java.util.concurrent.ExecutionException: java.lang.NoClassDefFoundError: org/apache/lucene/analysis/icu/ICUFoldingFilter at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:188) at org.apache.solr.core.CoreContainer.load(CoreContainer.java:301) at org.apache.solr.servlet.SolrDispatchFilter.createCoreContainer(SolrDispatchFilter.java:190) at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:137) at org.eclipse.jetty.servlet.FilterHolder.doStart(FilterHolder.java:119) at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64) at org.eclipse.jetty.servlet.ServletHandler.initialize(ServletHandler.java:719) at org.eclipse.jetty.servlet.ServletContextHandler.startContext(ServletContextHandler.java:265) at org.eclipse.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1252) at org.eclipse.jetty.server.handler.ContextHandler.doStart(ContextHandler.java:710) at org.eclipse.jetty.webapp.WebAppContext.doStart(WebAppContext.java:494) at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64) at org.eclipse.jetty.deploy.bindings.StandardStarter.processBinding(StandardStarter.java:39) at org.eclipse.jetty.deploy.AppLifeCycle.runBindings(AppLifeCycle.java:186) at org.eclipse.jetty.deploy.DeploymentManager.requestAppGoal(DeploymentManager.java:494) at org.eclipse.jetty.deploy.DeploymentManager.addApp(DeploymentManager.java:141) at org.eclipse.jetty.deploy.providers.ScanningAppProvider.fileAdded(ScanningAppProvider.java:145) at org.eclipse.jetty.deploy.providers.ScanningAppProvider$1.fileAdded(ScanningAppProvider.java:56) at org.eclipse.jetty.util.Scanner.reportAddition(Scanner.java:609) at org.eclipse.jetty.util.Scanner.reportDifferences(Scanner.java:540) at org.eclipse.jetty.util.Scanner.scan(Scanner.java:403) at org.eclipse.jetty.util.Scanner.doStart(Scanner.java:337) at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64) at org.eclipse.jetty.deploy.providers.ScanningAppProvider.doStart(ScanningAppProvider.java:121) at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64) at org.eclipse.jetty.deploy.DeploymentManager.startAppProvider(DeploymentManager.java:555) at org.eclipse.jetty.deploy.DeploymentManager.doStart(DeploymentManager.java:230) at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64) at org.eclipse.jetty.util.component.AggregateLifeCycle.doStart(AggregateLifeCycle.java:81) at org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:58) at org.eclipse.jetty.server.handler.HandlerWrapper.doStart(HandlerWrapper.java:96) at org.eclipse.jetty.server.Server.doStart(Server.java:280) at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64) at org.eclipse.jetty.xml.XmlConfiguration$1.run(XmlConfiguration.java:1259) at java.security.AccessController.doPrivileged(Native Method) at
[jira] [Updated] (SOLR-6188) solr.ICUFoldingFilterFactory causes NoClassDefFoundError: o/a/l/a/icu/ICUFoldingFilter
[ https://issues.apache.org/jira/browse/SOLR-6188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated SOLR-6188: -- Fix Version/s: (was: 4.9) 4.10 solr.ICUFoldingFilterFactory causes NoClassDefFoundError: o/a/l/a/icu/ICUFoldingFilter -- Key: SOLR-6188 URL: https://issues.apache.org/jira/browse/SOLR-6188 Project: Solr Issue Type: Bug Components: Schema and Analysis Affects Versions: 4.8.1 Reporter: Ahmet Arslan Labels: ICUFoldingFilterFactory Fix For: 4.10 When fully qualified class name is used in schema.xml {{org.apache.lucene.analysis.icu.ICUFoldingFilterFactory}} it works. However as documented in confluence and wiki, when {{solr.ICUFoldingFilterFactory}} is used it throws following exception. This is true for both released 4.8.1 version and trunk r1604168 following type works : {code:xml} fieldType name=folded2 class=solr.TextField analyzer tokenizer class=solr.StandardTokenizerFactory/ filter class=org.apache.lucene.analysis.icu.ICUFoldingFilterFactory/ /analyzer /fieldType {code} this does not : {code:xml} fieldType name=folded class=solr.TextField analyzer tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.ICUFoldingFilterFactory/ /analyzer /fieldType {code} {noformat} 257 [main] ERROR org.apache.solr.core.SolrCore – Error loading core:java.util.concurrent.ExecutionException: java.lang.NoClassDefFoundError: org/apache/lucene/analysis/icu/ICUFoldingFilter at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:188) at org.apache.solr.core.CoreContainer.load(CoreContainer.java:301) at org.apache.solr.servlet.SolrDispatchFilter.createCoreContainer(SolrDispatchFilter.java:190) at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:137) at org.eclipse.jetty.servlet.FilterHolder.doStart(FilterHolder.java:119) at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64) at org.eclipse.jetty.servlet.ServletHandler.initialize(ServletHandler.java:719) at org.eclipse.jetty.servlet.ServletContextHandler.startContext(ServletContextHandler.java:265) at org.eclipse.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1252) at org.eclipse.jetty.server.handler.ContextHandler.doStart(ContextHandler.java:710) at org.eclipse.jetty.webapp.WebAppContext.doStart(WebAppContext.java:494) at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64) at org.eclipse.jetty.deploy.bindings.StandardStarter.processBinding(StandardStarter.java:39) at org.eclipse.jetty.deploy.AppLifeCycle.runBindings(AppLifeCycle.java:186) at org.eclipse.jetty.deploy.DeploymentManager.requestAppGoal(DeploymentManager.java:494) at org.eclipse.jetty.deploy.DeploymentManager.addApp(DeploymentManager.java:141) at org.eclipse.jetty.deploy.providers.ScanningAppProvider.fileAdded(ScanningAppProvider.java:145) at org.eclipse.jetty.deploy.providers.ScanningAppProvider$1.fileAdded(ScanningAppProvider.java:56) at org.eclipse.jetty.util.Scanner.reportAddition(Scanner.java:609) at org.eclipse.jetty.util.Scanner.reportDifferences(Scanner.java:540) at org.eclipse.jetty.util.Scanner.scan(Scanner.java:403) at org.eclipse.jetty.util.Scanner.doStart(Scanner.java:337) at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64) at org.eclipse.jetty.deploy.providers.ScanningAppProvider.doStart(ScanningAppProvider.java:121) at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64) at org.eclipse.jetty.deploy.DeploymentManager.startAppProvider(DeploymentManager.java:555) at org.eclipse.jetty.deploy.DeploymentManager.doStart(DeploymentManager.java:230) at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64) at org.eclipse.jetty.util.component.AggregateLifeCycle.doStart(AggregateLifeCycle.java:81) at org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:58) at org.eclipse.jetty.server.handler.HandlerWrapper.doStart(HandlerWrapper.java:96) at org.eclipse.jetty.server.Server.doStart(Server.java:280) at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64) at org.eclipse.jetty.xml.XmlConfiguration$1.run(XmlConfiguration.java:1259) at java.security.AccessController.doPrivileged(Native
[jira] [Commented] (SOLR-6188) solr.ICUFoldingFilterFactory causes NoClassDefFoundError: o/a/l/a/icu/ICUFoldingFilter
[ https://issues.apache.org/jira/browse/SOLR-6188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038783#comment-14038783 ] Robert Muir commented on SOLR-6188: --- Usually this is because you have not configured the correct classpath. solr.ICUFoldingFilterFactory causes NoClassDefFoundError: o/a/l/a/icu/ICUFoldingFilter -- Key: SOLR-6188 URL: https://issues.apache.org/jira/browse/SOLR-6188 Project: Solr Issue Type: Bug Components: Schema and Analysis Affects Versions: 4.8.1 Reporter: Ahmet Arslan Labels: ICUFoldingFilterFactory Fix For: 4.10 When fully qualified class name is used in schema.xml {{org.apache.lucene.analysis.icu.ICUFoldingFilterFactory}} it works. However as documented in confluence and wiki, when {{solr.ICUFoldingFilterFactory}} is used it throws following exception. This is true for both released 4.8.1 version and trunk r1604168 following type works : {code:xml} fieldType name=folded2 class=solr.TextField analyzer tokenizer class=solr.StandardTokenizerFactory/ filter class=org.apache.lucene.analysis.icu.ICUFoldingFilterFactory/ /analyzer /fieldType {code} this does not : {code:xml} fieldType name=folded class=solr.TextField analyzer tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.ICUFoldingFilterFactory/ /analyzer /fieldType {code} {noformat} 257 [main] ERROR org.apache.solr.core.SolrCore – Error loading core:java.util.concurrent.ExecutionException: java.lang.NoClassDefFoundError: org/apache/lucene/analysis/icu/ICUFoldingFilter at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:188) at org.apache.solr.core.CoreContainer.load(CoreContainer.java:301) at org.apache.solr.servlet.SolrDispatchFilter.createCoreContainer(SolrDispatchFilter.java:190) at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:137) at org.eclipse.jetty.servlet.FilterHolder.doStart(FilterHolder.java:119) at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64) at org.eclipse.jetty.servlet.ServletHandler.initialize(ServletHandler.java:719) at org.eclipse.jetty.servlet.ServletContextHandler.startContext(ServletContextHandler.java:265) at org.eclipse.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1252) at org.eclipse.jetty.server.handler.ContextHandler.doStart(ContextHandler.java:710) at org.eclipse.jetty.webapp.WebAppContext.doStart(WebAppContext.java:494) at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64) at org.eclipse.jetty.deploy.bindings.StandardStarter.processBinding(StandardStarter.java:39) at org.eclipse.jetty.deploy.AppLifeCycle.runBindings(AppLifeCycle.java:186) at org.eclipse.jetty.deploy.DeploymentManager.requestAppGoal(DeploymentManager.java:494) at org.eclipse.jetty.deploy.DeploymentManager.addApp(DeploymentManager.java:141) at org.eclipse.jetty.deploy.providers.ScanningAppProvider.fileAdded(ScanningAppProvider.java:145) at org.eclipse.jetty.deploy.providers.ScanningAppProvider$1.fileAdded(ScanningAppProvider.java:56) at org.eclipse.jetty.util.Scanner.reportAddition(Scanner.java:609) at org.eclipse.jetty.util.Scanner.reportDifferences(Scanner.java:540) at org.eclipse.jetty.util.Scanner.scan(Scanner.java:403) at org.eclipse.jetty.util.Scanner.doStart(Scanner.java:337) at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64) at org.eclipse.jetty.deploy.providers.ScanningAppProvider.doStart(ScanningAppProvider.java:121) at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64) at org.eclipse.jetty.deploy.DeploymentManager.startAppProvider(DeploymentManager.java:555) at org.eclipse.jetty.deploy.DeploymentManager.doStart(DeploymentManager.java:230) at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64) at org.eclipse.jetty.util.component.AggregateLifeCycle.doStart(AggregateLifeCycle.java:81) at org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:58) at org.eclipse.jetty.server.handler.HandlerWrapper.doStart(HandlerWrapper.java:96) at org.eclipse.jetty.server.Server.doStart(Server.java:280) at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64) at
[jira] [Commented] (SOLR-6188) solr.ICUFoldingFilterFactory causes NoClassDefFoundError: o/a/l/a/icu/ICUFoldingFilter
[ https://issues.apache.org/jira/browse/SOLR-6188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038790#comment-14038790 ] Ahmet Arslan commented on SOLR-6188: Hi Robert, I thought the same at first, but I have these two jars : {{icu4j-53.1.jar}} and {{lucene-analyzers-icu-5.0-SNAPSHOT.jar}} inside solr-trunk/solr/example/solr/collection1/lib directory. Besides it shouldn't work when org.apache.lucene.analysis.icu.ICUFoldingFilterFactory is used. I am downloading your 4.9 release candidate, i will test it with that too. solr.ICUFoldingFilterFactory causes NoClassDefFoundError: o/a/l/a/icu/ICUFoldingFilter -- Key: SOLR-6188 URL: https://issues.apache.org/jira/browse/SOLR-6188 Project: Solr Issue Type: Bug Components: Schema and Analysis Affects Versions: 4.8.1 Reporter: Ahmet Arslan Labels: ICUFoldingFilterFactory Fix For: 4.10 When fully qualified class name is used in schema.xml {{org.apache.lucene.analysis.icu.ICUFoldingFilterFactory}} it works. However as documented in confluence and wiki, when {{solr.ICUFoldingFilterFactory}} is used it throws following exception. This is true for both released 4.8.1 version and trunk r1604168 following type works : {code:xml} fieldType name=folded2 class=solr.TextField analyzer tokenizer class=solr.StandardTokenizerFactory/ filter class=org.apache.lucene.analysis.icu.ICUFoldingFilterFactory/ /analyzer /fieldType {code} this does not : {code:xml} fieldType name=folded class=solr.TextField analyzer tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.ICUFoldingFilterFactory/ /analyzer /fieldType {code} {noformat} 257 [main] ERROR org.apache.solr.core.SolrCore – Error loading core:java.util.concurrent.ExecutionException: java.lang.NoClassDefFoundError: org/apache/lucene/analysis/icu/ICUFoldingFilter at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:188) at org.apache.solr.core.CoreContainer.load(CoreContainer.java:301) at org.apache.solr.servlet.SolrDispatchFilter.createCoreContainer(SolrDispatchFilter.java:190) at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:137) at org.eclipse.jetty.servlet.FilterHolder.doStart(FilterHolder.java:119) at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64) at org.eclipse.jetty.servlet.ServletHandler.initialize(ServletHandler.java:719) at org.eclipse.jetty.servlet.ServletContextHandler.startContext(ServletContextHandler.java:265) at org.eclipse.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1252) at org.eclipse.jetty.server.handler.ContextHandler.doStart(ContextHandler.java:710) at org.eclipse.jetty.webapp.WebAppContext.doStart(WebAppContext.java:494) at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64) at org.eclipse.jetty.deploy.bindings.StandardStarter.processBinding(StandardStarter.java:39) at org.eclipse.jetty.deploy.AppLifeCycle.runBindings(AppLifeCycle.java:186) at org.eclipse.jetty.deploy.DeploymentManager.requestAppGoal(DeploymentManager.java:494) at org.eclipse.jetty.deploy.DeploymentManager.addApp(DeploymentManager.java:141) at org.eclipse.jetty.deploy.providers.ScanningAppProvider.fileAdded(ScanningAppProvider.java:145) at org.eclipse.jetty.deploy.providers.ScanningAppProvider$1.fileAdded(ScanningAppProvider.java:56) at org.eclipse.jetty.util.Scanner.reportAddition(Scanner.java:609) at org.eclipse.jetty.util.Scanner.reportDifferences(Scanner.java:540) at org.eclipse.jetty.util.Scanner.scan(Scanner.java:403) at org.eclipse.jetty.util.Scanner.doStart(Scanner.java:337) at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64) at org.eclipse.jetty.deploy.providers.ScanningAppProvider.doStart(ScanningAppProvider.java:121) at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64) at org.eclipse.jetty.deploy.DeploymentManager.startAppProvider(DeploymentManager.java:555) at org.eclipse.jetty.deploy.DeploymentManager.doStart(DeploymentManager.java:230) at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64) at org.eclipse.jetty.util.component.AggregateLifeCycle.doStart(AggregateLifeCycle.java:81) at org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:58) at
[jira] [Commented] (SOLR-4793) Solr Cloud can't upload large config files ( 1MB) to Zookeeper
[ https://issues.apache.org/jira/browse/SOLR-4793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038797#comment-14038797 ] Elaine Cario commented on SOLR-4793: I'm finding that setting jute.maxbuffer both in Solr (-D option at tomcat startup) and zookeeper (zoo.cfg) doesn't seem to work (at least with solr 4.8). This is really becoming a blocker for us, as we are using index time synonym replacement as a poor man's lemmatization and these files get quite large. Would be nice to have some option to have these files managed outside of zookeeper. Solr Cloud can't upload large config files ( 1MB) to Zookeeper - Key: SOLR-4793 URL: https://issues.apache.org/jira/browse/SOLR-4793 Project: Solr Issue Type: Improvement Reporter: Son Nguyen Zookeeper set znode size limit to 1MB by default. So we can't start Solr Cloud with some large config files, like synonyms.txt. Jan Høydahl has a good idea: SolrCloud is designed with an assumption that you should be able to upload your whole disk-based conf folder into ZK, and that you should be able to add an empty Solr node to a cluster and it would download all config from ZK. So immediately a splitting strategy automatically handled by ZkSolresourceLoader for large files could be one way forward, i.e. store synonyms.txt as e.g. __001_synonyms.txt __002_synonyms.txt -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4793) Solr Cloud can't upload large config files ( 1MB) to Zookeeper
[ https://issues.apache.org/jira/browse/SOLR-4793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038810#comment-14038810 ] Yago Riveiro commented on SOLR-4793: I think that version 4.8 updates zookeeper version to 3.4.6 If the workaround doesn't work then is serious issue if you have a large number of collections and replicas because all metadata about the cluster is into clusterstate.json file. [~ecario], How you notice it that the workaround doesn't work? Have you any logs or something? and last question, do you upgrade Solr from 4.7 to 4.8 or is a fresh install? Solr Cloud can't upload large config files ( 1MB) to Zookeeper - Key: SOLR-4793 URL: https://issues.apache.org/jira/browse/SOLR-4793 Project: Solr Issue Type: Improvement Reporter: Son Nguyen Zookeeper set znode size limit to 1MB by default. So we can't start Solr Cloud with some large config files, like synonyms.txt. Jan Høydahl has a good idea: SolrCloud is designed with an assumption that you should be able to upload your whole disk-based conf folder into ZK, and that you should be able to add an empty Solr node to a cluster and it would download all config from ZK. So immediately a splitting strategy automatically handled by ZkSolresourceLoader for large files could be one way forward, i.e. store synonyms.txt as e.g. __001_synonyms.txt __002_synonyms.txt -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [VOTE] 4.9.0
Hi, here is what I do * download solr-4.9.0.tgz * add icu4j-53.1.jar and solr-analysis-extras-4.9.0.jar and lucene-analyzers-icu-4.9.0.jar to solr-4.9.0/example/solr/collection1/lib/ * confirm they are loaded INFO org.apache.solr.core.SolrResourceLoader – Adding 'file:/Volumes/datadisk/Desktop/solr-4.9.0/example/solr/collection1/lib/icu4j-53.1.jar' to class loader INFO org.apache.solr.core.SolrResourceLoader – Adding 'file:/Volumes/datadisk/Desktop/solr-4.9.0/example/solr/collection1/lib/lucene-analyzers-icu-4.9.0.jar' to classloader INFO org.apache.solr.core.SolrResourceLoader – Adding 'file:/Volumes/datadisk/Desktop/solr-4.9.0/example/solr/collection1/lib/solr-analysis-extras-4.9.0.jar' to class loader icu4j-53.1.jar loaded twice INFO org.apache.solr.core.SolrResourceLoader – Adding 'file:/Volumes/datadisk/Desktop/solr-4.9.0/contrib/extraction/lib/icu4j-53.1.jar' to classloader * add filter class=solr.ICUFoldingFilterFactory/ to example schema.xml * java -jar start.jar yields the exception reported in SOLR-6188 When filter class=org.apache.lucene.analysis.icu.ICUFoldingFilterFactory/ is used everything works fine. Thanks, Ahmet On Friday, June 20, 2014 3:55 PM, Michael McCandless luc...@mikemccandless.com wrote: +1 SUCCESS! [0:47:26.115239] Mike McCandless http://blog.mikemccandless.com On Fri, Jun 20, 2014 at 8:13 AM, Robert Muir rcm...@gmail.com wrote: Artifacts here: http://people.apache.org/~rmuir/staging_area/lucene_solr_4_9_0_r1604085/ Here's my +1 SUCCESS! [0:35:36.654925] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [VOTE] 4.9.0
I don't think this issue needs to cause a respin, its in a contrib module, there is a workaround, and the fix is either some classpath problem or, in case it is a bug, likely involves dangerous territory (e.g. SolrResourceLoader)... On Fri, Jun 20, 2014 at 10:03 AM, Ahmet Arslan iori...@yahoo.com.invalid wrote: Hi, here is what I do * download solr-4.9.0.tgz * add icu4j-53.1.jar and solr-analysis-extras-4.9.0.jar and lucene-analyzers-icu-4.9.0.jar to solr-4.9.0/example/solr/collection1/lib/ * confirm they are loaded INFO org.apache.solr.core.SolrResourceLoader – Adding 'file:/Volumes/datadisk/Desktop/solr-4.9.0/example/solr/collection1/lib/icu4j-53.1.jar' to class loader INFO org.apache.solr.core.SolrResourceLoader – Adding 'file:/Volumes/datadisk/Desktop/solr-4.9.0/example/solr/collection1/lib/lucene-analyzers-icu-4.9.0.jar' to classloader INFO org.apache.solr.core.SolrResourceLoader – Adding 'file:/Volumes/datadisk/Desktop/solr-4.9.0/example/solr/collection1/lib/solr-analysis-extras-4.9.0.jar' to class loader icu4j-53.1.jar loaded twice INFO org.apache.solr.core.SolrResourceLoader – Adding 'file:/Volumes/datadisk/Desktop/solr-4.9.0/contrib/extraction/lib/icu4j-53.1.jar' to classloader * add filter class=solr.ICUFoldingFilterFactory/ to example schema.xml * java -jar start.jar yields the exception reported in SOLR-6188 When filter class=org.apache.lucene.analysis.icu.ICUFoldingFilterFactory/ is used everything works fine. Thanks, Ahmet On Friday, June 20, 2014 3:55 PM, Michael McCandless luc...@mikemccandless.com wrote: +1 SUCCESS! [0:47:26.115239] Mike McCandless http://blog.mikemccandless.com On Fri, Jun 20, 2014 at 8:13 AM, Robert Muir rcm...@gmail.com wrote: Artifacts here: http://people.apache.org/~rmuir/staging_area/lucene_solr_4_9_0_r1604085/ Here's my +1 SUCCESS! [0:35:36.654925] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4793) Solr Cloud can't upload large config files ( 1MB) to Zookeeper
[ https://issues.apache.org/jira/browse/SOLR-4793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038834#comment-14038834 ] Nicole Lacoste commented on SOLR-4793: -- Elaine, We got stuck at the same thing and if I remember right we put the synonyms files in a matching folder on each of the machines and put the full path in the schema. It means if there is an update you have to take care of it yourself. I am not sure if the REST api for the synonyms work in this case you'd have to test that. Solr Cloud can't upload large config files ( 1MB) to Zookeeper - Key: SOLR-4793 URL: https://issues.apache.org/jira/browse/SOLR-4793 Project: Solr Issue Type: Improvement Reporter: Son Nguyen Zookeeper set znode size limit to 1MB by default. So we can't start Solr Cloud with some large config files, like synonyms.txt. Jan Høydahl has a good idea: SolrCloud is designed with an assumption that you should be able to upload your whole disk-based conf folder into ZK, and that you should be able to add an empty Solr node to a cluster and it would download all config from ZK. So immediately a splitting strategy automatically handled by ZkSolresourceLoader for large files could be one way forward, i.e. store synonyms.txt as e.g. __001_synonyms.txt __002_synonyms.txt -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4793) Solr Cloud can't upload large config files ( 1MB) to Zookeeper
[ https://issues.apache.org/jira/browse/SOLR-4793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038835#comment-14038835 ] Elaine Cario commented on SOLR-4793: I got this exception in the logs, they always happen on the lemmatization files, which run anywhere from 2MB to 20MB in size: ERROR - 2014-06-20 09:29:59.082; org.apache.solr.core.ZkContainer; org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /configs/dictindex_uk/en_lemmatization.txt at org.apache.zookeeper.KeeperException.create(KeeperException.java:99) at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783) at org.apache.solr.common.cloud.SolrZkClient$10.execute(SolrZkClient.java:432) at org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(ZkCmdExecutor.java:73) at org.apache.solr.common.cloud.SolrZkClient.makePath(SolrZkClient.java:429) at org.apache.solr.common.cloud.SolrZkClient.makePath(SolrZkClient.java:339) at org.apache.solr.cloud.ZkController.uploadToZK(ZkController.java:1318) at org.apache.solr.cloud.ZkController.uploadConfigDir(ZkController.java:1355) at org.apache.solr.cloud.ZkController.bootstrapConf(ZkController.java:1565) at org.apache.solr.core.ZkContainer.initZooKeeper(ZkContainer.java:188) at org.apache.solr.core.ZkContainer.initZooKeeper(ZkContainer.java:67) at org.apache.solr.core.CoreContainer.load(CoreContainer.java:216) at org.apache.solr.servlet.SolrDispatchFilter.createCoreContainer(SolrDispatchFilter.java:189) at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:136) at org.apache.catalina.core.ApplicationFilterConfig.initFilter(ApplicationFilterConfig.java:279) at org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:260) at org.apache.catalina.core.ApplicationFilterConfig.init(ApplicationFilterConfig.java:105) at org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:4809) at org.apache.catalina.core.StandardContext.startInternal(StandardContext.java:5485) at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:150) at org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:901) at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:877) at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:632) at org.apache.catalina.startup.HostConfig.deployWAR(HostConfig.java:1073) at org.apache.catalina.startup.HostConfig$DeployWar.run(HostConfig.java:1857) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) ERROR - 2014-06-20 09:29:59.087; org.apache.solr.servlet.SolrDispatchFilter; Could not start Solr. Check solr/home property and the logs We were working around it by splitting the files up in 1MB chunks, but that sometimes worked and sometimes didn't (we'd still get errors), and for the larger files it's, well, a pain to list 20 files in the schema. We tried moving the files, but then it complains that relative paths aren't allowed in the specification. This is a fresh install of 4.8 (technically it was an upgrade, but we removed the index and re-fed all our content). But we didn't upgrade zookeeper, so I will try that and report back. Solr Cloud can't upload large config files ( 1MB) to Zookeeper - Key: SOLR-4793 URL: https://issues.apache.org/jira/browse/SOLR-4793 Project: Solr Issue Type: Improvement Reporter: Son Nguyen Zookeeper set znode size limit to 1MB by default. So we can't start Solr Cloud with some large config files, like synonyms.txt. Jan Høydahl has a good idea: SolrCloud is designed with an assumption that you should be able to upload your whole disk-based conf folder into ZK, and that you should be able to add an empty Solr node to a cluster and it would download all config from ZK. So immediately a splitting strategy automatically handled by ZkSolresourceLoader for large files could be one way forward, i.e. store synonyms.txt as e.g. __001_synonyms.txt __002_synonyms.txt -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail:
[jira] [Commented] (SOLR-4793) Solr Cloud can't upload large config files ( 1MB) to Zookeeper
[ https://issues.apache.org/jira/browse/SOLR-4793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038842#comment-14038842 ] Nicole Lacoste commented on SOLR-4793: -- I also had trouble with relative paths, try full paths. Solr Cloud can't upload large config files ( 1MB) to Zookeeper - Key: SOLR-4793 URL: https://issues.apache.org/jira/browse/SOLR-4793 Project: Solr Issue Type: Improvement Reporter: Son Nguyen Zookeeper set znode size limit to 1MB by default. So we can't start Solr Cloud with some large config files, like synonyms.txt. Jan Høydahl has a good idea: SolrCloud is designed with an assumption that you should be able to upload your whole disk-based conf folder into ZK, and that you should be able to add an empty Solr node to a cluster and it would download all config from ZK. So immediately a splitting strategy automatically handled by ZkSolresourceLoader for large files could be one way forward, i.e. store synonyms.txt as e.g. __001_synonyms.txt __002_synonyms.txt -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4793) Solr Cloud can't upload large config files ( 1MB) to Zookeeper
[ https://issues.apache.org/jira/browse/SOLR-4793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038845#comment-14038845 ] Elaine Cario commented on SOLR-4793: Scratch that, we're already at zookeeper 3.4.6. Thanks Nicole - that was going to be my very next step. Updates aren't an issue, the files are under version control and our deployment scripts can take care of moving them wherever we need to make it work :-). Will report back... Solr Cloud can't upload large config files ( 1MB) to Zookeeper - Key: SOLR-4793 URL: https://issues.apache.org/jira/browse/SOLR-4793 Project: Solr Issue Type: Improvement Reporter: Son Nguyen Zookeeper set znode size limit to 1MB by default. So we can't start Solr Cloud with some large config files, like synonyms.txt. Jan Høydahl has a good idea: SolrCloud is designed with an assumption that you should be able to upload your whole disk-based conf folder into ZK, and that you should be able to add an empty Solr node to a cluster and it would download all config from ZK. So immediately a splitting strategy automatically handled by ZkSolresourceLoader for large files could be one way forward, i.e. store synonyms.txt as e.g. __001_synonyms.txt __002_synonyms.txt -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-5661) PriorityQueue has OOM (Requested array size exceeds VM limit) issue
[ https://issues.apache.org/jira/browse/SOLR-5661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shalin Shekhar Mangar resolved SOLR-5661. - Resolution: Fixed Fix Version/s: (was: 4.9) 4.7 This was resolved in 4.7 PriorityQueue has OOM (Requested array size exceeds VM limit) issue --- Key: SOLR-5661 URL: https://issues.apache.org/jira/browse/SOLR-5661 Project: Solr Issue Type: Bug Components: contrib - Solr Cell (Tika extraction) Affects Versions: 4.3.1, 4.4, 4.5, 4.5.1, 4.6 Environment: JDK 7 Reporter: Raintung Li Assignee: Michael McCandless Fix For: 5.0, 4.7 Attachments: patch-5661.txt It look like JDK7 change the design for max_array_length logic, it isn't max_jint, and it should be max_jint - header_size(type). If you deliver the Integer.MaxValue to create the PriorityQueue and have enough memory, you will find it is ok in JVM6 but not work in JVM7. JVM7 will throw OOM error while do array rang checking. It should the compatible issue between JVM6 and JVM7. Maybe need protect in the code logic, throw OOM look like big issues for customer. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4793) Solr Cloud can't upload large config files ( 1MB) to Zookeeper
[ https://issues.apache.org/jira/browse/SOLR-4793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038868#comment-14038868 ] Elaine Cario commented on SOLR-4793: Absolute paths didn't work either - it seems to turn it into a relaitve path, and still uses the ZKSolrResourceLoader to find it. I may have to either split the files, or dig into the FSTSynonymFilterFactory to figure out some other workaround. ERROR - 2014-06-20 10:45:13.522; org.apache.solr.core.CoreContainer; Unable to create core: dictindex_uk org.apache.solr.common.SolrException: Could not load core configuration for core dictindex_uk at org.apache.solr.core.ConfigSetService.getConfig(ConfigSetService.java:66) at org.apache.solr.core.CoreContainer.create(CoreContainer.java:554) at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:261) at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:253) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) Caused by: java.lang.RuntimeException: java.io.IOException: Error opening /configs/dictindex_uk//apps/solr/collections/en_lemmatization.txt at org.apache.solr.schema.IndexSchema.init(IndexSchema.java:169) at org.apache.solr.schema.IndexSchemaFactory.create(IndexSchemaFactory.java:55) at org.apache.solr.schema.IndexSchemaFactory.buildIndexSchema(IndexSchemaFactory.java:69) at org.apache.solr.core.ConfigSetService.createIndexSchema(ConfigSetService.java:89) at org.apache.solr.core.ConfigSetService.getConfig(ConfigSetService.java:62) ... 9 more Caused by: java.io.IOException: Error opening /configs/dictindex_uk//apps/solr/collections/en_lemmatization.txt at org.apache.solr.cloud.ZkSolrResourceLoader.openResource(ZkSolrResourceLoader.java:83) at org.apache.lucene.analysis.synonym.FSTSynonymFilterFactory.loadSynonyms(FSTSynonymFilterFactory.java:137) at org.apache.lucene.analysis.synonym.FSTSynonymFilterFactory.inform(FSTSynonymFilterFactory.java:112) at org.apache.lucene.analysis.synonym.SynonymFilterFactory.inform(SynonymFilterFactory.java:90) at org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:675) at org.apache.solr.schema.IndexSchema.init(IndexSchema.java:167) ... 13 more Solr Cloud can't upload large config files ( 1MB) to Zookeeper - Key: SOLR-4793 URL: https://issues.apache.org/jira/browse/SOLR-4793 Project: Solr Issue Type: Improvement Reporter: Son Nguyen Zookeeper set znode size limit to 1MB by default. So we can't start Solr Cloud with some large config files, like synonyms.txt. Jan Høydahl has a good idea: SolrCloud is designed with an assumption that you should be able to upload your whole disk-based conf folder into ZK, and that you should be able to add an empty Solr node to a cluster and it would download all config from ZK. So immediately a splitting strategy automatically handled by ZkSolresourceLoader for large files could be one way forward, i.e. store synonyms.txt as e.g. __001_synonyms.txt __002_synonyms.txt -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4793) Solr Cloud can't upload large config files ( 1MB) to Zookeeper
[ https://issues.apache.org/jira/browse/SOLR-4793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038874#comment-14038874 ] Yago Riveiro commented on SOLR-4793: Elaine can you paste the configuration for tomcat and zookeeper that you have for the jute.maxbuffer? Solr Cloud can't upload large config files ( 1MB) to Zookeeper - Key: SOLR-4793 URL: https://issues.apache.org/jira/browse/SOLR-4793 Project: Solr Issue Type: Improvement Reporter: Son Nguyen Zookeeper set znode size limit to 1MB by default. So we can't start Solr Cloud with some large config files, like synonyms.txt. Jan Høydahl has a good idea: SolrCloud is designed with an assumption that you should be able to upload your whole disk-based conf folder into ZK, and that you should be able to add an empty Solr node to a cluster and it would download all config from ZK. So immediately a splitting strategy automatically handled by ZkSolresourceLoader for large files could be one way forward, i.e. store synonyms.txt as e.g. __001_synonyms.txt __002_synonyms.txt -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4793) Solr Cloud can't upload large config files ( 1MB) to Zookeeper
[ https://issues.apache.org/jira/browse/SOLR-4793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038889#comment-14038889 ] Elaine Cario commented on SOLR-4793: Here is the SOLR_OPTS for Tomcat: SOLR_OPTS=-Dhost.port=8080 -Dhost.context=solr -Dsolr.install.dir=${SOLR_DIST} -Dsolr.solr.home=${SOLR_HOME} -Dsolr.base.data.dir=$SOLR_DATA -Dbootstrap_conf=true -Dsolr.log=${SOLR_LOGDIR} -Dcollection.configName=wkcontent -DnumShards=2 -DzkHost=10.208.152.230:2181 -Djute.maxbuffer=5000 -Dlog4j.configuration=file:///apps/solr/collections/log4j.properties Here is zoo.cfg tickTime=2000 initLimit=10 syncLimit=5 dataDir=/apps/zookeeper/zk_datadir dataLogDir=/apps/zookeeper/zk_logs clientPort=2181 server.1=zookeep01:2881:3881 jute.maxbuffer=5000 Solr Cloud can't upload large config files ( 1MB) to Zookeeper - Key: SOLR-4793 URL: https://issues.apache.org/jira/browse/SOLR-4793 Project: Solr Issue Type: Improvement Reporter: Son Nguyen Zookeeper set znode size limit to 1MB by default. So we can't start Solr Cloud with some large config files, like synonyms.txt. Jan Høydahl has a good idea: SolrCloud is designed with an assumption that you should be able to upload your whole disk-based conf folder into ZK, and that you should be able to add an empty Solr node to a cluster and it would download all config from ZK. So immediately a splitting strategy automatically handled by ZkSolresourceLoader for large files could be one way forward, i.e. store synonyms.txt as e.g. __001_synonyms.txt __002_synonyms.txt -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4793) Solr Cloud can't upload large config files ( 1MB) to Zookeeper
[ https://issues.apache.org/jira/browse/SOLR-4793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038898#comment-14038898 ] Yago Riveiro commented on SOLR-4793: About tocamt's configuration I have the same configuration. In the case of Zookeeper I have all custom configurations into a file named zookeeper-env.sh located into bin/conf folder with this content: {code} #!/usr/bin/env bash ZOO_ENV=-Djute.maxbuffer= 5000 {code} Solr Cloud can't upload large config files ( 1MB) to Zookeeper - Key: SOLR-4793 URL: https://issues.apache.org/jira/browse/SOLR-4793 Project: Solr Issue Type: Improvement Reporter: Son Nguyen Zookeeper set znode size limit to 1MB by default. So we can't start Solr Cloud with some large config files, like synonyms.txt. Jan Høydahl has a good idea: SolrCloud is designed with an assumption that you should be able to upload your whole disk-based conf folder into ZK, and that you should be able to add an empty Solr node to a cluster and it would download all config from ZK. So immediately a splitting strategy automatically handled by ZkSolresourceLoader for large files could be one way forward, i.e. store synonyms.txt as e.g. __001_synonyms.txt __002_synonyms.txt -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4793) Solr Cloud can't upload large config files ( 1MB) to Zookeeper
[ https://issues.apache.org/jira/browse/SOLR-4793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038913#comment-14038913 ] Elaine Cario commented on SOLR-4793: Yago, I was going to ask how that gets loaded, but I just found an old post of yours with more info: let me try that (reposting it here): In my case I have: - Zookeeper: a file into conf folder named zookeeper-env.sh with: #!/usr/bin/env bash ZOO_ENV=-Djute.maxbuffer=10485761 The zookeeper-env.sh is loaded automatically by the zkEnv.sh if the file exists. Solr Cloud can't upload large config files ( 1MB) to Zookeeper - Key: SOLR-4793 URL: https://issues.apache.org/jira/browse/SOLR-4793 Project: Solr Issue Type: Improvement Reporter: Son Nguyen Zookeeper set znode size limit to 1MB by default. So we can't start Solr Cloud with some large config files, like synonyms.txt. Jan Høydahl has a good idea: SolrCloud is designed with an assumption that you should be able to upload your whole disk-based conf folder into ZK, and that you should be able to add an empty Solr node to a cluster and it would download all config from ZK. So immediately a splitting strategy automatically handled by ZkSolresourceLoader for large files could be one way forward, i.e. store synonyms.txt as e.g. __001_synonyms.txt __002_synonyms.txt -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4793) Solr Cloud can't upload large config files ( 1MB) to Zookeeper
[ https://issues.apache.org/jira/browse/SOLR-4793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038918#comment-14038918 ] Yago Riveiro commented on SOLR-4793: Indeed, after dive into the zkEnv file, I realised that if the zookeeper-env.sh exists, the zookeeper append the configurations to the init command. Solr Cloud can't upload large config files ( 1MB) to Zookeeper - Key: SOLR-4793 URL: https://issues.apache.org/jira/browse/SOLR-4793 Project: Solr Issue Type: Improvement Reporter: Son Nguyen Zookeeper set znode size limit to 1MB by default. So we can't start Solr Cloud with some large config files, like synonyms.txt. Jan Høydahl has a good idea: SolrCloud is designed with an assumption that you should be able to upload your whole disk-based conf folder into ZK, and that you should be able to add an empty Solr node to a cluster and it would download all config from ZK. So immediately a splitting strategy automatically handled by ZkSolresourceLoader for large files could be one way forward, i.e. store synonyms.txt as e.g. __001_synonyms.txt __002_synonyms.txt -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4793) Solr Cloud can't upload large config files ( 1MB) to Zookeeper
[ https://issues.apache.org/jira/browse/SOLR-4793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038948#comment-14038948 ] Elaine Cario commented on SOLR-4793: Yago - we have partial success! I added the file as you did, but for some reason the parameter was still not being added to the zookeeper startup (checked PID using ps -ww -fp PID, and didn't see it). So, for the moment I modified the zkServer.sh to just add the param directly to the java startup, and that seemed to work - everything started up with the big files. Now I just need to figure out what's going wrong with the scripting, but that's just some investigation in my part. I'm sure I did something wrong there. Thanks so much for your help!! This has been bugging me for weeks. Solr Cloud can't upload large config files ( 1MB) to Zookeeper - Key: SOLR-4793 URL: https://issues.apache.org/jira/browse/SOLR-4793 Project: Solr Issue Type: Improvement Reporter: Son Nguyen Zookeeper set znode size limit to 1MB by default. So we can't start Solr Cloud with some large config files, like synonyms.txt. Jan Høydahl has a good idea: SolrCloud is designed with an assumption that you should be able to upload your whole disk-based conf folder into ZK, and that you should be able to add an empty Solr node to a cluster and it would download all config from ZK. So immediately a splitting strategy automatically handled by ZkSolresourceLoader for large files could be one way forward, i.e. store synonyms.txt as e.g. __001_synonyms.txt __002_synonyms.txt -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5771) Review semantics of SpatialOperation predicates
[ https://issues.apache.org/jira/browse/LUCENE-5771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Smiley updated LUCENE-5771: - Attachment: LUCENE-5771_SpatialOperation_semantics.patch The attached patch removes the needs area and score is meaningful notions from SpatialOperation. I also added aliases to the various predicates to align with the standard names. Ryan, why did you choose the non-standard names? E.g. why did you choose IsEqualTo when Equals is the standard name? And why the BBoxIntersects and BBoxWithin predicates which I'm not aware we use and are also non-standard and seem better addressed via other ways (e.g. some sort of function one adds that bbox'es a shape). Review semantics of SpatialOperation predicates --- Key: LUCENE-5771 URL: https://issues.apache.org/jira/browse/LUCENE-5771 Project: Lucene - Core Issue Type: Improvement Components: modules/spatial Reporter: David Smiley Attachments: LUCENE-5771_SpatialOperation_semantics.patch, LUCENE-5771_SpatialOperation_semantics.patch SpatialOperation (which I wish was named SpatialPredicate) is a bunch of predicates -- methods that return true/false based on a pair of shapes. Some of them don't seem to be defined in a way consistent with their definitions on ESRI's site: http://edndoc.esri.com/arcsde/9.1/general_topics/understand_spatial_relations.htm (which is linked as a reference, and is in turn equivalent to OGC spec definitions, I believe). Problems: * the definitions make no mention of needing to have area or not, yet some of our predicates are defined as to require area on either the indexed or query shape. * the definitions make a distinction of the boundary of a shape, yet in Lucene-spatial, there is none. That suggests our predicates are wrongly chosen since there *are* official predicates that are boundary-neutral -- namely Covers and CoveredBy in lieu of Contains and Within, respectively. If we don't rename our predicates, we should at least support the correct predicates names! * Overlaps appears totally wrong. It should be defined as indexedShape.relate(queryShape) == Intersects (and thus not Within or Contains or Disjoint). It's presently defined as the same as Intersects plus the query shape needing area. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [VOTE] 4.9.0
On 6/20/2014 8:03 AM, Ahmet Arslan wrote: * download solr-4.9.0.tgz * add icu4j-53.1.jar and solr-analysis-extras-4.9.0.jar and lucene-analyzers-icu-4.9.0.jar to solr-4.9.0/example/solr/collection1/lib/ * confirm they are loaded INFO org.apache.solr.core.SolrResourceLoader – Adding 'file:/Volumes/datadisk/Desktop/solr-4.9.0/example/solr/collection1/lib/icu4j-53.1.jar' to class loader INFO org.apache.solr.core.SolrResourceLoader – Adding 'file:/Volumes/datadisk/Desktop/solr-4.9.0/example/solr/collection1/lib/lucene-analyzers-icu-4.9.0.jar' to classloader INFO org.apache.solr.core.SolrResourceLoader – Adding 'file:/Volumes/datadisk/Desktop/solr-4.9.0/example/solr/collection1/lib/solr-analysis-extras-4.9.0.jar' to class loader icu4j-53.1.jar loaded twice INFO org.apache.solr.core.SolrResourceLoader – Adding 'file:/Volumes/datadisk/Desktop/solr-4.9.0/contrib/extraction/lib/icu4j-53.1.jar' to classloader * add filter class=solr.ICUFoldingFilterFactory/ to example schema.xml * java -jar start.jar yields the exception reported in SOLR-6188 When filter class=org.apache.lucene.analysis.icu.ICUFoldingFilterFactory/ is used everything works fine. One issue that can cause jars to get loaded twice is this: https://issues.apache.org/jira/browse/SOLR-4852 This doesn't look like that specific issue, though. I'm wondering why you copied the jars, since the example config already has a lib line that loads them directly from the contrib directory. I think it's getting loaded twice because you copied it into the core-specific lib directory and there's probably another lib config element that loads everything in that directory, in addition to the one that loads it directly from contrib. My experience with SOLR-4852 combined with this problem report makes me think that it's the fact that it gets loaded twice that causes it to fail. Using the fully-qualified class name is not something that I would have thought of as a workaround to SOLR-4852. Thanks, Shawn - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
RE: [VOTE] 4.9.0
I agree, it's not a problem. The message of SolrResourceCorrumpter is a bit misleading: It does not load the JAR, it just adds it to classpath 2 times, which is not a problem at all. Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Robert Muir [mailto:rcm...@gmail.com] Sent: Friday, June 20, 2014 4:10 PM To: dev@lucene.apache.org; Ahmet Arslan Subject: Re: [VOTE] 4.9.0 I don't think this issue needs to cause a respin, its in a contrib module, there is a workaround, and the fix is either some classpath problem or, in case it is a bug, likely involves dangerous territory (e.g. SolrResourceLoader)... On Fri, Jun 20, 2014 at 10:03 AM, Ahmet Arslan iori...@yahoo.com.invalid wrote: Hi, here is what I do * download solr-4.9.0.tgz * add icu4j-53.1.jar and solr-analysis-extras-4.9.0.jar and lucene-analyzers-icu-4.9.0.jar to solr-4.9.0/example/solr/collection1/lib/ * confirm they are loaded INFO org.apache.solr.core.SolrResourceLoader – Adding 'file:/Volumes/datadisk/Desktop/solr-4.9.0/example/solr/collection1/li b/icu4j-53.1.jar' to class loader INFO org.apache.solr.core.SolrResourceLoader – Adding 'file:/Volumes/datadisk/Desktop/solr-4.9.0/example/solr/collection1/li b/lucene-analyzers-icu-4.9.0.jar' to classloader INFO org.apache.solr.core.SolrResourceLoader – Adding 'file:/Volumes/datadisk/Desktop/solr-4.9.0/example/solr/collection1/li b/solr-analysis-extras-4.9.0.jar' to class loader icu4j-53.1.jar loaded twice INFO org.apache.solr.core.SolrResourceLoader – Adding 'file:/Volumes/datadisk/Desktop/solr-4.9.0/contrib/extraction/lib/icu4 j-53.1.jar' to classloader * add filter class=solr.ICUFoldingFilterFactory/ to example schema.xml * java -jar start.jar yields the exception reported in SOLR-6188 When filter class=org.apache.lucene.analysis.icu.ICUFoldingFilterFactory/ is used everything works fine. Thanks, Ahmet On Friday, June 20, 2014 3:55 PM, Michael McCandless luc...@mikemccandless.com wrote: +1 SUCCESS! [0:47:26.115239] Mike McCandless http://blog.mikemccandless.com On Fri, Jun 20, 2014 at 8:13 AM, Robert Muir rcm...@gmail.com wrote: Artifacts here: http://people.apache.org/~rmuir/staging_area/lucene_solr_4_9_0_r16040 85/ Here's my +1 SUCCESS! [0:35:36.654925] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4793) Solr Cloud can't upload large config files ( 1MB) to Zookeeper
[ https://issues.apache.org/jira/browse/SOLR-4793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038970#comment-14038970 ] Yago Riveiro commented on SOLR-4793: Elaine now is easier to do the debug, you know where the problem is :). Note: I'm using the 3.4.5 version of zookeeper, I don't know if the zkServer.sh was changed Solr Cloud can't upload large config files ( 1MB) to Zookeeper - Key: SOLR-4793 URL: https://issues.apache.org/jira/browse/SOLR-4793 Project: Solr Issue Type: Improvement Reporter: Son Nguyen Zookeeper set znode size limit to 1MB by default. So we can't start Solr Cloud with some large config files, like synonyms.txt. Jan Høydahl has a good idea: SolrCloud is designed with an assumption that you should be able to upload your whole disk-based conf folder into ZK, and that you should be able to add an empty Solr node to a cluster and it would download all config from ZK. So immediately a splitting strategy automatically handled by ZkSolresourceLoader for large files could be one way forward, i.e. store synonyms.txt as e.g. __001_synonyms.txt __002_synonyms.txt -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5468) Option to notify client when desired replication factor not achieved for an update request.
[ https://issues.apache.org/jira/browse/SOLR-5468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Timothy Potter updated SOLR-5468: - Summary: Option to notify client when desired replication factor not achieved for an update request. (was: Option to enforce a majority quorum approach to accepting updates in SolrCloud) Option to notify client when desired replication factor not achieved for an update request. --- Key: SOLR-5468 URL: https://issues.apache.org/jira/browse/SOLR-5468 Project: Solr Issue Type: New Feature Components: SolrCloud Affects Versions: 4.5 Environment: All Reporter: Timothy Potter Assignee: Timothy Potter Priority: Minor Attachments: SOLR-5468.patch, SOLR-5468.patch, SOLR-5468.patch I've been thinking about how SolrCloud deals with write-availability using in-sync replica sets, in which writes will continue to be accepted so long as there is at least one healthy node per shard. For a little background (and to verify my understanding of the process is correct), SolrCloud only considers active/healthy replicas when acknowledging a write. Specifically, when a shard leader accepts an update request, it forwards the request to all active/healthy replicas and only considers the write successful if all active/healthy replicas ack the write. Any down / gone replicas are not considered and will sync up with the leader when they come back online using peer sync or snapshot replication. For instance, if a shard has 3 nodes, A, B, C with A being the current leader, then writes to the shard will continue to succeed even if B C are down. The issue is that if a shard leader continues to accept updates even if it loses all of its replicas, then we have acknowledged updates on only 1 node. If that node, call it A, then fails and one of the previous replicas, call it B, comes back online before A does, then any writes that A accepted while the other replicas were offline are at risk to being lost. SolrCloud does provide a safe-guard mechanism for this problem with the leaderVoteWait setting, which puts any replicas that come back online before node A into a temporary wait state. If A comes back online within the wait period, then all is well as it will become the leader again and no writes will be lost. As a side note, sys admins definitely need to be made more aware of this situation as when I first encountered it in my cluster, I had no idea what it meant. My question is whether we want to consider an approach where SolrCloud will not accept writes unless there is a majority of replicas available to accept the write? For my example, under this approach, we wouldn't accept writes if both BC failed, but would if only C did, leaving A B online. Admittedly, this lowers the write-availability of the system, so may be something that should be tunable? From Mark M: Yeah, this is kind of like one of many little features that we have just not gotten to yet. I’ve always planned for a param that let’s you say how many replicas an update must be verified on before responding success. Seems to make sense to fail that type of request early if you notice there are not enough replicas up to satisfy the param to begin with. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-Tests-4.x-Java7 - Build # 2012 - Still Failing
Build: https://builds.apache.org/job/Lucene-Solr-Tests-4.x-Java7/2012/ All tests passed Build Log: [...truncated 29626 lines...] check-licenses: [echo] License check under: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java7/solr [licenses] MISSING sha1 checksum file for: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java7/solr/example/lib/ext/log4j-1.2.16.jar [licenses] EXPECTED sha1 checksum file : /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java7/solr/licenses/log4j-1.2.16.jar.sha1 [...truncated 1 lines...] BUILD FAILED /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java7/build.xml:467: The following error occurred while executing this line: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java7/build.xml:70: The following error occurred while executing this line: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java7/solr/build.xml:254: The following error occurred while executing this line: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java7/lucene/tools/custom-tasks.xml:62: License check failed. Check the logs. Total time: 122 minutes 50 seconds Build step 'Invoke Ant' marked build as failure Archiving artifacts Sending artifact delta relative to Lucene-Solr-Tests-4.x-Java7 #2008 Archived 1 artifacts Archive block size is 32768 Received 0 blocks and 464 bytes Compression is 0.0% Took 35 ms Recording test results Email was triggered for: Failure Sending email for trigger: Failure - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-5468) Option to notify client when desired replication factor not achieved for an update request.
[ https://issues.apache.org/jira/browse/SOLR-5468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Timothy Potter resolved SOLR-5468. -- Resolution: Fixed Fix Version/s: 4.9 Option to notify client when desired replication factor not achieved for an update request. --- Key: SOLR-5468 URL: https://issues.apache.org/jira/browse/SOLR-5468 Project: Solr Issue Type: New Feature Components: SolrCloud Affects Versions: 4.5 Environment: All Reporter: Timothy Potter Assignee: Timothy Potter Priority: Minor Fix For: 4.9 Attachments: SOLR-5468.patch, SOLR-5468.patch, SOLR-5468.patch I've been thinking about how SolrCloud deals with write-availability using in-sync replica sets, in which writes will continue to be accepted so long as there is at least one healthy node per shard. For a little background (and to verify my understanding of the process is correct), SolrCloud only considers active/healthy replicas when acknowledging a write. Specifically, when a shard leader accepts an update request, it forwards the request to all active/healthy replicas and only considers the write successful if all active/healthy replicas ack the write. Any down / gone replicas are not considered and will sync up with the leader when they come back online using peer sync or snapshot replication. For instance, if a shard has 3 nodes, A, B, C with A being the current leader, then writes to the shard will continue to succeed even if B C are down. The issue is that if a shard leader continues to accept updates even if it loses all of its replicas, then we have acknowledged updates on only 1 node. If that node, call it A, then fails and one of the previous replicas, call it B, comes back online before A does, then any writes that A accepted while the other replicas were offline are at risk to being lost. SolrCloud does provide a safe-guard mechanism for this problem with the leaderVoteWait setting, which puts any replicas that come back online before node A into a temporary wait state. If A comes back online within the wait period, then all is well as it will become the leader again and no writes will be lost. As a side note, sys admins definitely need to be made more aware of this situation as when I first encountered it in my cluster, I had no idea what it meant. My question is whether we want to consider an approach where SolrCloud will not accept writes unless there is a majority of replicas available to accept the write? For my example, under this approach, we wouldn't accept writes if both BC failed, but would if only C did, leaving A B online. Admittedly, this lowers the write-availability of the system, so may be something that should be tunable? From Mark M: Yeah, this is kind of like one of many little features that we have just not gotten to yet. I’ve always planned for a param that let’s you say how many replicas an update must be verified on before responding success. Seems to make sense to fail that type of request early if you notice there are not enough replicas up to satisfy the param to begin with. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5468) Option to notify client when desired replication factor not achieved for an update request.
[ https://issues.apache.org/jira/browse/SOLR-5468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14039012#comment-14039012 ] Timothy Potter commented on SOLR-5468: -- I've renamed this ticket as the previous name was misleading as to what the actual solution is doing. If you're interested in this subject, I encourage you to read my comments above. Basically, Solr's replication mechanism isn't well suited for enforcing a majority quorum before accepting a write. In a nutshell, if you can't rollback a write that succeeds on some nodes if the majority isn't reached, then you can't enforce it ;-) The solution in this ticket is very basic. It allows a client to request that Solr return the achieved replication factor for a given update request (single or batch). All this allows is for the client application to take some additional measures, such as resending the updates, if so desired. Put simply, this moves the problem out to the client. It's useful for environments that have RF=3 and if 2 of the replicas are down, then there is a risk of losing writes if the leader fails and doesn't come back. This gives the client application the ability to know which writes were accepted by a degraded shard and perhaps re-try them in the future if needed. That's it! Option to notify client when desired replication factor not achieved for an update request. --- Key: SOLR-5468 URL: https://issues.apache.org/jira/browse/SOLR-5468 Project: Solr Issue Type: New Feature Components: SolrCloud Affects Versions: 4.5 Environment: All Reporter: Timothy Potter Assignee: Timothy Potter Priority: Minor Fix For: 4.9 Attachments: SOLR-5468.patch, SOLR-5468.patch, SOLR-5468.patch I've been thinking about how SolrCloud deals with write-availability using in-sync replica sets, in which writes will continue to be accepted so long as there is at least one healthy node per shard. For a little background (and to verify my understanding of the process is correct), SolrCloud only considers active/healthy replicas when acknowledging a write. Specifically, when a shard leader accepts an update request, it forwards the request to all active/healthy replicas and only considers the write successful if all active/healthy replicas ack the write. Any down / gone replicas are not considered and will sync up with the leader when they come back online using peer sync or snapshot replication. For instance, if a shard has 3 nodes, A, B, C with A being the current leader, then writes to the shard will continue to succeed even if B C are down. The issue is that if a shard leader continues to accept updates even if it loses all of its replicas, then we have acknowledged updates on only 1 node. If that node, call it A, then fails and one of the previous replicas, call it B, comes back online before A does, then any writes that A accepted while the other replicas were offline are at risk to being lost. SolrCloud does provide a safe-guard mechanism for this problem with the leaderVoteWait setting, which puts any replicas that come back online before node A into a temporary wait state. If A comes back online within the wait period, then all is well as it will become the leader again and no writes will be lost. As a side note, sys admins definitely need to be made more aware of this situation as when I first encountered it in my cluster, I had no idea what it meant. My question is whether we want to consider an approach where SolrCloud will not accept writes unless there is a majority of replicas available to accept the write? For my example, under this approach, we wouldn't accept writes if both BC failed, but would if only C did, leaving A B online. Admittedly, this lowers the write-availability of the system, so may be something that should be tunable? From Mark M: Yeah, this is kind of like one of many little features that we have just not gotten to yet. I’ve always planned for a param that let’s you say how many replicas an update must be verified on before responding success. Seems to make sense to fail that type of request early if you notice there are not enough replicas up to satisfy the param to begin with. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5714) Improve tests for BBoxStrategy then port to 4x.
[ https://issues.apache.org/jira/browse/LUCENE-5714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14039061#comment-14039061 ] David Smiley commented on LUCENE-5714: -- Another change to the API is, I think it's not needed to have a BBoxSimilarity interface. DistanceSimilarity can be tossed, and so could BBoxSimilarityValueSource. Instead, AreaSimilarity can be ShapeAreaValueSource that takes a ValueSource that produces shapes from it's objectVal(doc). This is in the same vein as DistanceToShapeValueSource. This underscores the pluggability with, say, SerializedDVStrategy with ValueSource's. It's plausible it will be faster to decode 4 numbers from a contiguous byte array than have to retrieve a number 4 times via DocValues. And the code shouldn't have to change accordingly -- it's plug and play. Continuing this (definitely a separate JIRA issue), looking at the TODOs: these two methods move to SpatialStrategy: {code:java} /** * Provides access to each rectangle per document as a ValueSource in which * {@link org.apache.lucene.queries.function.FunctionValues#objectVal(int)} returns a {@link * Shape}. */ //TODO raise to SpatialStrategy public ValueSource makeShapeValueSource() { return new BBoxValueSource(this); } @Override public ValueSource makeDistanceValueSource(Point queryPoint, double multiplier) { //TODO if makeShapeValueSource gets lifted to the top; this could become a generic impl. return new DistanceToShapeValueSource(makeShapeValueSource(), queryPoint, multiplier, ctx); } {code} Improve tests for BBoxStrategy then port to 4x. --- Key: LUCENE-5714 URL: https://issues.apache.org/jira/browse/LUCENE-5714 Project: Lucene - Core Issue Type: Improvement Components: modules/spatial Reporter: David Smiley Assignee: David Smiley Fix For: 4.10 Attachments: LUCENE-5714__Enhance_BBoxStrategy__more_tests,_fix_dateline_bugs,_new_AreaSimilarity_algor.patch BBoxStrategy needs better tests before I'm comfortable seeing it in 4x. Specifically it should use random rectangles based validation (ones that may cross the dateline), akin to the other tests. And I think I see an equals/hashcode bug to be fixed in there too. One particular thing I'd like to see added is how to handle a zero-area case for AreaSimilarity. I think an additional feature in which you declare a minimum % area (relative to the query shape) would be good. It should be possible for the user to combine rectangle center-point to query shape center-point distance sorting as well. I think it is but I need to make sure it's possible without _having_ to index a separate center point field. Another possibility (probably not to be addressed here) is a minimum ratio between width/height, perhaps 10%. A long but nearly no height line should not be massively disadvantaged relevancy-wise to an equivalently long diagonal road that has a square bbox. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6157) ReplicationFactorTest hangs
[ https://issues.apache.org/jira/browse/SOLR-6157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14039104#comment-14039104 ] ASF subversion and git services commented on SOLR-6157: --- Commit 1604223 from [~thelabdude] in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1604223 ] SOLR-6157: Fix hanging unit test. ReplicationFactorTest hangs --- Key: SOLR-6157 URL: https://issues.apache.org/jira/browse/SOLR-6157 Project: Solr Issue Type: Bug Components: replication (java) Reporter: Uwe Schindler Assignee: Timothy Potter See: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/10517/ You can download all logs from there. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-6157) ReplicationFactorTest hangs
[ https://issues.apache.org/jira/browse/SOLR-6157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Timothy Potter resolved SOLR-6157. -- Resolution: Fixed Fix Version/s: 4.10 Haven't seen a test hang since committing the previous changes, so marking this one resolved for now. ReplicationFactorTest hangs --- Key: SOLR-6157 URL: https://issues.apache.org/jira/browse/SOLR-6157 Project: Solr Issue Type: Bug Components: replication (java) Reporter: Uwe Schindler Assignee: Timothy Potter Fix For: 4.10 See: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/10517/ You can download all logs from there. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4793) Solr Cloud can't upload large config files ( 1MB) to Zookeeper
[ https://issues.apache.org/jira/browse/SOLR-4793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14039111#comment-14039111 ] Elaine Cario commented on SOLR-4793: Not sure either, but I couldn't find where ZOO_ENV was actually referenced in any of the zookeeper scripts, so I changed the zookeeper-env.sh to this: JVMFLAGS=$JVMFLAGS -Djute.maxbuffer=5000 And then it all came together. Solr Cloud can't upload large config files ( 1MB) to Zookeeper - Key: SOLR-4793 URL: https://issues.apache.org/jira/browse/SOLR-4793 Project: Solr Issue Type: Improvement Reporter: Son Nguyen Zookeeper set znode size limit to 1MB by default. So we can't start Solr Cloud with some large config files, like synonyms.txt. Jan Høydahl has a good idea: SolrCloud is designed with an assumption that you should be able to upload your whole disk-based conf folder into ZK, and that you should be able to add an empty Solr node to a cluster and it would download all config from ZK. So immediately a splitting strategy automatically handled by ZkSolresourceLoader for large files could be one way forward, i.e. store synonyms.txt as e.g. __001_synonyms.txt __002_synonyms.txt -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-6182) Using rest manager endpoint to create a new (yet unused) managed resource results in corrupt file, solr can't restart
[ https://issues.apache.org/jira/browse/SOLR-6182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Timothy Potter resolved SOLR-6182. -- Resolution: Fixed Fix Version/s: 4.9 Using rest manager endpoint to create a new (yet unused) managed resource results in corrupt file, solr can't restart - Key: SOLR-6182 URL: https://issues.apache.org/jira/browse/SOLR-6182 Project: Solr Issue Type: Bug Reporter: Hoss Man Assignee: Timothy Potter Fix For: 4.9 The ref guide documents that... bq. You can also create new managed resource using PUT/POST to the appropriate URL – before ever configuring anything that uses these resources. ...however attempting to do this results in the creation of _rest_managed.json file that is not in the format expected by solr, and prevents solr from starting up. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-5956) SnapShooter is using getRawInstanceDir, which is sometimes not a valid directory
[ https://issues.apache.org/jira/browse/SOLR-5956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Timothy Potter resolved SOLR-5956. -- Resolution: Fixed Fix Version/s: 4.9 SnapShooter is using getRawInstanceDir, which is sometimes not a valid directory Key: SOLR-5956 URL: https://issues.apache.org/jira/browse/SOLR-5956 Project: Solr Issue Type: Bug Components: replication (java), SolrCloud Environment: SolrCloud Reporter: Timothy Potter Assignee: Timothy Potter Fix For: 4.9 Attachments: SOLR-5956.patch Ran into an issue where the getRawInstanceDir method on CoreDescriptor returns an invalid directory on the server. Need to track down where this bad value comes from and fix it. I suspect this has to do with me using symlinks e.g. I have server with solr.solr.home set to /home/ec2-user/lucene-solr/solr/cloud87/solr, which in reality is: /vol0/cloud87/solr as /home/ec2-user/lucene-solr/solr/cloud87 is a symlink to /vol0/cloud87 getRawInstanceDir was returning /vol0/cloud87/demo_shard1_replica1 which is missing the /solr part of the directory path, should be: /vol0/cloud87/solr/demo_shard1_replica1 -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-6106) Sometimes all the cores on a SolrCloud node cannot find their config when intializing the ManagedResourceStorage storageIO impl
[ https://issues.apache.org/jira/browse/SOLR-6106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Timothy Potter resolved SOLR-6106. -- Resolution: Won't Fix Only occurs if a full GC pause causes the ZK session to expire during server initialization (when there are many cores). There's not much you can do with this other than restart (with more memory or better GC settings). Sometimes all the cores on a SolrCloud node cannot find their config when intializing the ManagedResourceStorage storageIO impl --- Key: SOLR-6106 URL: https://issues.apache.org/jira/browse/SOLR-6106 Project: Solr Issue Type: Bug Components: SolrCloud Reporter: Timothy Potter Assignee: Timothy Potter Priority: Minor Attachments: SOLR-6106_prelim.patch Had one of my many nodes have problems initializing all cores due to the following problem. It was resolved by restarting the node (hence the minor classification). 2014-05-21 20:39:17,898 [coreLoadExecutor-4-thread-27] ERROR solr.core.CoreContainer - Unable to create core: small46_shard1_replica1 org.apache.solr.common.SolrException: Could not find config name for collection:small46 at org.apache.solr.core.SolrCore.init(SolrCore.java:858) at org.apache.solr.core.SolrCore.init(SolrCore.java:641) at org.apache.solr.core.CoreContainer.create(CoreContainer.java:556) at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:261) at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:253) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) Caused by: org.apache.solr.common.SolrException: Could not find config name for collection:small46 at org.apache.solr.rest.ManagedResourceStorage.newStorageIO(ManagedResourceStorage.java:99) at org.apache.solr.core.SolrCore.initRestManager(SolrCore.java:2339) at org.apache.solr.core.SolrCore.init(SolrCore.java:845) ... 10 more -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4793) Solr Cloud can't upload large config files ( 1MB) to Zookeeper
[ https://issues.apache.org/jira/browse/SOLR-4793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14039198#comment-14039198 ] Yago Riveiro commented on SOLR-4793: it's probably that I was tweaked the zkServer file a bit ... :P Solr Cloud can't upload large config files ( 1MB) to Zookeeper - Key: SOLR-4793 URL: https://issues.apache.org/jira/browse/SOLR-4793 Project: Solr Issue Type: Improvement Reporter: Son Nguyen Zookeeper set znode size limit to 1MB by default. So we can't start Solr Cloud with some large config files, like synonyms.txt. Jan Høydahl has a good idea: SolrCloud is designed with an assumption that you should be able to upload your whole disk-based conf folder into ZK, and that you should be able to add an empty Solr node to a cluster and it would download all config from ZK. So immediately a splitting strategy automatically handled by ZkSolresourceLoader for large files could be one way forward, i.e. store synonyms.txt as e.g. __001_synonyms.txt __002_synonyms.txt -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6120) zkcli.sh class not fount error /opt/solr-4.8.1/example/solr-webapp is empty
[ https://issues.apache.org/jira/browse/SOLR-6120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14039223#comment-14039223 ] Cassandra Targett commented on SOLR-6120: - [~shalinmangar]: should this be updated to Fix Version = 4.9 also Resolved? Or is Windows support blocking that? zkcli.sh class not fount error /opt/solr-4.8.1/example/solr-webapp is empty --- Key: SOLR-6120 URL: https://issues.apache.org/jira/browse/SOLR-6120 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.8.1 Reporter: sebastian badea Assignee: Shalin Shekhar Mangar Attachments: SOLR-6120-windows.patch, SOLR-6120-windows.patch, SOLR-6120-windows.patch, SOLR-6120.patch When calling /solr-4.8.1/example/scripts/cloud-scripts/zkcli.sh the org.apache.solr.cloud.ZkCLI class is not found The cause is that /opt/solr-4.8.1/example/solr-webapp is empty -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4852) If sharedLib is set to lib, classloader fails to find classes in lib
[ https://issues.apache.org/jira/browse/SOLR-4852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14039276#comment-14039276 ] Ahmet Arslan commented on SOLR-4852: This is not all about loading same jar twice. Here is an interesting finding. I remove all lib directives in example solrconfig.xml and put icu4j-53.1.jar and lucene-analyzers-icu-4.8.1.jar into collection1/lib folder. solr.ICUFoldingFilterFactory works file. Just add following line to solrconfig.xml {code:xml} lib dir=../../../dist/ regex=solr-velocity-\d.*\.jar / {code} bum it fails. I have a feeling that this is nothing to do with twice loading. It looks line order of processed lib directives causing something. If sharedLib is set to lib, classloader fails to find classes in lib Key: SOLR-4852 URL: https://issues.apache.org/jira/browse/SOLR-4852 Project: Solr Issue Type: Bug Affects Versions: 4.4 Environment: Linux bigindy5 2.6.32-358.6.1.el6.centos.plus.x86_64 #1 SMP Wed Apr 24 03:21:04 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux java version 1.7.0_21 Java(TM) SE Runtime Environment (build 1.7.0_21-b11) Java HotSpot(TM) 64-Bit Server VM (build 23.21-b01, mixed mode) Reporter: Shawn Heisey Fix For: 4.9, 5.0 Attachments: SOLR-4852-test-failhard.txt, SOLR-4852.patch, SOLR-4852.patch I have some jars in the lib directory under solr.solr.home - DIH, ICU, and MySQL. If I set sharedLib in solr.xml to lib then the ICUTokenizer class is not found, even though the jar is loaded (twice) during Solr startup. If I set sharedLib to another location that doesn't exist, the jars are only loaded once and there is no problem. I'm using the old-style solr.xml on branch_4x revision 1485566. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6188) solr.ICUFoldingFilterFactory causes NoClassDefFoundError: o/a/l/a/icu/ICUFoldingFilter
[ https://issues.apache.org/jira/browse/SOLR-6188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14039280#comment-14039280 ] Ahmet Arslan commented on SOLR-6188: Same symptom. solr.ICU*Factory in schema.xml causes not found exception for org.apache.lucene.* class. solr.ICUFoldingFilterFactory causes NoClassDefFoundError: o/a/l/a/icu/ICUFoldingFilter -- Key: SOLR-6188 URL: https://issues.apache.org/jira/browse/SOLR-6188 Project: Solr Issue Type: Bug Components: Schema and Analysis Affects Versions: 4.8.1 Reporter: Ahmet Arslan Labels: ICUFoldingFilterFactory Fix For: 4.10 When fully qualified class name is used in schema.xml {{org.apache.lucene.analysis.icu.ICUFoldingFilterFactory}} it works. However as documented in confluence and wiki, when {{solr.ICUFoldingFilterFactory}} is used it throws following exception. This is true for both released 4.8.1 version and trunk r1604168 following type works : {code:xml} fieldType name=folded2 class=solr.TextField analyzer tokenizer class=solr.StandardTokenizerFactory/ filter class=org.apache.lucene.analysis.icu.ICUFoldingFilterFactory/ /analyzer /fieldType {code} this does not : {code:xml} fieldType name=folded class=solr.TextField analyzer tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.ICUFoldingFilterFactory/ /analyzer /fieldType {code} {noformat} 257 [main] ERROR org.apache.solr.core.SolrCore – Error loading core:java.util.concurrent.ExecutionException: java.lang.NoClassDefFoundError: org/apache/lucene/analysis/icu/ICUFoldingFilter at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:188) at org.apache.solr.core.CoreContainer.load(CoreContainer.java:301) at org.apache.solr.servlet.SolrDispatchFilter.createCoreContainer(SolrDispatchFilter.java:190) at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:137) at org.eclipse.jetty.servlet.FilterHolder.doStart(FilterHolder.java:119) at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64) at org.eclipse.jetty.servlet.ServletHandler.initialize(ServletHandler.java:719) at org.eclipse.jetty.servlet.ServletContextHandler.startContext(ServletContextHandler.java:265) at org.eclipse.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1252) at org.eclipse.jetty.server.handler.ContextHandler.doStart(ContextHandler.java:710) at org.eclipse.jetty.webapp.WebAppContext.doStart(WebAppContext.java:494) at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64) at org.eclipse.jetty.deploy.bindings.StandardStarter.processBinding(StandardStarter.java:39) at org.eclipse.jetty.deploy.AppLifeCycle.runBindings(AppLifeCycle.java:186) at org.eclipse.jetty.deploy.DeploymentManager.requestAppGoal(DeploymentManager.java:494) at org.eclipse.jetty.deploy.DeploymentManager.addApp(DeploymentManager.java:141) at org.eclipse.jetty.deploy.providers.ScanningAppProvider.fileAdded(ScanningAppProvider.java:145) at org.eclipse.jetty.deploy.providers.ScanningAppProvider$1.fileAdded(ScanningAppProvider.java:56) at org.eclipse.jetty.util.Scanner.reportAddition(Scanner.java:609) at org.eclipse.jetty.util.Scanner.reportDifferences(Scanner.java:540) at org.eclipse.jetty.util.Scanner.scan(Scanner.java:403) at org.eclipse.jetty.util.Scanner.doStart(Scanner.java:337) at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64) at org.eclipse.jetty.deploy.providers.ScanningAppProvider.doStart(ScanningAppProvider.java:121) at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64) at org.eclipse.jetty.deploy.DeploymentManager.startAppProvider(DeploymentManager.java:555) at org.eclipse.jetty.deploy.DeploymentManager.doStart(DeploymentManager.java:230) at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64) at org.eclipse.jetty.util.component.AggregateLifeCycle.doStart(AggregateLifeCycle.java:81) at org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:58) at org.eclipse.jetty.server.handler.HandlerWrapper.doStart(HandlerWrapper.java:96) at org.eclipse.jetty.server.Server.doStart(Server.java:280) at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64) at
[jira] [Commented] (SOLR-4852) If sharedLib is set to lib, classloader fails to find classes in lib
[ https://issues.apache.org/jira/browse/SOLR-4852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14039312#comment-14039312 ] Ahmet Arslan commented on SOLR-4852: In above setting no duplicate jars loaded. Only one explicit lib directive is defined in solrconfig.xml along with implicit one. core/lib. following lines printed during startup. {noformat} 1751 [coreLoadExecutor-4-thread-1] INFO org.apache.solr.core.SolrResourceLoader – new SolrResourceLoader for directory: '/Users/iorixxx/Desktop/solr-4.8.1/example/solr/collection1/' 1752 [coreLoadExecutor-4-thread-1] INFO org.apache.solr.core.SolrResourceLoader – Adding 'file:/Users/iorixxx/Desktop/solr-4.8.1/example/solr/collection1/lib/.DS_Store' to classloader 1752 [coreLoadExecutor-4-thread-1] INFO org.apache.solr.core.SolrResourceLoader – Adding 'file:/Users/iorixxx/Desktop/solr-4.8.1/example/solr/collection1/lib/icu4j-53.1.jar' to classloader 1752 [coreLoadExecutor-4-thread-1] INFO org.apache.solr.core.SolrResourceLoader – Adding 'file:/Users/iorixxx/Desktop/solr-4.8.1/example/solr/collection1/lib/lucene-analyzers-icu-4.8.1.jar' to classloader 1833 [coreLoadExecutor-4-thread-1] INFO org.apache.solr.core.SolrConfig – Adding specified lib dirs to ClassLoader 1835 [coreLoadExecutor-4-thread-1] INFO org.apache.solr.core.SolrResourceLoader – Adding 'file:/Users/iorixxx/Desktop/solr-4.8.1/dist/solr-velocity-4.8.1.jar' to classloader {noformat} This fails for no reason. By the way why does this load non - jar hidden .DS file : 4.8.1/example/solr/collection1/lib/.DS_Store' to classloader If sharedLib is set to lib, classloader fails to find classes in lib Key: SOLR-4852 URL: https://issues.apache.org/jira/browse/SOLR-4852 Project: Solr Issue Type: Bug Affects Versions: 4.4 Environment: Linux bigindy5 2.6.32-358.6.1.el6.centos.plus.x86_64 #1 SMP Wed Apr 24 03:21:04 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux java version 1.7.0_21 Java(TM) SE Runtime Environment (build 1.7.0_21-b11) Java HotSpot(TM) 64-Bit Server VM (build 23.21-b01, mixed mode) Reporter: Shawn Heisey Fix For: 4.9, 5.0 Attachments: SOLR-4852-test-failhard.txt, SOLR-4852.patch, SOLR-4852.patch I have some jars in the lib directory under solr.solr.home - DIH, ICU, and MySQL. If I set sharedLib in solr.xml to lib then the ICUTokenizer class is not found, even though the jar is loaded (twice) during Solr startup. If I set sharedLib to another location that doesn't exist, the jars are only loaded once and there is no problem. I'm using the old-style solr.xml on branch_4x revision 1485566. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6184) Replication fetchLatestIndex always failed, that will occur the recovery error.
[ https://issues.apache.org/jira/browse/SOLR-6184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14039320#comment-14039320 ] Shawn Heisey commented on SOLR-6184: Would the commitReserveDuration parameter on the replication handler be useful in keeping Solr from deleting the commit point that is being replicated until after the replication is complete? Normally it's not recommended to have any config parameters for replication, but if a very large index is having problems recovering when there is a lot of update activity, perhaps that would be an exception. Replication fetchLatestIndex always failed, that will occur the recovery error. --- Key: SOLR-6184 URL: https://issues.apache.org/jira/browse/SOLR-6184 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.6, 4.6.1 Environment: the index file size is more than 70G Reporter: Raintung Li Attachments: Solr-6184.txt Usually the copy full index 70G need 20 minutes at least, 100M read/write network or disk r/w. If in the 20 minutes happen one hard commit, that means the copy full index snap pull will be failed, the temp folder will be removed because it is failed pull task. In the production, update index will happen in every minute, redo pull task always failed because index always change. And also always redo the pull it will occur the network and disk usage keep the high level. For my suggestion, the fetchLatestIndex can be do again in some frequency. Don't need remove the tmp folder, and copy the largest index at first. Redo the fetchLatestIndex don't download the same biggest file again, only will copy the commit index just now, at last the task will be easy success. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4852) If sharedLib is set to lib, classloader fails to find classes in lib
[ https://issues.apache.org/jira/browse/SOLR-4852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14039331#comment-14039331 ] Shawn Heisey commented on SOLR-4852: Another theory I've considered is that the problem is caused by the resource loader object being replaced, which from my research apparently has to be done anytime you need to add another directory or list of jars. This theory would be disproved by a setup that has more than one lib directive and uses classes in jars from both locations ... and there are probably some of those out there. I willingly admit that I do not understand what causes the problems that I documented here. I looked into the Solr code and poked around the Java 7 API javadocs, but I wasn't able to make sense of it. If sharedLib is set to lib, classloader fails to find classes in lib Key: SOLR-4852 URL: https://issues.apache.org/jira/browse/SOLR-4852 Project: Solr Issue Type: Bug Affects Versions: 4.4 Environment: Linux bigindy5 2.6.32-358.6.1.el6.centos.plus.x86_64 #1 SMP Wed Apr 24 03:21:04 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux java version 1.7.0_21 Java(TM) SE Runtime Environment (build 1.7.0_21-b11) Java HotSpot(TM) 64-Bit Server VM (build 23.21-b01, mixed mode) Reporter: Shawn Heisey Fix For: 4.9, 5.0 Attachments: SOLR-4852-test-failhard.txt, SOLR-4852.patch, SOLR-4852.patch I have some jars in the lib directory under solr.solr.home - DIH, ICU, and MySQL. If I set sharedLib in solr.xml to lib then the ICUTokenizer class is not found, even though the jar is loaded (twice) during Solr startup. If I set sharedLib to another location that doesn't exist, the jars are only loaded once and there is no problem. I'm using the old-style solr.xml on branch_4x revision 1485566. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4852) If sharedLib is set to lib, classloader fails to find classes in lib
[ https://issues.apache.org/jira/browse/SOLR-4852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14039336#comment-14039336 ] Uwe Schindler commented on SOLR-4852: - Hi Shawn, this is the issue here: I explain the problem in the linked issue: SOLR-6188 If sharedLib is set to lib, classloader fails to find classes in lib Key: SOLR-4852 URL: https://issues.apache.org/jira/browse/SOLR-4852 Project: Solr Issue Type: Bug Affects Versions: 4.4 Environment: Linux bigindy5 2.6.32-358.6.1.el6.centos.plus.x86_64 #1 SMP Wed Apr 24 03:21:04 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux java version 1.7.0_21 Java(TM) SE Runtime Environment (build 1.7.0_21-b11) Java HotSpot(TM) 64-Bit Server VM (build 23.21-b01, mixed mode) Reporter: Shawn Heisey Fix For: 4.9, 5.0 Attachments: SOLR-4852-test-failhard.txt, SOLR-4852.patch, SOLR-4852.patch I have some jars in the lib directory under solr.solr.home - DIH, ICU, and MySQL. If I set sharedLib in solr.xml to lib then the ICUTokenizer class is not found, even though the jar is loaded (twice) during Solr startup. If I set sharedLib to another location that doesn't exist, the jars are only loaded once and there is no problem. I'm using the old-style solr.xml on branch_4x revision 1485566. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6188) solr.ICUFoldingFilterFactory causes NoClassDefFoundError: o/a/l/a/icu/ICUFoldingFilter
[ https://issues.apache.org/jira/browse/SOLR-6188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14039350#comment-14039350 ] Uwe Schindler commented on SOLR-6188: - Hi the reason for this issue is indeed caused by SOLR-4852: The reason why it works with absolute classname is the following: - If you use the absolute class name, the class is loaded by Class.forName from SolrResourceLoader - If you use the shortcut, the Solr 3.x backwards layer for finding factory classes is used. The solr.XXXFactory name is rewritten to a call to TokenFilterFactory.forName(). This forName call uses the classpath it was initialized with. TokenFilterFactory is a static class and doe snot really know classloaders (because there is only one single instance). Every SolrResourceLoader calls an update process, that scans the own classpath and adds all *new* factory instances to the forName() lookup map. What happens here: In an earlier stage, it looks like SolrResourceLoader has seen a Factory instance loaded by SPI and cached its factory class for forName(). But later the classpath and classloader was replaced and the scanner was called again. This onescanned classpath again, and found a new instance of the FactoryClass (the new one that should be used). Because this one was already in the forName cache, it did not replace that one. In the meantime, the old classloader was closed with Java 7's URLClassLoacer.close() method. Because of this a call to forName returned the factory class, but the dependend classes it was referring to are no longer loadable (classloader closed). This causes the bug. The fix is not easily possible, I will think about it. solr.ICUFoldingFilterFactory causes NoClassDefFoundError: o/a/l/a/icu/ICUFoldingFilter -- Key: SOLR-6188 URL: https://issues.apache.org/jira/browse/SOLR-6188 Project: Solr Issue Type: Bug Components: Schema and Analysis Affects Versions: 4.8.1 Reporter: Ahmet Arslan Labels: ICUFoldingFilterFactory Fix For: 4.10 When fully qualified class name is used in schema.xml {{org.apache.lucene.analysis.icu.ICUFoldingFilterFactory}} it works. However as documented in confluence and wiki, when {{solr.ICUFoldingFilterFactory}} is used it throws following exception. This is true for both released 4.8.1 version and trunk r1604168 following type works : {code:xml} fieldType name=folded2 class=solr.TextField analyzer tokenizer class=solr.StandardTokenizerFactory/ filter class=org.apache.lucene.analysis.icu.ICUFoldingFilterFactory/ /analyzer /fieldType {code} this does not : {code:xml} fieldType name=folded class=solr.TextField analyzer tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.ICUFoldingFilterFactory/ /analyzer /fieldType {code} {noformat} 257 [main] ERROR org.apache.solr.core.SolrCore – Error loading core:java.util.concurrent.ExecutionException: java.lang.NoClassDefFoundError: org/apache/lucene/analysis/icu/ICUFoldingFilter at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:188) at org.apache.solr.core.CoreContainer.load(CoreContainer.java:301) at org.apache.solr.servlet.SolrDispatchFilter.createCoreContainer(SolrDispatchFilter.java:190) at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:137) at org.eclipse.jetty.servlet.FilterHolder.doStart(FilterHolder.java:119) at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64) at org.eclipse.jetty.servlet.ServletHandler.initialize(ServletHandler.java:719) at org.eclipse.jetty.servlet.ServletContextHandler.startContext(ServletContextHandler.java:265) at org.eclipse.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1252) at org.eclipse.jetty.server.handler.ContextHandler.doStart(ContextHandler.java:710) at org.eclipse.jetty.webapp.WebAppContext.doStart(WebAppContext.java:494) at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64) at org.eclipse.jetty.deploy.bindings.StandardStarter.processBinding(StandardStarter.java:39) at org.eclipse.jetty.deploy.AppLifeCycle.runBindings(AppLifeCycle.java:186) at org.eclipse.jetty.deploy.DeploymentManager.requestAppGoal(DeploymentManager.java:494) at org.eclipse.jetty.deploy.DeploymentManager.addApp(DeploymentManager.java:141) at org.eclipse.jetty.deploy.providers.ScanningAppProvider.fileAdded(ScanningAppProvider.java:145) at
[jira] [Commented] (LUCENE-5752) Explore light weight Automaton replacement
[ https://issues.apache.org/jira/browse/LUCENE-5752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14039401#comment-14039401 ] ASF subversion and git services commented on LUCENE-5752: - Commit 1604283 from [~mikemccand] in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1604283 ] LUCENE-5752: switch to simpler, immutable Automaton implementation Explore light weight Automaton replacement -- Key: LUCENE-5752 URL: https://issues.apache.org/jira/browse/LUCENE-5752 Project: Lucene - Core Issue Type: Improvement Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 5.0, 4.10 Attachments: LUCENE-5752.patch, LUCENE-5752.patch This effort started with the patch on LUCENE-4556, to create a light weight replacement for the current object-heavy Automaton class (which creates separate State and Transition objects). I took that initial patch much further, and cutover most places in Lucene that use Automaton to LightAutomaton. Tests pass. The core idea of LightAutomaton is all states are ints, and you build up the automaton under the restriction that you add all outgoing transitions one state at a time. This worked well for most operations, but for some (e.g. UTF32ToUTF8!!) it was harder, so I also added a separate builder to add transitions in any order and then in the end they are sorted and added to the real automaton. If this is successful I think we should just replace the current Automaton with LightAutomaton; right now they both exist in my current patch... This is very much a work in progress, and I'm not sure the restrictions the API imposes are reasonable (some algos got uglier). But I think it's at least worth exploring/iterating... I'll make a branch and commit my current state. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-5752) Explore light weight Automaton replacement
[ https://issues.apache.org/jira/browse/LUCENE-5752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless resolved LUCENE-5752. Resolution: Fixed Explore light weight Automaton replacement -- Key: LUCENE-5752 URL: https://issues.apache.org/jira/browse/LUCENE-5752 Project: Lucene - Core Issue Type: Improvement Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 5.0, 4.10 Attachments: LUCENE-5752.patch, LUCENE-5752.patch This effort started with the patch on LUCENE-4556, to create a light weight replacement for the current object-heavy Automaton class (which creates separate State and Transition objects). I took that initial patch much further, and cutover most places in Lucene that use Automaton to LightAutomaton. Tests pass. The core idea of LightAutomaton is all states are ints, and you build up the automaton under the restriction that you add all outgoing transitions one state at a time. This worked well for most operations, but for some (e.g. UTF32ToUTF8!!) it was harder, so I also added a separate builder to add transitions in any order and then in the end they are sorted and added to the real automaton. If this is successful I think we should just replace the current Automaton with LightAutomaton; right now they both exist in my current patch... This is very much a work in progress, and I'm not sure the restrictions the API imposes are reasonable (some algos got uglier). But I think it's at least worth exploring/iterating... I'll make a branch and commit my current state. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5752) Explore light weight Automaton replacement
[ https://issues.apache.org/jira/browse/LUCENE-5752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-5752: --- Fix Version/s: 4.10 Explore light weight Automaton replacement -- Key: LUCENE-5752 URL: https://issues.apache.org/jira/browse/LUCENE-5752 Project: Lucene - Core Issue Type: Improvement Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 5.0, 4.10 Attachments: LUCENE-5752.patch, LUCENE-5752.patch This effort started with the patch on LUCENE-4556, to create a light weight replacement for the current object-heavy Automaton class (which creates separate State and Transition objects). I took that initial patch much further, and cutover most places in Lucene that use Automaton to LightAutomaton. Tests pass. The core idea of LightAutomaton is all states are ints, and you build up the automaton under the restriction that you add all outgoing transitions one state at a time. This worked well for most operations, but for some (e.g. UTF32ToUTF8!!) it was harder, so I also added a separate builder to add transitions in any order and then in the end they are sorted and added to the real automaton. If this is successful I think we should just replace the current Automaton with LightAutomaton; right now they both exist in my current patch... This is very much a work in progress, and I'm not sure the restrictions the API imposes are reasonable (some algos got uglier). But I think it's at least worth exploring/iterating... I'll make a branch and commit my current state. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org