[jira] [Commented] (LUCENE-5771) Review semantics of SpatialOperation predicates

2014-06-20 Thread Ryan McKinley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038546#comment-14038546
 ] 

Ryan McKinley commented on LUCENE-5771:
---

change looks good

I *think* the reason for  queryNeedsArea and targetNeedsArea was so that we 
could decide if the input was valid simply based on the shape -- but i think we 
find that out anyway, so we could just drop these.

I don't think 'scoreIsMeaningful' is used, so it can also be dropped




 Review semantics of SpatialOperation predicates
 ---

 Key: LUCENE-5771
 URL: https://issues.apache.org/jira/browse/LUCENE-5771
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/spatial
Reporter: David Smiley
 Attachments: LUCENE-5771_SpatialOperation_semantics.patch


 SpatialOperation (which I wish was named SpatialPredicate) is a bunch of 
 predicates -- methods that return true/false based on a pair of shapes.  Some 
 of them don't seem to be defined in a way consistent with their definitions 
 on ESRI's site:
 http://edndoc.esri.com/arcsde/9.1/general_topics/understand_spatial_relations.htm
   (which is linked as a reference, and is in turn equivalent to OGC spec 
 definitions, I believe).
 Problems:
 * the definitions make no mention of needing to have area or not, yet some of 
 our predicates are defined as to require area on either the indexed or query 
 shape.
 * the definitions make a distinction of the boundary of a shape, yet in 
 Lucene-spatial, there is none.  That suggests our predicates are wrongly 
 chosen since there *are* official predicates that are boundary-neutral -- 
 namely Covers and CoveredBy in lieu of Contains and Within, respectively. 
  If we don't rename our predicates, we should at least support the correct 
 predicates names!
 * Overlaps appears totally wrong. It should be defined as 
 indexedShape.relate(queryShape) == Intersects  (and thus not Within or 
 Contains or Disjoint).  It's presently defined as the same as Intersects plus 
 the query shape needing area.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5779) Improve BBox AreaSimilarity algorithm to consider lines and points

2014-06-20 Thread Ryan McKinley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038548#comment-14038548
 ] 

Ryan McKinley commented on LUCENE-5779:
---

+1 thanks for looking at this

 Improve BBox AreaSimilarity algorithm to consider lines and points
 --

 Key: LUCENE-5779
 URL: https://issues.apache.org/jira/browse/LUCENE-5779
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/spatial
Reporter: David Smiley
 Attachments: LUCENE-5779__Improved_bbox_AreaSimilarity_algorithm.patch


 GeoPortal's area overlap algorithm didn't consider lines and points; they end 
 up turning the score 0.  I've thought about this for a bit and I've come up 
 with an alternative scoring algorithm.  (already coded and tested and 
 documented):
 New Javadocs:
 {code:java}
 /**
  * The algorithm is implemented as envelope on envelope overlays rather than
  * complex polygon on complex polygon overlays.
  * p/
  * p/
  * Spatial relevance scoring algorithm:
  * DL
  *   DTqueryArea/DT DDthe area of the input query envelope/DD
  *   DTtargetArea/DT DDthe area of the target envelope (per Lucene 
 document)/DD
  *   DTintersectionArea/DT DDthe area of the intersection between the 
 query and target envelopes/DD
  *   DTqueryTargetProportion/DT DDA 0-1 factor that divides the score 
 proportion between query and target.
  *   0.5 is evenly./DD
  *
  *   DTqueryRatio/DT DDintersectionArea / queryArea; (see note)/DD
  *   DTtargetRatio/DT DDintersectionArea / targetArea; (see note)/DD
  *   DTqueryFactor/DT DDqueryRatio * queryTargetProportion;/DD
  *   DTtargetFactor/DT DDtargetRatio * (1 - queryTargetProportion);/DD
  *   DTscore/DT DDqueryFactor + targetFactor;/DD
  * /DL
  * Note: The actual computation of queryRatio and targetRatio is more 
 complicated so that it considers
  * points and lines. Lines have the ratio of overlap, and points are either 
 1.0 or 0.0 depending on wether
  * it intersects or not.
  * p /
  * Based on Geoportal's
  * a 
 href=http://geoportal.svn.sourceforge.net/svnroot/geoportal/Geoportal/trunk/src/com/esri/gpt/catalog/lucene/SpatialRankingValueSource.java;
  *   SpatialRankingValueSource/a but modified. GeoPortal's algorithm will 
 yield a score of 0
  * if either a line or point is compared, and it's doesn't output a 0-1 
 normalized score (it multiplies the factors).
  *
  * @lucene.experimental
  */
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5714) Improve tests for BBoxStrategy then port to 4x.

2014-06-20 Thread Ryan McKinley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038549#comment-14038549
 ] 

Ryan McKinley commented on LUCENE-5714:
---

+1 thanks

 Improve tests for BBoxStrategy then port to 4x.
 ---

 Key: LUCENE-5714
 URL: https://issues.apache.org/jira/browse/LUCENE-5714
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/spatial
Reporter: David Smiley
Assignee: David Smiley
 Fix For: 4.9

 Attachments: 
 LUCENE-5714__Enhance_BBoxStrategy__more_tests,_fix_dateline_bugs,_new_AreaSimilarity_algor.patch


 BBoxStrategy needs better tests before I'm comfortable seeing it in 4x.  
 Specifically it should use random rectangles based validation (ones that may 
 cross the dateline), akin to the other tests.  And I think I see an 
 equals/hashcode bug to be fixed in there too.
 One particular thing I'd like to see added is how to handle a zero-area case 
 for AreaSimilarity.  I think an additional feature in which you declare a 
 minimum % area (relative to the query shape) would be good.
 It should be possible for the user to combine rectangle center-point to query 
 shape center-point distance sorting as well.  I think it is but I need to 
 make sure it's possible without _having_ to index a separate center point 
 field.
 Another possibility (probably not to be addressed here) is a minimum ratio 
 between width/height, perhaps 10%.  A long but nearly no height line should 
 not be massively disadvantaged relevancy-wise to an equivalently long 
 diagonal road that has a square bbox.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6178) Deprecate Jaspell suggester and change the default

2014-06-20 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038564#comment-14038564
 ] 

Uwe Schindler commented on SOLR-6178:
-

I think for now, we should at least make the factory deprecated. As we 
deprecated Jaspell already in 4.9, this should be done before 4.9.

 Deprecate Jaspell suggester and change the default
 --

 Key: SOLR-6178
 URL: https://issues.apache.org/jira/browse/SOLR-6178
 Project: Solr
  Issue Type: Improvement
  Components: spellchecker
Reporter: Michael McCandless
 Fix For: 4.9, 5.0


 Right now Solr defaults to Jaspell, but we've deprecated it in LUCENE-5775 
 ... and in trunk I'd like to remove it.  But first we need to fix Solr to not 
 default to it.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-6178) Deprecate Jaspell suggester and change the default

2014-06-20 Thread Uwe Schindler (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated SOLR-6178:


Priority: Blocker  (was: Major)

 Deprecate Jaspell suggester and change the default
 --

 Key: SOLR-6178
 URL: https://issues.apache.org/jira/browse/SOLR-6178
 Project: Solr
  Issue Type: Improvement
  Components: spellchecker
Reporter: Michael McCandless
Priority: Blocker
 Fix For: 4.9, 5.0


 Right now Solr defaults to Jaspell, but we've deprecated it in LUCENE-5775 
 ... and in trunk I'd like to remove it.  But first we need to fix Solr to not 
 default to it.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-6178) Deprecate Jaspell suggester and change the default

2014-06-20 Thread Uwe Schindler (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated SOLR-6178:


  Component/s: spellchecker
Fix Version/s: 5.0
   4.9

 Deprecate Jaspell suggester and change the default
 --

 Key: SOLR-6178
 URL: https://issues.apache.org/jira/browse/SOLR-6178
 Project: Solr
  Issue Type: Improvement
  Components: spellchecker
Reporter: Michael McCandless
 Fix For: 4.9, 5.0


 Right now Solr defaults to Jaspell, but we've deprecated it in LUCENE-5775 
 ... and in trunk I'd like to remove it.  But first we need to fix Solr to not 
 default to it.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-6178) Deprecate Jaspell suggester and change the default

2014-06-20 Thread Uwe Schindler (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated SOLR-6178:


Attachment: SOLR-6178.patch

Simple patch for deprecation.

Maybe we should change the default to FSTLookupFactory. This one looks most 
similar to the Jaspell one (from the functionality). I am not sure how this 
affects users that rely on the default impl (because they have not given one it 
in the solrconfig).

 Deprecate Jaspell suggester and change the default
 --

 Key: SOLR-6178
 URL: https://issues.apache.org/jira/browse/SOLR-6178
 Project: Solr
  Issue Type: Improvement
  Components: spellchecker
Reporter: Michael McCandless
Priority: Blocker
 Fix For: 4.9, 5.0

 Attachments: SOLR-6178.patch


 Right now Solr defaults to Jaspell, but we've deprecated it in LUCENE-5775 
 ... and in trunk I'd like to remove it.  But first we need to fix Solr to not 
 default to it.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-6184) Replication fetchLatestIndex always failed, that will occur the recovery error.

2014-06-20 Thread Raintung Li (JIRA)
Raintung Li created SOLR-6184:
-

 Summary: Replication fetchLatestIndex always failed, that will 
occur the recovery error.
 Key: SOLR-6184
 URL: https://issues.apache.org/jira/browse/SOLR-6184
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.6.1, 4.6
 Environment: the index file size is more than 70G
Reporter: Raintung Li


Usually the copy full index 70G need 20 minutes at least, 100M read/write 
network or disk r/w.  If in the 20 minutes happen one hard commit, that means 
the copy full index snap pull will be failed, the temp folder will be removed 
because it is failed pull task. 
In the production, update index will happen in every minute, redo pull task 
always failed because index always change. 

And also always redo the pull it will occur the network and disk usage keep the 
high level.

For my suggestion, the fetchLatestIndex can be done again in some frequency. 
Don't need remove the tmp folder, and copy the largest index at first. Redo the 
fetchLatestIndex don't download the same biggest file again, only will copy the 
commit index just now, at last the task will be easy success.





--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-6184) Replication fetchLatestIndex always failed, that will occur the recovery error.

2014-06-20 Thread Raintung Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raintung Li updated SOLR-6184:
--

Description: 
Usually the copy full index 70G need 20 minutes at least, 100M read/write 
network or disk r/w.  If in the 20 minutes happen one hard commit, that means 
the copy full index snap pull will be failed, the temp folder will be removed 
because it is failed pull task. 
In the production, update index will happen in every minute, redo pull task 
always failed because index always change. 

And also always redo the pull it will occur the network and disk usage keep the 
high level.

For my suggestion, the fetchLatestIndex can be do again in some frequency. 
Don't need remove the tmp folder, and copy the largest index at first. Redo the 
fetchLatestIndex don't download the same biggest file again, only will copy the 
commit index just now, at last the task will be easy success.



  was:
Usually the copy full index 70G need 20 minutes at least, 100M read/write 
network or disk r/w.  If in the 20 minutes happen one hard commit, that means 
the copy full index snap pull will be failed, the temp folder will be removed 
because it is failed pull task. 
In the production, update index will happen in every minute, redo pull task 
always failed because index always change. 

And also always redo the pull it will occur the network and disk usage keep the 
high level.

For my suggestion, the fetchLatestIndex can be done again in some frequency. 
Don't need remove the tmp folder, and copy the largest index at first. Redo the 
fetchLatestIndex don't download the same biggest file again, only will copy the 
commit index just now, at last the task will be easy success.




 Replication fetchLatestIndex always failed, that will occur the recovery 
 error.
 ---

 Key: SOLR-6184
 URL: https://issues.apache.org/jira/browse/SOLR-6184
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.6, 4.6.1
 Environment: the index file size is more than 70G
Reporter: Raintung Li

 Usually the copy full index 70G need 20 minutes at least, 100M read/write 
 network or disk r/w.  If in the 20 minutes happen one hard commit, that means 
 the copy full index snap pull will be failed, the temp folder will be removed 
 because it is failed pull task. 
 In the production, update index will happen in every minute, redo pull task 
 always failed because index always change. 
 And also always redo the pull it will occur the network and disk usage keep 
 the high level.
 For my suggestion, the fetchLatestIndex can be do again in some frequency. 
 Don't need remove the tmp folder, and copy the largest index at first. Redo 
 the fetchLatestIndex don't download the same biggest file again, only will 
 copy the commit index just now, at last the task will be easy success.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-6184) Replication fetchLatestIndex always failed, that will occur the recovery error.

2014-06-20 Thread Raintung Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raintung Li updated SOLR-6184:
--

Attachment: Solr-6184.txt

 Replication fetchLatestIndex always failed, that will occur the recovery 
 error.
 ---

 Key: SOLR-6184
 URL: https://issues.apache.org/jira/browse/SOLR-6184
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.6, 4.6.1
 Environment: the index file size is more than 70G
Reporter: Raintung Li
 Attachments: Solr-6184.txt


 Usually the copy full index 70G need 20 minutes at least, 100M read/write 
 network or disk r/w.  If in the 20 minutes happen one hard commit, that means 
 the copy full index snap pull will be failed, the temp folder will be removed 
 because it is failed pull task. 
 In the production, update index will happen in every minute, redo pull task 
 always failed because index always change. 
 And also always redo the pull it will occur the network and disk usage keep 
 the high level.
 For my suggestion, the fetchLatestIndex can be do again in some frequency. 
 Don't need remove the tmp folder, and copy the largest index at first. Redo 
 the fetchLatestIndex don't download the same biggest file again, only will 
 copy the commit index just now, at last the task will be easy success.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6178) Deprecate Jaspell suggester and change the default

2014-06-20 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038625#comment-14038625
 ] 

Robert Muir commented on SOLR-6178:
---

Not a blocker. Sorry guys. Not even a bug!

 Deprecate Jaspell suggester and change the default
 --

 Key: SOLR-6178
 URL: https://issues.apache.org/jira/browse/SOLR-6178
 Project: Solr
  Issue Type: Improvement
  Components: spellchecker
Reporter: Michael McCandless
Priority: Minor
 Fix For: 4.9, 5.0

 Attachments: SOLR-6178.patch


 Right now Solr defaults to Jaspell, but we've deprecated it in LUCENE-5775 
 ... and in trunk I'd like to remove it.  But first we need to fix Solr to not 
 default to it.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-6178) Deprecate Jaspell suggester and change the default

2014-06-20 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated SOLR-6178:
--

Priority: Minor  (was: Blocker)

 Deprecate Jaspell suggester and change the default
 --

 Key: SOLR-6178
 URL: https://issues.apache.org/jira/browse/SOLR-6178
 Project: Solr
  Issue Type: Improvement
  Components: spellchecker
Reporter: Michael McCandless
Priority: Minor
 Fix For: 4.9, 5.0

 Attachments: SOLR-6178.patch


 Right now Solr defaults to Jaspell, but we've deprecated it in LUCENE-5775 
 ... and in trunk I'd like to remove it.  But first we need to fix Solr to not 
 default to it.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6178) Deprecate Jaspell suggester and change the default

2014-06-20 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038627#comment-14038627
 ] 

Robert Muir commented on SOLR-6178:
---

This is *WAY* too risky to do right before a release. Please do not commit 
anything at all to the 4.9 branch. This can definitely be done in 4.10

 Deprecate Jaspell suggester and change the default
 --

 Key: SOLR-6178
 URL: https://issues.apache.org/jira/browse/SOLR-6178
 Project: Solr
  Issue Type: Improvement
  Components: spellchecker
Reporter: Michael McCandless
Priority: Minor
 Fix For: 4.9, 5.0

 Attachments: SOLR-6178.patch


 Right now Solr defaults to Jaspell, but we've deprecated it in LUCENE-5775 
 ... and in trunk I'd like to remove it.  But first we need to fix Solr to not 
 default to it.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-5780) OrdinalMap's mapping from global ords to segment ords is sometimes wasteful

2014-06-20 Thread Adrien Grand (JIRA)
Adrien Grand created LUCENE-5780:


 Summary: OrdinalMap's mapping from global ords to segment ords is 
sometimes wasteful
 Key: LUCENE-5780
 URL: https://issues.apache.org/jira/browse/LUCENE-5780
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Adrien Grand
Assignee: Adrien Grand
 Fix For: 4.9, 5.0


Robert found a case when the ordinal map can be quite wasteful in terms of 
memory usage: in order to be able to resolve values given a global ordinals, it 
stores two things:
 - an identifier of the segment where the value is
 - the difference between the ordinal on the segment and the global ordinal

The issue is that OrdinalMap currently picks any of the segments that contain 
the value but we can do better: we can pick the first segment that has the 
value. This will help for two reasons:
 - it will potentially require fewer bits per value to store the segment ids if 
NRT segments don't introduce new values
 - if all values happen to appear in the first segment, then the map from 
global ords to deltas only stores zeros.

I just tested on an index where all values are in the first segment and this 
helped reduce memory usage of the ordinal map by 4x (from 3.5MB to 800KB).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5780) OrdinalMap's mapping from global ords to segment ords is sometimes wasteful

2014-06-20 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038634#comment-14038634
 ] 

Robert Muir commented on LUCENE-5780:
-

I agree conceptually its a bug, but I think this should be 4.10. Its not 
anything introduced in 4.9, its been this way since 4.2

I already am pushing back on other issues such as SOLR-6178, because i ran 125 
rounds of tests last night and want to keep things stable.


 OrdinalMap's mapping from global ords to segment ords is sometimes wasteful
 ---

 Key: LUCENE-5780
 URL: https://issues.apache.org/jira/browse/LUCENE-5780
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Adrien Grand
Assignee: Adrien Grand
 Fix For: 4.9, 5.0


 Robert found a case when the ordinal map can be quite wasteful in terms of 
 memory usage: in order to be able to resolve values given a global ordinals, 
 it stores two things:
  - an identifier of the segment where the value is
  - the difference between the ordinal on the segment and the global ordinal
 The issue is that OrdinalMap currently picks any of the segments that contain 
 the value but we can do better: we can pick the first segment that has the 
 value. This will help for two reasons:
  - it will potentially require fewer bits per value to store the segment ids 
 if NRT segments don't introduce new values
  - if all values happen to appear in the first segment, then the map from 
 global ords to deltas only stores zeros.
 I just tested on an index where all values are in the first segment and this 
 helped reduce memory usage of the ordinal map by 4x (from 3.5MB to 800KB).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5780) OrdinalMap's mapping from global ords to segment ords is sometimes wasteful

2014-06-20 Thread Adrien Grand (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand updated LUCENE-5780:
-

Fix Version/s: (was: 4.9)
   4.10

 OrdinalMap's mapping from global ords to segment ords is sometimes wasteful
 ---

 Key: LUCENE-5780
 URL: https://issues.apache.org/jira/browse/LUCENE-5780
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Adrien Grand
Assignee: Adrien Grand
 Fix For: 5.0, 4.10


 Robert found a case when the ordinal map can be quite wasteful in terms of 
 memory usage: in order to be able to resolve values given a global ordinals, 
 it stores two things:
  - an identifier of the segment where the value is
  - the difference between the ordinal on the segment and the global ordinal
 The issue is that OrdinalMap currently picks any of the segments that contain 
 the value but we can do better: we can pick the first segment that has the 
 value. This will help for two reasons:
  - it will potentially require fewer bits per value to store the segment ids 
 if NRT segments don't introduce new values
  - if all values happen to appear in the first segment, then the map from 
 global ords to deltas only stores zeros.
 I just tested on an index where all values are in the first segment and this 
 helped reduce memory usage of the ordinal map by 4x (from 3.5MB to 800KB).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5780) OrdinalMap's mapping from global ords to segment ords is sometimes wasteful

2014-06-20 Thread Adrien Grand (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038637#comment-14038637
 ] 

Adrien Grand commented on LUCENE-5780:
--

No worries I just updated the fix version.

 OrdinalMap's mapping from global ords to segment ords is sometimes wasteful
 ---

 Key: LUCENE-5780
 URL: https://issues.apache.org/jira/browse/LUCENE-5780
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Adrien Grand
Assignee: Adrien Grand
 Fix For: 5.0, 4.10


 Robert found a case when the ordinal map can be quite wasteful in terms of 
 memory usage: in order to be able to resolve values given a global ordinals, 
 it stores two things:
  - an identifier of the segment where the value is
  - the difference between the ordinal on the segment and the global ordinal
 The issue is that OrdinalMap currently picks any of the segments that contain 
 the value but we can do better: we can pick the first segment that has the 
 value. This will help for two reasons:
  - it will potentially require fewer bits per value to store the segment ids 
 if NRT segments don't introduce new values
  - if all values happen to appear in the first segment, then the map from 
 global ords to deltas only stores zeros.
 I just tested on an index where all values are in the first segment and this 
 helped reduce memory usage of the ordinal map by 4x (from 3.5MB to 800KB).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5714) Improve tests for BBoxStrategy then port to 4x.

2014-06-20 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038639#comment-14038639
 ] 

Robert Muir commented on LUCENE-5714:
-

This isn't going to make 4.9

 Improve tests for BBoxStrategy then port to 4x.
 ---

 Key: LUCENE-5714
 URL: https://issues.apache.org/jira/browse/LUCENE-5714
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/spatial
Reporter: David Smiley
Assignee: David Smiley
 Fix For: 4.9

 Attachments: 
 LUCENE-5714__Enhance_BBoxStrategy__more_tests,_fix_dateline_bugs,_new_AreaSimilarity_algor.patch


 BBoxStrategy needs better tests before I'm comfortable seeing it in 4x.  
 Specifically it should use random rectangles based validation (ones that may 
 cross the dateline), akin to the other tests.  And I think I see an 
 equals/hashcode bug to be fixed in there too.
 One particular thing I'd like to see added is how to handle a zero-area case 
 for AreaSimilarity.  I think an additional feature in which you declare a 
 minimum % area (relative to the query shape) would be good.
 It should be possible for the user to combine rectangle center-point to query 
 shape center-point distance sorting as well.  I think it is but I need to 
 make sure it's possible without _having_ to index a separate center point 
 field.
 Another possibility (probably not to be addressed here) is a minimum ratio 
 between width/height, perhaps 10%.  A long but nearly no height line should 
 not be massively disadvantaged relevancy-wise to an equivalently long 
 diagonal road that has a square bbox.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5780) OrdinalMap's mapping from global ords to segment ords is sometimes wasteful

2014-06-20 Thread Adrien Grand (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand updated LUCENE-5780:
-

Attachment: LUCENE-5780.patch

Here is a patch (built on 4.x since I did the tests on 4.x but it should apply 
well on trunk too).

 OrdinalMap's mapping from global ords to segment ords is sometimes wasteful
 ---

 Key: LUCENE-5780
 URL: https://issues.apache.org/jira/browse/LUCENE-5780
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Adrien Grand
Assignee: Adrien Grand
 Fix For: 5.0, 4.10

 Attachments: LUCENE-5780.patch


 Robert found a case when the ordinal map can be quite wasteful in terms of 
 memory usage: in order to be able to resolve values given a global ordinals, 
 it stores two things:
  - an identifier of the segment where the value is
  - the difference between the ordinal on the segment and the global ordinal
 The issue is that OrdinalMap currently picks any of the segments that contain 
 the value but we can do better: we can pick the first segment that has the 
 value. This will help for two reasons:
  - it will potentially require fewer bits per value to store the segment ids 
 if NRT segments don't introduce new values
  - if all values happen to appear in the first segment, then the map from 
 global ords to deltas only stores zeros.
 I just tested on an index where all values are in the first segment and this 
 helped reduce memory usage of the ordinal map by 4x (from 3.5MB to 800KB).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-6178) Deprecate Jaspell suggester and change the default

2014-06-20 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038644#comment-14038644
 ] 

Uwe Schindler edited comment on SOLR-6178 at 6/20/14 10:10 AM:
---

Ok lets split this issue: We need in any case the deprecateion, not the change 
of defaults. I will commit that now.


was (Author: thetaphi):
Ok lets split this issue: We need in any case the deprecateion, not the change 
of defaults.

 Deprecate Jaspell suggester and change the default
 --

 Key: SOLR-6178
 URL: https://issues.apache.org/jira/browse/SOLR-6178
 Project: Solr
  Issue Type: Improvement
  Components: spellchecker
Reporter: Michael McCandless
 Fix For: 4.9, 5.0

 Attachments: SOLR-6178.patch


 Right now Solr defaults to Jaspell, but we've deprecated it in LUCENE-5775 
 ... and in trunk I'd like to remove it.  But first we need to fix Solr to not 
 default to it.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6178) Deprecate Jaspell suggester and change the default

2014-06-20 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038644#comment-14038644
 ] 

Uwe Schindler commented on SOLR-6178:
-

Ok lets split this issue: We need in any case the deprecateion, not the change 
of defaults.

 Deprecate Jaspell suggester and change the default
 --

 Key: SOLR-6178
 URL: https://issues.apache.org/jira/browse/SOLR-6178
 Project: Solr
  Issue Type: Improvement
  Components: spellchecker
Reporter: Michael McCandless
 Fix For: 4.9, 5.0

 Attachments: SOLR-6178.patch


 Right now Solr defaults to Jaspell, but we've deprecated it in LUCENE-5775 
 ... and in trunk I'd like to remove it.  But first we need to fix Solr to not 
 default to it.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-6178) Deprecate Jaspell suggester and change the default

2014-06-20 Thread Uwe Schindler (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated SOLR-6178:


Priority: Major  (was: Minor)

 Deprecate Jaspell suggester and change the default
 --

 Key: SOLR-6178
 URL: https://issues.apache.org/jira/browse/SOLR-6178
 Project: Solr
  Issue Type: Improvement
  Components: spellchecker
Reporter: Michael McCandless
 Fix For: 4.9, 5.0

 Attachments: SOLR-6178.patch


 Right now Solr defaults to Jaspell, but we've deprecated it in LUCENE-5775 
 ... and in trunk I'd like to remove it.  But first we need to fix Solr to not 
 default to it.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5780) OrdinalMap's mapping from global ords to segment ords is sometimes wasteful

2014-06-20 Thread Adrien Grand (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand updated LUCENE-5780:
-

Attachment: LUCENE-5780.patch

Same patch but with a better assertion.

 OrdinalMap's mapping from global ords to segment ords is sometimes wasteful
 ---

 Key: LUCENE-5780
 URL: https://issues.apache.org/jira/browse/LUCENE-5780
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Adrien Grand
Assignee: Adrien Grand
 Fix For: 5.0, 4.10

 Attachments: LUCENE-5780.patch, LUCENE-5780.patch


 Robert found a case when the ordinal map can be quite wasteful in terms of 
 memory usage: in order to be able to resolve values given a global ordinals, 
 it stores two things:
  - an identifier of the segment where the value is
  - the difference between the ordinal on the segment and the global ordinal
 The issue is that OrdinalMap currently picks any of the segments that contain 
 the value but we can do better: we can pick the first segment that has the 
 value. This will help for two reasons:
  - it will potentially require fewer bits per value to store the segment ids 
 if NRT segments don't introduce new values
  - if all values happen to appear in the first segment, then the map from 
 global ords to deltas only stores zeros.
 I just tested on an index where all values are in the first segment and this 
 helped reduce memory usage of the ordinal map by 4x (from 3.5MB to 800KB).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6178) Deprecate Jaspell suggester and change the default

2014-06-20 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038652#comment-14038652
 ] 

ASF subversion and git services commented on SOLR-6178:
---

Commit 1604124 from [~thetaphi] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1604124 ]

Merged revision(s) 1604122 from lucene/dev/trunk:
SOLR-6178, LUCENE-5775: Deprecate JaspellLookupFactory

 Deprecate Jaspell suggester and change the default
 --

 Key: SOLR-6178
 URL: https://issues.apache.org/jira/browse/SOLR-6178
 Project: Solr
  Issue Type: Improvement
  Components: spellchecker
Reporter: Michael McCandless
 Fix For: 4.9, 5.0

 Attachments: SOLR-6178.patch


 Right now Solr defaults to Jaspell, but we've deprecated it in LUCENE-5775 
 ... and in trunk I'd like to remove it.  But first we need to fix Solr to not 
 default to it.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6178) Deprecate Jaspell suggester and change the default

2014-06-20 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038648#comment-14038648
 ] 

ASF subversion and git services commented on SOLR-6178:
---

Commit 1604122 from [~thetaphi] in branch 'dev/trunk'
[ https://svn.apache.org/r1604122 ]

SOLR-6178, LUCENE-5775: Deprecate JaspellLookupFactory

 Deprecate Jaspell suggester and change the default
 --

 Key: SOLR-6178
 URL: https://issues.apache.org/jira/browse/SOLR-6178
 Project: Solr
  Issue Type: Improvement
  Components: spellchecker
Reporter: Michael McCandless
 Fix For: 4.9, 5.0

 Attachments: SOLR-6178.patch


 Right now Solr defaults to Jaspell, but we've deprecated it in LUCENE-5775 
 ... and in trunk I'd like to remove it.  But first we need to fix Solr to not 
 default to it.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5775) JaspellTernarySearchTrie.ramBytesUsed hits StackOverflowError

2014-06-20 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038653#comment-14038653
 ] 

ASF subversion and git services commented on LUCENE-5775:
-

Commit 1604124 from [~thetaphi] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1604124 ]

Merged revision(s) 1604122 from lucene/dev/trunk:
SOLR-6178, LUCENE-5775: Deprecate JaspellLookupFactory

 JaspellTernarySearchTrie.ramBytesUsed hits StackOverflowError
 -

 Key: LUCENE-5775
 URL: https://issues.apache.org/jira/browse/LUCENE-5775
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Michael McCandless
Assignee: Michael McCandless
Priority: Minor
 Fix For: 4.9, 5.0


 I hit this when trying to run LookupBenchmarkTest for LUCENE-5752:
 {noformat}
[junit4]   2 NOTE: reproduce with: ant test  
 -Dtestcase=LookupBenchmarkTest -Dtests.method=testStorageNeeds 
 -Dtests.seed=EA0FADB2EE37D385 -Dtests.locale=es_ES 
 -Dtests.timezone=Etc/Greenwich -Dtests.file.encoding=UTF-8
[junit4] ERROR   1.89s | LookupBenchmarkTest.testStorageNeeds 
[junit4] Throwable #1: java.lang.StackOverflowError
[junit4]  at 
 __randomizedtesting.SeedInfo.seed([EA0FADB2EE37D385:DF8106BCB29C472F]:0)
[junit4]  at java.lang.Class.getMethod0(Class.java:2774)
[junit4]  at 
 java.lang.Class.isCheckMemberAccessOverridden(Class.java:2214)
[junit4]  at java.lang.Class.checkMemberAccess(Class.java:2233)
[junit4]  at java.lang.Class.getDeclaredFields(Class.java:1805)
[junit4]  at 
 org.apache.lucene.util.RamUsageEstimator.shallowSizeOfInstance(RamUsageEstimator.java:351)
[junit4]  at 
 org.apache.lucene.util.RamUsageEstimator.shallowSizeOf(RamUsageEstimator.java:329)
[junit4]  at 
 org.apache.lucene.search.suggest.jaspell.JaspellTernarySearchTrie$TSTNode.ramBytesUsed(JaspellTernarySearchTrie.java:100)
[junit4]  at 
 org.apache.lucene.search.suggest.jaspell.JaspellTernarySearchTrie$TSTNode.ramBytesUsed(JaspellTernarySearchTrie.java:103)
[junit4]  at 
 org.apache.lucene.search.suggest.jaspell.JaspellTernarySearchTrie$TSTNode.ramBytesUsed(JaspellTernarySearchTrie.java:103)
[junit4]  at 
 org.apache.lucene.search.suggest.jaspell.JaspellTernarySearchTrie$TSTNode.ramBytesUsed(JaspellTernarySearchTrie.java:103)
[junit4]  at 
 org.apache.lucene.search.suggest.jaspell.JaspellTernarySearchTrie$TSTNode.ramBytesUsed(JaspellTernarySearchTrie.java:103)
[junit4]  at 
 org.apache.lucene.search.suggest.jaspell.JaspellTernarySearchTrie$TSTNode.ramBytesUsed(JaspellTernarySearchTrie.java:103)
[junit4]  at 
 org.apache.lucene.search.suggest.jaspell.JaspellTernarySearchTrie$TSTNode.ramBytesUsed(JaspellTernarySearchTrie.java:103)
[junit4]  at 
 org.apache.lucene.search.suggest.jaspell.JaspellTernarySearchTrie$TSTNode.ramBytesUsed(JaspellTernarySearchTrie.java:103)
[junit4]  at 
 org.apache.lucene.search.suggest.jaspell.JaspellTernarySearchTrie$TSTNode.ramBytesUsed(JaspellTernarySearchTrie.java:103)
[junit4]  at 
 org.apache.lucene.search.suggest.jaspell.JaspellTernarySearchTrie$TSTNode.ramBytesUsed(JaspellTernarySearchTrie.java:103)
[junit4]  at 
 org.apache.lucene.search.suggest.jaspell.JaspellTernarySearchTrie$TSTNode.ramBytesUsed(JaspellTernarySearchTrie.java:103)
[junit4]  at 
 org.apache.lucene.search.suggest.jaspell.JaspellTernarySearchTrie$TSTNode.ramBytesUsed(JaspellTernarySearchTrie.java:103)
[junit4]  at 
 org.apache.lucene.search.suggest.jaspell.JaspellTernarySearchTrie$TSTNode.ramBytesUsed(JaspellTernarySearchTrie.java:103)
 {noformat}
 I think we should just remove/deprecate this suggester?  The FST based 
 suggesters are far more RAM efficient...



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5775) JaspellTernarySearchTrie.ramBytesUsed hits StackOverflowError

2014-06-20 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038649#comment-14038649
 ] 

ASF subversion and git services commented on LUCENE-5775:
-

Commit 1604122 from [~thetaphi] in branch 'dev/trunk'
[ https://svn.apache.org/r1604122 ]

SOLR-6178, LUCENE-5775: Deprecate JaspellLookupFactory

 JaspellTernarySearchTrie.ramBytesUsed hits StackOverflowError
 -

 Key: LUCENE-5775
 URL: https://issues.apache.org/jira/browse/LUCENE-5775
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Michael McCandless
Assignee: Michael McCandless
Priority: Minor
 Fix For: 4.9, 5.0


 I hit this when trying to run LookupBenchmarkTest for LUCENE-5752:
 {noformat}
[junit4]   2 NOTE: reproduce with: ant test  
 -Dtestcase=LookupBenchmarkTest -Dtests.method=testStorageNeeds 
 -Dtests.seed=EA0FADB2EE37D385 -Dtests.locale=es_ES 
 -Dtests.timezone=Etc/Greenwich -Dtests.file.encoding=UTF-8
[junit4] ERROR   1.89s | LookupBenchmarkTest.testStorageNeeds 
[junit4] Throwable #1: java.lang.StackOverflowError
[junit4]  at 
 __randomizedtesting.SeedInfo.seed([EA0FADB2EE37D385:DF8106BCB29C472F]:0)
[junit4]  at java.lang.Class.getMethod0(Class.java:2774)
[junit4]  at 
 java.lang.Class.isCheckMemberAccessOverridden(Class.java:2214)
[junit4]  at java.lang.Class.checkMemberAccess(Class.java:2233)
[junit4]  at java.lang.Class.getDeclaredFields(Class.java:1805)
[junit4]  at 
 org.apache.lucene.util.RamUsageEstimator.shallowSizeOfInstance(RamUsageEstimator.java:351)
[junit4]  at 
 org.apache.lucene.util.RamUsageEstimator.shallowSizeOf(RamUsageEstimator.java:329)
[junit4]  at 
 org.apache.lucene.search.suggest.jaspell.JaspellTernarySearchTrie$TSTNode.ramBytesUsed(JaspellTernarySearchTrie.java:100)
[junit4]  at 
 org.apache.lucene.search.suggest.jaspell.JaspellTernarySearchTrie$TSTNode.ramBytesUsed(JaspellTernarySearchTrie.java:103)
[junit4]  at 
 org.apache.lucene.search.suggest.jaspell.JaspellTernarySearchTrie$TSTNode.ramBytesUsed(JaspellTernarySearchTrie.java:103)
[junit4]  at 
 org.apache.lucene.search.suggest.jaspell.JaspellTernarySearchTrie$TSTNode.ramBytesUsed(JaspellTernarySearchTrie.java:103)
[junit4]  at 
 org.apache.lucene.search.suggest.jaspell.JaspellTernarySearchTrie$TSTNode.ramBytesUsed(JaspellTernarySearchTrie.java:103)
[junit4]  at 
 org.apache.lucene.search.suggest.jaspell.JaspellTernarySearchTrie$TSTNode.ramBytesUsed(JaspellTernarySearchTrie.java:103)
[junit4]  at 
 org.apache.lucene.search.suggest.jaspell.JaspellTernarySearchTrie$TSTNode.ramBytesUsed(JaspellTernarySearchTrie.java:103)
[junit4]  at 
 org.apache.lucene.search.suggest.jaspell.JaspellTernarySearchTrie$TSTNode.ramBytesUsed(JaspellTernarySearchTrie.java:103)
[junit4]  at 
 org.apache.lucene.search.suggest.jaspell.JaspellTernarySearchTrie$TSTNode.ramBytesUsed(JaspellTernarySearchTrie.java:103)
[junit4]  at 
 org.apache.lucene.search.suggest.jaspell.JaspellTernarySearchTrie$TSTNode.ramBytesUsed(JaspellTernarySearchTrie.java:103)
[junit4]  at 
 org.apache.lucene.search.suggest.jaspell.JaspellTernarySearchTrie$TSTNode.ramBytesUsed(JaspellTernarySearchTrie.java:103)
[junit4]  at 
 org.apache.lucene.search.suggest.jaspell.JaspellTernarySearchTrie$TSTNode.ramBytesUsed(JaspellTernarySearchTrie.java:103)
[junit4]  at 
 org.apache.lucene.search.suggest.jaspell.JaspellTernarySearchTrie$TSTNode.ramBytesUsed(JaspellTernarySearchTrie.java:103)
 {noformat}
 I think we should just remove/deprecate this suggester?  The FST based 
 suggesters are far more RAM efficient...



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5775) JaspellTernarySearchTrie.ramBytesUsed hits StackOverflowError

2014-06-20 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038655#comment-14038655
 ] 

ASF subversion and git services commented on LUCENE-5775:
-

Commit 1604125 from [~thetaphi] in branch 'dev/branches/lucene_solr_4_9'
[ https://svn.apache.org/r1604125 ]

Merged revision(s) 1604122 from lucene/dev/trunk:
SOLR-6178, LUCENE-5775: Deprecate JaspellLookupFactory

 JaspellTernarySearchTrie.ramBytesUsed hits StackOverflowError
 -

 Key: LUCENE-5775
 URL: https://issues.apache.org/jira/browse/LUCENE-5775
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Michael McCandless
Assignee: Michael McCandless
Priority: Minor
 Fix For: 4.9, 5.0


 I hit this when trying to run LookupBenchmarkTest for LUCENE-5752:
 {noformat}
[junit4]   2 NOTE: reproduce with: ant test  
 -Dtestcase=LookupBenchmarkTest -Dtests.method=testStorageNeeds 
 -Dtests.seed=EA0FADB2EE37D385 -Dtests.locale=es_ES 
 -Dtests.timezone=Etc/Greenwich -Dtests.file.encoding=UTF-8
[junit4] ERROR   1.89s | LookupBenchmarkTest.testStorageNeeds 
[junit4] Throwable #1: java.lang.StackOverflowError
[junit4]  at 
 __randomizedtesting.SeedInfo.seed([EA0FADB2EE37D385:DF8106BCB29C472F]:0)
[junit4]  at java.lang.Class.getMethod0(Class.java:2774)
[junit4]  at 
 java.lang.Class.isCheckMemberAccessOverridden(Class.java:2214)
[junit4]  at java.lang.Class.checkMemberAccess(Class.java:2233)
[junit4]  at java.lang.Class.getDeclaredFields(Class.java:1805)
[junit4]  at 
 org.apache.lucene.util.RamUsageEstimator.shallowSizeOfInstance(RamUsageEstimator.java:351)
[junit4]  at 
 org.apache.lucene.util.RamUsageEstimator.shallowSizeOf(RamUsageEstimator.java:329)
[junit4]  at 
 org.apache.lucene.search.suggest.jaspell.JaspellTernarySearchTrie$TSTNode.ramBytesUsed(JaspellTernarySearchTrie.java:100)
[junit4]  at 
 org.apache.lucene.search.suggest.jaspell.JaspellTernarySearchTrie$TSTNode.ramBytesUsed(JaspellTernarySearchTrie.java:103)
[junit4]  at 
 org.apache.lucene.search.suggest.jaspell.JaspellTernarySearchTrie$TSTNode.ramBytesUsed(JaspellTernarySearchTrie.java:103)
[junit4]  at 
 org.apache.lucene.search.suggest.jaspell.JaspellTernarySearchTrie$TSTNode.ramBytesUsed(JaspellTernarySearchTrie.java:103)
[junit4]  at 
 org.apache.lucene.search.suggest.jaspell.JaspellTernarySearchTrie$TSTNode.ramBytesUsed(JaspellTernarySearchTrie.java:103)
[junit4]  at 
 org.apache.lucene.search.suggest.jaspell.JaspellTernarySearchTrie$TSTNode.ramBytesUsed(JaspellTernarySearchTrie.java:103)
[junit4]  at 
 org.apache.lucene.search.suggest.jaspell.JaspellTernarySearchTrie$TSTNode.ramBytesUsed(JaspellTernarySearchTrie.java:103)
[junit4]  at 
 org.apache.lucene.search.suggest.jaspell.JaspellTernarySearchTrie$TSTNode.ramBytesUsed(JaspellTernarySearchTrie.java:103)
[junit4]  at 
 org.apache.lucene.search.suggest.jaspell.JaspellTernarySearchTrie$TSTNode.ramBytesUsed(JaspellTernarySearchTrie.java:103)
[junit4]  at 
 org.apache.lucene.search.suggest.jaspell.JaspellTernarySearchTrie$TSTNode.ramBytesUsed(JaspellTernarySearchTrie.java:103)
[junit4]  at 
 org.apache.lucene.search.suggest.jaspell.JaspellTernarySearchTrie$TSTNode.ramBytesUsed(JaspellTernarySearchTrie.java:103)
[junit4]  at 
 org.apache.lucene.search.suggest.jaspell.JaspellTernarySearchTrie$TSTNode.ramBytesUsed(JaspellTernarySearchTrie.java:103)
[junit4]  at 
 org.apache.lucene.search.suggest.jaspell.JaspellTernarySearchTrie$TSTNode.ramBytesUsed(JaspellTernarySearchTrie.java:103)
 {noformat}
 I think we should just remove/deprecate this suggester?  The FST based 
 suggesters are far more RAM efficient...



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6178) Deprecate Jaspell suggester

2014-06-20 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038656#comment-14038656
 ] 

Robert Muir commented on SOLR-6178:
---

This didn't make 4.9

 Deprecate Jaspell suggester
 ---

 Key: SOLR-6178
 URL: https://issues.apache.org/jira/browse/SOLR-6178
 Project: Solr
  Issue Type: Improvement
  Components: spellchecker
Reporter: Michael McCandless
 Fix For: 4.9, 5.0

 Attachments: SOLR-6178.patch


 Right now Solr defaults to Jaspell, but we've deprecated it in LUCENE-5775 
 ... and in trunk I'd like to remove it.  But first we need to fix Solr to not 
 default to it.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-6178) Deprecate Jaspell suggester

2014-06-20 Thread Uwe Schindler (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated SOLR-6178:


Summary: Deprecate Jaspell suggester  (was: Deprecate Jaspell suggester and 
change the default)

 Deprecate Jaspell suggester
 ---

 Key: SOLR-6178
 URL: https://issues.apache.org/jira/browse/SOLR-6178
 Project: Solr
  Issue Type: Improvement
  Components: spellchecker
Reporter: Michael McCandless
 Fix For: 4.9, 5.0

 Attachments: SOLR-6178.patch


 Right now Solr defaults to Jaspell, but we've deprecated it in LUCENE-5775 
 ... and in trunk I'd like to remove it.  But first we need to fix Solr to not 
 default to it.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-6178) Deprecate Jaspell suggester

2014-06-20 Thread Uwe Schindler (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler resolved SOLR-6178.
-

Resolution: Fixed

 Deprecate Jaspell suggester
 ---

 Key: SOLR-6178
 URL: https://issues.apache.org/jira/browse/SOLR-6178
 Project: Solr
  Issue Type: Improvement
  Components: spellchecker
Reporter: Michael McCandless
 Fix For: 4.9, 5.0

 Attachments: SOLR-6178.patch


 Right now Solr defaults to Jaspell, but we've deprecated it in LUCENE-5775 
 ... and in trunk I'd like to remove it.  But first we need to fix Solr to not 
 default to it.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6178) Deprecate Jaspell suggester and change the default

2014-06-20 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038654#comment-14038654
 ] 

ASF subversion and git services commented on SOLR-6178:
---

Commit 1604125 from [~thetaphi] in branch 'dev/branches/lucene_solr_4_9'
[ https://svn.apache.org/r1604125 ]

Merged revision(s) 1604122 from lucene/dev/trunk:
SOLR-6178, LUCENE-5775: Deprecate JaspellLookupFactory

 Deprecate Jaspell suggester and change the default
 --

 Key: SOLR-6178
 URL: https://issues.apache.org/jira/browse/SOLR-6178
 Project: Solr
  Issue Type: Improvement
  Components: spellchecker
Reporter: Michael McCandless
 Fix For: 4.9, 5.0

 Attachments: SOLR-6178.patch


 Right now Solr defaults to Jaspell, but we've deprecated it in LUCENE-5775 
 ... and in trunk I'd like to remove it.  But first we need to fix Solr to not 
 default to it.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-6185) Change suggester default in Solr, because Jaspell is deprecated

2014-06-20 Thread Uwe Schindler (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated SOLR-6185:


Affects Version/s: 4.9
Fix Version/s: 5.0
   Issue Type: Task  (was: Bug)

 Change suggester default in Solr, because Jaspell is deprecated
 ---

 Key: SOLR-6185
 URL: https://issues.apache.org/jira/browse/SOLR-6185
 Project: Solr
  Issue Type: Task
Affects Versions: 4.9
Reporter: Uwe Schindler
 Fix For: 5.0


 In SOLR-6178, LUCENE-5775 we deprecated JaspellLookup. But the default is 
 still 4.9, Solr warns only if you explicitely use it. We should change the 
 default for 5.0.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-6185) Change suggester default in Solr, because Jaspell is deprecated

2014-06-20 Thread Uwe Schindler (JIRA)
Uwe Schindler created SOLR-6185:
---

 Summary: Change suggester default in Solr, because Jaspell is 
deprecated
 Key: SOLR-6185
 URL: https://issues.apache.org/jira/browse/SOLR-6185
 Project: Solr
  Issue Type: Bug
Reporter: Uwe Schindler


In SOLR-6178, LUCENE-5775 we deprecated JaspellLookup. But the default is still 
4.9, Solr warns only if you explicitely use it. We should change the default 
for 5.0.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6178) Deprecate Jaspell suggester

2014-06-20 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038657#comment-14038657
 ] 

Uwe Schindler commented on SOLR-6178:
-

Sorry, I don't see a release.

 Deprecate Jaspell suggester
 ---

 Key: SOLR-6178
 URL: https://issues.apache.org/jira/browse/SOLR-6178
 Project: Solr
  Issue Type: Improvement
  Components: spellchecker
Reporter: Michael McCandless
 Fix For: 4.9, 5.0

 Attachments: SOLR-6178.patch


 Right now Solr defaults to Jaspell, but we've deprecated it in LUCENE-5775 
 ... and in trunk I'd like to remove it.  But first we need to fix Solr to not 
 default to it.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6178) Deprecate Jaspell suggester

2014-06-20 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038659#comment-14038659
 ] 

Robert Muir commented on SOLR-6178:
---

Well, you can vote however you like. I already spun the RC and its in progress.

Its risky as shit to add such deprecations which will cause warning messages to 
solr users and create a controversy. There is no need to rush in such things at 
the last minute.

 Deprecate Jaspell suggester
 ---

 Key: SOLR-6178
 URL: https://issues.apache.org/jira/browse/SOLR-6178
 Project: Solr
  Issue Type: Improvement
  Components: spellchecker
Reporter: Michael McCandless
 Fix For: 4.9, 5.0

 Attachments: SOLR-6178.patch


 Right now Solr defaults to Jaspell, but we've deprecated it in LUCENE-5775 
 ... and in trunk I'd like to remove it.  But first we need to fix Solr to not 
 default to it.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6178) Deprecate Jaspell suggester

2014-06-20 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038664#comment-14038664
 ] 

Michael McCandless commented on SOLR-6178:
--

In hindsight I should have just backported the bug fix in LUCENE-5775 (stack 
overflow when calling .ramBytesUsed()), and not the deprecation, for 4.9  
next time I'll make separate issues.  Sorry for the hassle.

 Deprecate Jaspell suggester
 ---

 Key: SOLR-6178
 URL: https://issues.apache.org/jira/browse/SOLR-6178
 Project: Solr
  Issue Type: Improvement
  Components: spellchecker
Reporter: Michael McCandless
 Fix For: 4.9, 5.0

 Attachments: SOLR-6178.patch


 Right now Solr defaults to Jaspell, but we've deprecated it in LUCENE-5775 
 ... and in trunk I'd like to remove it.  But first we need to fix Solr to not 
 default to it.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6178) Deprecate Jaspell suggester

2014-06-20 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038663#comment-14038663
 ] 

Uwe Schindler commented on SOLR-6178:
-

This was not last minute. Mike just missed to add the deprecation also in Solr. 
If we go this way, please also revert Mike's commit. Sorry.

 Deprecate Jaspell suggester
 ---

 Key: SOLR-6178
 URL: https://issues.apache.org/jira/browse/SOLR-6178
 Project: Solr
  Issue Type: Improvement
  Components: spellchecker
Reporter: Michael McCandless
 Fix For: 4.9, 5.0

 Attachments: SOLR-6178.patch


 Right now Solr defaults to Jaspell, but we've deprecated it in LUCENE-5775 
 ... and in trunk I'd like to remove it.  But first we need to fix Solr to not 
 default to it.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6178) Deprecate Jaspell suggester

2014-06-20 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038669#comment-14038669
 ] 

Robert Muir commented on SOLR-6178:
---

Its last minute because it can be potentially controversial. Deprecating the 
lucene suggester has no impact to solr users. Putting a big warning in the 
default configuration is a different story.

I didnt revert anything, your commit just didn't make it in time for the 
release candidate. I dont want such controversy: I care about keeping things 
stable and bad bugs. I can EASILY see such a warning turning into we should 
change the solr default and spinning totally out of control and somehow 
breaking the example.

I waited extra time for SOLR-6182 because it looked like a really bad bug. And 
if a similar bug like that pops up I won't even have a second thought about 
respinning for it. But by no means is this a blocker.

 Deprecate Jaspell suggester
 ---

 Key: SOLR-6178
 URL: https://issues.apache.org/jira/browse/SOLR-6178
 Project: Solr
  Issue Type: Improvement
  Components: spellchecker
Reporter: Michael McCandless
 Fix For: 4.9, 5.0

 Attachments: SOLR-6178.patch


 Right now Solr defaults to Jaspell, but we've deprecated it in LUCENE-5775 
 ... and in trunk I'd like to remove it.  But first we need to fix Solr to not 
 default to it.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5780) OrdinalMap's mapping from global ords to segment ords is sometimes wasteful

2014-06-20 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038674#comment-14038674
 ] 

Robert Muir commented on LUCENE-5780:
-

This looks good (+1 to commit to trunk/4.10) but i think we can do better, by 
explicitly sorting? E.g. take a long[] sizes parameter (can be optional and 
zeros would give us what we have today if we use a stable sort), that the user 
could populate either with valueCount or number of docs in the segment (both 
are probably a fine heuristic).

I know this means we will need an array to remap lookups, but this only happens 
once per segment with the new LongValues api so it won't impact performance.

 OrdinalMap's mapping from global ords to segment ords is sometimes wasteful
 ---

 Key: LUCENE-5780
 URL: https://issues.apache.org/jira/browse/LUCENE-5780
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Adrien Grand
Assignee: Adrien Grand
 Fix For: 5.0, 4.10

 Attachments: LUCENE-5780.patch, LUCENE-5780.patch


 Robert found a case when the ordinal map can be quite wasteful in terms of 
 memory usage: in order to be able to resolve values given a global ordinals, 
 it stores two things:
  - an identifier of the segment where the value is
  - the difference between the ordinal on the segment and the global ordinal
 The issue is that OrdinalMap currently picks any of the segments that contain 
 the value but we can do better: we can pick the first segment that has the 
 value. This will help for two reasons:
  - it will potentially require fewer bits per value to store the segment ids 
 if NRT segments don't introduce new values
  - if all values happen to appear in the first segment, then the map from 
 global ords to deltas only stores zeros.
 I just tested on an index where all values are in the first segment and this 
 helped reduce memory usage of the ordinal map by 4x (from 3.5MB to 800KB).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-6186) Highlight on parent in block join query.

2014-06-20 Thread Rob Bamberg (JIRA)
Rob Bamberg created SOLR-6186:
-

 Summary: Highlight on parent in block join query.
 Key: SOLR-6186
 URL: https://issues.apache.org/jira/browse/SOLR-6186
 Project: Solr
  Issue Type: Improvement
  Components: highlighter
Affects Versions: 4.8.1
Reporter: Rob Bamberg
 Fix For: 4.9, 5.0


The issue I have is with the highlighting component during block joins where I 
am searching on the parent document but returning the children.

As I am returning the children the snippets also apply to the children, but I 
am actually searching in the parent.  (search query example: {!child 
of=type:parent}q ) 

So could it be possible to return snippets for the parent on which I am 
actually searching? 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6178) Deprecate Jaspell suggester

2014-06-20 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038726#comment-14038726
 ] 

Uwe Schindler commented on SOLR-6178:
-

In my opinion, we should revert this one *and* the deprecation inside Lucene. 
To me it is also very controversal if Solr defaults to use something that is 
deprecated in Lucene because it uses too much memory!

I agree to revert this issue if the deprecation in Lucene is also reverted.

 Deprecate Jaspell suggester
 ---

 Key: SOLR-6178
 URL: https://issues.apache.org/jira/browse/SOLR-6178
 Project: Solr
  Issue Type: Improvement
  Components: spellchecker
Reporter: Michael McCandless
 Fix For: 4.9, 5.0

 Attachments: SOLR-6178.patch


 Right now Solr defaults to Jaspell, but we've deprecated it in LUCENE-5775 
 ... and in trunk I'd like to remove it.  But first we need to fix Solr to not 
 default to it.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[VOTE] 4.9.0

2014-06-20 Thread Robert Muir
Artifacts here:
http://people.apache.org/~rmuir/staging_area/lucene_solr_4_9_0_r1604085/

Here's my +1

SUCCESS! [0:35:36.654925]

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6178) Deprecate Jaspell suggester

2014-06-20 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038732#comment-14038732
 ] 

Robert Muir commented on SOLR-6178:
---

Well, thats your opinion. 

There is absolutely no way I am going to let a lucene deprecation annotation 
have a destabilizing snowball effect on this release: first its a deprecation 
warning, then someone doesn't want the warning in the default configuration and 
wants a change of defaults, then the change of defaults breaks the example, 
then the change to the example breaks the documentation or tutorial, and on and 
on and on.

There is just no reason for such risks.


 Deprecate Jaspell suggester
 ---

 Key: SOLR-6178
 URL: https://issues.apache.org/jira/browse/SOLR-6178
 Project: Solr
  Issue Type: Improvement
  Components: spellchecker
Reporter: Michael McCandless
 Fix For: 4.9, 5.0

 Attachments: SOLR-6178.patch


 Right now Solr defaults to Jaspell, but we've deprecated it in LUCENE-5775 
 ... and in trunk I'd like to remove it.  But first we need to fix Solr to not 
 default to it.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5780) OrdinalMap's mapping from global ords to segment ords is sometimes wasteful

2014-06-20 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038734#comment-14038734
 ] 

ASF subversion and git services commented on LUCENE-5780:
-

Commit 1604157 from [~jpountz] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1604157 ]

LUCENE-5780: Make OrdinalMap more memory-efficient.

 OrdinalMap's mapping from global ords to segment ords is sometimes wasteful
 ---

 Key: LUCENE-5780
 URL: https://issues.apache.org/jira/browse/LUCENE-5780
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Adrien Grand
Assignee: Adrien Grand
 Fix For: 5.0, 4.10

 Attachments: LUCENE-5780.patch, LUCENE-5780.patch


 Robert found a case when the ordinal map can be quite wasteful in terms of 
 memory usage: in order to be able to resolve values given a global ordinals, 
 it stores two things:
  - an identifier of the segment where the value is
  - the difference between the ordinal on the segment and the global ordinal
 The issue is that OrdinalMap currently picks any of the segments that contain 
 the value but we can do better: we can pick the first segment that has the 
 value. This will help for two reasons:
  - it will potentially require fewer bits per value to store the segment ids 
 if NRT segments don't introduce new values
  - if all values happen to appear in the first segment, then the map from 
 global ords to deltas only stores zeros.
 I just tested on an index where all values are in the first segment and this 
 helped reduce memory usage of the ordinal map by 4x (from 3.5MB to 800KB).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5780) OrdinalMap's mapping from global ords to segment ords is sometimes wasteful

2014-06-20 Thread Adrien Grand (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038733#comment-14038733
 ] 

Adrien Grand commented on LUCENE-5780:
--

+1 I had planned to open a follow-up issue about it. Will open it shortly.

 OrdinalMap's mapping from global ords to segment ords is sometimes wasteful
 ---

 Key: LUCENE-5780
 URL: https://issues.apache.org/jira/browse/LUCENE-5780
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Adrien Grand
Assignee: Adrien Grand
 Fix For: 5.0, 4.10

 Attachments: LUCENE-5780.patch, LUCENE-5780.patch


 Robert found a case when the ordinal map can be quite wasteful in terms of 
 memory usage: in order to be able to resolve values given a global ordinals, 
 it stores two things:
  - an identifier of the segment where the value is
  - the difference between the ordinal on the segment and the global ordinal
 The issue is that OrdinalMap currently picks any of the segments that contain 
 the value but we can do better: we can pick the first segment that has the 
 value. This will help for two reasons:
  - it will potentially require fewer bits per value to store the segment ids 
 if NRT segments don't introduce new values
  - if all values happen to appear in the first segment, then the map from 
 global ords to deltas only stores zeros.
 I just tested on an index where all values are in the first segment and this 
 helped reduce memory usage of the ordinal map by 4x (from 3.5MB to 800KB).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5780) OrdinalMap's mapping from global ords to segment ords is sometimes wasteful

2014-06-20 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038736#comment-14038736
 ] 

ASF subversion and git services commented on LUCENE-5780:
-

Commit 1604158 from [~jpountz] in branch 'dev/trunk'
[ https://svn.apache.org/r1604158 ]

LUCENE-5780: Make OrdinalMap more memory-efficient.

 OrdinalMap's mapping from global ords to segment ords is sometimes wasteful
 ---

 Key: LUCENE-5780
 URL: https://issues.apache.org/jira/browse/LUCENE-5780
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Adrien Grand
Assignee: Adrien Grand
 Fix For: 5.0, 4.10

 Attachments: LUCENE-5780.patch, LUCENE-5780.patch


 Robert found a case when the ordinal map can be quite wasteful in terms of 
 memory usage: in order to be able to resolve values given a global ordinals, 
 it stores two things:
  - an identifier of the segment where the value is
  - the difference between the ordinal on the segment and the global ordinal
 The issue is that OrdinalMap currently picks any of the segments that contain 
 the value but we can do better: we can pick the first segment that has the 
 value. This will help for two reasons:
  - it will potentially require fewer bits per value to store the segment ids 
 if NRT segments don't introduce new values
  - if all values happen to appear in the first segment, then the map from 
 global ords to deltas only stores zeros.
 I just tested on an index where all values are in the first segment and this 
 helped reduce memory usage of the ordinal map by 4x (from 3.5MB to 800KB).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5714) Improve tests for BBoxStrategy then port to 4x.

2014-06-20 Thread David Smiley (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Smiley updated LUCENE-5714:
-

Fix Version/s: (was: 4.9)
   4.10

That's right [~rcmuir]; it was assigned as such prior to your release branch. I 
bumped it to 4.10.

 Improve tests for BBoxStrategy then port to 4x.
 ---

 Key: LUCENE-5714
 URL: https://issues.apache.org/jira/browse/LUCENE-5714
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/spatial
Reporter: David Smiley
Assignee: David Smiley
 Fix For: 4.10

 Attachments: 
 LUCENE-5714__Enhance_BBoxStrategy__more_tests,_fix_dateline_bugs,_new_AreaSimilarity_algor.patch


 BBoxStrategy needs better tests before I'm comfortable seeing it in 4x.  
 Specifically it should use random rectangles based validation (ones that may 
 cross the dateline), akin to the other tests.  And I think I see an 
 equals/hashcode bug to be fixed in there too.
 One particular thing I'd like to see added is how to handle a zero-area case 
 for AreaSimilarity.  I think an additional feature in which you declare a 
 minimum % area (relative to the query shape) would be good.
 It should be possible for the user to combine rectangle center-point to query 
 shape center-point distance sorting as well.  I think it is but I need to 
 make sure it's possible without _having_ to index a separate center point 
 field.
 Another possibility (probably not to be addressed here) is a minimum ratio 
 between width/height, perhaps 10%.  A long but nearly no height line should 
 not be massively disadvantaged relevancy-wise to an equivalently long 
 diagonal road that has a square bbox.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6178) Deprecate Jaspell suggester

2014-06-20 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038739#comment-14038739
 ] 

Robert Muir commented on SOLR-6178:
---

Also, changing the default gets even more complex. its been jaspell for a long 
time: but changing the default could cause a ton of confusion (users think 4.9 
cannot read their index), even though we know that suggester data is really 
not part of the index... the user just may not see it that way: to them its 
their data and the upgrade broke them.

Such a change needs to be planned out from a docs and release notes 
perspective, not shoved in right at the last minute hastily. Its too risky.

 Deprecate Jaspell suggester
 ---

 Key: SOLR-6178
 URL: https://issues.apache.org/jira/browse/SOLR-6178
 Project: Solr
  Issue Type: Improvement
  Components: spellchecker
Reporter: Michael McCandless
 Fix For: 4.9, 5.0

 Attachments: SOLR-6178.patch


 Right now Solr defaults to Jaspell, but we've deprecated it in LUCENE-5775 
 ... and in trunk I'd like to remove it.  But first we need to fix Solr to not 
 default to it.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-5780) OrdinalMap's mapping from global ords to segment ords is sometimes wasteful

2014-06-20 Thread Adrien Grand (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand resolved LUCENE-5780.
--

Resolution: Fixed

 OrdinalMap's mapping from global ords to segment ords is sometimes wasteful
 ---

 Key: LUCENE-5780
 URL: https://issues.apache.org/jira/browse/LUCENE-5780
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Adrien Grand
Assignee: Adrien Grand
 Fix For: 5.0, 4.10

 Attachments: LUCENE-5780.patch, LUCENE-5780.patch


 Robert found a case when the ordinal map can be quite wasteful in terms of 
 memory usage: in order to be able to resolve values given a global ordinals, 
 it stores two things:
  - an identifier of the segment where the value is
  - the difference between the ordinal on the segment and the global ordinal
 The issue is that OrdinalMap currently picks any of the segments that contain 
 the value but we can do better: we can pick the first segment that has the 
 value. This will help for two reasons:
  - it will potentially require fewer bits per value to store the segment ids 
 if NRT segments don't introduce new values
  - if all values happen to appear in the first segment, then the map from 
 global ords to deltas only stores zeros.
 I just tested on an index where all values are in the first segment and this 
 helped reduce memory usage of the ordinal map by 4x (from 3.5MB to 800KB).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [VOTE] 4.9.0

2014-06-20 Thread Michael McCandless
+1

SUCCESS! [0:47:26.115239]

Mike McCandless

http://blog.mikemccandless.com


On Fri, Jun 20, 2014 at 8:13 AM, Robert Muir rcm...@gmail.com wrote:
 Artifacts here:
 http://people.apache.org/~rmuir/staging_area/lucene_solr_4_9_0_r1604085/

 Here's my +1

 SUCCESS! [0:35:36.654925]

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-6187) facet.mincount ignored in range date faceting using distributed search

2014-06-20 Thread Zaccheo Bagnati (JIRA)
Zaccheo Bagnati created SOLR-6187:
-

 Summary: facet.mincount ignored in range date faceting using 
distributed search
 Key: SOLR-6187
 URL: https://issues.apache.org/jira/browse/SOLR-6187
 Project: Solr
  Issue Type: Bug
  Components: search
Affects Versions: 4.8.1, 4.8
Reporter: Zaccheo Bagnati


While I was trying to do a range faceting with gap +1YEAR using shards, I 
noticed that facet.mincount parameter seems to be ignored.

Issue can be reproduced in this way:
Create 2 cores testshard1 and testshard2 with:

solrconfig.xml
?xml version=1.0 encoding=UTF-8 ?
config
  luceneMatchVersionLUCENE_41/luceneMatchVersion
  lib dir=/opt/solr/dist regex=solr-cell-.*\.jar/
  directoryFactory name=DirectoryFactory 
class=${solr.directoryFactory:solr.NRTCachingDirectoryFactory}/
  updateHandler class=solr.DirectUpdateHandler2 /
  requestHandler name=/select class=solr.SearchHandler
 lst name=defaults
   str name=echoParamsexplicit/str
   int name=rows10/int
   str name=dfid/str
 /lst
  /requestHandler
  requestHandler name=/update class=solr.UpdateRequestHandler  /
  requestHandler name=/admin/ 
class=org.apache.solr.handler.admin.AdminHandlers /
  requestHandler name=/admin/ping class=solr.PingRequestHandler
lst name=invariants
  str name=qsolrpingquery/str
/lst
lst name=defaults
  str name=echoParamsall/str
/lst
  /requestHandler
/config

schema.xml
?xml version=1.0 ?
schema name=${solr.core.name} version=1.5 
xmlns:xi=http://www.w3.org/2001/XInclude;
  fieldType name=int class=solr.TrieIntField precisionStep=0 
positionIncrementGap=0/
  fieldType name=long class=solr.TrieLongField precisionStep=0 
positionIncrementGap=0/
  fieldType name=date class=solr.TrieDateField precisionStep=0 
positionIncrementGap=0/
  field name=_version_ type=long indexed=true  stored=true/
  field name=id type=int indexed=true stored=true multiValued=false 
/
  field name=date type=date indexed=true stored=true 
multiValued=false /
  uniqueKeyid/uniqueKey
  defaultSearchFieldid/defaultSearchField
/schema

Insert in testshard1:
add
 doc
  field name=id1/field
  field name=date2014-06-20T12:51:00Z/field
 /doc
/add

Insert into testshard2:
add
 doc
  field name=id2/field
  field name=date2013-06-20T12:51:00Z/field
 /doc
/add

Now if I execute:

curl 
http://localhost:8983/solr/testshard1/select?q=id:1facet=truefacet.mincount=1facet.range=datef.date.facet.range.start=1900-01-01T00:00:00Zf.date.facet.range.end=NOWf.date.facet.range.gap=%2B1YEARshards=localhost%3A8983%2Fsolr%2Ftestshard1%2Clocalhost%3A8983%2Fsolr%2Ftestshard2shards.info=truewt=json;

I obtain:

[jira] [Created] (SOLR-6188) solr.ICUFoldingFilterFactory causes NoClassDefFoundError: o/a/l/a/icu/ICUFoldingFilter

2014-06-20 Thread Ahmet Arslan (JIRA)
Ahmet Arslan created SOLR-6188:
--

 Summary: solr.ICUFoldingFilterFactory causes NoClassDefFoundError: 
o/a/l/a/icu/ICUFoldingFilter
 Key: SOLR-6188
 URL: https://issues.apache.org/jira/browse/SOLR-6188
 Project: Solr
  Issue Type: Bug
  Components: Schema and Analysis
Affects Versions: 4.8.1
Reporter: Ahmet Arslan


When fully qualified class name is used in schema.xml 

{{org.apache.lucene.analysis.icu.ICUFoldingFilterFactory}}

it works. However as documented in confluence and wiki, when 
{{solr.ICUFoldingFilterFactory}} is used it throws following exception.

This is true for both released 4.8.1 version and trunk r1604168
following type works :

{code:xml}
 fieldType name=folded2 class=solr.TextField
  analyzer
tokenizer class=solr.StandardTokenizerFactory/
filter class=org.apache.lucene.analysis.icu.ICUFoldingFilterFactory/
  /analyzer
/fieldType
{code}


this does not : 
{code:xml}

 fieldType name=folded class=solr.TextField
  analyzer
tokenizer class=solr.StandardTokenizerFactory/
filter class=solr.ICUFoldingFilterFactory/
  /analyzer
/fieldType
{code}


{noformat}
257 [main] ERROR org.apache.solr.core.SolrCore  – Error loading 
core:java.util.concurrent.ExecutionException: java.lang.NoClassDefFoundError: 
org/apache/lucene/analysis/icu/ICUFoldingFilter
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:188)
at org.apache.solr.core.CoreContainer.load(CoreContainer.java:301)
at 
org.apache.solr.servlet.SolrDispatchFilter.createCoreContainer(SolrDispatchFilter.java:190)
at 
org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:137)
at org.eclipse.jetty.servlet.FilterHolder.doStart(FilterHolder.java:119)
at 
org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64)
at 
org.eclipse.jetty.servlet.ServletHandler.initialize(ServletHandler.java:719)
at 
org.eclipse.jetty.servlet.ServletContextHandler.startContext(ServletContextHandler.java:265)
at 
org.eclipse.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1252)
at 
org.eclipse.jetty.server.handler.ContextHandler.doStart(ContextHandler.java:710)
at 
org.eclipse.jetty.webapp.WebAppContext.doStart(WebAppContext.java:494)
at 
org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64)
at 
org.eclipse.jetty.deploy.bindings.StandardStarter.processBinding(StandardStarter.java:39)
at 
org.eclipse.jetty.deploy.AppLifeCycle.runBindings(AppLifeCycle.java:186)
at 
org.eclipse.jetty.deploy.DeploymentManager.requestAppGoal(DeploymentManager.java:494)
at 
org.eclipse.jetty.deploy.DeploymentManager.addApp(DeploymentManager.java:141)
at 
org.eclipse.jetty.deploy.providers.ScanningAppProvider.fileAdded(ScanningAppProvider.java:145)
at 
org.eclipse.jetty.deploy.providers.ScanningAppProvider$1.fileAdded(ScanningAppProvider.java:56)
at org.eclipse.jetty.util.Scanner.reportAddition(Scanner.java:609)
at org.eclipse.jetty.util.Scanner.reportDifferences(Scanner.java:540)
at org.eclipse.jetty.util.Scanner.scan(Scanner.java:403)
at org.eclipse.jetty.util.Scanner.doStart(Scanner.java:337)
at 
org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64)
at 
org.eclipse.jetty.deploy.providers.ScanningAppProvider.doStart(ScanningAppProvider.java:121)
at 
org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64)
at 
org.eclipse.jetty.deploy.DeploymentManager.startAppProvider(DeploymentManager.java:555)
at 
org.eclipse.jetty.deploy.DeploymentManager.doStart(DeploymentManager.java:230)
at 
org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64)
at 
org.eclipse.jetty.util.component.AggregateLifeCycle.doStart(AggregateLifeCycle.java:81)
at 
org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:58)
at 
org.eclipse.jetty.server.handler.HandlerWrapper.doStart(HandlerWrapper.java:96)
at org.eclipse.jetty.server.Server.doStart(Server.java:280)
at 
org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64)
at 
org.eclipse.jetty.xml.XmlConfiguration$1.run(XmlConfiguration.java:1259)
at java.security.AccessController.doPrivileged(Native Method)
at 
org.eclipse.jetty.xml.XmlConfiguration.main(XmlConfiguration.java:1182)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 

[jira] [Updated] (SOLR-6188) solr.ICUFoldingFilterFactory causes NoClassDefFoundError: o/a/l/a/icu/ICUFoldingFilter

2014-06-20 Thread Ahmet Arslan (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmet Arslan updated SOLR-6188:
---

Fix Version/s: 4.9

 solr.ICUFoldingFilterFactory causes NoClassDefFoundError: 
 o/a/l/a/icu/ICUFoldingFilter
 --

 Key: SOLR-6188
 URL: https://issues.apache.org/jira/browse/SOLR-6188
 Project: Solr
  Issue Type: Bug
  Components: Schema and Analysis
Affects Versions: 4.8.1
Reporter: Ahmet Arslan
  Labels: ICUFoldingFilterFactory
 Fix For: 4.9


 When fully qualified class name is used in schema.xml 
 {{org.apache.lucene.analysis.icu.ICUFoldingFilterFactory}}
 it works. However as documented in confluence and wiki, when 
 {{solr.ICUFoldingFilterFactory}} is used it throws following exception.
 This is true for both released 4.8.1 version and trunk r1604168
 following type works :
 {code:xml}
  fieldType name=folded2 class=solr.TextField
   analyzer
 tokenizer class=solr.StandardTokenizerFactory/
 filter 
 class=org.apache.lucene.analysis.icu.ICUFoldingFilterFactory/
   /analyzer
 /fieldType
 {code}
 this does not : 
 {code:xml}
  fieldType name=folded class=solr.TextField
   analyzer
 tokenizer class=solr.StandardTokenizerFactory/
 filter class=solr.ICUFoldingFilterFactory/
   /analyzer
 /fieldType
 {code}
 {noformat}
 257 [main] ERROR org.apache.solr.core.SolrCore  – Error loading 
 core:java.util.concurrent.ExecutionException: java.lang.NoClassDefFoundError: 
 org/apache/lucene/analysis/icu/ICUFoldingFilter
   at java.util.concurrent.FutureTask.report(FutureTask.java:122)
   at java.util.concurrent.FutureTask.get(FutureTask.java:188)
   at org.apache.solr.core.CoreContainer.load(CoreContainer.java:301)
   at 
 org.apache.solr.servlet.SolrDispatchFilter.createCoreContainer(SolrDispatchFilter.java:190)
   at 
 org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:137)
   at org.eclipse.jetty.servlet.FilterHolder.doStart(FilterHolder.java:119)
   at 
 org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64)
   at 
 org.eclipse.jetty.servlet.ServletHandler.initialize(ServletHandler.java:719)
   at 
 org.eclipse.jetty.servlet.ServletContextHandler.startContext(ServletContextHandler.java:265)
   at 
 org.eclipse.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1252)
   at 
 org.eclipse.jetty.server.handler.ContextHandler.doStart(ContextHandler.java:710)
   at 
 org.eclipse.jetty.webapp.WebAppContext.doStart(WebAppContext.java:494)
   at 
 org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64)
   at 
 org.eclipse.jetty.deploy.bindings.StandardStarter.processBinding(StandardStarter.java:39)
   at 
 org.eclipse.jetty.deploy.AppLifeCycle.runBindings(AppLifeCycle.java:186)
   at 
 org.eclipse.jetty.deploy.DeploymentManager.requestAppGoal(DeploymentManager.java:494)
   at 
 org.eclipse.jetty.deploy.DeploymentManager.addApp(DeploymentManager.java:141)
   at 
 org.eclipse.jetty.deploy.providers.ScanningAppProvider.fileAdded(ScanningAppProvider.java:145)
   at 
 org.eclipse.jetty.deploy.providers.ScanningAppProvider$1.fileAdded(ScanningAppProvider.java:56)
   at org.eclipse.jetty.util.Scanner.reportAddition(Scanner.java:609)
   at org.eclipse.jetty.util.Scanner.reportDifferences(Scanner.java:540)
   at org.eclipse.jetty.util.Scanner.scan(Scanner.java:403)
   at org.eclipse.jetty.util.Scanner.doStart(Scanner.java:337)
   at 
 org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64)
   at 
 org.eclipse.jetty.deploy.providers.ScanningAppProvider.doStart(ScanningAppProvider.java:121)
   at 
 org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64)
   at 
 org.eclipse.jetty.deploy.DeploymentManager.startAppProvider(DeploymentManager.java:555)
   at 
 org.eclipse.jetty.deploy.DeploymentManager.doStart(DeploymentManager.java:230)
   at 
 org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64)
   at 
 org.eclipse.jetty.util.component.AggregateLifeCycle.doStart(AggregateLifeCycle.java:81)
   at 
 org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:58)
   at 
 org.eclipse.jetty.server.handler.HandlerWrapper.doStart(HandlerWrapper.java:96)
   at org.eclipse.jetty.server.Server.doStart(Server.java:280)
   at 
 org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64)
   at 
 org.eclipse.jetty.xml.XmlConfiguration$1.run(XmlConfiguration.java:1259)
   at java.security.AccessController.doPrivileged(Native Method)
   at 
 

[jira] [Updated] (SOLR-6188) solr.ICUFoldingFilterFactory causes NoClassDefFoundError: o/a/l/a/icu/ICUFoldingFilter

2014-06-20 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated SOLR-6188:
--

Fix Version/s: (was: 4.9)
   4.10

 solr.ICUFoldingFilterFactory causes NoClassDefFoundError: 
 o/a/l/a/icu/ICUFoldingFilter
 --

 Key: SOLR-6188
 URL: https://issues.apache.org/jira/browse/SOLR-6188
 Project: Solr
  Issue Type: Bug
  Components: Schema and Analysis
Affects Versions: 4.8.1
Reporter: Ahmet Arslan
  Labels: ICUFoldingFilterFactory
 Fix For: 4.10


 When fully qualified class name is used in schema.xml 
 {{org.apache.lucene.analysis.icu.ICUFoldingFilterFactory}}
 it works. However as documented in confluence and wiki, when 
 {{solr.ICUFoldingFilterFactory}} is used it throws following exception.
 This is true for both released 4.8.1 version and trunk r1604168
 following type works :
 {code:xml}
  fieldType name=folded2 class=solr.TextField
   analyzer
 tokenizer class=solr.StandardTokenizerFactory/
 filter 
 class=org.apache.lucene.analysis.icu.ICUFoldingFilterFactory/
   /analyzer
 /fieldType
 {code}
 this does not : 
 {code:xml}
  fieldType name=folded class=solr.TextField
   analyzer
 tokenizer class=solr.StandardTokenizerFactory/
 filter class=solr.ICUFoldingFilterFactory/
   /analyzer
 /fieldType
 {code}
 {noformat}
 257 [main] ERROR org.apache.solr.core.SolrCore  – Error loading 
 core:java.util.concurrent.ExecutionException: java.lang.NoClassDefFoundError: 
 org/apache/lucene/analysis/icu/ICUFoldingFilter
   at java.util.concurrent.FutureTask.report(FutureTask.java:122)
   at java.util.concurrent.FutureTask.get(FutureTask.java:188)
   at org.apache.solr.core.CoreContainer.load(CoreContainer.java:301)
   at 
 org.apache.solr.servlet.SolrDispatchFilter.createCoreContainer(SolrDispatchFilter.java:190)
   at 
 org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:137)
   at org.eclipse.jetty.servlet.FilterHolder.doStart(FilterHolder.java:119)
   at 
 org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64)
   at 
 org.eclipse.jetty.servlet.ServletHandler.initialize(ServletHandler.java:719)
   at 
 org.eclipse.jetty.servlet.ServletContextHandler.startContext(ServletContextHandler.java:265)
   at 
 org.eclipse.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1252)
   at 
 org.eclipse.jetty.server.handler.ContextHandler.doStart(ContextHandler.java:710)
   at 
 org.eclipse.jetty.webapp.WebAppContext.doStart(WebAppContext.java:494)
   at 
 org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64)
   at 
 org.eclipse.jetty.deploy.bindings.StandardStarter.processBinding(StandardStarter.java:39)
   at 
 org.eclipse.jetty.deploy.AppLifeCycle.runBindings(AppLifeCycle.java:186)
   at 
 org.eclipse.jetty.deploy.DeploymentManager.requestAppGoal(DeploymentManager.java:494)
   at 
 org.eclipse.jetty.deploy.DeploymentManager.addApp(DeploymentManager.java:141)
   at 
 org.eclipse.jetty.deploy.providers.ScanningAppProvider.fileAdded(ScanningAppProvider.java:145)
   at 
 org.eclipse.jetty.deploy.providers.ScanningAppProvider$1.fileAdded(ScanningAppProvider.java:56)
   at org.eclipse.jetty.util.Scanner.reportAddition(Scanner.java:609)
   at org.eclipse.jetty.util.Scanner.reportDifferences(Scanner.java:540)
   at org.eclipse.jetty.util.Scanner.scan(Scanner.java:403)
   at org.eclipse.jetty.util.Scanner.doStart(Scanner.java:337)
   at 
 org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64)
   at 
 org.eclipse.jetty.deploy.providers.ScanningAppProvider.doStart(ScanningAppProvider.java:121)
   at 
 org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64)
   at 
 org.eclipse.jetty.deploy.DeploymentManager.startAppProvider(DeploymentManager.java:555)
   at 
 org.eclipse.jetty.deploy.DeploymentManager.doStart(DeploymentManager.java:230)
   at 
 org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64)
   at 
 org.eclipse.jetty.util.component.AggregateLifeCycle.doStart(AggregateLifeCycle.java:81)
   at 
 org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:58)
   at 
 org.eclipse.jetty.server.handler.HandlerWrapper.doStart(HandlerWrapper.java:96)
   at org.eclipse.jetty.server.Server.doStart(Server.java:280)
   at 
 org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64)
   at 
 org.eclipse.jetty.xml.XmlConfiguration$1.run(XmlConfiguration.java:1259)
   at java.security.AccessController.doPrivileged(Native 

[jira] [Commented] (SOLR-6188) solr.ICUFoldingFilterFactory causes NoClassDefFoundError: o/a/l/a/icu/ICUFoldingFilter

2014-06-20 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038783#comment-14038783
 ] 

Robert Muir commented on SOLR-6188:
---

Usually this is because you have not configured the correct classpath.

 solr.ICUFoldingFilterFactory causes NoClassDefFoundError: 
 o/a/l/a/icu/ICUFoldingFilter
 --

 Key: SOLR-6188
 URL: https://issues.apache.org/jira/browse/SOLR-6188
 Project: Solr
  Issue Type: Bug
  Components: Schema and Analysis
Affects Versions: 4.8.1
Reporter: Ahmet Arslan
  Labels: ICUFoldingFilterFactory
 Fix For: 4.10


 When fully qualified class name is used in schema.xml 
 {{org.apache.lucene.analysis.icu.ICUFoldingFilterFactory}}
 it works. However as documented in confluence and wiki, when 
 {{solr.ICUFoldingFilterFactory}} is used it throws following exception.
 This is true for both released 4.8.1 version and trunk r1604168
 following type works :
 {code:xml}
  fieldType name=folded2 class=solr.TextField
   analyzer
 tokenizer class=solr.StandardTokenizerFactory/
 filter 
 class=org.apache.lucene.analysis.icu.ICUFoldingFilterFactory/
   /analyzer
 /fieldType
 {code}
 this does not : 
 {code:xml}
  fieldType name=folded class=solr.TextField
   analyzer
 tokenizer class=solr.StandardTokenizerFactory/
 filter class=solr.ICUFoldingFilterFactory/
   /analyzer
 /fieldType
 {code}
 {noformat}
 257 [main] ERROR org.apache.solr.core.SolrCore  – Error loading 
 core:java.util.concurrent.ExecutionException: java.lang.NoClassDefFoundError: 
 org/apache/lucene/analysis/icu/ICUFoldingFilter
   at java.util.concurrent.FutureTask.report(FutureTask.java:122)
   at java.util.concurrent.FutureTask.get(FutureTask.java:188)
   at org.apache.solr.core.CoreContainer.load(CoreContainer.java:301)
   at 
 org.apache.solr.servlet.SolrDispatchFilter.createCoreContainer(SolrDispatchFilter.java:190)
   at 
 org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:137)
   at org.eclipse.jetty.servlet.FilterHolder.doStart(FilterHolder.java:119)
   at 
 org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64)
   at 
 org.eclipse.jetty.servlet.ServletHandler.initialize(ServletHandler.java:719)
   at 
 org.eclipse.jetty.servlet.ServletContextHandler.startContext(ServletContextHandler.java:265)
   at 
 org.eclipse.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1252)
   at 
 org.eclipse.jetty.server.handler.ContextHandler.doStart(ContextHandler.java:710)
   at 
 org.eclipse.jetty.webapp.WebAppContext.doStart(WebAppContext.java:494)
   at 
 org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64)
   at 
 org.eclipse.jetty.deploy.bindings.StandardStarter.processBinding(StandardStarter.java:39)
   at 
 org.eclipse.jetty.deploy.AppLifeCycle.runBindings(AppLifeCycle.java:186)
   at 
 org.eclipse.jetty.deploy.DeploymentManager.requestAppGoal(DeploymentManager.java:494)
   at 
 org.eclipse.jetty.deploy.DeploymentManager.addApp(DeploymentManager.java:141)
   at 
 org.eclipse.jetty.deploy.providers.ScanningAppProvider.fileAdded(ScanningAppProvider.java:145)
   at 
 org.eclipse.jetty.deploy.providers.ScanningAppProvider$1.fileAdded(ScanningAppProvider.java:56)
   at org.eclipse.jetty.util.Scanner.reportAddition(Scanner.java:609)
   at org.eclipse.jetty.util.Scanner.reportDifferences(Scanner.java:540)
   at org.eclipse.jetty.util.Scanner.scan(Scanner.java:403)
   at org.eclipse.jetty.util.Scanner.doStart(Scanner.java:337)
   at 
 org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64)
   at 
 org.eclipse.jetty.deploy.providers.ScanningAppProvider.doStart(ScanningAppProvider.java:121)
   at 
 org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64)
   at 
 org.eclipse.jetty.deploy.DeploymentManager.startAppProvider(DeploymentManager.java:555)
   at 
 org.eclipse.jetty.deploy.DeploymentManager.doStart(DeploymentManager.java:230)
   at 
 org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64)
   at 
 org.eclipse.jetty.util.component.AggregateLifeCycle.doStart(AggregateLifeCycle.java:81)
   at 
 org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:58)
   at 
 org.eclipse.jetty.server.handler.HandlerWrapper.doStart(HandlerWrapper.java:96)
   at org.eclipse.jetty.server.Server.doStart(Server.java:280)
   at 
 org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64)
   at 
 

[jira] [Commented] (SOLR-6188) solr.ICUFoldingFilterFactory causes NoClassDefFoundError: o/a/l/a/icu/ICUFoldingFilter

2014-06-20 Thread Ahmet Arslan (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038790#comment-14038790
 ] 

Ahmet Arslan commented on SOLR-6188:


Hi Robert, I thought the same at first, but I have these two jars : 
{{icu4j-53.1.jar}} and {{lucene-analyzers-icu-5.0-SNAPSHOT.jar}} inside
solr-trunk/solr/example/solr/collection1/lib directory. Besides it shouldn't 
work when org.apache.lucene.analysis.icu.ICUFoldingFilterFactory is used.

I am downloading your 4.9 release candidate, i will test it with that too.

 solr.ICUFoldingFilterFactory causes NoClassDefFoundError: 
 o/a/l/a/icu/ICUFoldingFilter
 --

 Key: SOLR-6188
 URL: https://issues.apache.org/jira/browse/SOLR-6188
 Project: Solr
  Issue Type: Bug
  Components: Schema and Analysis
Affects Versions: 4.8.1
Reporter: Ahmet Arslan
  Labels: ICUFoldingFilterFactory
 Fix For: 4.10


 When fully qualified class name is used in schema.xml 
 {{org.apache.lucene.analysis.icu.ICUFoldingFilterFactory}}
 it works. However as documented in confluence and wiki, when 
 {{solr.ICUFoldingFilterFactory}} is used it throws following exception.
 This is true for both released 4.8.1 version and trunk r1604168
 following type works :
 {code:xml}
  fieldType name=folded2 class=solr.TextField
   analyzer
 tokenizer class=solr.StandardTokenizerFactory/
 filter 
 class=org.apache.lucene.analysis.icu.ICUFoldingFilterFactory/
   /analyzer
 /fieldType
 {code}
 this does not : 
 {code:xml}
  fieldType name=folded class=solr.TextField
   analyzer
 tokenizer class=solr.StandardTokenizerFactory/
 filter class=solr.ICUFoldingFilterFactory/
   /analyzer
 /fieldType
 {code}
 {noformat}
 257 [main] ERROR org.apache.solr.core.SolrCore  – Error loading 
 core:java.util.concurrent.ExecutionException: java.lang.NoClassDefFoundError: 
 org/apache/lucene/analysis/icu/ICUFoldingFilter
   at java.util.concurrent.FutureTask.report(FutureTask.java:122)
   at java.util.concurrent.FutureTask.get(FutureTask.java:188)
   at org.apache.solr.core.CoreContainer.load(CoreContainer.java:301)
   at 
 org.apache.solr.servlet.SolrDispatchFilter.createCoreContainer(SolrDispatchFilter.java:190)
   at 
 org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:137)
   at org.eclipse.jetty.servlet.FilterHolder.doStart(FilterHolder.java:119)
   at 
 org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64)
   at 
 org.eclipse.jetty.servlet.ServletHandler.initialize(ServletHandler.java:719)
   at 
 org.eclipse.jetty.servlet.ServletContextHandler.startContext(ServletContextHandler.java:265)
   at 
 org.eclipse.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1252)
   at 
 org.eclipse.jetty.server.handler.ContextHandler.doStart(ContextHandler.java:710)
   at 
 org.eclipse.jetty.webapp.WebAppContext.doStart(WebAppContext.java:494)
   at 
 org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64)
   at 
 org.eclipse.jetty.deploy.bindings.StandardStarter.processBinding(StandardStarter.java:39)
   at 
 org.eclipse.jetty.deploy.AppLifeCycle.runBindings(AppLifeCycle.java:186)
   at 
 org.eclipse.jetty.deploy.DeploymentManager.requestAppGoal(DeploymentManager.java:494)
   at 
 org.eclipse.jetty.deploy.DeploymentManager.addApp(DeploymentManager.java:141)
   at 
 org.eclipse.jetty.deploy.providers.ScanningAppProvider.fileAdded(ScanningAppProvider.java:145)
   at 
 org.eclipse.jetty.deploy.providers.ScanningAppProvider$1.fileAdded(ScanningAppProvider.java:56)
   at org.eclipse.jetty.util.Scanner.reportAddition(Scanner.java:609)
   at org.eclipse.jetty.util.Scanner.reportDifferences(Scanner.java:540)
   at org.eclipse.jetty.util.Scanner.scan(Scanner.java:403)
   at org.eclipse.jetty.util.Scanner.doStart(Scanner.java:337)
   at 
 org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64)
   at 
 org.eclipse.jetty.deploy.providers.ScanningAppProvider.doStart(ScanningAppProvider.java:121)
   at 
 org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64)
   at 
 org.eclipse.jetty.deploy.DeploymentManager.startAppProvider(DeploymentManager.java:555)
   at 
 org.eclipse.jetty.deploy.DeploymentManager.doStart(DeploymentManager.java:230)
   at 
 org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64)
   at 
 org.eclipse.jetty.util.component.AggregateLifeCycle.doStart(AggregateLifeCycle.java:81)
   at 
 org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:58)
   at 
 

[jira] [Commented] (SOLR-4793) Solr Cloud can't upload large config files ( 1MB) to Zookeeper

2014-06-20 Thread Elaine Cario (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038797#comment-14038797
 ] 

Elaine Cario commented on SOLR-4793:


I'm finding that setting jute.maxbuffer both in Solr (-D option at tomcat 
startup) and zookeeper (zoo.cfg) doesn't seem to work (at least with solr 4.8). 
 This is really becoming a blocker for us, as we are using index time synonym 
replacement as a poor man's lemmatization and these files get quite large.  
Would be nice to have some option to have these files managed outside of 
zookeeper.

 Solr Cloud can't upload large config files (  1MB)  to Zookeeper
 -

 Key: SOLR-4793
 URL: https://issues.apache.org/jira/browse/SOLR-4793
 Project: Solr
  Issue Type: Improvement
Reporter: Son Nguyen

 Zookeeper set znode size limit to 1MB by default. So we can't start Solr 
 Cloud with some large config files, like synonyms.txt.
 Jan Høydahl has a good idea:
 SolrCloud is designed with an assumption that you should be able to upload 
 your whole disk-based conf folder into ZK, and that you should be able to add 
 an empty Solr node to a cluster and it would download all config from ZK. So 
 immediately a splitting strategy automatically handled by ZkSolresourceLoader 
 for large files could be one way forward, i.e. store synonyms.txt as e.g. 
 __001_synonyms.txt __002_synonyms.txt



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4793) Solr Cloud can't upload large config files ( 1MB) to Zookeeper

2014-06-20 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038810#comment-14038810
 ] 

Yago Riveiro commented on SOLR-4793:


I think that version 4.8 updates zookeeper version to 3.4.6

If the workaround doesn't work then is serious issue if you have a large number 
of collections and replicas because all metadata about the cluster is into 
clusterstate.json file.

[~ecario], How you notice it that the workaround doesn't work? Have you any 
logs or something? and last question, do you upgrade Solr from 4.7 to 4.8 or is 
a fresh install?




 Solr Cloud can't upload large config files (  1MB)  to Zookeeper
 -

 Key: SOLR-4793
 URL: https://issues.apache.org/jira/browse/SOLR-4793
 Project: Solr
  Issue Type: Improvement
Reporter: Son Nguyen

 Zookeeper set znode size limit to 1MB by default. So we can't start Solr 
 Cloud with some large config files, like synonyms.txt.
 Jan Høydahl has a good idea:
 SolrCloud is designed with an assumption that you should be able to upload 
 your whole disk-based conf folder into ZK, and that you should be able to add 
 an empty Solr node to a cluster and it would download all config from ZK. So 
 immediately a splitting strategy automatically handled by ZkSolresourceLoader 
 for large files could be one way forward, i.e. store synonyms.txt as e.g. 
 __001_synonyms.txt __002_synonyms.txt



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [VOTE] 4.9.0

2014-06-20 Thread Ahmet Arslan
Hi,

here is what I do

* download solr-4.9.0.tgz
* add icu4j-53.1.jar and solr-analysis-extras-4.9.0.jar and 
lucene-analyzers-icu-4.9.0.jar to solr-4.9.0/example/solr/collection1/lib/

* confirm they are loaded

INFO  org.apache.solr.core.SolrResourceLoader  – Adding 
'file:/Volumes/datadisk/Desktop/solr-4.9.0/example/solr/collection1/lib/icu4j-53.1.jar'
 to class loader

INFO  org.apache.solr.core.SolrResourceLoader  – Adding 
'file:/Volumes/datadisk/Desktop/solr-4.9.0/example/solr/collection1/lib/lucene-analyzers-icu-4.9.0.jar'
 to classloader

INFO  org.apache.solr.core.SolrResourceLoader  – Adding 
'file:/Volumes/datadisk/Desktop/solr-4.9.0/example/solr/collection1/lib/solr-analysis-extras-4.9.0.jar'
 to class loader

icu4j-53.1.jar loaded twice 

INFO  org.apache.solr.core.SolrResourceLoader  – Adding 
'file:/Volumes/datadisk/Desktop/solr-4.9.0/contrib/extraction/lib/icu4j-53.1.jar'
 to classloader


* add filter class=solr.ICUFoldingFilterFactory/ to example schema.xml

* java -jar start.jar yields the exception reported in SOLR-6188


When filter class=org.apache.lucene.analysis.icu.ICUFoldingFilterFactory/ 
is used everything works fine.

Thanks,
Ahmet



On Friday, June 20, 2014 3:55 PM, Michael McCandless 
luc...@mikemccandless.com wrote:
+1

SUCCESS! [0:47:26.115239]

Mike McCandless

http://blog.mikemccandless.com


On Fri, Jun 20, 2014 at 8:13 AM, Robert Muir rcm...@gmail.com wrote:
 Artifacts here:
 http://people.apache.org/~rmuir/staging_area/lucene_solr_4_9_0_r1604085/

 Here's my +1

 SUCCESS! [0:35:36.654925]

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org





-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [VOTE] 4.9.0

2014-06-20 Thread Robert Muir
I don't think this issue needs to cause a respin, its in a contrib
module, there is a workaround, and the fix is either some classpath
problem or, in case it is a bug, likely involves dangerous territory
(e.g. SolrResourceLoader)...


On Fri, Jun 20, 2014 at 10:03 AM, Ahmet Arslan
iori...@yahoo.com.invalid wrote:
 Hi,

 here is what I do

 * download solr-4.9.0.tgz
 * add icu4j-53.1.jar and solr-analysis-extras-4.9.0.jar and 
 lucene-analyzers-icu-4.9.0.jar to solr-4.9.0/example/solr/collection1/lib/

 * confirm they are loaded

 INFO  org.apache.solr.core.SolrResourceLoader  – Adding 
 'file:/Volumes/datadisk/Desktop/solr-4.9.0/example/solr/collection1/lib/icu4j-53.1.jar'
  to class loader

 INFO  org.apache.solr.core.SolrResourceLoader  – Adding 
 'file:/Volumes/datadisk/Desktop/solr-4.9.0/example/solr/collection1/lib/lucene-analyzers-icu-4.9.0.jar'
  to classloader

 INFO  org.apache.solr.core.SolrResourceLoader  – Adding 
 'file:/Volumes/datadisk/Desktop/solr-4.9.0/example/solr/collection1/lib/solr-analysis-extras-4.9.0.jar'
  to class loader

 icu4j-53.1.jar loaded twice

 INFO  org.apache.solr.core.SolrResourceLoader  – Adding 
 'file:/Volumes/datadisk/Desktop/solr-4.9.0/contrib/extraction/lib/icu4j-53.1.jar'
  to classloader


 * add filter class=solr.ICUFoldingFilterFactory/ to example schema.xml

 * java -jar start.jar yields the exception reported in SOLR-6188


 When filter class=org.apache.lucene.analysis.icu.ICUFoldingFilterFactory/ 
 is used everything works fine.

 Thanks,
 Ahmet



 On Friday, June 20, 2014 3:55 PM, Michael McCandless 
 luc...@mikemccandless.com wrote:
 +1

 SUCCESS! [0:47:26.115239]

 Mike McCandless

 http://blog.mikemccandless.com


 On Fri, Jun 20, 2014 at 8:13 AM, Robert Muir rcm...@gmail.com wrote:
 Artifacts here:
 http://people.apache.org/~rmuir/staging_area/lucene_solr_4_9_0_r1604085/

 Here's my +1

 SUCCESS! [0:35:36.654925]

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org





 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4793) Solr Cloud can't upload large config files ( 1MB) to Zookeeper

2014-06-20 Thread Nicole Lacoste (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038834#comment-14038834
 ] 

Nicole Lacoste commented on SOLR-4793:
--

Elaine,
We got stuck at the same thing and if I remember right we put the synonyms 
files in a matching folder on each of the machines and put the full path in the 
schema. It means if there is an update you have to take care of it yourself. I 
am not sure if the REST api for the synonyms work in this case you'd have to 
test that.  


 Solr Cloud can't upload large config files (  1MB)  to Zookeeper
 -

 Key: SOLR-4793
 URL: https://issues.apache.org/jira/browse/SOLR-4793
 Project: Solr
  Issue Type: Improvement
Reporter: Son Nguyen

 Zookeeper set znode size limit to 1MB by default. So we can't start Solr 
 Cloud with some large config files, like synonyms.txt.
 Jan Høydahl has a good idea:
 SolrCloud is designed with an assumption that you should be able to upload 
 your whole disk-based conf folder into ZK, and that you should be able to add 
 an empty Solr node to a cluster and it would download all config from ZK. So 
 immediately a splitting strategy automatically handled by ZkSolresourceLoader 
 for large files could be one way forward, i.e. store synonyms.txt as e.g. 
 __001_synonyms.txt __002_synonyms.txt



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4793) Solr Cloud can't upload large config files ( 1MB) to Zookeeper

2014-06-20 Thread Elaine Cario (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038835#comment-14038835
 ] 

Elaine Cario commented on SOLR-4793:


I got this exception in the logs, they always happen on the lemmatization 
files, which run anywhere from 2MB to 20MB in size:

ERROR - 2014-06-20 09:29:59.082; org.apache.solr.core.ZkContainer;
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = 
ConnectionLoss for /configs/dictindex_uk/en_lemmatization.txt
at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783)
at 
org.apache.solr.common.cloud.SolrZkClient$10.execute(SolrZkClient.java:432)
at 
org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(ZkCmdExecutor.java:73)
at 
org.apache.solr.common.cloud.SolrZkClient.makePath(SolrZkClient.java:429)
at 
org.apache.solr.common.cloud.SolrZkClient.makePath(SolrZkClient.java:339)
at org.apache.solr.cloud.ZkController.uploadToZK(ZkController.java:1318)
at 
org.apache.solr.cloud.ZkController.uploadConfigDir(ZkController.java:1355)
at 
org.apache.solr.cloud.ZkController.bootstrapConf(ZkController.java:1565)
at org.apache.solr.core.ZkContainer.initZooKeeper(ZkContainer.java:188)
at org.apache.solr.core.ZkContainer.initZooKeeper(ZkContainer.java:67)
at org.apache.solr.core.CoreContainer.load(CoreContainer.java:216)
at 
org.apache.solr.servlet.SolrDispatchFilter.createCoreContainer(SolrDispatchFilter.java:189)
at 
org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:136)
at 
org.apache.catalina.core.ApplicationFilterConfig.initFilter(ApplicationFilterConfig.java:279)
at 
org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:260)
at 
org.apache.catalina.core.ApplicationFilterConfig.init(ApplicationFilterConfig.java:105)
at 
org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:4809)
at 
org.apache.catalina.core.StandardContext.startInternal(StandardContext.java:5485)
at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:150)
at 
org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:901)
at 
org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:877)
at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:632)
at 
org.apache.catalina.startup.HostConfig.deployWAR(HostConfig.java:1073)
at 
org.apache.catalina.startup.HostConfig$DeployWar.run(HostConfig.java:1857)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
ERROR - 2014-06-20 09:29:59.087; org.apache.solr.servlet.SolrDispatchFilter; 
Could not start Solr. Check solr/home property and the logs

We were working around it by splitting the files up in  1MB chunks, but that 
sometimes worked and sometimes didn't (we'd still get errors), and for the 
larger files it's, well, a pain to list  20 files in the schema.  We tried 
moving the files, but then it complains that relative paths aren't allowed in 
the specification. 

This is a fresh install of 4.8 (technically it was an upgrade, but we removed 
the index and re-fed all our content).  But we didn't upgrade zookeeper, so I 
will try that and report back.


 Solr Cloud can't upload large config files (  1MB)  to Zookeeper
 -

 Key: SOLR-4793
 URL: https://issues.apache.org/jira/browse/SOLR-4793
 Project: Solr
  Issue Type: Improvement
Reporter: Son Nguyen

 Zookeeper set znode size limit to 1MB by default. So we can't start Solr 
 Cloud with some large config files, like synonyms.txt.
 Jan Høydahl has a good idea:
 SolrCloud is designed with an assumption that you should be able to upload 
 your whole disk-based conf folder into ZK, and that you should be able to add 
 an empty Solr node to a cluster and it would download all config from ZK. So 
 immediately a splitting strategy automatically handled by ZkSolresourceLoader 
 for large files could be one way forward, i.e. store synonyms.txt as e.g. 
 __001_synonyms.txt __002_synonyms.txt



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: 

[jira] [Commented] (SOLR-4793) Solr Cloud can't upload large config files ( 1MB) to Zookeeper

2014-06-20 Thread Nicole Lacoste (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038842#comment-14038842
 ] 

Nicole Lacoste commented on SOLR-4793:
--

I also had trouble with relative paths, try full paths.  

 Solr Cloud can't upload large config files (  1MB)  to Zookeeper
 -

 Key: SOLR-4793
 URL: https://issues.apache.org/jira/browse/SOLR-4793
 Project: Solr
  Issue Type: Improvement
Reporter: Son Nguyen

 Zookeeper set znode size limit to 1MB by default. So we can't start Solr 
 Cloud with some large config files, like synonyms.txt.
 Jan Høydahl has a good idea:
 SolrCloud is designed with an assumption that you should be able to upload 
 your whole disk-based conf folder into ZK, and that you should be able to add 
 an empty Solr node to a cluster and it would download all config from ZK. So 
 immediately a splitting strategy automatically handled by ZkSolresourceLoader 
 for large files could be one way forward, i.e. store synonyms.txt as e.g. 
 __001_synonyms.txt __002_synonyms.txt



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4793) Solr Cloud can't upload large config files ( 1MB) to Zookeeper

2014-06-20 Thread Elaine Cario (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038845#comment-14038845
 ] 

Elaine Cario commented on SOLR-4793:


Scratch that, we're already at zookeeper 3.4.6.

Thanks Nicole - that was going to be my very next step.  Updates aren't an 
issue, the files are under version control and our deployment scripts can take 
care of moving them wherever we need to make it work :-).

Will report back...

 Solr Cloud can't upload large config files (  1MB)  to Zookeeper
 -

 Key: SOLR-4793
 URL: https://issues.apache.org/jira/browse/SOLR-4793
 Project: Solr
  Issue Type: Improvement
Reporter: Son Nguyen

 Zookeeper set znode size limit to 1MB by default. So we can't start Solr 
 Cloud with some large config files, like synonyms.txt.
 Jan Høydahl has a good idea:
 SolrCloud is designed with an assumption that you should be able to upload 
 your whole disk-based conf folder into ZK, and that you should be able to add 
 an empty Solr node to a cluster and it would download all config from ZK. So 
 immediately a splitting strategy automatically handled by ZkSolresourceLoader 
 for large files could be one way forward, i.e. store synonyms.txt as e.g. 
 __001_synonyms.txt __002_synonyms.txt



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-5661) PriorityQueue has OOM (Requested array size exceeds VM limit) issue

2014-06-20 Thread Shalin Shekhar Mangar (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar resolved SOLR-5661.
-

   Resolution: Fixed
Fix Version/s: (was: 4.9)
   4.7

This was resolved in 4.7

 PriorityQueue has OOM (Requested array size exceeds VM limit) issue
 ---

 Key: SOLR-5661
 URL: https://issues.apache.org/jira/browse/SOLR-5661
 Project: Solr
  Issue Type: Bug
  Components: contrib - Solr Cell (Tika extraction)
Affects Versions: 4.3.1, 4.4, 4.5, 4.5.1, 4.6
 Environment: JDK 7 
Reporter: Raintung Li
Assignee: Michael McCandless
 Fix For: 5.0, 4.7

 Attachments: patch-5661.txt


 It look like JDK7 change the design for max_array_length logic, it isn't 
 max_jint, and it should be  max_jint - header_size(type).
 If you deliver the Integer.MaxValue to create the PriorityQueue and have 
 enough memory, you will find it is ok in JVM6 but not work in JVM7.
  
 JVM7 will throw OOM error while do array rang checking.
 It should the compatible issue between JVM6 and JVM7.
 Maybe need protect in the code logic, throw OOM look like big issues for 
 customer.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4793) Solr Cloud can't upload large config files ( 1MB) to Zookeeper

2014-06-20 Thread Elaine Cario (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038868#comment-14038868
 ] 

Elaine Cario commented on SOLR-4793:


Absolute paths didn't work either - it seems to turn it into a relaitve path, 
and still uses the ZKSolrResourceLoader to find it.  I may have to either split 
the files, or dig into the FSTSynonymFilterFactory to figure out some other 
workaround.

ERROR - 2014-06-20 10:45:13.522; org.apache.solr.core.CoreContainer; Unable to 
create core: dictindex_uk
org.apache.solr.common.SolrException: Could not load core configuration for 
core dictindex_uk
at 
org.apache.solr.core.ConfigSetService.getConfig(ConfigSetService.java:66)
at org.apache.solr.core.CoreContainer.create(CoreContainer.java:554)
at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:261)
at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:253)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Caused by: java.lang.RuntimeException: java.io.IOException: Error opening 
/configs/dictindex_uk//apps/solr/collections/en_lemmatization.txt
at org.apache.solr.schema.IndexSchema.init(IndexSchema.java:169)
at 
org.apache.solr.schema.IndexSchemaFactory.create(IndexSchemaFactory.java:55)
at 
org.apache.solr.schema.IndexSchemaFactory.buildIndexSchema(IndexSchemaFactory.java:69)
at 
org.apache.solr.core.ConfigSetService.createIndexSchema(ConfigSetService.java:89)
at 
org.apache.solr.core.ConfigSetService.getConfig(ConfigSetService.java:62)
... 9 more
Caused by: java.io.IOException: Error opening 
/configs/dictindex_uk//apps/solr/collections/en_lemmatization.txt
at 
org.apache.solr.cloud.ZkSolrResourceLoader.openResource(ZkSolrResourceLoader.java:83)
at 
org.apache.lucene.analysis.synonym.FSTSynonymFilterFactory.loadSynonyms(FSTSynonymFilterFactory.java:137)
at 
org.apache.lucene.analysis.synonym.FSTSynonymFilterFactory.inform(FSTSynonymFilterFactory.java:112)
at 
org.apache.lucene.analysis.synonym.SynonymFilterFactory.inform(SynonymFilterFactory.java:90)
at 
org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:675)
at org.apache.solr.schema.IndexSchema.init(IndexSchema.java:167)
... 13 more


 Solr Cloud can't upload large config files (  1MB)  to Zookeeper
 -

 Key: SOLR-4793
 URL: https://issues.apache.org/jira/browse/SOLR-4793
 Project: Solr
  Issue Type: Improvement
Reporter: Son Nguyen

 Zookeeper set znode size limit to 1MB by default. So we can't start Solr 
 Cloud with some large config files, like synonyms.txt.
 Jan Høydahl has a good idea:
 SolrCloud is designed with an assumption that you should be able to upload 
 your whole disk-based conf folder into ZK, and that you should be able to add 
 an empty Solr node to a cluster and it would download all config from ZK. So 
 immediately a splitting strategy automatically handled by ZkSolresourceLoader 
 for large files could be one way forward, i.e. store synonyms.txt as e.g. 
 __001_synonyms.txt __002_synonyms.txt



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4793) Solr Cloud can't upload large config files ( 1MB) to Zookeeper

2014-06-20 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038874#comment-14038874
 ] 

Yago Riveiro commented on SOLR-4793:


Elaine can you paste the configuration for tomcat and zookeeper that you have 
for the jute.maxbuffer?

 Solr Cloud can't upload large config files (  1MB)  to Zookeeper
 -

 Key: SOLR-4793
 URL: https://issues.apache.org/jira/browse/SOLR-4793
 Project: Solr
  Issue Type: Improvement
Reporter: Son Nguyen

 Zookeeper set znode size limit to 1MB by default. So we can't start Solr 
 Cloud with some large config files, like synonyms.txt.
 Jan Høydahl has a good idea:
 SolrCloud is designed with an assumption that you should be able to upload 
 your whole disk-based conf folder into ZK, and that you should be able to add 
 an empty Solr node to a cluster and it would download all config from ZK. So 
 immediately a splitting strategy automatically handled by ZkSolresourceLoader 
 for large files could be one way forward, i.e. store synonyms.txt as e.g. 
 __001_synonyms.txt __002_synonyms.txt



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4793) Solr Cloud can't upload large config files ( 1MB) to Zookeeper

2014-06-20 Thread Elaine Cario (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038889#comment-14038889
 ] 

Elaine Cario commented on SOLR-4793:


Here is the SOLR_OPTS for Tomcat:

SOLR_OPTS=-Dhost.port=8080 -Dhost.context=solr -Dsolr.install.dir=${SOLR_DIST} 
-Dsolr.solr.home=${SOLR_HOME} -Dsolr.base.data.dir=$SOLR_DATA 
-Dbootstrap_conf=true -Dsolr.log=${SOLR_LOGDIR} 
-Dcollection.configName=wkcontent -DnumShards=2 -DzkHost=10.208.152.230:2181 
-Djute.maxbuffer=5000 
-Dlog4j.configuration=file:///apps/solr/collections/log4j.properties

Here is zoo.cfg

tickTime=2000
initLimit=10
syncLimit=5
dataDir=/apps/zookeeper/zk_datadir
dataLogDir=/apps/zookeeper/zk_logs
clientPort=2181
server.1=zookeep01:2881:3881
jute.maxbuffer=5000


 Solr Cloud can't upload large config files (  1MB)  to Zookeeper
 -

 Key: SOLR-4793
 URL: https://issues.apache.org/jira/browse/SOLR-4793
 Project: Solr
  Issue Type: Improvement
Reporter: Son Nguyen

 Zookeeper set znode size limit to 1MB by default. So we can't start Solr 
 Cloud with some large config files, like synonyms.txt.
 Jan Høydahl has a good idea:
 SolrCloud is designed with an assumption that you should be able to upload 
 your whole disk-based conf folder into ZK, and that you should be able to add 
 an empty Solr node to a cluster and it would download all config from ZK. So 
 immediately a splitting strategy automatically handled by ZkSolresourceLoader 
 for large files could be one way forward, i.e. store synonyms.txt as e.g. 
 __001_synonyms.txt __002_synonyms.txt



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4793) Solr Cloud can't upload large config files ( 1MB) to Zookeeper

2014-06-20 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038898#comment-14038898
 ] 

Yago Riveiro commented on SOLR-4793:


About tocamt's configuration I have the same configuration.

In the case of Zookeeper I have all custom configurations into a file named 
zookeeper-env.sh located into bin/conf folder with this content:

{code}
#!/usr/bin/env bash

ZOO_ENV=-Djute.maxbuffer= 5000
{code}

 Solr Cloud can't upload large config files (  1MB)  to Zookeeper
 -

 Key: SOLR-4793
 URL: https://issues.apache.org/jira/browse/SOLR-4793
 Project: Solr
  Issue Type: Improvement
Reporter: Son Nguyen

 Zookeeper set znode size limit to 1MB by default. So we can't start Solr 
 Cloud with some large config files, like synonyms.txt.
 Jan Høydahl has a good idea:
 SolrCloud is designed with an assumption that you should be able to upload 
 your whole disk-based conf folder into ZK, and that you should be able to add 
 an empty Solr node to a cluster and it would download all config from ZK. So 
 immediately a splitting strategy automatically handled by ZkSolresourceLoader 
 for large files could be one way forward, i.e. store synonyms.txt as e.g. 
 __001_synonyms.txt __002_synonyms.txt



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4793) Solr Cloud can't upload large config files ( 1MB) to Zookeeper

2014-06-20 Thread Elaine Cario (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038913#comment-14038913
 ] 

Elaine Cario commented on SOLR-4793:


Yago,  I was going to ask how that gets loaded, but I just found an old post of 
yours with more info: let me try that (reposting it here):

 In my case I have:

 - Zookeeper: a file into conf folder named zookeeper-env.sh with:

 #!/usr/bin/env bash

 ZOO_ENV=-Djute.maxbuffer=10485761

 The zookeeper-env.sh is loaded automatically by the zkEnv.sh if the file
 exists.


 Solr Cloud can't upload large config files (  1MB)  to Zookeeper
 -

 Key: SOLR-4793
 URL: https://issues.apache.org/jira/browse/SOLR-4793
 Project: Solr
  Issue Type: Improvement
Reporter: Son Nguyen

 Zookeeper set znode size limit to 1MB by default. So we can't start Solr 
 Cloud with some large config files, like synonyms.txt.
 Jan Høydahl has a good idea:
 SolrCloud is designed with an assumption that you should be able to upload 
 your whole disk-based conf folder into ZK, and that you should be able to add 
 an empty Solr node to a cluster and it would download all config from ZK. So 
 immediately a splitting strategy automatically handled by ZkSolresourceLoader 
 for large files could be one way forward, i.e. store synonyms.txt as e.g. 
 __001_synonyms.txt __002_synonyms.txt



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4793) Solr Cloud can't upload large config files ( 1MB) to Zookeeper

2014-06-20 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038918#comment-14038918
 ] 

Yago Riveiro commented on SOLR-4793:


Indeed, after dive into the zkEnv file, I realised that if the zookeeper-env.sh 
exists, the zookeeper append the configurations to the init command.

 Solr Cloud can't upload large config files (  1MB)  to Zookeeper
 -

 Key: SOLR-4793
 URL: https://issues.apache.org/jira/browse/SOLR-4793
 Project: Solr
  Issue Type: Improvement
Reporter: Son Nguyen

 Zookeeper set znode size limit to 1MB by default. So we can't start Solr 
 Cloud with some large config files, like synonyms.txt.
 Jan Høydahl has a good idea:
 SolrCloud is designed with an assumption that you should be able to upload 
 your whole disk-based conf folder into ZK, and that you should be able to add 
 an empty Solr node to a cluster and it would download all config from ZK. So 
 immediately a splitting strategy automatically handled by ZkSolresourceLoader 
 for large files could be one way forward, i.e. store synonyms.txt as e.g. 
 __001_synonyms.txt __002_synonyms.txt



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4793) Solr Cloud can't upload large config files ( 1MB) to Zookeeper

2014-06-20 Thread Elaine Cario (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038948#comment-14038948
 ] 

Elaine Cario commented on SOLR-4793:


Yago - we have partial success!  I added the file as you did, but for some 
reason the parameter was still not being added to the zookeeper startup 
(checked PID using ps -ww -fp PID, and didn't see it).

So, for the moment I modified the zkServer.sh to just add the param directly to 
the java startup, and that seemed to work - everything started up with the big 
files.  Now I just need to figure out what's going wrong with the scripting, 
but that's just some investigation in my part.  I'm sure I did something wrong 
there.

Thanks so much for your help!!  This has been bugging me for weeks.



 Solr Cloud can't upload large config files (  1MB)  to Zookeeper
 -

 Key: SOLR-4793
 URL: https://issues.apache.org/jira/browse/SOLR-4793
 Project: Solr
  Issue Type: Improvement
Reporter: Son Nguyen

 Zookeeper set znode size limit to 1MB by default. So we can't start Solr 
 Cloud with some large config files, like synonyms.txt.
 Jan Høydahl has a good idea:
 SolrCloud is designed with an assumption that you should be able to upload 
 your whole disk-based conf folder into ZK, and that you should be able to add 
 an empty Solr node to a cluster and it would download all config from ZK. So 
 immediately a splitting strategy automatically handled by ZkSolresourceLoader 
 for large files could be one way forward, i.e. store synonyms.txt as e.g. 
 __001_synonyms.txt __002_synonyms.txt



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5771) Review semantics of SpatialOperation predicates

2014-06-20 Thread David Smiley (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Smiley updated LUCENE-5771:
-

Attachment: LUCENE-5771_SpatialOperation_semantics.patch

The attached patch removes the needs area and score is meaningful notions 
from SpatialOperation.

I also added aliases to the various predicates to align with the standard 
names.  Ryan, why did you choose the non-standard names?  E.g. why did you 
choose IsEqualTo when Equals is the standard name?  And why the 
BBoxIntersects and BBoxWithin predicates which I'm not aware we use and are 
also non-standard and seem better addressed via other ways (e.g. some sort of 
function one adds that bbox'es a shape).

 Review semantics of SpatialOperation predicates
 ---

 Key: LUCENE-5771
 URL: https://issues.apache.org/jira/browse/LUCENE-5771
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/spatial
Reporter: David Smiley
 Attachments: LUCENE-5771_SpatialOperation_semantics.patch, 
 LUCENE-5771_SpatialOperation_semantics.patch


 SpatialOperation (which I wish was named SpatialPredicate) is a bunch of 
 predicates -- methods that return true/false based on a pair of shapes.  Some 
 of them don't seem to be defined in a way consistent with their definitions 
 on ESRI's site:
 http://edndoc.esri.com/arcsde/9.1/general_topics/understand_spatial_relations.htm
   (which is linked as a reference, and is in turn equivalent to OGC spec 
 definitions, I believe).
 Problems:
 * the definitions make no mention of needing to have area or not, yet some of 
 our predicates are defined as to require area on either the indexed or query 
 shape.
 * the definitions make a distinction of the boundary of a shape, yet in 
 Lucene-spatial, there is none.  That suggests our predicates are wrongly 
 chosen since there *are* official predicates that are boundary-neutral -- 
 namely Covers and CoveredBy in lieu of Contains and Within, respectively. 
  If we don't rename our predicates, we should at least support the correct 
 predicates names!
 * Overlaps appears totally wrong. It should be defined as 
 indexedShape.relate(queryShape) == Intersects  (and thus not Within or 
 Contains or Disjoint).  It's presently defined as the same as Intersects plus 
 the query shape needing area.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [VOTE] 4.9.0

2014-06-20 Thread Shawn Heisey
On 6/20/2014 8:03 AM, Ahmet Arslan wrote:
 * download solr-4.9.0.tgz
 * add icu4j-53.1.jar and solr-analysis-extras-4.9.0.jar and 
 lucene-analyzers-icu-4.9.0.jar to solr-4.9.0/example/solr/collection1/lib/

 * confirm they are loaded

 INFO  org.apache.solr.core.SolrResourceLoader  – Adding 
 'file:/Volumes/datadisk/Desktop/solr-4.9.0/example/solr/collection1/lib/icu4j-53.1.jar'
  to class loader

 INFO  org.apache.solr.core.SolrResourceLoader  – Adding 
 'file:/Volumes/datadisk/Desktop/solr-4.9.0/example/solr/collection1/lib/lucene-analyzers-icu-4.9.0.jar'
  to classloader

 INFO  org.apache.solr.core.SolrResourceLoader  – Adding 
 'file:/Volumes/datadisk/Desktop/solr-4.9.0/example/solr/collection1/lib/solr-analysis-extras-4.9.0.jar'
  to class loader

 icu4j-53.1.jar loaded twice 

 INFO  org.apache.solr.core.SolrResourceLoader  – Adding 
 'file:/Volumes/datadisk/Desktop/solr-4.9.0/contrib/extraction/lib/icu4j-53.1.jar'
  to classloader


 * add filter class=solr.ICUFoldingFilterFactory/ to example schema.xml

 * java -jar start.jar yields the exception reported in SOLR-6188


 When filter class=org.apache.lucene.analysis.icu.ICUFoldingFilterFactory/ 
 is used everything works fine.

One issue that can cause jars to get loaded twice is this:

https://issues.apache.org/jira/browse/SOLR-4852

This doesn't look like that specific issue, though.  I'm wondering why
you copied the jars, since the example config already has a lib line
that loads them directly from the contrib directory.

I think it's getting loaded twice because you copied it into the
core-specific lib directory and there's probably another lib config
element that loads everything in that directory, in addition to the one
that loads it directly from contrib.  My experience with SOLR-4852
combined with this problem report makes me think that it's the fact that
it gets loaded twice that causes it to fail.

Using the fully-qualified class name is not something that I would have
thought of as a workaround to SOLR-4852.

Thanks,
Shawn


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



RE: [VOTE] 4.9.0

2014-06-20 Thread Uwe Schindler
I agree, it's not a problem. The message of SolrResourceCorrumpter is a bit 
misleading: It does not load the JAR, it just adds it to classpath 2 times, 
which is not a problem at all.

Uwe

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


 -Original Message-
 From: Robert Muir [mailto:rcm...@gmail.com]
 Sent: Friday, June 20, 2014 4:10 PM
 To: dev@lucene.apache.org; Ahmet Arslan
 Subject: Re: [VOTE] 4.9.0
 
 I don't think this issue needs to cause a respin, its in a contrib module, 
 there
 is a workaround, and the fix is either some classpath problem or, in case it 
 is a
 bug, likely involves dangerous territory (e.g. SolrResourceLoader)...
 
 
 On Fri, Jun 20, 2014 at 10:03 AM, Ahmet Arslan iori...@yahoo.com.invalid
 wrote:
  Hi,
 
  here is what I do
 
  * download solr-4.9.0.tgz
  * add icu4j-53.1.jar and solr-analysis-extras-4.9.0.jar and
  lucene-analyzers-icu-4.9.0.jar to
  solr-4.9.0/example/solr/collection1/lib/
 
  * confirm they are loaded
 
  INFO  org.apache.solr.core.SolrResourceLoader  – Adding
  'file:/Volumes/datadisk/Desktop/solr-4.9.0/example/solr/collection1/li
  b/icu4j-53.1.jar' to class loader
 
  INFO  org.apache.solr.core.SolrResourceLoader  – Adding
  'file:/Volumes/datadisk/Desktop/solr-4.9.0/example/solr/collection1/li
  b/lucene-analyzers-icu-4.9.0.jar' to classloader
 
  INFO  org.apache.solr.core.SolrResourceLoader  – Adding
  'file:/Volumes/datadisk/Desktop/solr-4.9.0/example/solr/collection1/li
  b/solr-analysis-extras-4.9.0.jar' to class loader
 
  icu4j-53.1.jar loaded twice
 
  INFO  org.apache.solr.core.SolrResourceLoader  – Adding
  'file:/Volumes/datadisk/Desktop/solr-4.9.0/contrib/extraction/lib/icu4
  j-53.1.jar' to classloader
 
 
  * add filter class=solr.ICUFoldingFilterFactory/ to example
  schema.xml
 
  * java -jar start.jar yields the exception reported in SOLR-6188
 
 
  When filter
 class=org.apache.lucene.analysis.icu.ICUFoldingFilterFactory/ is used
 everything works fine.
 
  Thanks,
  Ahmet
 
 
 
  On Friday, June 20, 2014 3:55 PM, Michael McCandless
 luc...@mikemccandless.com wrote:
  +1
 
  SUCCESS! [0:47:26.115239]
 
  Mike McCandless
 
  http://blog.mikemccandless.com
 
 
  On Fri, Jun 20, 2014 at 8:13 AM, Robert Muir rcm...@gmail.com wrote:
  Artifacts here:
 
 http://people.apache.org/~rmuir/staging_area/lucene_solr_4_9_0_r16040
  85/
 
  Here's my +1
 
  SUCCESS! [0:35:36.654925]
 
  -
  To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For
  additional commands, e-mail: dev-h...@lucene.apache.org
 
 
 
 
 
  -
  To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For
  additional commands, e-mail: dev-h...@lucene.apache.org
 
  -
  To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For
  additional commands, e-mail: dev-h...@lucene.apache.org
 
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional
 commands, e-mail: dev-h...@lucene.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4793) Solr Cloud can't upload large config files ( 1MB) to Zookeeper

2014-06-20 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038970#comment-14038970
 ] 

Yago Riveiro commented on SOLR-4793:


Elaine now is easier to do the debug, you know where the problem is :).

Note: I'm using the 3.4.5 version of zookeeper, I don't know if the zkServer.sh 
was changed 

 Solr Cloud can't upload large config files (  1MB)  to Zookeeper
 -

 Key: SOLR-4793
 URL: https://issues.apache.org/jira/browse/SOLR-4793
 Project: Solr
  Issue Type: Improvement
Reporter: Son Nguyen

 Zookeeper set znode size limit to 1MB by default. So we can't start Solr 
 Cloud with some large config files, like synonyms.txt.
 Jan Høydahl has a good idea:
 SolrCloud is designed with an assumption that you should be able to upload 
 your whole disk-based conf folder into ZK, and that you should be able to add 
 an empty Solr node to a cluster and it would download all config from ZK. So 
 immediately a splitting strategy automatically handled by ZkSolresourceLoader 
 for large files could be one way forward, i.e. store synonyms.txt as e.g. 
 __001_synonyms.txt __002_synonyms.txt



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5468) Option to notify client when desired replication factor not achieved for an update request.

2014-06-20 Thread Timothy Potter (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timothy Potter updated SOLR-5468:
-

Summary: Option to notify client when desired replication factor not 
achieved for an update request.  (was: Option to enforce a majority quorum 
approach to accepting updates in SolrCloud)

 Option to notify client when desired replication factor not achieved for an 
 update request.
 ---

 Key: SOLR-5468
 URL: https://issues.apache.org/jira/browse/SOLR-5468
 Project: Solr
  Issue Type: New Feature
  Components: SolrCloud
Affects Versions: 4.5
 Environment: All
Reporter: Timothy Potter
Assignee: Timothy Potter
Priority: Minor
 Attachments: SOLR-5468.patch, SOLR-5468.patch, SOLR-5468.patch


 I've been thinking about how SolrCloud deals with write-availability using 
 in-sync replica sets, in which writes will continue to be accepted so long as 
 there is at least one healthy node per shard.
 For a little background (and to verify my understanding of the process is 
 correct), SolrCloud only considers active/healthy replicas when acknowledging 
 a write. Specifically, when a shard leader accepts an update request, it 
 forwards the request to all active/healthy replicas and only considers the 
 write successful if all active/healthy replicas ack the write. Any down / 
 gone replicas are not considered and will sync up with the leader when they 
 come back online using peer sync or snapshot replication. For instance, if a 
 shard has 3 nodes, A, B, C with A being the current leader, then writes to 
 the shard will continue to succeed even if B  C are down.
 The issue is that if a shard leader continues to accept updates even if it 
 loses all of its replicas, then we have acknowledged updates on only 1 node. 
 If that node, call it A, then fails and one of the previous replicas, call it 
 B, comes back online before A does, then any writes that A accepted while the 
 other replicas were offline are at risk to being lost. 
 SolrCloud does provide a safe-guard mechanism for this problem with the 
 leaderVoteWait setting, which puts any replicas that come back online before 
 node A into a temporary wait state. If A comes back online within the wait 
 period, then all is well as it will become the leader again and no writes 
 will be lost. As a side note, sys admins definitely need to be made more 
 aware of this situation as when I first encountered it in my cluster, I had 
 no idea what it meant.
 My question is whether we want to consider an approach where SolrCloud will 
 not accept writes unless there is a majority of replicas available to accept 
 the write? For my example, under this approach, we wouldn't accept writes if 
 both BC failed, but would if only C did, leaving A  B online. Admittedly, 
 this lowers the write-availability of the system, so may be something that 
 should be tunable?
 From Mark M: Yeah, this is kind of like one of many little features that we 
 have just not gotten to yet. I’ve always planned for a param that let’s you 
 say how many replicas an update must be verified on before responding 
 success. Seems to make sense to fail that type of request early if you notice 
 there are not enough replicas up to satisfy the param to begin with.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-Tests-4.x-Java7 - Build # 2012 - Still Failing

2014-06-20 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-Tests-4.x-Java7/2012/

All tests passed

Build Log:
[...truncated 29626 lines...]
check-licenses:
 [echo] License check under: 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java7/solr
 [licenses] MISSING sha1 checksum file for: 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java7/solr/example/lib/ext/log4j-1.2.16.jar
 [licenses] EXPECTED sha1 checksum file : 
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java7/solr/licenses/log4j-1.2.16.jar.sha1

[...truncated 1 lines...]
BUILD FAILED
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java7/build.xml:467:
 The following error occurred while executing this line:
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java7/build.xml:70:
 The following error occurred while executing this line:
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java7/solr/build.xml:254:
 The following error occurred while executing this line:
/usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Tests-4.x-Java7/lucene/tools/custom-tasks.xml:62:
 License check failed. Check the logs.

Total time: 122 minutes 50 seconds
Build step 'Invoke Ant' marked build as failure
Archiving artifacts
Sending artifact delta relative to Lucene-Solr-Tests-4.x-Java7 #2008
Archived 1 artifacts
Archive block size is 32768
Received 0 blocks and 464 bytes
Compression is 0.0%
Took 35 ms
Recording test results
Email was triggered for: Failure
Sending email for trigger: Failure



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-5468) Option to notify client when desired replication factor not achieved for an update request.

2014-06-20 Thread Timothy Potter (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timothy Potter resolved SOLR-5468.
--

   Resolution: Fixed
Fix Version/s: 4.9

 Option to notify client when desired replication factor not achieved for an 
 update request.
 ---

 Key: SOLR-5468
 URL: https://issues.apache.org/jira/browse/SOLR-5468
 Project: Solr
  Issue Type: New Feature
  Components: SolrCloud
Affects Versions: 4.5
 Environment: All
Reporter: Timothy Potter
Assignee: Timothy Potter
Priority: Minor
 Fix For: 4.9

 Attachments: SOLR-5468.patch, SOLR-5468.patch, SOLR-5468.patch


 I've been thinking about how SolrCloud deals with write-availability using 
 in-sync replica sets, in which writes will continue to be accepted so long as 
 there is at least one healthy node per shard.
 For a little background (and to verify my understanding of the process is 
 correct), SolrCloud only considers active/healthy replicas when acknowledging 
 a write. Specifically, when a shard leader accepts an update request, it 
 forwards the request to all active/healthy replicas and only considers the 
 write successful if all active/healthy replicas ack the write. Any down / 
 gone replicas are not considered and will sync up with the leader when they 
 come back online using peer sync or snapshot replication. For instance, if a 
 shard has 3 nodes, A, B, C with A being the current leader, then writes to 
 the shard will continue to succeed even if B  C are down.
 The issue is that if a shard leader continues to accept updates even if it 
 loses all of its replicas, then we have acknowledged updates on only 1 node. 
 If that node, call it A, then fails and one of the previous replicas, call it 
 B, comes back online before A does, then any writes that A accepted while the 
 other replicas were offline are at risk to being lost. 
 SolrCloud does provide a safe-guard mechanism for this problem with the 
 leaderVoteWait setting, which puts any replicas that come back online before 
 node A into a temporary wait state. If A comes back online within the wait 
 period, then all is well as it will become the leader again and no writes 
 will be lost. As a side note, sys admins definitely need to be made more 
 aware of this situation as when I first encountered it in my cluster, I had 
 no idea what it meant.
 My question is whether we want to consider an approach where SolrCloud will 
 not accept writes unless there is a majority of replicas available to accept 
 the write? For my example, under this approach, we wouldn't accept writes if 
 both BC failed, but would if only C did, leaving A  B online. Admittedly, 
 this lowers the write-availability of the system, so may be something that 
 should be tunable?
 From Mark M: Yeah, this is kind of like one of many little features that we 
 have just not gotten to yet. I’ve always planned for a param that let’s you 
 say how many replicas an update must be verified on before responding 
 success. Seems to make sense to fail that type of request early if you notice 
 there are not enough replicas up to satisfy the param to begin with.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5468) Option to notify client when desired replication factor not achieved for an update request.

2014-06-20 Thread Timothy Potter (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14039012#comment-14039012
 ] 

Timothy Potter commented on SOLR-5468:
--

I've renamed this ticket as the previous name was misleading as to what the 
actual solution is doing. If you're interested in this subject, I encourage you 
to read my comments above. Basically, Solr's replication mechanism isn't well 
suited for enforcing a majority quorum before accepting a write. In a nutshell, 
if you can't rollback a write that succeeds on some nodes if the majority isn't 
reached, then you can't enforce it ;-) The solution in this ticket is very 
basic. It allows a client to request that Solr return the achieved replication 
factor for a given update request (single or batch). All this allows is for the 
client application to take some additional measures, such as resending the 
updates, if so desired. Put simply, this moves the problem out to the client. 
It's useful for environments that have RF=3 and if 2 of the replicas are down, 
then there is a risk of losing writes if the leader fails and doesn't come 
back. This gives the client application the ability to know which writes were 
accepted by a degraded shard and perhaps re-try them in the future if needed. 
That's it!

 Option to notify client when desired replication factor not achieved for an 
 update request.
 ---

 Key: SOLR-5468
 URL: https://issues.apache.org/jira/browse/SOLR-5468
 Project: Solr
  Issue Type: New Feature
  Components: SolrCloud
Affects Versions: 4.5
 Environment: All
Reporter: Timothy Potter
Assignee: Timothy Potter
Priority: Minor
 Fix For: 4.9

 Attachments: SOLR-5468.patch, SOLR-5468.patch, SOLR-5468.patch


 I've been thinking about how SolrCloud deals with write-availability using 
 in-sync replica sets, in which writes will continue to be accepted so long as 
 there is at least one healthy node per shard.
 For a little background (and to verify my understanding of the process is 
 correct), SolrCloud only considers active/healthy replicas when acknowledging 
 a write. Specifically, when a shard leader accepts an update request, it 
 forwards the request to all active/healthy replicas and only considers the 
 write successful if all active/healthy replicas ack the write. Any down / 
 gone replicas are not considered and will sync up with the leader when they 
 come back online using peer sync or snapshot replication. For instance, if a 
 shard has 3 nodes, A, B, C with A being the current leader, then writes to 
 the shard will continue to succeed even if B  C are down.
 The issue is that if a shard leader continues to accept updates even if it 
 loses all of its replicas, then we have acknowledged updates on only 1 node. 
 If that node, call it A, then fails and one of the previous replicas, call it 
 B, comes back online before A does, then any writes that A accepted while the 
 other replicas were offline are at risk to being lost. 
 SolrCloud does provide a safe-guard mechanism for this problem with the 
 leaderVoteWait setting, which puts any replicas that come back online before 
 node A into a temporary wait state. If A comes back online within the wait 
 period, then all is well as it will become the leader again and no writes 
 will be lost. As a side note, sys admins definitely need to be made more 
 aware of this situation as when I first encountered it in my cluster, I had 
 no idea what it meant.
 My question is whether we want to consider an approach where SolrCloud will 
 not accept writes unless there is a majority of replicas available to accept 
 the write? For my example, under this approach, we wouldn't accept writes if 
 both BC failed, but would if only C did, leaving A  B online. Admittedly, 
 this lowers the write-availability of the system, so may be something that 
 should be tunable?
 From Mark M: Yeah, this is kind of like one of many little features that we 
 have just not gotten to yet. I’ve always planned for a param that let’s you 
 say how many replicas an update must be verified on before responding 
 success. Seems to make sense to fail that type of request early if you notice 
 there are not enough replicas up to satisfy the param to begin with.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5714) Improve tests for BBoxStrategy then port to 4x.

2014-06-20 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14039061#comment-14039061
 ] 

David Smiley commented on LUCENE-5714:
--

Another change to the API is, I think it's not needed to have a BBoxSimilarity 
interface.  DistanceSimilarity can be tossed, and so could 
BBoxSimilarityValueSource.  Instead, AreaSimilarity can be ShapeAreaValueSource 
that takes a ValueSource that produces shapes from it's objectVal(doc).  This 
is in the same vein as DistanceToShapeValueSource. This underscores the 
pluggability with, say, SerializedDVStrategy with ValueSource's.  It's 
plausible it will be faster to decode 4 numbers from a contiguous byte array 
than have to retrieve a number 4 times via DocValues.  And the code shouldn't 
have to change accordingly -- it's plug and play.

Continuing this (definitely a separate JIRA issue), looking at the TODOs: these 
two methods move to SpatialStrategy:
{code:java}

  /**
   * Provides access to each rectangle per document as a ValueSource in which
   * {@link org.apache.lucene.queries.function.FunctionValues#objectVal(int)} 
returns a {@link
   * Shape}.
   */ //TODO raise to SpatialStrategy
  public ValueSource makeShapeValueSource() {
return new BBoxValueSource(this);
  }

  @Override
  public ValueSource makeDistanceValueSource(Point queryPoint, double 
multiplier) {
//TODO if makeShapeValueSource gets lifted to the top; this could become a 
generic impl.
return new DistanceToShapeValueSource(makeShapeValueSource(), queryPoint, 
multiplier, ctx);
  }
{code}

 Improve tests for BBoxStrategy then port to 4x.
 ---

 Key: LUCENE-5714
 URL: https://issues.apache.org/jira/browse/LUCENE-5714
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/spatial
Reporter: David Smiley
Assignee: David Smiley
 Fix For: 4.10

 Attachments: 
 LUCENE-5714__Enhance_BBoxStrategy__more_tests,_fix_dateline_bugs,_new_AreaSimilarity_algor.patch


 BBoxStrategy needs better tests before I'm comfortable seeing it in 4x.  
 Specifically it should use random rectangles based validation (ones that may 
 cross the dateline), akin to the other tests.  And I think I see an 
 equals/hashcode bug to be fixed in there too.
 One particular thing I'd like to see added is how to handle a zero-area case 
 for AreaSimilarity.  I think an additional feature in which you declare a 
 minimum % area (relative to the query shape) would be good.
 It should be possible for the user to combine rectangle center-point to query 
 shape center-point distance sorting as well.  I think it is but I need to 
 make sure it's possible without _having_ to index a separate center point 
 field.
 Another possibility (probably not to be addressed here) is a minimum ratio 
 between width/height, perhaps 10%.  A long but nearly no height line should 
 not be massively disadvantaged relevancy-wise to an equivalently long 
 diagonal road that has a square bbox.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6157) ReplicationFactorTest hangs

2014-06-20 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14039104#comment-14039104
 ] 

ASF subversion and git services commented on SOLR-6157:
---

Commit 1604223 from [~thelabdude] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1604223 ]

SOLR-6157: Fix hanging unit test.

 ReplicationFactorTest hangs
 ---

 Key: SOLR-6157
 URL: https://issues.apache.org/jira/browse/SOLR-6157
 Project: Solr
  Issue Type: Bug
  Components: replication (java)
Reporter: Uwe Schindler
Assignee: Timothy Potter

 See: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/10517/
 You can download all logs from there.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-6157) ReplicationFactorTest hangs

2014-06-20 Thread Timothy Potter (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timothy Potter resolved SOLR-6157.
--

   Resolution: Fixed
Fix Version/s: 4.10

Haven't seen a test hang since committing the previous changes, so marking this 
one resolved for now.

 ReplicationFactorTest hangs
 ---

 Key: SOLR-6157
 URL: https://issues.apache.org/jira/browse/SOLR-6157
 Project: Solr
  Issue Type: Bug
  Components: replication (java)
Reporter: Uwe Schindler
Assignee: Timothy Potter
 Fix For: 4.10


 See: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/10517/
 You can download all logs from there.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4793) Solr Cloud can't upload large config files ( 1MB) to Zookeeper

2014-06-20 Thread Elaine Cario (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14039111#comment-14039111
 ] 

Elaine Cario commented on SOLR-4793:


Not sure either, but I couldn't find where ZOO_ENV was actually referenced in 
any of the zookeeper scripts, so I changed the zookeeper-env.sh to this:

JVMFLAGS=$JVMFLAGS -Djute.maxbuffer=5000

And then it all came together.

 Solr Cloud can't upload large config files (  1MB)  to Zookeeper
 -

 Key: SOLR-4793
 URL: https://issues.apache.org/jira/browse/SOLR-4793
 Project: Solr
  Issue Type: Improvement
Reporter: Son Nguyen

 Zookeeper set znode size limit to 1MB by default. So we can't start Solr 
 Cloud with some large config files, like synonyms.txt.
 Jan Høydahl has a good idea:
 SolrCloud is designed with an assumption that you should be able to upload 
 your whole disk-based conf folder into ZK, and that you should be able to add 
 an empty Solr node to a cluster and it would download all config from ZK. So 
 immediately a splitting strategy automatically handled by ZkSolresourceLoader 
 for large files could be one way forward, i.e. store synonyms.txt as e.g. 
 __001_synonyms.txt __002_synonyms.txt



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-6182) Using rest manager endpoint to create a new (yet unused) managed resource results in corrupt file, solr can't restart

2014-06-20 Thread Timothy Potter (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timothy Potter resolved SOLR-6182.
--

   Resolution: Fixed
Fix Version/s: 4.9

 Using rest manager endpoint to create a new (yet unused) managed resource 
 results in corrupt file, solr can't restart
 -

 Key: SOLR-6182
 URL: https://issues.apache.org/jira/browse/SOLR-6182
 Project: Solr
  Issue Type: Bug
Reporter: Hoss Man
Assignee: Timothy Potter
 Fix For: 4.9


 The ref guide documents that...
 bq. You can also create new managed resource using PUT/POST to the 
 appropriate URL – before ever configuring anything that uses these resources.
 ...however attempting to do this results in the creation of 
 _rest_managed.json file that is not in the format expected by solr, and 
 prevents solr from starting up.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-5956) SnapShooter is using getRawInstanceDir, which is sometimes not a valid directory

2014-06-20 Thread Timothy Potter (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timothy Potter resolved SOLR-5956.
--

   Resolution: Fixed
Fix Version/s: 4.9

 SnapShooter is using getRawInstanceDir, which is sometimes not a valid 
 directory
 

 Key: SOLR-5956
 URL: https://issues.apache.org/jira/browse/SOLR-5956
 Project: Solr
  Issue Type: Bug
  Components: replication (java), SolrCloud
 Environment: SolrCloud
Reporter: Timothy Potter
Assignee: Timothy Potter
 Fix For: 4.9

 Attachments: SOLR-5956.patch


 Ran into an issue where the getRawInstanceDir method on CoreDescriptor 
 returns an invalid directory on the server. Need to track down where this bad 
 value comes from and fix it. I suspect this has to do with me using symlinks
 e.g.
 I have server with solr.solr.home set to
 /home/ec2-user/lucene-solr/solr/cloud87/solr, which in reality is:
 /vol0/cloud87/solr as /home/ec2-user/lucene-solr/solr/cloud87 is a symlink to 
 /vol0/cloud87
 getRawInstanceDir was returning /vol0/cloud87/demo_shard1_replica1 which is 
 missing the /solr part of the directory path, should be:
 /vol0/cloud87/solr/demo_shard1_replica1



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-6106) Sometimes all the cores on a SolrCloud node cannot find their config when intializing the ManagedResourceStorage storageIO impl

2014-06-20 Thread Timothy Potter (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-6106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timothy Potter resolved SOLR-6106.
--

Resolution: Won't Fix

Only occurs if a full GC pause causes the ZK session to expire during server 
initialization (when there are many cores). There's not much you can do with 
this other than restart (with more memory or better GC settings).

 Sometimes all the cores on a SolrCloud node cannot find their config when 
 intializing the ManagedResourceStorage storageIO impl
 ---

 Key: SOLR-6106
 URL: https://issues.apache.org/jira/browse/SOLR-6106
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Reporter: Timothy Potter
Assignee: Timothy Potter
Priority: Minor
 Attachments: SOLR-6106_prelim.patch


 Had one of my many nodes have problems initializing all cores due to the 
 following problem. It was resolved by restarting the node (hence the minor 
 classification).
 2014-05-21 20:39:17,898 [coreLoadExecutor-4-thread-27] ERROR 
 solr.core.CoreContainer  - Unable to create core: small46_shard1_replica1
 org.apache.solr.common.SolrException: Could not find config name for 
 collection:small46
   at org.apache.solr.core.SolrCore.init(SolrCore.java:858)
   at org.apache.solr.core.SolrCore.init(SolrCore.java:641)
   at org.apache.solr.core.CoreContainer.create(CoreContainer.java:556)
   at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:261)
   at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:253)
   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
   at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:744)
 Caused by: org.apache.solr.common.SolrException: Could not find config name 
 for collection:small46
   at 
 org.apache.solr.rest.ManagedResourceStorage.newStorageIO(ManagedResourceStorage.java:99)
   at org.apache.solr.core.SolrCore.initRestManager(SolrCore.java:2339)
   at org.apache.solr.core.SolrCore.init(SolrCore.java:845)
   ... 10 more



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4793) Solr Cloud can't upload large config files ( 1MB) to Zookeeper

2014-06-20 Thread Yago Riveiro (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14039198#comment-14039198
 ] 

Yago Riveiro commented on SOLR-4793:


it's probably that I was tweaked the zkServer file a bit ... :P

 Solr Cloud can't upload large config files (  1MB)  to Zookeeper
 -

 Key: SOLR-4793
 URL: https://issues.apache.org/jira/browse/SOLR-4793
 Project: Solr
  Issue Type: Improvement
Reporter: Son Nguyen

 Zookeeper set znode size limit to 1MB by default. So we can't start Solr 
 Cloud with some large config files, like synonyms.txt.
 Jan Høydahl has a good idea:
 SolrCloud is designed with an assumption that you should be able to upload 
 your whole disk-based conf folder into ZK, and that you should be able to add 
 an empty Solr node to a cluster and it would download all config from ZK. So 
 immediately a splitting strategy automatically handled by ZkSolresourceLoader 
 for large files could be one way forward, i.e. store synonyms.txt as e.g. 
 __001_synonyms.txt __002_synonyms.txt



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6120) zkcli.sh class not fount error /opt/solr-4.8.1/example/solr-webapp is empty

2014-06-20 Thread Cassandra Targett (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14039223#comment-14039223
 ] 

Cassandra Targett commented on SOLR-6120:
-

[~shalinmangar]: should this be updated to Fix Version = 4.9  also Resolved? 
Or is Windows support blocking that?

 zkcli.sh class not fount error /opt/solr-4.8.1/example/solr-webapp is empty
 ---

 Key: SOLR-6120
 URL: https://issues.apache.org/jira/browse/SOLR-6120
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.8.1
Reporter: sebastian badea
Assignee: Shalin Shekhar Mangar
 Attachments: SOLR-6120-windows.patch, SOLR-6120-windows.patch, 
 SOLR-6120-windows.patch, SOLR-6120.patch


 When calling /solr-4.8.1/example/scripts/cloud-scripts/zkcli.sh the 
 org.apache.solr.cloud.ZkCLI class is not found
 The cause is that /opt/solr-4.8.1/example/solr-webapp is empty



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4852) If sharedLib is set to lib, classloader fails to find classes in lib

2014-06-20 Thread Ahmet Arslan (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14039276#comment-14039276
 ] 

Ahmet Arslan commented on SOLR-4852:


This is not all about loading same jar twice. Here is an interesting finding.

I remove all lib directives in example solrconfig.xml and put icu4j-53.1.jar 
and lucene-analyzers-icu-4.8.1.jar into collection1/lib folder.

solr.ICUFoldingFilterFactory works file.

Just add following line to solrconfig.xml 
{code:xml}
  lib dir=../../../dist/ regex=solr-velocity-\d.*\.jar /
{code}

bum it fails.   I have a feeling that this is nothing to do with twice loading. 
It looks line order of processed lib directives causing something. 

 If sharedLib is set to lib, classloader fails to find classes in lib
 

 Key: SOLR-4852
 URL: https://issues.apache.org/jira/browse/SOLR-4852
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.4
 Environment: Linux bigindy5 2.6.32-358.6.1.el6.centos.plus.x86_64 #1 
 SMP Wed Apr 24 03:21:04 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
 java version 1.7.0_21
 Java(TM) SE Runtime Environment (build 1.7.0_21-b11)
 Java HotSpot(TM) 64-Bit Server VM (build 23.21-b01, mixed mode)
Reporter: Shawn Heisey
 Fix For: 4.9, 5.0

 Attachments: SOLR-4852-test-failhard.txt, SOLR-4852.patch, 
 SOLR-4852.patch


 I have some jars in the lib directory under solr.solr.home - DIH, ICU, and 
 MySQL.  If I set sharedLib in solr.xml to lib then the ICUTokenizer class 
 is not found, even though the jar is loaded (twice) during Solr startup.  If 
 I set sharedLib to another location that doesn't exist, the jars are only 
 loaded once and there is no problem.
 I'm using the old-style solr.xml on branch_4x revision 1485566.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6188) solr.ICUFoldingFilterFactory causes NoClassDefFoundError: o/a/l/a/icu/ICUFoldingFilter

2014-06-20 Thread Ahmet Arslan (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14039280#comment-14039280
 ] 

Ahmet Arslan commented on SOLR-6188:


Same symptom. solr.ICU*Factory in schema.xml causes not found exception for 
org.apache.lucene.* class. 

 solr.ICUFoldingFilterFactory causes NoClassDefFoundError: 
 o/a/l/a/icu/ICUFoldingFilter
 --

 Key: SOLR-6188
 URL: https://issues.apache.org/jira/browse/SOLR-6188
 Project: Solr
  Issue Type: Bug
  Components: Schema and Analysis
Affects Versions: 4.8.1
Reporter: Ahmet Arslan
  Labels: ICUFoldingFilterFactory
 Fix For: 4.10


 When fully qualified class name is used in schema.xml 
 {{org.apache.lucene.analysis.icu.ICUFoldingFilterFactory}}
 it works. However as documented in confluence and wiki, when 
 {{solr.ICUFoldingFilterFactory}} is used it throws following exception.
 This is true for both released 4.8.1 version and trunk r1604168
 following type works :
 {code:xml}
  fieldType name=folded2 class=solr.TextField
   analyzer
 tokenizer class=solr.StandardTokenizerFactory/
 filter 
 class=org.apache.lucene.analysis.icu.ICUFoldingFilterFactory/
   /analyzer
 /fieldType
 {code}
 this does not : 
 {code:xml}
  fieldType name=folded class=solr.TextField
   analyzer
 tokenizer class=solr.StandardTokenizerFactory/
 filter class=solr.ICUFoldingFilterFactory/
   /analyzer
 /fieldType
 {code}
 {noformat}
 257 [main] ERROR org.apache.solr.core.SolrCore  – Error loading 
 core:java.util.concurrent.ExecutionException: java.lang.NoClassDefFoundError: 
 org/apache/lucene/analysis/icu/ICUFoldingFilter
   at java.util.concurrent.FutureTask.report(FutureTask.java:122)
   at java.util.concurrent.FutureTask.get(FutureTask.java:188)
   at org.apache.solr.core.CoreContainer.load(CoreContainer.java:301)
   at 
 org.apache.solr.servlet.SolrDispatchFilter.createCoreContainer(SolrDispatchFilter.java:190)
   at 
 org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:137)
   at org.eclipse.jetty.servlet.FilterHolder.doStart(FilterHolder.java:119)
   at 
 org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64)
   at 
 org.eclipse.jetty.servlet.ServletHandler.initialize(ServletHandler.java:719)
   at 
 org.eclipse.jetty.servlet.ServletContextHandler.startContext(ServletContextHandler.java:265)
   at 
 org.eclipse.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1252)
   at 
 org.eclipse.jetty.server.handler.ContextHandler.doStart(ContextHandler.java:710)
   at 
 org.eclipse.jetty.webapp.WebAppContext.doStart(WebAppContext.java:494)
   at 
 org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64)
   at 
 org.eclipse.jetty.deploy.bindings.StandardStarter.processBinding(StandardStarter.java:39)
   at 
 org.eclipse.jetty.deploy.AppLifeCycle.runBindings(AppLifeCycle.java:186)
   at 
 org.eclipse.jetty.deploy.DeploymentManager.requestAppGoal(DeploymentManager.java:494)
   at 
 org.eclipse.jetty.deploy.DeploymentManager.addApp(DeploymentManager.java:141)
   at 
 org.eclipse.jetty.deploy.providers.ScanningAppProvider.fileAdded(ScanningAppProvider.java:145)
   at 
 org.eclipse.jetty.deploy.providers.ScanningAppProvider$1.fileAdded(ScanningAppProvider.java:56)
   at org.eclipse.jetty.util.Scanner.reportAddition(Scanner.java:609)
   at org.eclipse.jetty.util.Scanner.reportDifferences(Scanner.java:540)
   at org.eclipse.jetty.util.Scanner.scan(Scanner.java:403)
   at org.eclipse.jetty.util.Scanner.doStart(Scanner.java:337)
   at 
 org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64)
   at 
 org.eclipse.jetty.deploy.providers.ScanningAppProvider.doStart(ScanningAppProvider.java:121)
   at 
 org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64)
   at 
 org.eclipse.jetty.deploy.DeploymentManager.startAppProvider(DeploymentManager.java:555)
   at 
 org.eclipse.jetty.deploy.DeploymentManager.doStart(DeploymentManager.java:230)
   at 
 org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64)
   at 
 org.eclipse.jetty.util.component.AggregateLifeCycle.doStart(AggregateLifeCycle.java:81)
   at 
 org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:58)
   at 
 org.eclipse.jetty.server.handler.HandlerWrapper.doStart(HandlerWrapper.java:96)
   at org.eclipse.jetty.server.Server.doStart(Server.java:280)
   at 
 org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64)
   at 
 

[jira] [Commented] (SOLR-4852) If sharedLib is set to lib, classloader fails to find classes in lib

2014-06-20 Thread Ahmet Arslan (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14039312#comment-14039312
 ] 

Ahmet Arslan commented on SOLR-4852:


In above setting no duplicate jars loaded. Only one explicit lib directive is 
defined in solrconfig.xml along with implicit one. core/lib. following lines 
printed during startup.

{noformat}
1751 [coreLoadExecutor-4-thread-1] INFO  
org.apache.solr.core.SolrResourceLoader  – new SolrResourceLoader for 
directory: '/Users/iorixxx/Desktop/solr-4.8.1/example/solr/collection1/'
1752 [coreLoadExecutor-4-thread-1] INFO  
org.apache.solr.core.SolrResourceLoader  – Adding 
'file:/Users/iorixxx/Desktop/solr-4.8.1/example/solr/collection1/lib/.DS_Store' 
to classloader
1752 [coreLoadExecutor-4-thread-1] INFO  
org.apache.solr.core.SolrResourceLoader  – Adding 
'file:/Users/iorixxx/Desktop/solr-4.8.1/example/solr/collection1/lib/icu4j-53.1.jar'
 to classloader
1752 [coreLoadExecutor-4-thread-1] INFO  
org.apache.solr.core.SolrResourceLoader  – Adding 
'file:/Users/iorixxx/Desktop/solr-4.8.1/example/solr/collection1/lib/lucene-analyzers-icu-4.8.1.jar'
 to classloader
1833 [coreLoadExecutor-4-thread-1] INFO  org.apache.solr.core.SolrConfig  – 
Adding specified lib dirs to ClassLoader
1835 [coreLoadExecutor-4-thread-1] INFO  
org.apache.solr.core.SolrResourceLoader  – Adding 
'file:/Users/iorixxx/Desktop/solr-4.8.1/dist/solr-velocity-4.8.1.jar' to 
classloader

{noformat}

This fails for no reason.  By the way why does this load non - jar hidden .DS 
file : 
4.8.1/example/solr/collection1/lib/.DS_Store' to classloader

 If sharedLib is set to lib, classloader fails to find classes in lib
 

 Key: SOLR-4852
 URL: https://issues.apache.org/jira/browse/SOLR-4852
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.4
 Environment: Linux bigindy5 2.6.32-358.6.1.el6.centos.plus.x86_64 #1 
 SMP Wed Apr 24 03:21:04 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
 java version 1.7.0_21
 Java(TM) SE Runtime Environment (build 1.7.0_21-b11)
 Java HotSpot(TM) 64-Bit Server VM (build 23.21-b01, mixed mode)
Reporter: Shawn Heisey
 Fix For: 4.9, 5.0

 Attachments: SOLR-4852-test-failhard.txt, SOLR-4852.patch, 
 SOLR-4852.patch


 I have some jars in the lib directory under solr.solr.home - DIH, ICU, and 
 MySQL.  If I set sharedLib in solr.xml to lib then the ICUTokenizer class 
 is not found, even though the jar is loaded (twice) during Solr startup.  If 
 I set sharedLib to another location that doesn't exist, the jars are only 
 loaded once and there is no problem.
 I'm using the old-style solr.xml on branch_4x revision 1485566.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6184) Replication fetchLatestIndex always failed, that will occur the recovery error.

2014-06-20 Thread Shawn Heisey (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14039320#comment-14039320
 ] 

Shawn Heisey commented on SOLR-6184:


Would the commitReserveDuration parameter on the replication handler be useful 
in keeping Solr from deleting the commit point that is being replicated until 
after the replication is complete?  Normally it's not recommended to have any 
config parameters for replication, but if a very large index is having problems 
recovering when there is a lot of update activity, perhaps that would be an 
exception.


 Replication fetchLatestIndex always failed, that will occur the recovery 
 error.
 ---

 Key: SOLR-6184
 URL: https://issues.apache.org/jira/browse/SOLR-6184
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.6, 4.6.1
 Environment: the index file size is more than 70G
Reporter: Raintung Li
 Attachments: Solr-6184.txt


 Usually the copy full index 70G need 20 minutes at least, 100M read/write 
 network or disk r/w.  If in the 20 minutes happen one hard commit, that means 
 the copy full index snap pull will be failed, the temp folder will be removed 
 because it is failed pull task. 
 In the production, update index will happen in every minute, redo pull task 
 always failed because index always change. 
 And also always redo the pull it will occur the network and disk usage keep 
 the high level.
 For my suggestion, the fetchLatestIndex can be do again in some frequency. 
 Don't need remove the tmp folder, and copy the largest index at first. Redo 
 the fetchLatestIndex don't download the same biggest file again, only will 
 copy the commit index just now, at last the task will be easy success.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4852) If sharedLib is set to lib, classloader fails to find classes in lib

2014-06-20 Thread Shawn Heisey (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14039331#comment-14039331
 ] 

Shawn Heisey commented on SOLR-4852:


Another theory I've considered is that the problem is caused by the resource 
loader object being replaced, which from my research apparently has to be done 
anytime you need to add another directory or list of jars.  This theory would 
be disproved by a setup that has more than one lib directive and uses classes 
in jars from both locations ... and there are probably some of those out there.

I willingly admit that I do not understand what causes the problems that I 
documented here.  I looked into the Solr code and poked around the Java 7 API 
javadocs, but I wasn't able to make sense of it.


 If sharedLib is set to lib, classloader fails to find classes in lib
 

 Key: SOLR-4852
 URL: https://issues.apache.org/jira/browse/SOLR-4852
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.4
 Environment: Linux bigindy5 2.6.32-358.6.1.el6.centos.plus.x86_64 #1 
 SMP Wed Apr 24 03:21:04 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
 java version 1.7.0_21
 Java(TM) SE Runtime Environment (build 1.7.0_21-b11)
 Java HotSpot(TM) 64-Bit Server VM (build 23.21-b01, mixed mode)
Reporter: Shawn Heisey
 Fix For: 4.9, 5.0

 Attachments: SOLR-4852-test-failhard.txt, SOLR-4852.patch, 
 SOLR-4852.patch


 I have some jars in the lib directory under solr.solr.home - DIH, ICU, and 
 MySQL.  If I set sharedLib in solr.xml to lib then the ICUTokenizer class 
 is not found, even though the jar is loaded (twice) during Solr startup.  If 
 I set sharedLib to another location that doesn't exist, the jars are only 
 loaded once and there is no problem.
 I'm using the old-style solr.xml on branch_4x revision 1485566.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4852) If sharedLib is set to lib, classloader fails to find classes in lib

2014-06-20 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14039336#comment-14039336
 ] 

Uwe Schindler commented on SOLR-4852:
-

Hi Shawn, this is the issue here: I explain the problem in the linked issue: 
SOLR-6188


 If sharedLib is set to lib, classloader fails to find classes in lib
 

 Key: SOLR-4852
 URL: https://issues.apache.org/jira/browse/SOLR-4852
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.4
 Environment: Linux bigindy5 2.6.32-358.6.1.el6.centos.plus.x86_64 #1 
 SMP Wed Apr 24 03:21:04 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
 java version 1.7.0_21
 Java(TM) SE Runtime Environment (build 1.7.0_21-b11)
 Java HotSpot(TM) 64-Bit Server VM (build 23.21-b01, mixed mode)
Reporter: Shawn Heisey
 Fix For: 4.9, 5.0

 Attachments: SOLR-4852-test-failhard.txt, SOLR-4852.patch, 
 SOLR-4852.patch


 I have some jars in the lib directory under solr.solr.home - DIH, ICU, and 
 MySQL.  If I set sharedLib in solr.xml to lib then the ICUTokenizer class 
 is not found, even though the jar is loaded (twice) during Solr startup.  If 
 I set sharedLib to another location that doesn't exist, the jars are only 
 loaded once and there is no problem.
 I'm using the old-style solr.xml on branch_4x revision 1485566.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-6188) solr.ICUFoldingFilterFactory causes NoClassDefFoundError: o/a/l/a/icu/ICUFoldingFilter

2014-06-20 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-6188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14039350#comment-14039350
 ] 

Uwe Schindler commented on SOLR-6188:
-

Hi the reason for this issue is indeed caused by SOLR-4852: The reason why it 
works with absolute classname is the following:
- If you use the absolute class name, the class is loaded by Class.forName from 
SolrResourceLoader
- If you use the shortcut, the Solr 3.x backwards layer for finding factory 
classes is used. The solr.XXXFactory name is rewritten to a call to 
TokenFilterFactory.forName(). This forName call uses the classpath it was 
initialized with. TokenFilterFactory is a static class and doe snot really know 
classloaders (because there is only one single instance). Every 
SolrResourceLoader calls an update process, that scans the own classpath and 
adds all *new* factory instances to the forName() lookup map.

What happens here: In an earlier stage, it looks like SolrResourceLoader has 
seen a Factory instance loaded by SPI and cached its factory class for 
forName(). But later the classpath and classloader was replaced and the scanner 
was called again. This onescanned classpath again, and found a new instance of 
the FactoryClass (the new one that should be used). Because this one was 
already in the forName cache, it did not replace that one. In the meantime, the 
old classloader was closed with Java 7's URLClassLoacer.close() method. Because 
of this a call to forName returned the factory class, but the dependend classes 
it was referring to are no longer loadable (classloader closed). This causes 
the bug.

The fix is not easily possible, I will think about it.

 solr.ICUFoldingFilterFactory causes NoClassDefFoundError: 
 o/a/l/a/icu/ICUFoldingFilter
 --

 Key: SOLR-6188
 URL: https://issues.apache.org/jira/browse/SOLR-6188
 Project: Solr
  Issue Type: Bug
  Components: Schema and Analysis
Affects Versions: 4.8.1
Reporter: Ahmet Arslan
  Labels: ICUFoldingFilterFactory
 Fix For: 4.10


 When fully qualified class name is used in schema.xml 
 {{org.apache.lucene.analysis.icu.ICUFoldingFilterFactory}}
 it works. However as documented in confluence and wiki, when 
 {{solr.ICUFoldingFilterFactory}} is used it throws following exception.
 This is true for both released 4.8.1 version and trunk r1604168
 following type works :
 {code:xml}
  fieldType name=folded2 class=solr.TextField
   analyzer
 tokenizer class=solr.StandardTokenizerFactory/
 filter 
 class=org.apache.lucene.analysis.icu.ICUFoldingFilterFactory/
   /analyzer
 /fieldType
 {code}
 this does not : 
 {code:xml}
  fieldType name=folded class=solr.TextField
   analyzer
 tokenizer class=solr.StandardTokenizerFactory/
 filter class=solr.ICUFoldingFilterFactory/
   /analyzer
 /fieldType
 {code}
 {noformat}
 257 [main] ERROR org.apache.solr.core.SolrCore  – Error loading 
 core:java.util.concurrent.ExecutionException: java.lang.NoClassDefFoundError: 
 org/apache/lucene/analysis/icu/ICUFoldingFilter
   at java.util.concurrent.FutureTask.report(FutureTask.java:122)
   at java.util.concurrent.FutureTask.get(FutureTask.java:188)
   at org.apache.solr.core.CoreContainer.load(CoreContainer.java:301)
   at 
 org.apache.solr.servlet.SolrDispatchFilter.createCoreContainer(SolrDispatchFilter.java:190)
   at 
 org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:137)
   at org.eclipse.jetty.servlet.FilterHolder.doStart(FilterHolder.java:119)
   at 
 org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64)
   at 
 org.eclipse.jetty.servlet.ServletHandler.initialize(ServletHandler.java:719)
   at 
 org.eclipse.jetty.servlet.ServletContextHandler.startContext(ServletContextHandler.java:265)
   at 
 org.eclipse.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1252)
   at 
 org.eclipse.jetty.server.handler.ContextHandler.doStart(ContextHandler.java:710)
   at 
 org.eclipse.jetty.webapp.WebAppContext.doStart(WebAppContext.java:494)
   at 
 org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:64)
   at 
 org.eclipse.jetty.deploy.bindings.StandardStarter.processBinding(StandardStarter.java:39)
   at 
 org.eclipse.jetty.deploy.AppLifeCycle.runBindings(AppLifeCycle.java:186)
   at 
 org.eclipse.jetty.deploy.DeploymentManager.requestAppGoal(DeploymentManager.java:494)
   at 
 org.eclipse.jetty.deploy.DeploymentManager.addApp(DeploymentManager.java:141)
   at 
 org.eclipse.jetty.deploy.providers.ScanningAppProvider.fileAdded(ScanningAppProvider.java:145)
   at 
 

[jira] [Commented] (LUCENE-5752) Explore light weight Automaton replacement

2014-06-20 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14039401#comment-14039401
 ] 

ASF subversion and git services commented on LUCENE-5752:
-

Commit 1604283 from [~mikemccand] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1604283 ]

LUCENE-5752: switch to simpler, immutable Automaton implementation

 Explore light weight Automaton replacement
 --

 Key: LUCENE-5752
 URL: https://issues.apache.org/jira/browse/LUCENE-5752
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 5.0, 4.10

 Attachments: LUCENE-5752.patch, LUCENE-5752.patch


 This effort started with the patch on LUCENE-4556, to create a light
 weight replacement for the current object-heavy Automaton class
 (which creates separate State and Transition objects).
 I took that initial patch much further, and cutover most places in
 Lucene that use Automaton to LightAutomaton.  Tests pass.
 The core idea of LightAutomaton is all states are ints, and you build
 up the automaton under the restriction that you add all outgoing
 transitions one state at a time.  This worked well for most
 operations, but for some (e.g. UTF32ToUTF8!!) it was harder, so I also
 added a separate builder to add transitions in any order and then in
 the end they are sorted and added to the real automaton.
 If this is successful I think we should just replace the current
 Automaton with LightAutomaton; right now they both exist in my current
 patch...
 This is very much a work in progress, and I'm not sure the
 restrictions the API imposes are reasonable (some algos got uglier).
 But I think it's at least worth exploring/iterating... I'll make a branch and
 commit my current state.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-5752) Explore light weight Automaton replacement

2014-06-20 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless resolved LUCENE-5752.


Resolution: Fixed

 Explore light weight Automaton replacement
 --

 Key: LUCENE-5752
 URL: https://issues.apache.org/jira/browse/LUCENE-5752
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 5.0, 4.10

 Attachments: LUCENE-5752.patch, LUCENE-5752.patch


 This effort started with the patch on LUCENE-4556, to create a light
 weight replacement for the current object-heavy Automaton class
 (which creates separate State and Transition objects).
 I took that initial patch much further, and cutover most places in
 Lucene that use Automaton to LightAutomaton.  Tests pass.
 The core idea of LightAutomaton is all states are ints, and you build
 up the automaton under the restriction that you add all outgoing
 transitions one state at a time.  This worked well for most
 operations, but for some (e.g. UTF32ToUTF8!!) it was harder, so I also
 added a separate builder to add transitions in any order and then in
 the end they are sorted and added to the real automaton.
 If this is successful I think we should just replace the current
 Automaton with LightAutomaton; right now they both exist in my current
 patch...
 This is very much a work in progress, and I'm not sure the
 restrictions the API imposes are reasonable (some algos got uglier).
 But I think it's at least worth exploring/iterating... I'll make a branch and
 commit my current state.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5752) Explore light weight Automaton replacement

2014-06-20 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-5752:
---

Fix Version/s: 4.10

 Explore light weight Automaton replacement
 --

 Key: LUCENE-5752
 URL: https://issues.apache.org/jira/browse/LUCENE-5752
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 5.0, 4.10

 Attachments: LUCENE-5752.patch, LUCENE-5752.patch


 This effort started with the patch on LUCENE-4556, to create a light
 weight replacement for the current object-heavy Automaton class
 (which creates separate State and Transition objects).
 I took that initial patch much further, and cutover most places in
 Lucene that use Automaton to LightAutomaton.  Tests pass.
 The core idea of LightAutomaton is all states are ints, and you build
 up the automaton under the restriction that you add all outgoing
 transitions one state at a time.  This worked well for most
 operations, but for some (e.g. UTF32ToUTF8!!) it was harder, so I also
 added a separate builder to add transitions in any order and then in
 the end they are sorted and added to the real automaton.
 If this is successful I think we should just replace the current
 Automaton with LightAutomaton; right now they both exist in my current
 patch...
 This is very much a work in progress, and I'm not sure the
 restrictions the API imposes are reasonable (some algos got uglier).
 But I think it's at least worth exploring/iterating... I'll make a branch and
 commit my current state.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



  1   2   >