Re: inconsistency/performance trap of empty terms

2010-10-28 Thread Andi Vajda
On Oct 28, 2010, at 22:32, Robert Muir wrote: On Thu, Oct 28, 2010 at 10:28 PM, Andi Vajda wrote: I've used this in a URL index. I needed to be able to distinguish between searching URLs that had, say, no path, from searching URLs without matching the path component. The absence of pat

Solr-3.x - Build # 150 - Still Failing

2010-10-28 Thread Apache Hudson Server
Build: http://hudson.zones.apache.org/hudson/job/Solr-3.x/150/ 1 tests failed. REGRESSION: org.apache.solr.client.solrj.embedded.SolrExampleStreamingTest.testCommitWithin Error Message: expected:<0> but was:<1> Stack Trace: junit.framework.AssertionFailedError: expected:<0> but was:<1>

Re: inconsistency/performance trap of empty terms

2010-10-28 Thread Robert Muir
On Thu, Oct 28, 2010 at 10:28 PM, Andi Vajda wrote: > > I've used this in a URL index. I needed to be able to distinguish between > searching URLs that had, say, no path, from searching URLs without matching > the path component. The absence of path was represented with an empty token > in the pat

[jira] Updated: (SOLR-2205) Grouping performance improvements

2010-10-28 Thread Yonik Seeley (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-2205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yonik Seeley updated SOLR-2205: --- Attachment: SOLR-2205.patch I've committed the first part of this patch that compares with the smalles

Lucene-Solr-tests-only-trunk - Build # 710 - Failure

2010-10-28 Thread Apache Hudson Server
Build: http://hudson.zones.apache.org/hudson/job/Lucene-Solr-tests-only-trunk/710/ 11 tests failed. REGRESSION: org.apache.solr.cloud.CloudStateUpdateTest.testCoreRegistration Error Message: null Stack Trace: org.apache.solr.common.cloud.ZooKeeperException: at org.apache.solr.core.Cor

Re: inconsistency/performance trap of empty terms

2010-10-28 Thread Andi Vajda
On Thu, 28 Oct 2010, Robert Muir wrote: On Thu, Oct 28, 2010 at 8:35 PM, wrote: In database queries, it is often useful to treat an empty value specially, and be able to search explicitly for records that have (for instance) no field X, or no value for field X.  I can't regurgitate offhand

[jira] Commented: (SOLR-2206) DIH MailEntityProcessor has mispelled words

2010-10-28 Thread Lance Norskog (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-2206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12926038#action_12926038 ] Lance Norskog commented on SOLR-2206: - Also, please change the date range parameter to u

[jira] Created: (SOLR-2206) DIH MailEntityProcessor has mispelled words

2010-10-28 Thread Lance Norskog (JIRA)
DIH MailEntityProcessor has mispelled words --- Key: SOLR-2206 URL: https://issues.apache.org/jira/browse/SOLR-2206 Project: Solr Issue Type: Improvement Components: contrib - DataImportHandl

Re: inconsistency/performance trap of empty terms

2010-10-28 Thread Robert Muir
On Thu, Oct 28, 2010 at 8:35 PM, wrote: > In database queries, it is often useful to treat an empty value specially, > and be able to search explicitly for records that have (for instance) no > field X, or no value for field X.  I can't regurgitate offhand all the > precise situations that I'v

RE: inconsistency/performance trap of empty terms

2010-10-28 Thread karl.wright
In database queries, it is often useful to treat an empty value specially, and be able to search explicitly for records that have (for instance) no field X, or no value for field X. I can't regurgitate offhand all the precise situations that I've used this and claim that they would apply to a s

Re: inconsistency/performance trap of empty terms

2010-10-28 Thread Robert Muir
On Thu, Oct 28, 2010 at 7:59 PM, Chris Hostetter wrote: > > : Anyway, I think its possible other users might be in this same > : situation, with slow performance, and not even realizing it yet... > : Obviously they can fix this if they go and add LengthFilter, but > : should we be doing something

Re: inconsistency/performance trap of empty terms

2010-10-28 Thread Chris Hostetter
: Anyway, I think its possible other users might be in this same : situation, with slow performance, and not even realizing it yet... : Obviously they can fix this if they go and add LengthFilter, but : should we be doing something different? On one level, ithink a big improvement might just be

[jira] Updated: (SOLR-2202) Money FieldType

2010-10-28 Thread Greg Fodor (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-2202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Greg Fodor updated SOLR-2202: - Attachment: SOLR-2202-solr-5.patch Fix for error when computing converted values. > Money FieldType > ---

Re: Lucene 3.0.3 Release Date

2010-10-28 Thread Simon Willnauer
On Fri, Oct 29, 2010 at 12:17 AM, Robert Muir wrote: > Shay, I forwarded your idea to the -dev list. > > Personally I think this is a great idea, really whenever there is a > even a remote possibility of an index corruption, and we fix it, I > think we should aggressively issue a release. > +1 >

Fwd: Lucene 3.0.3 Release Date

2010-10-28 Thread Robert Muir
Shay, I forwarded your idea to the -dev list. Personally I think this is a great idea, really whenever there is a even a remote possibility of an index corruption, and we fix it, I think we should aggressively issue a release. We also have some important bugfixes to things like NumericRangeQuery.

[jira] Issue Comment Edited: (SOLR-2205) Grouping performance improvements

2010-10-28 Thread Martijn van Groningen (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-2205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12925928#action_12925928 ] Martijn van Groningen edited comment on SOLR-2205 at 10/28/10 4:11 PM: ---

[jira] Commented: (SOLR-2205) Grouping performance improvements

2010-10-28 Thread Martijn van Groningen (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-2205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12925928#action_12925928 ] Martijn van Groningen commented on SOLR-2205: - bq. .. values). Something lik

[jira] Commented: (SOLR-2205) Grouping performance improvements

2010-10-28 Thread Yonik Seeley (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-2205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12925921#action_12925921 ] Yonik Seeley commented on SOLR-2205: Oh, nice - I hadn't thought of checking if a doc is

[jira] Commented: (LUCENE-2728) EnwikiContentSource does not properly identify the name/id of the Wikipedia article

2010-10-28 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12925916#action_12925916 ] Grant Ingersoll commented on LUCENE-2728: - Yes. I'm trying at the moment, but my

[jira] Updated: (LUCENE-2728) EnwikiContentSource does not properly identify the name/id of the Wikipedia article

2010-10-28 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-2728: --- Fix Version/s: 4.0 3.1 > EnwikiContentSource does not properly id

[jira] Reopened: (LUCENE-2691) Consolidate Near Real Time and Reopen API semantics

2010-10-28 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless reopened LUCENE-2691: Reopen until we settle the API question... > Consolidate Near Real Time and Reopen AP

[jira] Commented: (LUCENE-2728) EnwikiContentSource does not properly identify the name/id of the Wikipedia article

2010-10-28 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12925911#action_12925911 ] Michael McCandless commented on LUCENE-2728: Grant are you going to backport t

[jira] Commented: (SOLR-2205) Grouping performance improvements

2010-10-28 Thread Martijn van Groningen (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-2205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12925900#action_12925900 ] Martijn van Groningen commented on SOLR-2205: - bq. the searchtime was around 300

[jira] Issue Comment Edited: (SOLR-2205) Grouping performance improvements

2010-10-28 Thread Martijn van Groningen (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-2205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12925900#action_12925900 ] Martijn van Groningen edited comment on SOLR-2205 at 10/28/10 2:23 PM: ---

[jira] Commented: (SOLR-2205) Grouping performance improvements

2010-10-28 Thread Martijn van Groningen (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-2205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12925899#action_12925899 ] Martijn van Groningen commented on SOLR-2205: - bq. Are these a different set of

[jira] Updated: (SOLR-2205) Grouping performance improvements

2010-10-28 Thread Martijn van Groningen (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-2205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Martijn van Groningen updated SOLR-2205: Attachment: SOLR-2205.patch The code I initially wrote was on the pre-flex code base

Re: Solr Clustering example

2010-10-28 Thread Chris Hostetter
: Does anyone recall why the Carrot2 stuff is disabled by default in the : Solr example? If my memory serves, it was due to the licensing issues : that required the user to download certain libs. Was there any other? : In other words, I'd like to enable it by default, as I have hooked it :

[jira] Commented: (SOLR-2205) Grouping performance improvements

2010-10-28 Thread Yonik Seeley (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-2205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12925890#action_12925890 ] Yonik Seeley commented on SOLR-2205: Cool! Are these a different set of optimizations th

[jira] Created: (SOLR-2205) Grouping performance improvements

2010-10-28 Thread Martijn van Groningen (JIRA)
Grouping performance improvements - Key: SOLR-2205 URL: https://issues.apache.org/jira/browse/SOLR-2205 Project: Solr Issue Type: Sub-task Components: search Affects Versions: 4.0 Rep

[jira] Updated: (LUCENE-2727) simulate out of open files in MockDirectoryWrapper

2010-10-28 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-2727: Attachment: LUCENE-2727.patch ok here's another start, using the idea of maybeThrowIOException too

[jira] Commented: (LUCENE-2728) EnwikiContentSource does not properly identify the name/id of the Wikipedia article

2010-10-28 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12925872#action_12925872 ] Grant Ingersoll commented on LUCENE-2728: - Trunk: Committed revision 1028386. > E

[jira] Updated: (LUCENE-2728) EnwikiContentSource does not properly identify the name/id of the Wikipedia article

2010-10-28 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll updated LUCENE-2728: Attachment: LUCENE-2728.patch Patch applies from the contrib/benchmark directory. > Enwik

[jira] Commented: (LUCENE-2727) simulate out of open files in MockDirectoryWrapper

2010-10-28 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12925866#action_12925866 ] Robert Muir commented on LUCENE-2727: - bq. Hmm... should we also call maybeThrowIOExce

[jira] Commented: (LUCENE-2727) simulate out of open files in MockDirectoryWrapper

2010-10-28 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12925850#action_12925850 ] Michael McCandless commented on LUCENE-2727: This looks like a great start! S

[jira] Commented: (LUCENE-2728) EnwikiContentSource does not properly identify the name/id of the Wikipedia article

2010-10-28 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12925848#action_12925848 ] Grant Ingersoll commented on LUCENE-2728: - Also, the doc id is also guaranteed to

[jira] Created: (LUCENE-2728) EnwikiContentSource does not properly identify the name/id of the Wikipedia article

2010-10-28 Thread Grant Ingersoll (JIRA)
EnwikiContentSource does not properly identify the name/id of the Wikipedia article --- Key: LUCENE-2728 URL: https://issues.apache.org/jira/browse/LUCENE-2728 Project: L

RE: ArrayIndexOutOfBounds exception using FieldCache

2010-10-28 Thread karl.wright
Glad to be of service. ;-) Karl -Original Message- From: ext Michael McCandless [mailto:luc...@mikemccandless.com] Sent: Thursday, October 28, 2010 11:48 AM To: dev@lucene.apache.org; simon.willna...@gmail.com Subject: Re: ArrayIndexOutOfBounds exception using FieldCache On Thu, Oct 28,

Re: ArrayIndexOutOfBounds exception using FieldCache

2010-10-28 Thread Michael McCandless
On Thu, Oct 28, 2010 at 11:05 AM, Simon Willnauer wrote: > On Thu, Oct 28, 2010 at 4:59 PM, Walter Underwood > wrote: >> How big is it? The Internet works pretty well for large files. > > Mike, pick the USB stick up during you next run :) Heh, next time :) Karl, this is one big field cache ent

Re: ArrayIndexOutOfBounds exception using FieldCache

2010-10-28 Thread Michael McCandless
Nice find Yonik! Mike On Thu, Oct 28, 2010 at 10:16 AM, Yonik Seeley wrote: > On Thu, Oct 28, 2010 at 6:15 AM,   wrote: >> Synched to trunk, blew away old indexes, reindexed, same behavior.  So I >> think we've got a problem, Houston. ;-) > > Hey Karl, can you try the following patch on trunk:

RE: ArrayIndexOutOfBounds exception using FieldCache

2010-10-28 Thread karl.wright
The internet is not the bottleneck ;-). It's the intranet here. Index is 14GB. Besides, it looks like Yonik found the problem. Karl -Original Message- From: ext Walter Underwood [mailto:wun...@wunderwood.org] Sent: Thursday, October 28, 2010 11:00 AM To: dev@lucene.apache.org Subject:

Re: ArrayIndexOutOfBounds exception using FieldCache

2010-10-28 Thread Simon Willnauer
On Thu, Oct 28, 2010 at 4:59 PM, Walter Underwood wrote: > How big is it? The Internet works pretty well for large files. Mike, pick the USB stick up during you next run :) simon > > You can send a USB drive by snail mail. > > wunder > > On Oct 28, 2010, at 6:11 AM, wrote: > >> Talked with IT h

Re: ArrayIndexOutOfBounds exception using FieldCache

2010-10-28 Thread Walter Underwood
How big is it? The Internet works pretty well for large files. You can send a USB drive by snail mail. wunder On Oct 28, 2010, at 6:11 AM, wrote: > Talked with IT here - they don't recommend external transfers of this size. > So I think we'd best try the "instrument and repeat" approach inst

RE: ArrayIndexOutOfBounds exception using FieldCache

2010-10-28 Thread karl.wright
Yep, that fixed it. ;-) Everything seems happy now. Karl -Original Message- From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of ext Yonik Seeley Sent: Thursday, October 28, 2010 10:17 AM To: dev@lucene.apache.org Subject: Re: ArrayIndexOutOfBounds exception using FieldCache On

Lucene-Solr-tests-only-trunk - Build # 686 - Failure

2010-10-28 Thread Apache Hudson Server
Build: http://hudson.zones.apache.org/hudson/job/Lucene-Solr-tests-only-trunk/686/ 1 tests failed. REGRESSION: org.apache.solr.handler.component.DistributedTermsComponentTest.testDistribSearch Error Message: Some threads threw uncaught exceptions! Stack Trace: junit.framework.AssertionFailedE

Re: ArrayIndexOutOfBounds exception using FieldCache

2010-10-28 Thread Yonik Seeley
On Thu, Oct 28, 2010 at 6:15 AM, wrote: > Synched to trunk, blew away old indexes, reindexed, same behavior.  So I > think we've got a problem, Houston. ;-) Hey Karl, can you try the following patch on trunk: Index: lucene/src/java/org/apache/lucene/search/cache/DocTermsCreator.java ==

[jira] Updated: (LUCENE-2727) simulate out of open files in MockDirectoryWrapper

2010-10-28 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-2727: Attachment: LUCENE-2727.patch this is likely not the best way to do it, maybe we should actually t

[jira] Created: (LUCENE-2727) simulate out of open files in MockDirectoryWrapper

2010-10-28 Thread Robert Muir (JIRA)
simulate out of open files in MockDirectoryWrapper -- Key: LUCENE-2727 URL: https://issues.apache.org/jira/browse/LUCENE-2727 Project: Lucene - Java Issue Type: Test Components: Build

[jira] Commented: (LUCENE-2726) simulate disk fulls in copyBytes

2010-10-28 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12925806#action_12925806 ] Robert Muir commented on LUCENE-2726: - So the question here, is how to account for the

[jira] Updated: (LUCENE-2726) simulate disk fulls in copyBytes

2010-10-28 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-2726: Attachment: LUCENE-2726.patch here's a patch, uses the same logic to check as writeBytes does. >

[jira] Created: (LUCENE-2726) simulate disk fulls in copyBytes

2010-10-28 Thread Robert Muir (JIRA)
simulate disk fulls in copyBytes Key: LUCENE-2726 URL: https://issues.apache.org/jira/browse/LUCENE-2726 Project: Lucene - Java Issue Type: Test Components: Build Reporter: Robert Muir

RE: ArrayIndexOutOfBounds exception using FieldCache

2010-10-28 Thread karl.wright
Talked with IT here - they don't recommend external transfers of this size. So I think we'd best try the "instrument and repeat" approach instead." Karl -Original Message- From: ext karl.wri...@nokia.com [mailto:karl.wri...@nokia.com] Sent: Thursday, October 28, 2010 8:16 AM To: dev@lu

[jira] Commented: (SOLR-2204) Cross-version replication broken by new javabin format

2010-10-28 Thread Shawn Heisey (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-2204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12925794#action_12925794 ] Shawn Heisey commented on SOLR-2204: All of the Solr documentation I've read says that y

RE: ArrayIndexOutOfBounds exception using FieldCache

2010-10-28 Thread karl.wright
It's on an internal Nokia machine, unfortunately, so the only way I can transfer it out is with my credentials, or by email, which is definitely not going to work ;-). But if you can provide me with an account on a machine I'd be transferring it to, I may be able to scp it from here. Karl -

Re: ArrayIndexOutOfBounds exception using FieldCache

2010-10-28 Thread Michael McCandless
Fun fun :) Is there anyway I can rsync/scp/ftp a copy of this index over? Failing that I can make some patches that we can iterate on... Mike On Thu, Oct 28, 2010 at 6:15 AM, wrote: > Not good indeed. > > Synched to trunk, blew away old indexes, reindexed, same behavior.  So I > think we

RE: ArrayIndexOutOfBounds exception using FieldCache

2010-10-28 Thread karl.wright
Not good indeed. Synched to trunk, blew away old indexes, reindexed, same behavior. So I think we've got a problem, Houston. ;-) Karl -Original Message- From: ext Michael McCandless [mailto:luc...@mikemccandless.com] Sent: Wednesday, October 27, 2010 11:08 AM To: dev@lucene.apache.org

[jira] Commented: (SOLR-1395) Integrate Katta

2010-10-28 Thread tom liu (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-1395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12925765#action_12925765 ] tom liu commented on SOLR-1395: --- Walter, thanks. i review codes, found that org.apache.hadoop.

[jira] Commented: (SOLR-2204) Cross-version replication broken by new javabin format

2010-10-28 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/SOLR-2204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12925752#action_12925752 ] Robert Muir commented on SOLR-2204: --- I don't think this is a bug. Solr 3.1 is a new *majo

Solr-trunk - Build # 1295 - Still Failing

2010-10-28 Thread Apache Hudson Server
Build: http://hudson.zones.apache.org/hudson/job/Solr-trunk/1295/ All tests passed Build Log (for compile errors): [...truncated 16287 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional comman

Re: question about SolrCore

2010-10-28 Thread Li Li
is there anyone could help me? 2010/10/11 Li Li : > hi all, >    I want to know the detail of IndexReader in SolrCore. I read a > little codes of SolrCore. Here is my understanding, are they correct? >    Each SolrCore has many SolrIndexSearcher and keeps them in > _searchers. and _searcher keep t