Re: Compound File Default

2010-01-13 Thread John Wang
+1. -John On Tue, Jan 12, 2010 at 8:16 PM, Otis Gospodnetic < [email protected]> wrote: > Heh, yeah, I forgot about that. Pick the lesser evil? I like speedier > defaults. > > Otis > -- > Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch > > > > - Original Message >

[jira] Created: (LUCENE-2205) Rework of the TermInfosReader class to remove the Terms[], TermInfos[], and the index pointer long[] to be more memory efficient.

2010-01-13 Thread Aaron McCurry (JIRA)
Rework of the TermInfosReader class to remove the Terms[], TermInfos[], and the index pointer long[] to be more memory efficient. - Key: LUCENE-2205

[jira] Updated: (LUCENE-2205) Rework of the TermInfosReader class to remove the Terms[], TermInfos[], and the index pointer long[] to be more memory efficient.

2010-01-13 Thread Aaron McCurry (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron McCurry updated LUCENE-2205: -- Attachment: patch-final.txt All unit tests passed. > Rework of the TermInfosReader class to r

[jira] Updated: (LUCENE-2205) Rework of the TermInfosReader class to remove the Terms[], TermInfos[], and the index pointer long[] to be more memory efficient.

2010-01-13 Thread Aaron McCurry (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron McCurry updated LUCENE-2205: -- Description: Basically packing those three arrays into a byte array with an int array as an i

[jira] Updated: (LUCENE-2205) Rework of the TermInfosReader class to remove the Terms[], TermInfos[], and the index pointer long[] to be more memory efficient.

2010-01-13 Thread Aaron McCurry (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron McCurry updated LUCENE-2205: -- Attachment: rawoutput.txt RandomAccessTest.java This is the program that used

[jira] Updated: (LUCENE-2205) Rework of the TermInfosReader class to remove the Terms[], TermInfos[], and the index pointer long[] and create a more memory efficient data structure.

2010-01-13 Thread Aaron McCurry (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron McCurry updated LUCENE-2205: -- Summary: Rework of the TermInfosReader class to remove the Terms[], TermInfos[], and the index

[jira] Commented: (LUCENE-2205) Rework of the TermInfosReader class to remove the Terms[], TermInfos[], and the index pointer long[] and create a more memory efficient data structure.

2010-01-13 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12799676#action_12799676 ] Michael McCandless commented on LUCENE-2205: We've done something very similar

Re: Lucene 2.9.0 Near Real Time Indexing and Service Crashes/restarts

2010-01-13 Thread Michael McCandless
On Tue, Jan 12, 2010 at 6:10 PM, jchang wrote: > Does anybody know how this works out with service restarts (both orderly > shutdown and a crash)?  If the service goes down while indexed items are in > RAMDir but not on disk, are they lost?  Or is there some kind of log > recovery? Lucene expose

Re: Dynamic array reallocation algorithms

2010-01-13 Thread Michael McCandless
Phew, good svn sleuthing Marvin! Responses below... On Tue, Jan 12, 2010 at 6:27 PM, Marvin Humphrey wrote: > Greets, > > I've been trying to understand this comment regarding ArrayUtil.getNextSize(): > >     * The growth pattern is:  0, 4, 8, 16, 25, 35, 46, 58, 72, 88, ... > > Maybe I'm missin

Re: Max Segmentation Size when Optimizing Index

2010-01-13 Thread Michael McCandless
Could you re-ask this question on java-user instead? Thanks. This list is used for discussing changes to Lucene's internal source code, whereas your question is more about how to use Lucene's API, externally. Mike On Tue, Jan 12, 2010 at 9:15 PM, Trin Chavalittumrong wrote: > Hi, > > > > I am

[jira] Commented: (LUCENE-1990) Add unsigned packed int impls in oal.util

2010-01-13 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12799701#action_12799701 ] Michael McCandless commented on LUCENE-1990: bq. The first section if for 1M v

[jira] Commented: (LUCENE-2203) improved snowball testing

2010-01-13 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12799703#action_12799703 ] Robert Muir commented on LUCENE-2203: - i would like to commit this one at the end of t

[jira] Assigned: (LUCENE-2203) improved snowball testing

2010-01-13 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir reassigned LUCENE-2203: --- Assignee: Robert Muir > improved snowball testing > - > >

[jira] Commented: (LUCENE-2201) more performance improvements for snowball

2010-01-13 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12799706#action_12799706 ] Robert Muir commented on LUCENE-2201: - all tests pass (and also LUCENE-2203 tests) wit

[jira] Commented: (LUCENE-1990) Add unsigned packed int impls in oal.util

2010-01-13 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12799707#action_12799707 ] Michael McCandless commented on LUCENE-1990: How about something like this API

[jira] Updated: (LUCENE-1990) Add unsigned packed int impls in oal.util

2010-01-13 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-1990: --- Attachment: LUCENE-1990.patch Attached patch with my current roughed up approach for

Re: Lucene 2.9.0 Near Real Time Indexing and Service Crashes/restarts

2010-01-13 Thread John Wang
"NRT reader "simply" lets you search the full index, including un-committed changes." I am not sure I understand: I think the context of the discussion is for when the indexer crashes before IW.commit. At which point, does not really matter if you are using NRT, e.g. IW.getReader, or IndexReader.

Re: Lucene 2.9.0 Near Real Time Indexing and Service Crashes/restarts

2010-01-13 Thread Michael McCandless
On Wed, Jan 13, 2010 at 7:33 AM, John Wang wrote: > "NRT reader "simply" lets you search the full index, including > un-committed changes." > > I am not sure I understand: > > I think the context of the discussion is for when the indexer crashes before > IW.commit. At which point, does not really

Re: Dynamic array reallocation algorithms

2010-01-13 Thread DM Smith
On Jan 13, 2010, at 1:00 AM, Marvin Humphrey wrote: > On Tue, Jan 12, 2010 at 10:46:29PM -0500, DM Smith wrote: > >> So starting at 0, the size is 0. >> 0 => 0 >> 0 + 1 => 4 >> 4 + 1 => 8 >> 8 + 1 => 16 >> 16 + 1 => 25 >> 25 + 1 => 35 >> ... >> >> So I think the copied python comment is correct

[jira] Created: (LUCENE-2206) integrate snowball stopword lists

2010-01-13 Thread Robert Muir (JIRA)
integrate snowball stopword lists - Key: LUCENE-2206 URL: https://issues.apache.org/jira/browse/LUCENE-2206 Project: Lucene - Java Issue Type: New Feature Components: contrib/analyzers Re

[jira] Updated: (LUCENE-2206) integrate snowball stopword lists

2010-01-13 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-2206: Attachment: LUCENE-2206.patch patch with mod to wordlistloader, test, and snowball stoplists for d

Re: Dynamic array reallocation algorithms

2010-01-13 Thread Yonik Seeley
On Tue, Jan 12, 2010 at 6:27 PM, Marvin Humphrey wrote: >    return (targetSize >> 3) + (targetSize < 9 ? 3 : 6) + targetSize; > > For input values of 9 or greater, all that formula does is multiply by 1.125 > and add 6. (Values enumerated below my sig.) > > The comment appears to have originated

[jira] Commented: (LUCENE-2205) Rework of the TermInfosReader class to remove the Terms[], TermInfos[], and the index pointer long[] and create a more memory efficient data structure.

2010-01-13 Thread Aaron McCurry (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12799772#action_12799772 ] Aaron McCurry commented on LUCENE-2205: --- I took a look at that class, and it does lo

Re: Dynamic array reallocation algorithms

2010-01-13 Thread Marvin Humphrey
On Wed, Jan 13, 2010 at 05:48:08AM -0500, Michael McCandless wrote: > Have you notified python-dev? No, not yet. Is it kosher with the Python license to have copied-and-pasted that comment? It's not credited from what I can see. Small, but we should probably fix that. > Right, and also to stri

Re: Dynamic array reallocation algorithms

2010-01-13 Thread Michael McCandless
On Wed, Jan 13, 2010 at 10:04 AM, Marvin Humphrey wrote: > On Wed, Jan 13, 2010 at 05:48:08AM -0500, Michael McCandless wrote: >> Have you notified python-dev? > > No, not yet.  Is it kosher with the Python license to have copied-and-pasted > that comment?  It's not credited from what I can see.  

Re: Dynamic array reallocation algorithms

2010-01-13 Thread Michael McCandless
On Wed, Jan 13, 2010 at 8:53 AM, DM Smith wrote: > The pattern that is stated in the comment only occurs under a very > specific situation: The growing of an array one element at a time > and reallocation only when the current is exceeded. Ahh -- that works -- thanks for clarifying. So if you st

[jira] Created: (LUCENE-2207) CJKTokenizer generates tokens with incorrect offsets

2010-01-13 Thread Koji Sekiguchi (JIRA)
CJKTokenizer generates tokens with incorrect offsets Key: LUCENE-2207 URL: https://issues.apache.org/jira/browse/LUCENE-2207 Project: Lucene - Java Issue Type: Bug Components: co

Re: Dynamic array reallocation algorithms

2010-01-13 Thread Michael McCandless
On Wed, Jan 13, 2010 at 10:04 AM, Marvin Humphrey wrote: > Both mild and aggressive over-allocation solve that problem, but aggressive > over-allocation also involves significant RAM costs. Where the best balance > lies depends on how bad the reallocation performance is in relation to the > cost

[jira] Commented: (LUCENE-2198) support protected words in Stemming TokenFilters

2010-01-13 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12799819#action_12799819 ] Robert Muir commented on LUCENE-2198: - hi Simon, the more i think about it, the more i

[jira] Commented: (LUCENE-2205) Rework of the TermInfosReader class to remove the Terms[], TermInfos[], and the index pointer long[] and create a more memory efficient data structure.

2010-01-13 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12799825#action_12799825 ] Michael McCandless commented on LUCENE-2205: I don't think anyone's run specif

[jira] Updated: (LUCENE-2207) CJKTokenizer generates tokens with incorrect offsets

2010-01-13 Thread Koji Sekiguchi (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koji Sekiguchi updated LUCENE-2207: --- Attachment: TestCJKOffset.java Attached the program that reproduces the problem. In the prog

[jira] Commented: (LUCENE-2207) CJKTokenizer generates tokens with incorrect offsets

2010-01-13 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12799831#action_12799831 ] Robert Muir commented on LUCENE-2207: - Hi Koji, this looks like a bug in CJK offset ca

[jira] Commented: (LUCENE-2205) Rework of the TermInfosReader class to remove the Terms[], TermInfos[], and the index pointer long[] and create a more memory efficient data structure.

2010-01-13 Thread Aaron McCurry (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12799835#action_12799835 ] Aaron McCurry commented on LUCENE-2205: --- My in memory implementation uses Vints and

[jira] Commented: (LUCENE-2198) support protected words in Stemming TokenFilters

2010-01-13 Thread Erik Hatcher (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12799839#action_12799839 ] Erik Hatcher commented on LUCENE-2198: -- +1 on the StemAttribute approach. I've just

[jira] Updated: (LUCENE-2207) CJKTokenizer generates tokens with incorrect offsets

2010-01-13 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-2207: Attachment: LUCENE-2207.patch ok i found the bug. the problem is incrementToken() unconditionally

[jira] Updated: (LUCENE-2207) CJKTokenizer generates tokens with incorrect offsets

2010-01-13 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-2207: Attachment: LUCENE-2207.patch i added a testcase for end() to my patch that fails on trunk, but pa

[jira] Resolved: (LUCENE-2197) StopFilter should not create a new CharArraySet if the given set is already an instance of CharArraySet

2010-01-13 Thread Yonik Seeley (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yonik Seeley resolved LUCENE-2197. -- Resolution: Fixed committed. > StopFilter should not create a new CharArraySet if the given s

[jira] Updated: (LUCENE-2207) CJKTokenizer generates tokens with incorrect offsets

2010-01-13 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-2207: Attachment: LUCENE-2207.patch hello, this tokenizer has more serious offset/end problems than I or

[jira] Commented: (LUCENE-2205) Rework of the TermInfosReader class to remove the Terms[], TermInfos[], and the index pointer long[] and create a more memory efficient data structure.

2010-01-13 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12799862#action_12799862 ] Michael McCandless commented on LUCENE-2205: That's very interesting -- I like

[jira] Created: (LUCENE-2208) Token div exceeds length of provided text sized 4114

2010-01-13 Thread Ramazan VARLIKLI (JIRA)
Token div exceeds length of provided text sized 4114 Key: LUCENE-2208 URL: https://issues.apache.org/jira/browse/LUCENE-2208 Project: Lucene - Java Issue Type: Bug Components: co

[jira] Updated: (LUCENE-2198) support protected words in Stemming TokenFilters

2010-01-13 Thread Simon Willnauer (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Willnauer updated LUCENE-2198: Attachment: LUCENE-2198.patch This patch contains an intial design proposal. I tried to na

[jira] Commented: (LUCENE-2203) improved snowball testing

2010-01-13 Thread Simon Willnauer (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12799868#action_12799868 ] Simon Willnauer commented on LUCENE-2203: - looks good to me. I haven't applied it

[jira] Commented: (LUCENE-2205) Rework of the TermInfosReader class to remove the Terms[], TermInfos[], and the index pointer long[] and create a more memory efficient data structure.

2010-01-13 Thread Aaron McCurry (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12799869#action_12799869 ] Aaron McCurry commented on LUCENE-2205: --- Sure I can rework things for that. Not sur

[jira] Commented: (LUCENE-2198) support protected words in Stemming TokenFilters

2010-01-13 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12799873#action_12799873 ] Robert Muir commented on LUCENE-2198: - the patch looks great to me Simon. I especially

Re: Lucene 2.9.0 Near Real Time Indexing and Service Crashes/restarts

2010-01-13 Thread jchang
I don't specifically need a cluster of servers writing indexes. Actually, at the moment, I only have one server, but multiple message consuming threads, so I still land back at the same problem of contention for the index lock. Why do I have multiple message consumers? Speed...I wanted to dequeu

Re: Lucene 2.9.0 Near Real Time Indexing and Service Crashes/restarts

2010-01-13 Thread jchang
Actually, unless IW.commit is called, all changes after the last commit will be lost (because the segment infos file will not have been written). On Tue, Jan 12, 2010 at 3:37 PM, Jason Rutherglen wrote: > Greetin's John, > > 2.9 and 3.0 don't use a RAMDir... Deletes are held in RAM however so >

[jira] Commented: (LUCENE-2205) Rework of the TermInfosReader class to remove the Terms[], TermInfos[], and the index pointer long[] and create a more memory efficient data structure.

2010-01-13 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12799906#action_12799906 ] Michael McCandless commented on LUCENE-2205: Another benefit doing this with f

Re: Lucene 2.9.0 Near Real Time Indexing and Service Crashes/restarts

2010-01-13 Thread Michael McCandless
IndexWriter should show good concurrency, ie, as you add threads you should see indexing speedup, assuming you have no external synchronization, your hardware has free concurrency and you use a large enough RAM buffer, and don't commit too frequently. But you should use a single IndexWriter, which

Re: Lucene 2.9.0 Near Real Time Indexing and Service Crashes/restarts

2010-01-13 Thread Michael McCandless
For Lucene, everything (adds & deletes) done after the last successful commit, is lost on crash/power loss/etc. Mike On Wed, Jan 13, 2010 at 2:16 PM, jchang wrote: > > > Actually, unless IW.commit is called, all changes after the last > commit will be lost (because the segment infos file will no

[jira] Commented: (LUCENE-2205) Rework of the TermInfosReader class to remove the Terms[], TermInfos[], and the index pointer long[] and create a more memory efficient data structure.

2010-01-13 Thread Aaron McCurry (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12799923#action_12799923 ] Aaron McCurry commented on LUCENE-2205: --- Well to be honest, I spent a lot of time ma

[jira] Resolved: (LUCENE-2203) improved snowball testing

2010-01-13 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir resolved LUCENE-2203. - Resolution: Fixed Fix Version/s: 3.1 Committed revision 898950. > improved snowball test

[jira] Resolved: (LUCENE-2201) more performance improvements for snowball

2010-01-13 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir resolved LUCENE-2201. - Resolution: Fixed Fix Version/s: 3.1 Committed revision 898976. > more performance impro

[jira] Updated: (LUCENE-2193) Get rid of backwards tags

2010-01-13 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-2193: -- Attachment: LUCENE-2193.patch Optimized version of the patch: - The backwards checkout is now

[jira] Updated: (LUCENE-2193) Get rid of backwards tags

2010-01-13 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-2193: -- Attachment: LUCENE-2193.patch One small optimization to have only one update if the checkout i

[jira] Resolved: (LUCENE-2193) Get rid of backwards tags

2010-01-13 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler resolved LUCENE-2193. --- Resolution: Fixed Committed revision: 899001 > Get rid of backwards tags >

[jira] Assigned: (LUCENE-2206) integrate snowball stopword lists

2010-01-13 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir reassigned LUCENE-2206: --- Assignee: Robert Muir > integrate snowball stopword lists >

[jira] Commented: (LUCENE-2206) integrate snowball stopword lists

2010-01-13 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12800021#action_12800021 ] Robert Muir commented on LUCENE-2206: - I will commit this in a few days if no one obje

New trunk backwards branch edit instructions

2010-01-13 Thread Uwe Schindler
After https://issues.apache.org/jira/browse/LUCENE-2193 is committed I wanted to inform all developers that the workflow for updating the backwards branch changed a little bit and is now much simplier: http://wiki.apache.org/lucene-java/UpdatingBackCompatTests I will merge flex soon, so this al

Re: Dynamic array reallocation algorithms

2010-01-13 Thread Marvin Humphrey
On Wed, Jan 13, 2010 at 09:43:12AM -0500, Yonik Seeley wrote: > Yeah, something highly optimized for python in C may not be optimal for Java. It looks like that algo was tuned to address poor reallocation performance under Windows 9x. http://svn.python.org/view/python/trunk/Objects/listobjec

Re: Dynamic array reallocation algorithms

2010-01-13 Thread Marvin Humphrey
On Wed, Jan 13, 2010 at 11:46:50AM -0500, Michael McCandless wrote: > If forced to pick, in general, I tend to prefer burning CPU not RAM, > because the CPU is often a one-time burn, whereas RAM ties up storage > for indefinite amounts of time. With our dependence on indexes being RAM-resident for

Build failed in Hudson: Lucene-trunk #1061

2010-01-13 Thread Apache Hudson Server
See Changes: [uschindler] LUCENE-2193: Replace the backwards tags by revision numbers. Please consult wiki for a howto about updating the backwards-branch now! [rmuir] LUCENE-2201: use char[] for snowball, don't create interm

RE: Build failed in Hudson: Lucene-trunk #1061

2010-01-13 Thread Uwe Schindler
As Robert and me suspected, SVN on lucene.zones is a hundred years old. Mike: Is there any newer version maybe in $PATH? --depth is needed for sparse checkouts and available since svn 1.5: http://subversion.tigris.org/svn_1.5_releasenotes.html#sparse-checkouts What should we do? Instead checkout