Re: FST and FieldCache?

2011-05-19 Thread Earwin Burrfoot
You cannot get a string out of automaton by its ordinal without storing additional data. The string is stored there not as a single arc, but as a sequence of them (basically.. err.. as a string), so referencing them is basically writing the string asis. Space savings here come from sharing arcs

Re: FST and FieldCache?

2011-05-19 Thread Earwin Burrfoot
I think, if we add ord as an output to the FST, then it builds everything we need?  Ie no further data structures should be needed? Maybe I'm confused :) If you put the ord as an output the common part will be shifted towards the front of the tree. This will work if you want to look up a

Re: FST and FieldCache?

2011-05-19 Thread Earwin Burrfoot
On Thu, May 19, 2011 at 16:45, Dawid Weiss dawid.we...@cs.put.poznan.pl wrote: That's what I invented, and yes, it was invented by countless people before :) You know I didn't mean to sound rude, right? I'm really admiring your ability to come up with these solutions by yourself, I'm merely

Re: FST and FieldCache?

2011-05-19 Thread Earwin Burrfoot
This is more about compressing strings in TermsIndex, I think. And ability to use said TermsIndex directly in some cases that required FieldCache before. (Maybe FC is still needed, but it can be degraded to docId-ord map, storing actual strings in TI). This yields fat space savings when we, eg,

Re: FST and FieldCache?

2011-05-19 Thread Earwin Burrfoot
On Thu, May 19, 2011 at 20:43, Michael McCandless luc...@mikemccandless.com wrote: On Thu, May 19, 2011 at 12:35 PM, Jason Rutherglen jason.rutherg...@gmail.com wrote: And I do agree there are times when mmap is appropriate, eg if query latency is unimportant to you, but it's not a panacea and

Re: Moving towards Lucene 4.0

2011-05-19 Thread Earwin Burrfoot
On Thu, May 19, 2011 at 21:44, Chris Hostetter hossman_luc...@fucit.org wrote: : I think we should focus on everything that's *infrastructure* in 4.0, so : that we can develop additional features in subsequent 4.x releases. If we : end up releasing 4.0 just to discover many things will need to

Re: Fuzzy search always returning docs sorted by the highest match

2011-05-18 Thread Earwin Burrfoot
You aren't likely to encounter strings like abc company inc in Lucene index, as it will be tokenized into three tokens abc, company, inc under most Analyzers. So, for this exact example you don't even need fuzzy matching. Also, maybe you should try 'user' mailing list for questions regarding the

Re: Lucene/Solr JIRA

2011-05-18 Thread Earwin Burrfoot
+1 to Chris. Even if the code is partially shared and project is the same, the end products are completely different. Merging lists/jira will force niche developers/users to manually sift through heaps of irrelevant emails/issues. On Thu, May 19, 2011 at 00:53, Chris Hostetter

Re: Fuzzy search always returning docs sorted by the highest match

2011-05-18 Thread Earwin Burrfoot
, 2011 at 6:32 PM, Earwin Burrfoot ear...@gmail.com wrote: You aren't likely to encounter strings like abc company inc in Lucene index, as it will be tokenized into three tokens abc, company, inc under most Analyzers. So, for this exact example you don't even need fuzzy matching. Also, maybe

[jira] [Commented] (LUCENE-3105) String.intern() calls slow down IndexWriter.close() and IndexReader.open() for index with large number of unique field names

2011-05-17 Thread Earwin Burrfoot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-3105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13034639#comment-13034639 ] Earwin Burrfoot commented on LUCENE-3105: - StringInterner is in fact faster than

[jira] [Commented] (LUCENE-3105) String.intern() calls slow down IndexWriter.close() and IndexReader.open() for index with large number of unique field names

2011-05-17 Thread Earwin Burrfoot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-3105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13034640#comment-13034640 ] Earwin Burrfoot commented on LUCENE-3105: - Hmm.. Ok, it *is* still used

[jira] [Commented] (LUCENE-3092) NRTCachingDirectory, to buffer small segments in a RAMDir

2011-05-13 Thread Earwin Burrfoot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-3092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13032936#comment-13032936 ] Earwin Burrfoot commented on LUCENE-3092: - Chris, I don't like the idea

[jira] [Commented] (LUCENE-3092) NRTCachingDirectory, to buffer small segments in a RAMDir

2011-05-13 Thread Earwin Burrfoot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-3092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13032989#comment-13032989 ] Earwin Burrfoot commented on LUCENE-3092: - bq. but I couldn't disagree more

[jira] [Commented] (LUCENE-3092) NRTCachingDirectory, to buffer small segments in a RAMDir

2011-05-13 Thread Earwin Burrfoot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-3092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13032997#comment-13032997 ] Earwin Burrfoot commented on LUCENE-3092: - bq. The IOCtx should reference

[jira] [Commented] (LUCENE-3084) MergePolicy.OneMerge.segments should be ListSegmentInfo not SegmentInfos

2011-05-11 Thread Earwin Burrfoot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-3084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13032046#comment-13032046 ] Earwin Burrfoot commented on LUCENE-3084: - * Speaking logically, merges operate

[jira] [Commented] (LUCENE-3084) MergePolicy.OneMerge.segments should be ListSegmentInfo not SegmentInfos

2011-05-11 Thread Earwin Burrfoot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-3084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13032099#comment-13032099 ] Earwin Burrfoot commented on LUCENE-3084: - bq. Merges are ordered Hmm.. Why

[jira] [Commented] (LUCENE-3077) DWPT doesn't see changes to DW#infoStream

2011-05-06 Thread Earwin Burrfoot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-3077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13029881#comment-13029881 ] Earwin Burrfoot commented on LUCENE-3077: - We should just make it final

Re: I was accepted in GSoC!!!

2011-05-05 Thread Earwin Burrfoot
By the way, guys. LuSolr SVN repository is mirrored @ git://git.apache.org/lucene-solr.git , which is in turn mirrored @ https://github.com/apache/lucene-solr . Working with git (maybe with stgit) is easier than juggling patches by hand. On Wed, May 4, 2011 at 15:00, David Nemeskey

[jira] [Commented] (LUCENE-2904) non-contiguous LogMergePolicy should be careful to not select merges already running

2011-05-05 Thread Earwin Burrfoot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13029403#comment-13029403 ] Earwin Burrfoot commented on LUCENE-2904: - I think we should simply change

[jira] [Commented] (LUCENE-2904) non-contiguous LogMergePolicy should be careful to not select merges already running

2011-05-05 Thread Earwin Burrfoot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13029408#comment-13029408 ] Earwin Burrfoot commented on LUCENE-2904: - Ok, I'm wrong. We need both a list

[jira] [Commented] (LUCENE-3065) NumericField should be stored in binary format in index (matching Solr's format)

2011-05-05 Thread Earwin Burrfoot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-3065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13029421#comment-13029421 ] Earwin Burrfoot commented on LUCENE-3065: - It's sad NumericFields are hardbaked

[jira] [Commented] (LUCENE-3041) Support Query Visting / Walking

2011-05-02 Thread Earwin Burrfoot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-3041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13027612#comment-13027612 ] Earwin Burrfoot commented on LUCENE-3041: - The static cache is now not threadsafe

[jira] [Issue Comment Edited] (LUCENE-3041) Support Query Visting / Walking

2011-05-02 Thread Earwin Burrfoot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-3041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13027612#comment-13027612 ] Earwin Burrfoot edited comment on LUCENE-3041 at 5/2/11 10:30 AM

[jira] [Commented] (LUCENE-3061) Open IndexWriter API to allow custom MergeScheduler implementation

2011-05-02 Thread Earwin Burrfoot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-3061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13027626#comment-13027626 ] Earwin Burrfoot commented on LUCENE-3061: - Mark these as @experimental? Open

Re: MergePolicy Thresholds

2011-05-02 Thread Earwin Burrfoot
Have you checked BalancedSegmentMergePolicy? It has some more knobs :) On Mon, May 2, 2011 at 17:03, Shai Erera ser...@gmail.com wrote: Hi Today, LogMP allows you to set different thresholds for segments sizes, thereby allowing you to control the largest segment that will be considered for

Re: MergePolicy Thresholds

2011-05-02 Thread Earwin Burrfoot
only need two thresholds (size + mergeFactor), and we can reuse BalancedMP's findBalancedMerges logic (perhaps w/ some adaptations) to derive a merge plan. Shai On Mon, May 2, 2011 at 4:42 PM, Earwin Burrfoot ear...@gmail.com wrote: Have you checked BalancedSegmentMergePolicy? It has some

Re: MergePolicy Thresholds

2011-05-02 Thread Earwin Burrfoot
the same set of knobs can be intuitive and meaningful for one person, and useless for another. And you can't pick the best one. Will BalancedMP stop merging such segments (if all segments are of that order of magnitude)? Shai On Mon, May 2, 2011 at 5:23 PM, Earwin Burrfoot ear...@gmail.com wrote

Re: Setting the max number of merge threads across IndexWriters

2011-05-01 Thread Earwin Burrfoot
will be required. Then, instead of trying to factor out IW members from this MS, you could share the same ES with all MS instances, each will keep a reference to a different IW member. This is just a thought though, I haven't tried it. Shai On Thu, Apr 14, 2011 at 8:23 PM, Earwin Burrfoot ear

Re: Setting the max number of merge threads across IndexWriters

2011-05-01 Thread Earwin Burrfoot
instances, each will keep a reference to a different IW member. This is just a thought though, I haven't tried it. Shai On Thu, Apr 14, 2011 at 8:23 PM, Earwin Burrfoot ear...@gmail.com wrote: Can't remember. Probably no. I started an experimental MS api rewrite (incorporating ability to share

[jira] [Commented] (LUCENE-3055) LUCENE-2372, LUCENE-2389 made it impossible to subclass core analyzers

2011-04-30 Thread Earwin Burrfoot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-3055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13027361#comment-13027361 ] Earwin Burrfoot commented on LUCENE-3055: - Could anyone remind me, why the hell

[jira] [Commented] (LUCENE-2571) Indexing performance tests with realtime branch

2011-04-15 Thread Earwin Burrfoot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13020217#comment-13020217 ] Earwin Burrfoot commented on LUCENE-2571: - bq. Merges are NOT blocking indexing

Re: Setting the max number of merge threads across IndexWriters

2011-04-14 Thread Earwin Burrfoot
I proposed to decouple MergeScheduler from IW (stop keeping a reference to it). Then you can create a single CMS and pass it to all your IWs. On Thu, Apr 14, 2011 at 19:40, Jason Rutherglen jason.rutherg...@gmail.com wrote: I think the proposal involved using a ThreadPoolExecutor, which seemed

Re: Setting the max number of merge threads across IndexWriters

2011-04-14 Thread Earwin Burrfoot
Can't remember. Probably no. I started an experimental MS api rewrite (incorporating ability to share MSs between IWs) some time ago, but never had the time to finish it. On Thu, Apr 14, 2011 at 19:56, Simon Willnauer simon.willna...@googlemail.com wrote: On Thu, Apr 14, 2011 at 5:52 PM, Earwin

Re: Numerical ids for terms?

2011-04-12 Thread Earwin Burrfoot
On Tue, Apr 12, 2011 at 13:41, Gregor Heinrich gre...@arbylon.net wrote: Hi -- has there been any effort to create a numerical representation of Lucene indices. That is, to use the Lucene Directory backend as a large term-document matrix at index level. As this would require bijective mapping

An IDF variation with penalty for very rare terms

2011-04-12 Thread Earwin Burrfoot
Excuse me for somewhat of an offtopic, but have anybody ever seen/used -subj- ? Something that looks like like http://dl.dropbox.com/u/920413/IDFplusplus.png Traditional log(N/x) tail, but when nearing zero freq, instead of going to +inf you do a nice round bump (with controlled

Re: character escapes in source? ... was: Re: Eclipse: Invalid character constant

2011-04-08 Thread Earwin Burrfoot
On Fri, Apr 8, 2011 at 03:01, Robert Muir rcm...@gmail.com wrote: On Thu, Apr 7, 2011 at 6:48 PM, Chris Hostetter hossman_luc...@fucit.org wrote: : -1. These files should be readable, for maintaining, debugging and : knowing whats going on. Readability is my main concern ... i don't know

Re: [POLL] JTS compile/test dependency

2011-04-06 Thread Earwin Burrfoot
On Wed, Apr 6, 2011 at 22:43, Robert Muir rcm...@gmail.com wrote: On Wed, Apr 6, 2011 at 2:12 PM, Ryan McKinley ryan...@gmail.com wrote: Some may be following the thread on spatial development...  here is a quick summary, and a poll to help decide what may be the best next move. I'm hoping

Re: [POLL] JTS compile/test dependency

2011-04-06 Thread Earwin Burrfoot
On Thu, Apr 7, 2011 at 01:11, Robert Muir rcm...@gmail.com wrote: On Wed, Apr 6, 2011 at 5:07 PM, Earwin Burrfoot ear...@gmail.com wrote: Handling Unicode code points outside of BMP is highly expert stuff as well. And is totally unneeded by 80% of the users for any other reason except

[jira] [Commented] (LUCENE-2981) Review and potentially remove unused/unsupported Contribs

2011-03-31 Thread Earwin Burrfoot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13014108#comment-13014108 ] Earwin Burrfoot commented on LUCENE-2981: - Bye-bye, DB. Few things can compete

Re: Urgent! Forgot to close IndexWriter after adding Documents to the index.

2011-03-22 Thread Earwin Burrfoot
On Tue, Mar 22, 2011 at 06:21, Chris Hostetter hossman_luc...@fucit.org wrote: (replying to the dev list, see context below) : Unfortunately, you can't easily recover from this (except by : reindexing your docs again). : : Failing to call IW.commit() or IW.close() means no segments file was

Re: IndexReader.indexExists declares throwing IOE, but never does

2011-03-21 Thread Earwin Burrfoot
Technically, there's a big difference between I checked, and there was no index, and I was unable to check the disk because file system went BANG!. So the proper behaviour is to return false IOE (on proper occasion)? On Mon, Mar 21, 2011 at 13:53, Michael McCandless luc...@mikemccandless.com

Re: IndexReader.indexExists declares throwing IOE, but never does

2011-03-21 Thread Earwin Burrfoot
if this changes implementation. Removing the throws declaration doesn't break apps. In the worse case, they'll have a catch block which is redundant? Shai On Mon, Mar 21, 2011 at 4:12 PM, Sanne Grinovero sanne.grinov...@gmail.com wrote: 2011/3/21 Earwin Burrfoot ear...@gmail.com: Technically

[jira] Commented: (LUCENE-2960) Allow (or bring back) the ability to setRAMBufferSizeMB on an open IndexWriter

2011-03-15 Thread Earwin Burrfoot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13007048#comment-13007048 ] Earwin Burrfoot commented on LUCENE-2960: - bq. Oh yeah. But then we'd clone

[jira] Commented: (LUCENE-2960) Allow (or bring back) the ability to setRAMBufferSizeMB on an open IndexWriter

2011-03-15 Thread Earwin Burrfoot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13007136#comment-13007136 ] Earwin Burrfoot commented on LUCENE-2960: - You avoid deprecation/undeprecation

[jira] Commented: (LUCENE-2960) Allow (or bring back) the ability to setRAMBufferSizeMB on an open IndexWriter

2011-03-14 Thread Earwin Burrfoot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13006759#comment-13006759 ] Earwin Burrfoot commented on LUCENE-2960: - bq. infoStream is a PrintStream, which

Re: GPU acceleration

2011-03-13 Thread Earwin Burrfoot
On Sun, Mar 13, 2011 at 00:15, Ken O'Brien k...@kenobrien.org wrote: To clarify, I've not yet written any code. I aim to bring a large speedup to any functionality that is computationally expensive. I'm wondering which components are candidates for this. I'll be looking through the code but

[jira] Commented: (LUCENE-2960) Allow (or bring back) the ability to setRAMBufferSizeMB on an open IndexWriter

2011-03-13 Thread Earwin Burrfoot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13006227#comment-13006227 ] Earwin Burrfoot commented on LUCENE-2960: - {quote} Why such purity? What do we

Re: IndexWriter#setRAMBufferSizeMB removed in trunk

2011-03-11 Thread Earwin Burrfoot
Is it really that hard to recreate IndexWriter if you have to change the settings?? Yeah, yeah, you lose all your precious reused buffers, and maybe there's a small indexing latency spike, when switching from old IW to new one, but people aren't changing their IW configs several times a second?

[jira] Commented: (LUCENE-2960) Allow (or bring back) the ability to setRAMBufferSizeMB on an open IndexWriter

2011-03-11 Thread Earwin Burrfoot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13005617#comment-13005617 ] Earwin Burrfoot commented on LUCENE-2960: - As I said on the list - if one needs

Re: IndexWriter#setRAMBufferSizeMB removed in trunk

2011-03-11 Thread Earwin Burrfoot
this issue should block 3.1? We can anyway add other runtime settings following 3.1, and we won't undeprecate anything. So maybe mark that issue as a non-blocker? Shai On Fri, Mar 11, 2011 at 2:20 PM, Earwin Burrfoot ear...@gmail.com wrote: Is it really that hard to recreate IndexWriter if you

[jira] Commented: (LUCENE-2960) Allow (or bring back) the ability to setRAMBufferSizeMB on an open IndexWriter

2011-03-11 Thread Earwin Burrfoot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13005891#comment-13005891 ] Earwin Burrfoot commented on LUCENE-2960: - bq. Furthermore, closing the IW also

[jira] Commented: (LUCENE-2908) clean up serialization in the codebase

2011-02-15 Thread Earwin Burrfoot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12994769#comment-12994769 ] Earwin Burrfoot commented on LUCENE-2908: - Oh, damn :) On my project, we

Re: [REINDEX] Note: re-indexing required !

2011-02-07 Thread Earwin Burrfoot
Lucene maintains compatibility with earlier stable release index versions, and to some extent transparently upgrades them. But there is no guaranteed compatibility between different in-development indexes. E.g. 3.2 reads 3.1 indexes and upgrades them, but 3.2-dev-snapshot-10 (while happily

[jira] Commented: (LUCENE-2871) Use FileChannel in FSDirectory

2011-01-20 Thread Earwin Burrfoot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12984222#action_12984222 ] Earwin Burrfoot commented on LUCENE-2871: - Before arguing where to put this new

Re: Let's drop Maven Artifacts !

2011-01-18 Thread Earwin Burrfoot
Somehow, they were made available since 2.0 - http://repo2.maven.org/maven2/org/apache/lucene/lucene-core/ The pom's are minimal, sans dependencies, so eg if your project depends on lucene-spellchecker, lucene-core won't be transitively included and your build is gonna fail (you therefore had to

[jira] Commented: (LUCENE-2657) Replace Maven POM templates with full POMs, and change documentation accordingly

2011-01-18 Thread Earwin Burrfoot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983160#action_12983160 ] Earwin Burrfoot commented on LUCENE-2657: - bq. we need to be very clear

[jira] Commented: (LUCENE-2657) Replace Maven POM templates with full POMs, and change documentation accordingly

2011-01-18 Thread Earwin Burrfoot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983162#action_12983162 ] Earwin Burrfoot commented on LUCENE-2657: - Thanks, but I'm not the one confused

Re: Let's drop Maven Artifacts !

2011-01-18 Thread Earwin Burrfoot
On Tue, Jan 18, 2011 at 17:00, Robert Muir rcm...@gmail.com wrote: On Tue, Jan 18, 2011 at 8:54 AM, Grant Ingersoll gsing...@apache.org wrote: It seems to me that if we have a fix for the things that ail our Maven support (Steve's work), that it isn't then the reason for holding up a release

Re: Let's drop Maven Artifacts !

2011-01-18 Thread Earwin Burrfoot
On Tue, Jan 18, 2011 at 20:13, Robert Muir rcm...@gmail.com wrote: Unfortunately there is a very loud minority that care about maven I would wager that there is a sizable silent *majority* of users who literally depend on Lucene's Maven artifacts. I can't help but remind myself, this is the

[jira] Commented: (LUCENE-2755) Some improvements to CMS

2011-01-17 Thread Earwin Burrfoot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12982564#action_12982564 ] Earwin Burrfoot commented on LUCENE-2755: - bq. if you still want to work

Re: Let's drop Maven Artifacts !

2011-01-17 Thread Earwin Burrfoot
You're not alone. :) But, I bet, much more people would like to skip that step and have their artifacts downloaded from central. On Mon, Jan 17, 2011 at 19:06, Steven A Rowe sar...@syr.edu wrote: On 1/17/2011 at 1:53 AM, Michael Busch wrote: I don't think any user needs the ability to run an

Re: Let's drop Maven Artifacts !

2011-01-16 Thread Earwin Burrfoot
Maven is a defacto package/dependency manager for Java. Like it or not. All better tools out there, like Ant+Ivy, or SBT - support Maven repositories. Lots of people rely on Maven or better tools for their builds and as soon as you're on declarative dependency management train, it's a bother to

[jira] Commented: (LUCENE-2374) Add introspection API to AttributeSource/AttributeImpl

2011-01-16 Thread Earwin Burrfoot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12982437#action_12982437 ] Earwin Burrfoot commented on LUCENE-2374: - Nice. Except maybe introduce a simple

[jira] Commented: (LUCENE-2858) Separate SegmentReaders (and other atomic readers) from composite IndexReaders

2011-01-15 Thread Earwin Burrfoot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12982126#action_12982126 ] Earwin Burrfoot commented on LUCENE-2858: - bq. Any comments about removing write

[jira] Commented: (LUCENE-2858) Separate SegmentReaders (and other atomic readers) from composite IndexReaders

2011-01-15 Thread Earwin Burrfoot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12982132#action_12982132 ] Earwin Burrfoot commented on LUCENE-2858: - bq. Still, i think we would need

[jira] Commented: (LUCENE-2858) Separate SegmentReaders (and other atomic readers) from composite IndexReaders

2011-01-15 Thread Earwin Burrfoot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12982166#action_12982166 ] Earwin Burrfoot commented on LUCENE-2858: - APIs have to be there still. All

[jira] Commented: (LUCENE-2868) It should be easy to make use of TermState; rewritten queries should be shared automatically

2011-01-14 Thread Earwin Burrfoot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12981774#action_12981774 ] Earwin Burrfoot commented on LUCENE-2868: - We here use an intermediate query AST

[jira] Commented: (LUCENE-2324) Per thread DocumentsWriters that write their own private segments

2011-01-13 Thread Earwin Burrfoot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12981388#action_12981388 ] Earwin Burrfoot commented on LUCENE-2324: - Maan, this comment list is infinite

[jira] Commented: (LUCENE-2793) Directory createOutput and openInput should take an IOContext

2011-01-12 Thread Earwin Burrfoot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12980649#action_12980649 ] Earwin Burrfoot commented on LUCENE-2793: - What's with ongoing crazyness? :) bq

[jira] Commented: (LUCENE-2793) Directory createOutput and openInput should take an IOContext

2011-01-12 Thread Earwin Burrfoot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12980732#action_12980732 ] Earwin Burrfoot commented on LUCENE-2793: - bq. Because in your example code above

[jira] Commented: (LUCENE-2793) Directory createOutput and openInput should take an IOContext

2011-01-12 Thread Earwin Burrfoot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12980736#action_12980736 ] Earwin Burrfoot commented on LUCENE-2793: - {quote} As I said before though, i

[jira] Commented: (LUCENE-2858) Separate SegmentReaders (and other atomic readers) from composite IndexReaders

2011-01-11 Thread Earwin Burrfoot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12980388#action_12980388 ] Earwin Burrfoot commented on LUCENE-2858: - bq. On the other side, atomic readers

[jira] Commented: (LUCENE-2856) Create IndexWriter event listener, specifically for merges

2011-01-11 Thread Earwin Burrfoot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12980390#action_12980390 ] Earwin Burrfoot commented on LUCENE-2856: - A CompositeSegmentListener niftily

[jira] Commented: (LUCENE-2793) Directory createOutput and openInput should take an IOContext

2011-01-11 Thread Earwin Burrfoot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12980400#action_12980400 ] Earwin Burrfoot commented on LUCENE-2793: - Looks crazy. In a -bad- tangled way

[jira] Commented: (LUCENE-2856) Create IndexWriter event listener, specifically for merges

2011-01-11 Thread Earwin Burrfoot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12980448#action_12980448 ] Earwin Burrfoot commented on LUCENE-2856: - A SegmentListener that has a number

[jira] Commented: (LUCENE-2793) Directory createOutput and openInput should take an IOContext

2011-01-11 Thread Earwin Burrfoot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12980454#action_12980454 ] Earwin Burrfoot commented on LUCENE-2793: - {quote} bq. You get IOFactory from

[jira] Commented: (LUCENE-2793) Directory createOutput and openInput should take an IOContext

2011-01-11 Thread Earwin Burrfoot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12980458#action_12980458 ] Earwin Burrfoot commented on LUCENE-2793: - In fact, I suggest dropping bufferSize

[jira] Commented: (LUCENE-2312) Search on IndexWriter's RAM Buffer

2011-01-10 Thread Earwin Burrfoot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12979522#action_12979522 ] Earwin Burrfoot commented on LUCENE-2312: - Some questions to align myself

[jira] Commented: (LUCENE-2474) Allow to plug in a Cache Eviction Listener to IndexReader to eagerly clean custom caches that use the IndexReader (getFieldCacheKey)

2011-01-10 Thread Earwin Burrfoot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12979888#action_12979888 ] Earwin Burrfoot commented on LUCENE-2474: - bq. Earwin's working on improving

[jira] Commented: (LUCENE-2840) Multi-Threading in IndexSearcher (after removal of MultiSearcher and ParallelMultiSearcher)

2011-01-09 Thread Earwin Burrfoot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12979276#action_12979276 ] Earwin Burrfoot commented on LUCENE-2840: - bq. But doesn't that mean that an app w

[jira] Commented: (LUCENE-2843) Add variable-gap terms index impl.

2011-01-09 Thread Earwin Burrfoot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12979277#action_12979277 ] Earwin Burrfoot commented on LUCENE-2843: - And we're nearing a day when we keep

[jira] Commented: (LUCENE-2843) Add variable-gap terms index impl.

2011-01-09 Thread Earwin Burrfoot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12979305#action_12979305 ] Earwin Burrfoot commented on LUCENE-2843: - As I said, there's already a search

[jira] Commented: (LUCENE-2840) Multi-Threading in IndexSearcher (after removal of MultiSearcher and ParallelMultiSearcher)

2011-01-09 Thread Earwin Burrfoot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12979306#action_12979306 ] Earwin Burrfoot commented on LUCENE-2840: - A lot of fork-join type frameworks

[jira] Commented: (LUCENE-2843) Add variable-gap terms index impl.

2011-01-09 Thread Earwin Burrfoot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12979346#action_12979346 ] Earwin Burrfoot commented on LUCENE-2843: - bq. I don't like the reasoning

[jira] Commented: (LUCENE-2843) Add variable-gap terms index impl.

2011-01-09 Thread Earwin Burrfoot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12979366#action_12979366 ] Earwin Burrfoot commented on LUCENE-2843: - bq. Nope, havent looked at their code

Re: [jira] Commented: (SOLR-2218) Performance of start= and rows= parameters are exponentially slow with large data sets

2011-01-08 Thread Earwin Burrfoot
On Mon, Jan 3, 2011 at 18:18, Yonik Seeley yo...@lucidimagination.com wrote: On Thu, Nov 11, 2010 at 3:22 PM, Jan Høydahl / Cominventjan@cominvent.com wrote: The problem with large start is probably worse when sharding is involved. Anyone know how the shard component goes about fetching

[jira] Commented: (LUCENE-2840) Multi-Threading in IndexSearcher (after removal of MultiSearcher and ParallelMultiSearcher)

2010-12-30 Thread Earwin Burrfoot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12976027#action_12976027 ] Earwin Burrfoot commented on LUCENE-2840: - I use the following scheme

Re: strange problem of PForDelta decoder

2010-12-30 Thread Earwin Burrfoot
until we fix Lucene to run a single search concurrently (which we badly need to do). I am interested in this idea.(I have posted it before) do you have some resources such as papers or tech articles about it? I have tried but it need to modify index format dramatically and we use solr

Re: is the classes ended with PerThread(*PerThread) multithread

2010-12-28 Thread Earwin Burrfoot
There is a single indexchain, with a single instance of each chain component, except those ending in -PerThread. Though that's gonna change with https://issues.apache.org/jira/browse/LUCENE-2324 On Tue, Dec 28, 2010 at 13:10, Simon Willnauer simon.willna...@googlemail.com wrote: On Tue, Dec 28,

[jira] Commented: (LUCENE-2829) improve termquery pk lookup performance

2010-12-22 Thread Earwin Burrfoot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12974274#action_12974274 ] Earwin Burrfoot commented on LUCENE-2829: - Term lookup misses can be alleviated

[jira] Commented: (LUCENE-2829) improve termquery pk lookup performance

2010-12-22 Thread Earwin Burrfoot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12974350#action_12974350 ] Earwin Burrfoot commented on LUCENE-2829: - Nobody halts your progress, we're

Re: RT branch status

2010-12-22 Thread Earwin Burrfoot
Cool! I'm getting to this on a weekend. On Tue, Dec 21, 2010 at 11:44, Michael Busch busch...@gmail.com wrote: After merging trunk into the RT branch it's finally compiling again and up-to-date. Several tests are failing now after the merge (43 out of 1427 are failing), which is not too

Re: Do we want 'nocommit' to fail the commit?

2010-12-18 Thread Earwin Burrfoot
But. Er. What if we happen to have nocommit in a string, or in some docs, or as a name of variable? On Sat, Dec 18, 2010 at 12:47, Michael McCandless luc...@mikemccandless.com wrote: +1 this would be great :) Mike On Fri, Dec 17, 2010 at 10:45 PM, Shai Erera ser...@gmail.com wrote: Hi Out

[jira] Commented: (LUCENE-2818) abort() method for IndexOutput

2010-12-18 Thread Earwin Burrfoot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12972764#action_12972764 ] Earwin Burrfoot commented on LUCENE-2818: - bq. Can abort() have a default impl

[jira] Commented: (LUCENE-2818) abort() method for IndexOutput

2010-12-18 Thread Earwin Burrfoot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12972765#action_12972765 ] Earwin Burrfoot commented on LUCENE-2818: - bq. I think we can make a default impl

[jira] Updated: (LUCENE-2818) abort() method for IndexOutput

2010-12-18 Thread Earwin Burrfoot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Earwin Burrfoot updated LUCENE-2818: Priority: Minor (was: Major) This change is really minor, but I think, convinient. You

[jira] Updated: (LUCENE-2814) stop writing shared doc stores across segments

2010-12-18 Thread Earwin Burrfoot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Earwin Burrfoot updated LUCENE-2814: Attachment: LUCENE-2814.patch Synced to trunk. bq. Also, on the nocommit on exc

[jira] Created: (LUCENE-2818) abort() method for IndexOutput

2010-12-17 Thread Earwin Burrfoot (JIRA)
abort() method for IndexOutput -- Key: LUCENE-2818 URL: https://issues.apache.org/jira/browse/LUCENE-2818 Project: Lucene - Java Issue Type: Improvement Reporter: Earwin Burrfoot I'd like to see

[jira] Updated: (LUCENE-2814) stop writing shared doc stores across segments

2010-12-17 Thread Earwin Burrfoot (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-2814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Earwin Burrfoot updated LUCENE-2814: Attachment: LUCENE-2814.patch New patch. Now with even more lines removed! DocStore

Re: LogMergePolicy.setUseCompoundFile/DocStore

2010-12-16 Thread Earwin Burrfoot
Incoming LUCENE-2814 drops setUseCompoundDocStore() On Thu, Dec 16, 2010 at 12:04, Shai Erera ser...@gmail.com wrote: Hi I find it very annoying that I need to set true/false on these methods whenever I want to control compound files creation. Is it really necessary to allow writing doc

  1   2   3   4   5   6   7   >