[jira] Commented: (LUCENE-1536) if a filter can support random access API, we should use it

2009-04-16 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12699558#action_12699558 ] Uwe Schindler commented on LUCENE-1536: --- How about DocIdSet adds a {code} boolean is

[jira] Commented: (LUCENE-1536) if a filter can support random access API, we should use it

2009-04-16 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12699571#action_12699571 ] Uwe Schindler commented on LUCENE-1536: --- The empty docidset instance should *not* be

[jira] Commented: (LUCENE-1536) if a filter can support random access API, we should use it

2009-04-16 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12699573#action_12699573 ] Uwe Schindler commented on LUCENE-1536: --- And the switch for different densities: Ope

[jira] Updated: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-04-16 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-1606: Attachment: automaton.patch patch > Automaton Query/Filter (scalable regex) > ---

[jira] Created: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-04-16 Thread Robert Muir (JIRA)
Automaton Query/Filter (scalable regex) --- Key: LUCENE-1606 URL: https://issues.apache.org/jira/browse/LUCENE-1606 Project: Lucene - Java Issue Type: New Feature Components: contrib/*

[jira] Issue Comment Edited: (LUCENE-1536) if a filter can support random access API, we should use it

2009-04-16 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12699571#action_12699571 ] Uwe Schindler edited comment on LUCENE-1536 at 4/16/09 2:27 AM:

Re: Filtering documents out of IndexReader

2009-04-16 Thread Michael McCandless
On Tue, Apr 14, 2009 at 9:25 PM, Jeremy Volkman wrote: > Implementing this way allows me to write RAM indexes out to disk without > blocking readers, and only block readers when I need to remap any filtered > docs that may have been updated or deleted during the flushing process. I > think this m

[jira] Commented: (LUCENE-1591) Enable bzip compression in benchmark

2009-04-16 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12699604#action_12699604 ] Michael McCandless commented on LUCENE-1591: All tests pass! And patch looks

[jira] Commented: (LUCENE-1591) Enable bzip compression in benchmark

2009-04-16 Thread Shai Erera (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12699620#action_12699620 ] Shai Erera commented on LUCENE-1591: Mike, did you commit the commons-compress jar too

[jira] Commented: (LUCENE-1604) Stop creating huge arrays to represent the absense of field norms

2009-04-16 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12699607#action_12699607 ] Michael McCandless commented on LUCENE-1604: OK, patch looks good. All tests

[jira] Resolved: (LUCENE-1591) Enable bzip compression in benchmark

2009-04-16 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless resolved LUCENE-1591. Resolution: Fixed > Enable bzip compression in benchmark > ---

[jira] Updated: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-04-16 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-1606: Attachment: automatonWithWildCard.patch Here is an updated patch with AutomatonWildCardQuery. Thi

[jira] Commented: (LUCENE-1591) Enable bzip compression in benchmark

2009-04-16 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12699643#action_12699643 ] Michael McCandless commented on LUCENE-1591: bq. Mike, did you commit the comm

[jira] Commented: (LUCENE-831) Complete overhaul of FieldCache API/Implementation

2009-04-16 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12699644#action_12699644 ] Mark Miller commented on LUCENE-831: Right, you really want to use CacheByReaderValueSo

[jira] Commented: (LUCENE-1593) Optimizations to TopScoreDocCollector and TopFieldCollector

2009-04-16 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12699656#action_12699656 ] Michael McCandless commented on LUCENE-1593: bq. if so, can we agree on the ne

[jira] Commented: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-04-16 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12699657#action_12699657 ] Robert Muir commented on LUCENE-1606: - mark yeah, the enumeration helps a lot, it mean

[jira] Commented: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-04-16 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12699659#action_12699659 ] Michael McCandless commented on LUCENE-1606: Can this do everything that Regex

[jira] Updated: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-04-16 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-1606: --- Fix Version/s: 2.9 > Automaton Query/Filter (scalable regex) > -

[jira] Commented: (LUCENE-1603) Changes for TrieRange in FilteredTermEnum and MultiTermQuery improvement

2009-04-16 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12699660#action_12699660 ] Michael McCandless commented on LUCENE-1603: I think the name is good, so it's

[jira] Commented: (LUCENE-831) Complete overhaul of FieldCache API/Implementation

2009-04-16 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12699649#action_12699649 ] Uwe Schindler commented on LUCENE-831: -- This was the idea behin the "FieldType": You r

[jira] Commented: (LUCENE-1603) Changes for TrieRange in FilteredTermEnum and MultiTermQuery improvement

2009-04-16 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12699647#action_12699647 ] Michael McCandless commented on LUCENE-1603: Patch looks good -- I'll commit s

[jira] Commented: (LUCENE-1603) Changes for TrieRange in FilteredTermEnum and MultiTermQuery improvement

2009-04-16 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12699654#action_12699654 ] Uwe Schindler commented on LUCENE-1603: --- Do you think the name is good? MultiTermQue

[jira] Commented: (LUCENE-831) Complete overhaul of FieldCache API/Implementation

2009-04-16 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12699663#action_12699663 ] Mark Miller commented on LUCENE-831: Thats somewhat possible now (with the exception t

[jira] Updated: (LUCENE-1602) Rewrite TrieRange to use MultiTermQuery

2009-04-16 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-1602: -- Attachment: LUCENE-1602.patch This is the final patch, with the changes for LUCENE-1603. I als

[jira] Commented: (LUCENE-1536) if a filter can support random access API, we should use it

2009-04-16 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12699669#action_12699669 ] Michael McCandless commented on LUCENE-1536: I like this approach! But should

[jira] Updated: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-04-16 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-1606: Attachment: automatonWithWildCard2.patch oops I did say in javadocs score is constant / boost only

[jira] Commented: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-04-16 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12699662#action_12699662 ] Robert Muir commented on LUCENE-1606: - Mike the thing it cant do is stuff that cannot

[jira] Commented: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-04-16 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12699650#action_12699650 ] Mark Miller commented on LUCENE-1606: - Very nice Robert. This looks like it would make

[jira] Resolved: (LUCENE-1603) Changes for TrieRange in FilteredTermEnum and MultiTermQuery improvement

2009-04-16 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless resolved LUCENE-1603. Resolution: Fixed > Changes for TrieRange in FilteredTermEnum and MultiTermQuery i

[jira] Commented: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-04-16 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12699672#action_12699672 ] Uwe Schindler commented on LUCENE-1606: --- I looked into the patch, looks good. Maybe

TermEnum.skipTo()

2009-04-16 Thread Robert Muir
while I was mucking with term enumeration i found that TermEnum.skipTo() has a very simple implementation and has in javadocs that 'some implementations are considerably more efficent', yet SegmentTermEnum definitely doesn't reimplement it in a more efficient way. For my purposes to skip around i

[jira] Commented: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-04-16 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12699673#action_12699673 ] Robert Muir commented on LUCENE-1606: - Uwe, I agree with you, with one caveat: for thi

[jira] Commented: (LUCENE-1604) Stop creating huge arrays to represent the absense of field norms

2009-04-16 Thread Shon Vella (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12699674#action_12699674 ] Shon Vella commented on LUCENE-1604: Working on an update to the patch - MultiSegmentR

Re: TermEnum.skipTo()

2009-04-16 Thread Mark Miller
Robert Muir wrote: while I was mucking with term enumeration i found that TermEnum.skipTo() has a very simple implementation and has in javadocs that 'some implementations are considerably more efficent', yet SegmentTermEnum definitely doesn't reimplement it in a more efficient way. For my pu

[jira] Commented: (LUCENE-1536) if a filter can support random access API, we should use it

2009-04-16 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12699675#action_12699675 ] Uwe Schindler commented on LUCENE-1536: --- I coupled the density check inside the Open

Re: Lucene 2.9 status (to port to Lucene.Net)

2009-04-16 Thread Michael McCandless
Hi George, There's been a sudden burst of activity lately on 2.9 development... I know there are some biggish remaining features we may want to get into 2.9: * The new field cache (LUCENE-831; still being iterated/mulled), * Possible major rework of Field / Document & index-time vs sear

[jira] Commented: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-04-16 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12699676#action_12699676 ] Uwe Schindler commented on LUCENE-1606: --- It will work, that was what I said. For Mul

[jira] Commented: (LUCENE-831) Complete overhaul of FieldCache API/Implementation

2009-04-16 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12699678#action_12699678 ] Mark Miller commented on LUCENE-831: So I'm flopping around on this, but I guess my lat

[jira] Commented: (LUCENE-1536) if a filter can support random access API, we should use it

2009-04-16 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12699680#action_12699680 ] Michael McCandless commented on LUCENE-1536: OK, if we do choose to couple, ma

[jira] Commented: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-04-16 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12699685#action_12699685 ] Robert Muir commented on LUCENE-1606: - Uwe, i'll look and see how you do it for TrieRa

[jira] Resolved: (LUCENE-1602) Rewrite TrieRange to use MultiTermQuery

2009-04-16 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler resolved LUCENE-1602. --- Resolution: Fixed Committed revision 765618. > Rewrite TrieRange to use MultiTermQuery > --

[jira] Commented: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-04-16 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12699690#action_12699690 ] Uwe Schindler commented on LUCENE-1606: --- I committed TrieRange revision 765618. You

Re: TermEnum.skipTo()

2009-04-16 Thread Mark Miller
Mark Miller wrote: Robert Muir wrote: while I was mucking with term enumeration i found that TermEnum.skipTo() has a very simple implementation and has in javadocs that 'some implementations are considerably more efficent', yet SegmentTermEnum definitely doesn't reimplement it in a more effic

[jira] Updated: (LUCENE-1592) fix or deprecate TermsEnum.skipTo

2009-04-16 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-1592: -- Summary: fix or deprecate TermsEnum.skipTo (was: fix or deprecate TermsEnum.seek) > fix or d

[jira] Commented: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-04-16 Thread Robert Muir (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12699693#action_12699693 ] Robert Muir commented on LUCENE-1606: - Uwe, thanks. I'll think on this and on other im

[jira] Commented: (LUCENE-1592) fix or deprecate TermsEnum.skipTo

2009-04-16 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12699696#action_12699696 ] Mark Miller commented on LUCENE-1592: - I made a quick update to the javadoc so its a b

[jira] Commented: (LUCENE-1606) Automaton Query/Filter (scalable regex)

2009-04-16 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12699697#action_12699697 ] Uwe Schindler commented on LUCENE-1606: --- Let's stay with this issue! > Automaton Qu

Re: TermEnum.skipTo()

2009-04-16 Thread Michael McCandless
Maybe we should deprecate it? Mike On Thu, Apr 16, 2009 at 9:04 AM, Mark Miller wrote: > Mark Miller wrote: >> >> Robert Muir wrote: >>> >>> while I was mucking with term enumeration i found that TermEnum.skipTo() >>> has a very simple implementation and has in javadocs that 'some >>> implementa

Re: TermEnum.skipTo()

2009-04-16 Thread Shai Erera
I think it's a convenient method. Even if not performing, it's still more convenient than forcing everyone who wants to use it to implement it by himself. Perhaps a better implementation will exist in the future, and thus everyone who'll use this method will be silently upgraded. Maybe such a bette

Re: TermEnum.skipTo()

2009-04-16 Thread Michael McCandless
That would be great... we need someone to pull a patch together (for SegmentReader & Multi*Reader to implement it efficiently). Mike On Thu, Apr 16, 2009 at 9:50 AM, Shai Erera wrote: > I think it's a convenient method. Even if not performing, it's still more > convenient than forcing everyone w

I wanna contribute a Chinese analyzer to lucene

2009-04-16 Thread Gao Pinker
Hi All! I wrote a Analyzer for apache lucene for analyzing sentences in *Chinese*language, it's called *imdict-chinese-analyzer* as it is a subproject of *imdict*, which is an intelligent online dictionary. The project on google code is here: http://code.google.com/p/imdic

[jira] Issue Comment Edited: (LUCENE-1604) Stop creating huge arrays to represent the absense of field norms

2009-04-16 Thread Shon Vella (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12699714#action_12699714 ] Shon Vella edited comment on LUCENE-1604 at 4/16/09 7:16 AM: -

[jira] Commented: (LUCENE-1604) Stop creating huge arrays to represent the absense of field norms

2009-04-16 Thread Shon Vella (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12699714#action_12699714 ] Shon Vella commented on LUCENE-1604: Setting disableFakeNorms transitively isn't reall

Re: I wanna contribute a Chinese analyzer to lucene

2009-04-16 Thread Ken Krugler
I wrote a Analyzer for apache lucene for analyzing sentences in Chinese language, it's called imdict-chinese-analyzer as it is a subproject of imdict, which is an intelligent online dictionary. The project on google code is here:

Re: TermEnum.skipTo()

2009-04-16 Thread Mark Miller
+1 on further handling (LUCENE-1592). I just wanted to get a doc change in now rather than wait for that to complete. The statment that some implementations provide more efficient impls is very misleading (its almost an assertion that one exists) when no impls that ship with Lucene in fact do. On

[jira] Assigned: (LUCENE-1605) Add subset method to BitVector

2009-04-16 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless reassigned LUCENE-1605: -- Assignee: Michael McCandless > Add subset method to BitVector > --

[jira] Resolved: (LUCENE-1605) Add subset method to BitVector

2009-04-16 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless resolved LUCENE-1605. Resolution: Fixed > Add subset method to BitVector > -

[jira] Commented: (LUCENE-1605) Add subset method to BitVector

2009-04-16 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12699718#action_12699718 ] Michael McCandless commented on LUCENE-1605: Patch looks good; I'll commit sho

[jira] Commented: (LUCENE-1604) Stop creating huge arrays to represent the absense of field norms

2009-04-16 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12699720#action_12699720 ] Michael McCandless commented on LUCENE-1604: bq. Setting disableFakeNorms tran

vacation

2009-04-16 Thread Michael McCandless
Just as a heads up, since we have so many neat Lucene improvements "in flight": tomorrow I leave for a week long vacation, in a nice warm place that may or may not have internet access. So if suddenly I stop answering things, now you know why! Keep hacking away ;) Mike -

Re: vacation

2009-04-16 Thread Shai Erera
If it's "nice and warm" I hope for you that it doesn't have internet access, so you won't be tempted to be dragged away from it ;) On Thu, Apr 16, 2009 at 5:45 PM, Michael McCandless < luc...@mikemccandless.com> wrote: > Just as a heads up, since we have so many neat Lucene improvements "in > fli

Re: I wanna contribute a Chinese analyzer to lucene

2009-04-16 Thread Earwin Burrfoot
On Thu, Apr 16, 2009 at 18:16, Ken Krugler wrote: > I wrote a Analyzer for apache lucene for analyzing sentences in Chinese > language, it's called imdict-chinese-analyzer as it is a subproject of > imdict, which is an intelligent online dictionary. > > The project on google code is here: > http:/

Re: vacation

2009-04-16 Thread Michael McCandless
Yes I suppose that would be "best" ;) Mike On Thu, Apr 16, 2009 at 10:48 AM, Shai Erera wrote: > If it's "nice and warm" I hope for you that it doesn't have internet access, > so you won't be tempted to be dragged away from it ;) > > On Thu, Apr 16, 2009 at 5:45 PM, Michael McCandless > wrote:

RE: Lucene 2.9 status (to port to Lucene.Net)

2009-04-16 Thread George Aroush
Thanks Mike. A quick follow up question. What's the status of http://issues.apache.org/jira/browse/LUCENE-1313? Can this work be applied to Lucene 2.4.1 and still get it's benefit or are there other dependency / issues with it that prevents us from doing so? If anyone else knows, I welcome your

Re: Lucene 2.9 status (to port to Lucene.Net)

2009-04-16 Thread Mark Miller
I wouldn't be surprised if it didnt depend on a couple other little issues - Jason or Mike would probably have to tell you that. It does count a bit on LUCENE-1483 if you want to use it with FieldCaches or cached Filters though. It would still work with 1483, but would be much slower in those

Re: Lucene 2.9 status (to port to Lucene.Net)

2009-04-16 Thread Mark Miller
Whoops - should read: It should still work *without* 1483 but would be much slower in those cases (reloading the filter/fieldcache per reader rather than per segment). Mark Miller wrote: I wouldn't be surprised if it didnt depend on a couple other little issues - Jason or Mike would probably h

RE: Lucene 2.9 status (to port to Lucene.Net)

2009-04-16 Thread Uwe Schindler
These issues all depend so much on each other, i would suggest to simply try Lucene-2.9-dev trunk (e.g. from downloaded from Hudson). We have this running here without any problems. The problem with unreleased Lucene is more, that if you try new features, there may be non-compatible changes until t

RE: I wanna contribute a Chinese analyzer to lucene

2009-04-16 Thread Steven A Rowe
In addition to Ken's suggestions, check out http://wiki.apache.org/lucene-java/HowToContribute for some help on getting set up. - Steve From: Ken Krugler [mailto:kkrugler_li...@transpac.com] Sent: Thursday, April 16, 2009 10:16 AM To: java-dev@lucene.apache.org Subject: Re: I wanna contribute a

[jira] Commented: (LUCENE-1600) Reduce usage of String.intern(), performance is terrible

2009-04-16 Thread Jason Rutherglen (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12699857#action_12699857 ] Jason Rutherglen commented on LUCENE-1600: -- contrib/MemoryIndex has a bunch of no

[jira] Commented: (LUCENE-1600) Reduce usage of String.intern(), performance is terrible

2009-04-16 Thread Patrick Eger (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12699864#action_12699864 ] Patrick Eger commented on LUCENE-1600: -- Hashmaps would work also, but then they eithe

[jira] Commented: (LUCENE-1600) Reduce usage of String.intern(), performance is terrible

2009-04-16 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12699865#action_12699865 ] Uwe Schindler commented on LUCENE-1600: --- In addition to Mikes fixes, there are more

Re: Lucene 2.9 status (to port to Lucene.Net)

2009-04-16 Thread Jason Rutherglen
LUCENE-1313 relies on LUCENE-1516 which is in trunk. If you have other questions George, feel free to ask. On Thu, Apr 16, 2009 at 8:04 AM, George Aroush wrote: > Thanks Mike. > > A quick follow up question. What's the status of > http://issues.apache.org/jira/browse/LUCENE-1313? Can this wor

Re: vacation

2009-04-16 Thread Jason Rutherglen
Enjoy, I just got back from mine, tropical Minneapolis. On Thu, Apr 16, 2009 at 7:45 AM, Michael McCandless < luc...@mikemccandless.com> wrote: > Just as a heads up, since we have so many neat Lucene improvements "in > flight": tomorrow I leave for a week long vacation, in a nice warm > place tha

[jira] Issue Comment Edited: (LUCENE-1600) Reduce usage of String.intern(), performance is terrible

2009-04-16 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12699865#action_12699865 ] Uwe Schindler edited comment on LUCENE-1600 at 4/16/09 2:13 PM:

Re: I wanna contribute a Chinese analyzer to lucene

2009-04-16 Thread Otis Gospodnetic
-- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch From: Gao Pinker To: java-dev@lucene.apache.org Sent: Thursday, April 16, 2009 9:58:51 AM Subject: I wanna contribute a Chinese analyzer to lucene Hi All! I wrote a Analyzer for apache lucene for

RE: vacation

2009-04-16 Thread Uwe Schindler
Have fun and relax! My next holiday will be after a meeting in Japan, I will visit Kyoto (end of May). It will be hot there, too...! - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -Original Message- > From: Michael McCandless [mai

Re: I wanna contribute a Chinese analyzer to lucene

2009-04-16 Thread Otis Gospodnetic
This would be a great contribution. I took a quick look at the ZIP file and noticed it depends on, say, net.imdict.wordsegment.WordSegmenter, but I didn't see that class anywhere. I assume you will patch and polish things, but I thought I'd point this out. Thanks! Otis -- Sematext -- http://se

[jira] Commented: (LUCENE-1604) Stop creating huge arrays to represent the absense of field norms

2009-04-16 Thread Shon Vella (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12699872#action_12699872 ] Shon Vella commented on LUCENE-1604: What should the transitive behavior of MultiReade

[jira] Commented: (LUCENE-831) Complete overhaul of FieldCache API/Implementation

2009-04-16 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12699880#action_12699880 ] Mark Miller commented on LUCENE-831: Okay, now that I half way understand this issue, I

Re: vacation

2009-04-16 Thread Marvin Humphrey
On Thu, Apr 16, 2009 at 10:45:49AM -0400, Michael McCandless wrote: > Just as a heads up, since we have so many neat Lucene improvements "in > flight": tomorrow I leave for a week long vacation, in a nice warm > place that may or may not have internet access. So if suddenly I stop > answering thin

[jira] Commented: (LUCENE-831) Complete overhaul of FieldCache API/Implementation

2009-04-16 Thread Uwe Schindler (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12699893#action_12699893 ] Uwe Schindler commented on LUCENE-831: -- We have the problem with the ValueSource-overr

[jira] Commented: (LUCENE-1536) if a filter can support random access API, we should use it

2009-04-16 Thread Jason Rutherglen (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12699892#action_12699892 ] Jason Rutherglen commented on LUCENE-1536: -- I thought we are going to get LUCENE-

[jira] Updated: (LUCENE-1518) Merge Query and Filter classes

2009-04-16 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-1518: --- Fix Version/s: 2.9 > Merge Query and Filter classes > --

[jira] Commented: (LUCENE-1536) if a filter can support random access API, we should use it

2009-04-16 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12699939#action_12699939 ] Michael McCandless commented on LUCENE-1536: Ahh right, we should re-test perf

[jira] Commented: (LUCENE-1604) Stop creating huge arrays to represent the absense of field norms

2009-04-16 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12699941#action_12699941 ] Michael McCandless commented on LUCENE-1604: bq. I'm inclined to say they shou

[jira] Commented: (LUCENE-831) Complete overhaul of FieldCache API/Implementation

2009-04-16 Thread Mark Miller (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1263#action_1263 ] Mark Miller commented on LUCENE-831: I think we don't want to expose Uninverter though?