from:"Simon Willnauer"

Re: FuzzyQuery vs SlowFuzsyQuery docs? -- was: Re: [jira] [Commented] (LUCENE-2667) Fix FuzzyQuery's defaults, so its fast.

2012-11-10 Thread Simon Willnauer

On Sat, Nov 10, 2012 at 10:18 PM, Mark Bennett  wrote:
> Hi guys,
>
> Not expecting to change minds, but found Robert's last email helpful, so
> wanted to try one more round.
>
> On Fri, Nov 9, 2012 at 5:32 PM, Robert Muir  wrote:
>>
>> ...
>>
>> This is some analysis chain configuration issue.
>
>
> Interesting, so you would expect that the seed term *would* go through
> analysis before it finds the variants in the index?  If it's supposed to
> work that way then I can recheck my config.  (it wasn't just lowercase, that
> was just an example)
>
>>
>> If it doesn't work with 100M documents, i don't want it in lucene.
>
>
> Ah, this is very illuminating.  For scalability, big data, etc, that
> certainly makes sense.
>
> But there are many important Intranet search applications that have far less
> than 100M docs, but still need the fine-grained control of solr/lucene.
> Intranet projects in the 35k to 2M doc range often have even more precise
> indexing, filtering and faceting requirements, and solr/lucene provides that
> fine blade.
>
> Wouldn't it be more constructive to pick some number, say 100M, and give
> that the "big data" moniker.  Then, perhaps for things are not that
> scalable, have some separate area/label but still retain them.  Discarding
> all use cases < 100M seems draconian.


Mark, this was really an arbitrary number though. The problem with old
fuzzy is that it's already unusable once you cross the 32k unique
terms in a field. and that is far below "big data" We should not give
the impression that this stuff can work for you but shoots you in the
foot at any time.

just my $0.02

simon
>
>
>>
>>
>> I would have the same opinion if someone wanted unscalable solutions
>> for scoring w/ language models (e.g. not happy with smoothing for
>> unknown probabilities), or if someone claimed that spatial queries
>> should do slow things because they don't currently support
>> interplanetary distances, and so on.
>>
>> On Fri, Nov 9, 2012 at 7:52 PM, Mark Bennett  wrote:
>> > Hi Robert,
>> >
>> > I acknowledge your "-1" vote, and I'm guessing that your objection is
>> > maybe
>> > 70% "scalability", and only 30% use-case?
>> >
>> > The older Levenstein stuff has been around for a long time, scalable or
>> > not,
>> > and already in real systems.
>> >
>> > You seem to have a very "binary" on code being "in" or "out".  Is there
>> > any
>> > room in your world-view of code for "gray code", unsupported, incubator,
>> > what-have-you?  Maybe analagous to people who jailbreak their iPhones or
>> > something?
>> >
>> > You're an important part of the community, and working at Lucid, etc.,
>> > and
>> > clearly concerned about software quality.  When smart folks like you
>> > have
>> > such sharp opinions I do try to ponder them against my own
>> > circumstances.
>> >
>> > And on the quality of the old code, was it just the scalability, or were
>> > there other concerns such as stability, coding style, or possibly
>> > inconsistent results?
>> >
>> > Isn't the sandbox and admonished reference in Java docs sufficient?
>> >
>> > I'm harping on this because I'm really between a rock and hard place,
>> > and
>> > also posted another question.
>> >
>> > Just trying to understand your very strong opinions, and I thank you for
>> > your patience in this matter.  This issue is either going to fix or
>> > break my
>> > weekend / next-deliverble.
>> >
>> > Sincere thanks,
>> > Mark
>> >
>> >
>> > --
>> > Mark Bennett / New Idea Engineering, Inc. / mbenn...@ideaeng.com
>> > Direct: 408-733-0387 / Main: 866-IDEA-ENG / Cell: 408-829-6513
>> >
>> >
>> > On Fri, Nov 9, 2012 at 4:37 PM, Robert Muir  wrote:
>> >>
>> >> I'm -1 for having unscalable shit in lucene's core. This query should
>> >> have never been added.
>> >>
>> >> I don't care if a few people complain because they aren't using
>> >> lowercasefilter or some other insanity. Fix your analysis chain. I
>> >> don't have any sympathy.
>> >>
>> >> On Fri, Nov 9, 2012 at 7:35 PM, Jack Krupansky
>> >> 
>> >> wrote:
>> >> > +1 for permitting a choice of fuzzy query implementation.
>> >> >
>> >> > I agree that we want a super-fast fuzzy query for simple variations,
>> >> > but
>> >> > I
>> >> > also agree that we should have the option to trade off speed for
>> >> > function.
>> >> >
>> >> > But I am also sympathetic to assuring that any core Lucene features
>> >> > be
>> >> > as
>> >> > performant as possible.
>> >> >
>> >> > Ultimately, if there was a single fuzzy query implementation that did
>> >> > everything for everybody all of the time, that would be the way to
>> >> > go,
>> >> > but
>> >> > if choices need to be made to satisfy competing goals, we should
>> >> > support
>> >> > going that route.
>> >> >
>> >> > -- Jack Krupansky
>> >> >
>> >> > From: Mark Bennett
>> >> > Sent: Friday, November 09, 2012 3:48 PM
>> >> > To: dev@lucene.apache.org
>> >> > Subject: Re: FuzzyQuery vs SlowFuzsyQuery docs? -- was: Re: [jira]
>> >> > [Commented] (LUCENE-2667) Fix Fuzz

Re: Apache Git mirror

2012-11-11 Thread Simon Willnauer

Hey mark,

my workflow is as follows:

1. cloned lucene/solr on github
2. added apache mirror as upstream remote
3. pull into trunk only from upstream
4. branch for each feature and push to github

I never modify trunk or pull from github and my updates are usually
pretty fast ~ 1-5 sec.

yet, I don't use git svn so for commits I usually go through svn with
a git diff trunk

On Sun, Nov 11, 2012 at 5:20 PM, Mark Miller  wrote:
> Anyone else use the Apache git mirror much (not the GitHub one, but hosted by 
> Apache)? Since I've been trying to use it over the past couple days, it's 
> been super slow - cloning or updating does a whole lot of "counting objects" 
> and just generally takes a real long time. It did not used to be this way for 
> me, and I'm not seeing this on any other git projects I work with. Anyone 
> else notice this?
>
> One odd thing that showed up in the mirror a while back is a merge commit - 
> that's never come in from the svn bridge before. No clue if it's related, but 
> it was odd to see.
>
> - Mark
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Apache Git mirror

2012-11-12 Thread Simon Willnauer

On Mon, Nov 12, 2012 at 2:04 AM, David Smiley (@MITRE.org)
 wrote:
> Simon,
> Your response confuses me.
>
> "3. pull into trunk only from upstream"
>
> Do you mean a local branch "trunk", and coming from apache?

yeah
>
> "... and push to github"
>
> What; you can push to github's mirror?  I thought it was read-only?

I push into my own repo there what I cloned before from lucene/solr

>
> And do you mind telling me/us how you take a change you commit to trunk and
> back-port to 4x (assuming no conflicts) with git?  With svn it involves an
> "svn merge" on 4x referencing my commit revision on trunk.  I know my
> question is academic for normal git but this git svn double-mirror setup is
> odd.

I have a small tool that does all this for me. its manual but quick:
git diff trunk > p.patch && patchSvnTrunk < p.patch && ant precommit
test .
sorry but that is what I do and its quick. :)

I get all the goodness from git and for commits I move to SVN.

>
> Any way... seeing you guys use git as committers is encouraging as I thought
> the git side simply is too 2nd class citizen for Lucene/Solr to be effective
> for committers.  Guess not.  I'll have to do what Simon's doing.  Woohoo!
> Hopefully it's okay to post git patches to JIRA instead of subversions's;

I post git patches all the time!

> I've heard of incompatibility issues but if it's my issue then I'm the one
> committing any way so I guess it won't usually be a problem.
incompatible with what?

>
> ~ David
>
>
>
> -
>  Author: http://www.packtpub.com/apache-solr-3-enterprise-search-server/book
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Apache-Git-mirror-tp4019552p4019646.html
> Sent from the Lucene - Java Developer mailing list archive at Nabble.com.
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Updateable DocValues

2012-07-25 Thread Simon Willnauer

hey alan,

On Wed, Jul 25, 2012 at 11:15 AM, Alan Woodward
 wrote:
> Hi all,
>
> I'm looking at implementing some fairly hairy ACLs using Solr post-filtering 
> functionality, storing the ACL information in DocValues for quick lookup.  
> Some of this information will need to change frequently.  When DocValues were 
> first announced a while back, it was suggested that they could be 
> 'independently updateable' - i.e. you could make changes to them without 
> reindexing the whole document.  I've been nosing around the code, and haven't 
> found anything that looks like an update API though.  Has this actually been 
> implemented?  Or if not, how tricky would it be to implement, given the 
> existing implementation?  I'm guessing it would need to be done via Codecs.

this has not been implemented so far. I believe that the best way of
implementing this is a stacked segment that we can write on top of an
existing one and that would be merged once the segment is merged. We
have talked about this for a long time but never actually got to the
point to implement it. I think I will get to this once the position
iterators are merged into trunk but if you wanna give it a shot ahead
of time I am totally up for pairing and branching etc. There are a lot
of interesting problems to solve like how to deal with merges and
deleted docs while indexing etc. I was thinking of a update tool as a
start that would work just like updateing norms in 3.6 were you close
the IW, lock the index and update your docvalues, close the updater
and open a writer again. That could be a good start with limited
functionality but we can sketch out the underlying stuff before we
deal with concurrent updates etc.

what do you think, would that work for you?

simon
>
> Thanks,
>
> Alan Woodward
> http://www.romseysoftware.co.uk
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: revisit payloads API in DocsAndPositionsEnum

2012-08-11 Thread Simon Willnauer

+1 this makes lots of sense

simon

On Sat, Aug 11, 2012 at 7:28 PM, Michael McCandless
 wrote:
> +1
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
>
> On Sat, Aug 11, 2012 at 10:08 AM, Robert Muir  wrote:
>> The payloads api is really confusing:
>>
>>   /** Returns the payload at this position, or null if no
>>*  payload was indexed.  Only call this once per
>>*  position. You should not modify anything (neither
>>*  members of the returned BytesRef nor bytes in the
>>*  byte[]). */
>>   public abstract BytesRef getPayload() throws IOException;
>>
>>   public abstract boolean hasPayload();
>>
>> 1. is it ok for the consumer to call getPayload() [checking for null],
>> and never call hasPayload? It seems so, so why do we have hasPayload?
>> can we remove it?
>> The current situation requires impls to handle 'hasPayload' logic
>> twice: in hasPayload itself and also in getPayload.
>>
>> 2. You should not modify anything (neither members of the returned
>> BytesRef nor bytes in the byte[]). Then why do we have this in
>> TestPayloads.java:
>>// Just to ensure all codecs can
>>// handle a caller that mucks with the
>>// returned payload:
>>if (rarely()) {
>>  br.bytes = new byte[random().nextInt(5)];
>>}
>>br.length = 0;
>>br.offset = 0;
>>
>>Testing for this totally disagrees with the javadocs.
>>
>> 3. 'Only call this once per position'. This is totally different than
>> any of our other 'attributes' on the enums (freq(), startOffset(),
>> endOffset(), etc). I think we should
>> remove this.
>>
>> So I want to propose we remove hasPayload(), remove 'only call once
>> per position', and remove this test in TestPayloads.java. If we want
>> to make sure none
>> of lucene's code itself is mucking with the returned BytesRef, we can
>> add such a check in AssertingCodec.
>>
>>  /** Returns the payload at this position, or null if no
>>*  payload was indexed. You should not modify anything (neither
>>*  members of the returned BytesRef nor bytes in the
>>*  byte[]). */
>> public abstract BytesRef getPayload() throws IOException;
>>
>> --
>> lucidimagination.com
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Build failed in Jenkins: Lucene-trunk-Linux-Java7-64-test-only #3430

2012-08-28 Thread Simon Willnauer

thanks steve!

On Wed, Aug 29, 2012 at 8:02 AM, Steven A Rowe  wrote:
> Simon has granted me access on flonkings Jenkins, and I've set up the 
> hugerriblific build failure regex in the notification email configuration for 
> the jobs hosted there. - Steve
>
> -Original Message-
> From: Steven A Rowe [mailto:sar...@syr.edu]
> Sent: Saturday, August 25, 2012 8:13 PM
> To: dev@lucene.apache.org
> Subject: RE: Build failed in Jenkins: Lucene-trunk-Linux-Java7-64-test-only 
> #3430
>
> I'm pretty sure the AMBFR isn't being using on flonkings.  I don't have 
> access, so I haven't put it there. - Steve
>
> -Original Message-
> From: Michael McCandless [mailto:luc...@mikemccandless.com]
> Sent: Saturday, August 25, 2012 6:12 PM
> To: dev@lucene.apache.org
> Cc: buil...@flonkings.com; sim...@apache.org
> Subject: Re: Build failed in Jenkins: Lucene-trunk-Linux-Java7-64-test-only 
> #3430
>
> Also, it looks like that awesome, massive build failure regexp is
> failing to properly splice out this failure?
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
>
> On Sat, Aug 25, 2012 at 6:11 PM, Michael McCandless
>  wrote:
>> +1 to remove it.
>>
>> But why doesn't it fail all the time...?
>>
>> Mike McCandless
>>
>> http://blog.mikemccandless.com
>>
>> On Sat, Aug 25, 2012 at 5:28 PM, Robert Muir  wrote:
>>> I think this test method is useless and we should remove it.
>>>
>>> On Sat, Aug 25, 2012 at 5:17 PM,   wrote:
 See 

 --
 [...truncated 1008 lines...]
 [junit4:junit4] Completed on J5 in 0.43s, 4 tests
 [junit4:junit4]
 [junit4:junit4] Suite: org.apache.lucene.search.TestRegexpRandom
 [junit4:junit4] Completed on J2 in 1.07s, 1 test
 [junit4:junit4]
 [junit4:junit4] Suite: org.apache.lucene.search.TestWildcard
 [junit4:junit4] Completed on J0 in 0.85s, 8 tests
 [junit4:junit4]
 [junit4:junit4] Suite: 
 org.apache.lucene.search.TestSimpleExplanationsOfNonMatches
 [junit4:junit4] Completed on J4 in 1.42s, 69 tests
 [junit4:junit4]
 [junit4:junit4] Suite: org.apache.lucene.index.TestUniqueTermCount
 [junit4:junit4] Completed on J2 in 0.37s, 1 test
 [junit4:junit4]
 [junit4:junit4] Suite: 
 org.apache.lucene.search.TestComplexExplanationsOfNonMatches
 [junit4:junit4] Completed on J7 in 0.59s, 22 tests
 [junit4:junit4]
 [junit4:junit4] Suite: org.apache.lucene.index.TestSumDocFreq
 [junit4:junit4] Completed on J5 in 0.60s, 1 test
 [junit4:junit4]
 [junit4:junit4] Suite: org.apache.lucene.index.TestBinaryTerms
 [junit4:junit4] Completed on J2 in 0.18s, 1 test
 [junit4:junit4]
 [junit4:junit4] Suite: org.apache.lucene.index.TestParallelReaderEmptyIndex
 [junit4:junit4] Completed on J7 in 0.19s, 2 tests
 [junit4:junit4]
 [junit4:junit4] Suite: org.apache.lucene.index.TestPerSegmentDeletes
 [junit4:junit4] Completed on J4 in 0.36s, 1 test
 [junit4:junit4]
 [junit4:junit4] Suite: org.apache.lucene.index.TestIndexWriterConfig
 [junit4:junit4] Completed on J5 in 0.22s, 9 tests
 [junit4:junit4]
 [junit4:junit4] Suite: org.apache.lucene.TestSearchForDuplicates
 [junit4:junit4] Completed on J0 in 0.53s, 1 test
 [junit4:junit4]
 [junit4:junit4] Suite: org.apache.lucene.search.TestFilteredSearch
 [junit4:junit4] Completed on J2 in 0.20s, 1 test
 [junit4:junit4]
 [junit4:junit4] Suite: org.apache.lucene.index.TestForTooMuchCloning
 [junit4:junit4] Completed on J3 in 3.60s, 1 test
 [junit4:junit4]
 [junit4:junit4] Suite: org.apache.lucene.index.Test2BPostings
 [junit4:junit4] IGNOR/A 0.14s J0 | Test2BPostings.test
 [junit4:junit4]> Assumption #1: 'nightly' test group is disabled 
 (@Nightly)
 [junit4:junit4] Completed on J0 in 0.22s, 1 test, 1 skipped
 [junit4:junit4]
 [junit4:junit4] Suite: org.apache.lucene.index.TestFilterAtomicReader
 [junit4:junit4] Completed on J2 in 0.23s, 2 tests
 [junit4:junit4]
 [junit4:junit4] Suite: org.apache.lucene.search.TestCachingWrapperFilter
 [junit4:junit4] Completed on J4 in 0.38s, 5 tests
 [junit4:junit4]
 [junit4:junit4] Suite: org.apache.lucene.search.TestDocValuesScoring
 [junit4:junit4] Completed on J7 in 0.43s, 1 test
 [junit4:junit4]
 [junit4:junit4] Suite: org.apache.lucene.search.TestAutomatonQuery
 [junit4:junit4] Completed on J6 in 1.93s, 6 tests
 [junit4:junit4]
 [junit4:junit4] Suite: 
 org.apache.lucene.util.junitcompat.TestSeedFromUncaught
 [junit4:junit4] Completed on J0 in 0.14s, 1 test
 [junit4:junit4]
 [junit4:junit4] Suite: 
 org.apache.lucene.codecs.appending.TestAppendingCodec
 [junit4:junit4] Completed on J2 in 0.12s, 2 tests
 [junit4:junit4]
 [junit4:junit4] Suite: org.apache.lucene.store.TestFileSwitchDirectory
 [junit4:junit4] Completed on J3 in 0.26s, 4 tests
 [junit4:junit4]
 [junit4:junit4] Suite: or

Re: Welcome Adrien Grand as a new Lucene/Solr committer

2012-06-07 Thread Simon Willnauer

Welcome Adrien!

On Thu, Jun 7, 2012 at 8:49 PM, Christian Moen  wrote:
> Welcome, Adrien! :)
>
>
> Christian Moen
> http://atilika.com
>
> On Jun 7, 2012, at 8:11 PM, Michael McCandless wrote:
>
>> I'm pleased to announce that Adrien Grand has joined our ranks as a
>> committer.
>>
>> He has been contributing various patches to Lucene/Solr, recently to
>> Lucene's packed ints implementation, giving a nice performance gain in
>> some cases.  For example check out
>> http://people.apache.org/~mikemccand/lucenebench/TermTitleSort.html
>> (look for annotation U).
>>
>> Adrien, its tradition that you introduce yourself with a brief bio.
>>
>> As soon as your SVN access is setup, you should then be able to add
>> yourself to the committers list on the website as well.
>>
>> Congratulations!
>>
>> Mike McCandless
>>
>> http://blog.mikemccandless.com
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: lucene-highlighter 3.6 No highlight for 3 letter words

2012-06-08 Thread Simon Willnauer

you might ask this question on the user list to get better response.
can you provide also the text & query you want to highlight?

simon

On Fri, Jun 8, 2012 at 1:17 PM, gerryjun  wrote:
> Hi,
>
>    How can i highlight 3 letter words? everything is working except for
> this, what setting do i need to change?
>
>    Im using  lucene-highlighter-3.6.0.jar & lucene-core-3.6.0.jar.
>
>
>        Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_30);
>        QueryParser parser = new QueryParser(Version.LUCENE_30, "",
> analyzer);
>        parser.setAllowLeadingWildcard(true);
>        SimpleHTMLFormatter htmlFormatter = new SimpleHTMLFormatter("","");
>        Highlighter highlighter = new Highlighter(htmlFormatter,new
> QueryScorer(parser.parse(pQuery)));
>        highlighter.setTextFragmenter(new NullFragmenter());
>        highlighter.setMaxDocCharsToAnalyze(Integer.MAX_VALUE);
>        String text = highlighter.getBestFragment(analyzer, "", pText);
>
> Thanks
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/lucene-highlighter-3-6-No-highlight-for-3-letter-words-tp3988464.html
> Sent from the Lucene - Java Developer mailing list archive at Nabble.com.
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: svn commit: r1348623 - in /lucene/dev/branches/branch_4x: ./ dev-tools/ lucene/ lucene/analysis/ lucene/analysis/common/ lucene/analysis/common/src/java/org/apache/lucene/analysis/standard/std31/

2012-06-11 Thread Simon Willnauer

fixed thanks dawid!

On Sun, Jun 10, 2012 at 7:12 PM, Dawid Weiss  wrote:
> Synchonizer -> Synchronizer?
>
> D.
>
> On Sun, Jun 10, 2012 at 6:42 PM,   wrote:
>> Author: simonw
>> Date: Sun Jun 10 16:42:55 2012
>> New Revision: 1348623
>>
>> URL: http://svn.apache.org/viewvc?rev=1348623&view=rev
>> Log:
>> LUCENE-4116: fix concurrency test for DWPTStallControl
>>
>> Modified:
>>    lucene/dev/branches/branch_4x/   (props changed)
>>    lucene/dev/branches/branch_4x/dev-tools/   (props changed)
>>    lucene/dev/branches/branch_4x/lucene/   (props changed)
>>    lucene/dev/branches/branch_4x/lucene/BUILD.txt   (props changed)
>>    lucene/dev/branches/branch_4x/lucene/CHANGES.txt   (props changed)
>>    lucene/dev/branches/branch_4x/lucene/JRE_VERSION_MIGRATION.txt   (props 
>> changed)
>>    lucene/dev/branches/branch_4x/lucene/LICENSE.txt   (props changed)
>>    lucene/dev/branches/branch_4x/lucene/MIGRATE.txt   (props changed)
>>    lucene/dev/branches/branch_4x/lucene/NOTICE.txt   (props changed)
>>    lucene/dev/branches/branch_4x/lucene/README.txt   (props changed)
>>    lucene/dev/branches/branch_4x/lucene/analysis/   (props changed)
>>    lucene/dev/branches/branch_4x/lucene/analysis/common/   (props changed)
>>    
>> lucene/dev/branches/branch_4x/lucene/analysis/common/src/java/org/apache/lucene/analysis/standard/std31/package.html
>>    (props changed)
>>    
>> lucene/dev/branches/branch_4x/lucene/analysis/common/src/java/org/apache/lucene/analysis/standard/std34/package.html
>>    (props changed)
>>    lucene/dev/branches/branch_4x/lucene/backwards/   (props changed)
>>    lucene/dev/branches/branch_4x/lucene/benchmark/   (props changed)
>>    lucene/dev/branches/branch_4x/lucene/build.xml   (props changed)
>>    lucene/dev/branches/branch_4x/lucene/common-build.xml   (props changed)
>>    lucene/dev/branches/branch_4x/lucene/core/   (props changed)
>>    
>> lucene/dev/branches/branch_4x/lucene/core/src/java/org/apache/lucene/index/DocumentsWriterStallControl.java
>>    
>> lucene/dev/branches/branch_4x/lucene/core/src/test/org/apache/lucene/index/TestDocumentsWriterStallControl.java
>>    lucene/dev/branches/branch_4x/lucene/demo/   (props changed)
>>    lucene/dev/branches/branch_4x/lucene/facet/   (props changed)
>>    lucene/dev/branches/branch_4x/lucene/grouping/   (props changed)
>>    lucene/dev/branches/branch_4x/lucene/highlighter/   (props changed)
>>    lucene/dev/branches/branch_4x/lucene/ivy-settings.xml   (props changed)
>>    lucene/dev/branches/branch_4x/lucene/join/   (props changed)
>>    lucene/dev/branches/branch_4x/lucene/memory/   (props changed)
>>    lucene/dev/branches/branch_4x/lucene/misc/   (props changed)
>>    lucene/dev/branches/branch_4x/lucene/module-build.xml   (props changed)
>>    lucene/dev/branches/branch_4x/lucene/queries/   (props changed)
>>    lucene/dev/branches/branch_4x/lucene/queryparser/   (props changed)
>>    lucene/dev/branches/branch_4x/lucene/sandbox/   (props changed)
>>    lucene/dev/branches/branch_4x/lucene/site/   (props changed)
>>    lucene/dev/branches/branch_4x/lucene/spatial/   (props changed)
>>    lucene/dev/branches/branch_4x/lucene/suggest/   (props changed)
>>    lucene/dev/branches/branch_4x/lucene/test-framework/   (props changed)
>>    lucene/dev/branches/branch_4x/lucene/tools/   (props changed)
>>    lucene/dev/branches/branch_4x/solr/   (props changed)
>>    lucene/dev/branches/branch_4x/solr/CHANGES.txt   (props changed)
>>    lucene/dev/branches/branch_4x/solr/LICENSE.txt   (props changed)
>>    lucene/dev/branches/branch_4x/solr/NOTICE.txt   (props changed)
>>    lucene/dev/branches/branch_4x/solr/README.txt   (props changed)
>>    lucene/dev/branches/branch_4x/solr/build.xml   (props changed)
>>    lucene/dev/branches/branch_4x/solr/cloud-dev/   (props changed)
>>    lucene/dev/branches/branch_4x/solr/common-build.xml   (props changed)
>>    lucene/dev/branches/branch_4x/solr/contrib/   (props changed)
>>    lucene/dev/branches/branch_4x/solr/core/   (props changed)
>>    lucene/dev/branches/branch_4x/solr/dev-tools/   (props changed)
>>    lucene/dev/branches/branch_4x/solr/example/   (props changed)
>>    lucene/dev/branches/branch_4x/solr/lib/   (props changed)
>>    lucene/dev/branches/branch_4x/solr/lib/httpclient-LICENSE-ASL.txt   
>> (props changed)
>>    lucene/dev/branches/branch_4x/solr/lib/httpclient-NOTICE.txt   (props 
>> changed)
>>    lucene/dev/branches/branch_4x/solr/lib/httpcore-LICENSE-ASL.txt   (props 
>> changed)
>>    lucene/dev/branches/branch_4x/solr/lib/httpcore-NOTICE.txt   (props 
>> changed)
>>    lucene/dev/branches/branch_4x/solr/lib/httpmime-LICENSE-ASL.txt   (props 
>> changed)
>>    lucene/dev/branches/branch_4x/solr/lib/httpmime-NOTICE.txt   (props 
>> changed)
>>    lucene/dev/branches/branch_4x/solr/scripts/   (props changed)
>>    lucene/dev/branches/branch_4x/solr/solrj/   (props changed)
>>    lucene/dev/branches/branch_4x/solr/test-framework/   (props changed)
>>    lucene/dev/bra

Re: [JENKINS] Lucene-Solr-4.x-Linux-Java6-64 - Build # 101 - Failure!

2012-06-13 Thread Simon Willnauer

On Wed, Jun 13, 2012 at 6:11 PM, Michael McCandless
 wrote:
> It was a thread scheduling issue.
>
> The test opened an IndexWriter, launched indexing threads, then opened
> the first NRT reader.
>
> The failure would happen (as best I can figure out...) if a merge
> completed before the NRT reader was opened.

sneaky, thanks for diggin!

simon
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
> On Wed, Jun 13, 2012 at 11:17 AM, Dawid Weiss
>  wrote:
>> Was it a race condition of some sort? Why do you think it wasn't
>> reproducible from the same seed?
>>
>> Dawid
>>
>> On Wed, Jun 13, 2012 at 5:04 PM, Michael McCandless
>>  wrote:
>>> I could never repro this but I think I found the cause (just committed
>>> the fix)...
>>>
>>> Mike McCandless
>>>
>>> http://blog.mikemccandless.com
>>>
>>> On Wed, Jun 13, 2012 at 6:45 AM,   wrote:
 Build: 
 http://jenkins.sd-datasolutions.de/job/Lucene-Solr-4.x-Linux-Java6-64/101/

 1 tests failed.
 REGRESSION:  org.apache.lucene.index.TestNRTThreads.testNRTThreads

 Error Message:
 Thread threw an uncaught exception, thread: Thread[Thread-923,5,]

 Stack Trace:
 java.lang.RuntimeException: Thread threw an uncaught exception, thread: 
 Thread[Thread-923,5,]
        at 
 com.carrotsearch.randomizedtesting.RunnerThreadGroup.processUncaught(RunnerThreadGroup.java:96)
        at 
 com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:857)
        at 
 com.carrotsearch.randomizedtesting.RandomizedRunner.access$700(RandomizedRunner.java:132)
        at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$3$1.run(RandomizedRunner.java:669)
        at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:695)
        at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:734)
        at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:745)
        at 
 org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
        at 
 org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68)
        at 
 org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38)
        at 
 org.apache.lucene.util.TestRuleIcuHack$1.evaluate(TestRuleIcuHack.java:51)
        at 
 com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
        at 
 org.apache.lucene.util.TestRuleNoInstanceHooksOverrides$1.evaluate(TestRuleNoInstanceHooksOverrides.java:53)
        at 
 org.apache.lucene.util.TestRuleNoStaticHooksShadowing$1.evaluate(TestRuleNoStaticHooksShadowing.java:52)
        at 
 org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:36)
        at 
 org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
        at 
 org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:56)
        at 
 com.carrotsearch.randomizedtesting.RandomizedRunner.runSuite(RandomizedRunner.java:605)
        at 
 com.carrotsearch.randomizedtesting.RandomizedRunner.access$400(RandomizedRunner.java:132)
        at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$2.run(RandomizedRunner.java:551)
 Caused by: java.lang.RuntimeException: java.lang.AssertionError: sub 
 reader _5(4.0):c6 wasn't warmed: 
 {SegmentCoreReader(owner=_j(4.0):C17)=true}
        at __randomizedtesting.SeedInfo.seed([E1A5D8EF6222FB8E]:0)
        at 
 org.apache.lucene.index.ThreadedIndexingAndSearchingTestCase$2.run(ThreadedIndexingAndSearchingTestCase.java:395)
 Caused by: java.lang.AssertionError: sub reader _5(4.0):c6 wasn't warmed: 
 {SegmentCoreReader(owner=_j(4.0):C17)=true}
        at org.junit.Assert.fail(Assert.java:93)
        at org.junit.Assert.assertTrue(Assert.java:43)
        at 
 org.apache.lucene.index.ThreadedIndexingAndSearchingTestCase$2.run(ThreadedIndexingAndSearchingTestCase.java:345)




 Build Log:
 [...truncated 1297 lines...]
   [junit4]    >        at 
 org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
   [junit4]    >        at 
 org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68)
   [junit4]    >        at 
 org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38)
   [junit4]    >        at 
 org.apache.lucene.util.TestRuleIcuHack$1.evaluate(TestRuleIcuHack.java:51)
   [junit4]    >        at 
 com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evalua

Re: [JENKINS] Lucene-Solr-4.x-Windows-Java7-64 - Build # 70 - Failure!

2012-06-15 Thread Simon Willnauer

this seems like a bug but the window is super small. Since you need to
be in startCommit just before you add the new "pendingCommit" and be
in rollback with another thread already leaving the critical section.
In addition you need to have passed the ensureOpen() call with the
committing thread before we entered rollbackInternal. I think it would
make sense to do a ensureOpen() call before we add the new pending
commit in start commit to fix the problem. The unsynced reads robert
is mentioning are fine since pendingCommit is volatile though.

simon

On Fri, Jun 15, 2012 at 3:37 PM, Robert Muir  wrote:
> This seems like a thread hazard in rollBackInternal? it clears
> pendingCommit, but the closeInternal is outside of the synchronized
> block... so some other thread must be sneaking a pendingCommit in at
> that exact moment in this test?
>
> On Fri, Jun 15, 2012 at 9:24 AM, Robert Muir  wrote:
>> I can't reproduce this, on linux or windows: but I think its a bug.
>>
>> rollback()'s javadocs say "This also clears a previous call to {@link
>> #prepareCommit}."
>>
>> On Fri, Jun 15, 2012 at 9:09 AM,   wrote:
>>> Build: 
>>> http://jenkins.sd-datasolutions.de/job/Lucene-Solr-4.x-Windows-Java7-64/70/
>>>
>>> 1 tests failed.
>>> REGRESSION:  
>>> org.apache.lucene.index.TestAddIndexes.testAddIndexesWithRollback
>>>
>>> Error Message:
>>> cannot close: prepareCommit was already called with no corresponding call 
>>> to commit
>>>
>>> Stack Trace:
>>> java.lang.IllegalStateException: cannot close: prepareCommit was already 
>>> called with no corresponding call to commit
>>>        at 
>>> __randomizedtesting.SeedInfo.seed([6DC405A18E10711B:8BE3EA4E2ABF1A65]:0)
>>>        at 
>>> org.apache.lucene.index.IndexWriter.closeInternal(IndexWriter.java:894)
>>>        at 
>>> org.apache.lucene.index.IndexWriter.rollbackInternal(IndexWriter.java:1870)
>>>        at 
>>> org.apache.lucene.index.IndexWriter.rollback(IndexWriter.java:1792)
>>>        at 
>>> org.apache.lucene.index.TestAddIndexes.testAddIndexesWithRollback(TestAddIndexes.java:959)
>>>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>        at 
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>>        at 
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>        at java.lang.reflect.Method.invoke(Method.java:601)
>>>        at 
>>> com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1969)
>>>        at 
>>> com.carrotsearch.randomizedtesting.RandomizedRunner.access$1100(RandomizedRunner.java:132)
>>>        at 
>>> com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:814)
>>>        at 
>>> com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:875)
>>>        at 
>>> com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:889)
>>>        at 
>>> org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
>>>        at 
>>> org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:32)
>>>        at 
>>> org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
>>>        at 
>>> com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
>>>        at 
>>> org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68)
>>>        at 
>>> org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
>>>        at 
>>> org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
>>>        at 
>>> com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:821)
>>>        at 
>>> com.carrotsearch.randomizedtesting.RandomizedRunner.access$700(RandomizedRunner.java:132)
>>>        at 
>>> com.carrotsearch.randomizedtesting.RandomizedRunner$3$1.run(RandomizedRunner.java:669)
>>>        at 
>>> com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:695)
>>>        at 
>>> com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:734)
>>>        at 
>>> com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:745)
>>>        at 
>>> org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
>>>        at 
>>> org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68)
>>>        at 
>>> org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38)
>>>        at 
>>> org.apache.lucene.util.TestRuleIcuHack$1.evaluate(TestRuleIcuHack.java:51)
>>>        at 
>>> com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.jav

Re: Continuous stream indexing and time-based segment management

2012-06-19 Thread Simon Willnauer

On Tue, Jun 19, 2012 at 6:42 PM, mark harwood  wrote:
> There are a number of scenarios where Lucene might be used to index a fixed 
> time range on a continuous stream of data e.g. a news feed.
>
> In these scenarios I imagine the following facilities would be useful:
>
> a) A MergePolicy that organized content into segments on the basis of 
> increasing time units e.g. 5min->10 min->1 hour->1 day
> b) The ability to drop entire segments e.g. the day-level segment from 
> exactly a week ago

you can do that by subclassing IW and call some package private APIs /
members. We can certainly make that easier but I personally don't want
to open this as a public API. I can certainly imagine to have a
protected API that allows dropping entire segment.

simon

> c) Various new analysis functions comparing term frequencies across time e.g 
> discovery of "trending" topics.
>
> I can see that a) could be implemented using a custom MergePolicy and c) can 
> be done via existing APIs but I'm not sure if there is way to simply drop 
> entire segments currently?
>
> Anyone else had thoughts in this area?
>
> Cheers
> Mark
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Continuous stream indexing and time-based segment management

2012-06-19 Thread Simon Willnauer

On Tue, Jun 19, 2012 at 9:44 PM, Simon Willnauer
 wrote:
> On Tue, Jun 19, 2012 at 6:42 PM, mark harwood  wrote:
>> There are a number of scenarios where Lucene might be used to index a fixed 
>> time range on a continuous stream of data e.g. a news feed.
>>
>> In these scenarios I imagine the following facilities would be useful:
>>
>> a) A MergePolicy that organized content into segments on the basis of 
>> increasing time units e.g. 5min->10 min->1 hour->1 day
>> b) The ability to drop entire segments e.g. the day-level segment from 
>> exactly a week ago
>
> you can do that by subclassing IW and call some package private APIs /
> members. We can certainly make that easier but I personally don't want
> to open this as a public API. I can certainly imagine to have a
> protected API that allows dropping entire segment.
>
> simon
>
>> c) Various new analysis functions comparing term frequencies across time e.g 
>> discovery of "trending" topics.
>>
>> I can see that a) could be implemented using a custom MergePolicy and c) can 
>> be done via existing APIs but I'm not sure if there is way to simply drop 
>> entire segments currently?
>>
>> Anyone else had thoughts in this area?

I had some ideas to add statistics to DocValues that get created
during index time. You can already do that and expose it via
Attributes maybe we can add some API to docvlaues you can hook into so
that you don't need to write you own DV impl.
>>
>> Cheers
>> Mark
>>
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

control test temp dir used in common-build.xml

2012-06-21 Thread Simon Willnauer

hey folks,

I'd like to control the tempDir used to execute junit tests from the
outside. Any objections to the patch below?


diff --git a/lucene/common-build.xml b/lucene/common-build.xml
index 47d5013..2c8f144 100644
--- a/lucene/common-build.xml
+++ b/lucene/common-build.xml
@@ -95,6 +95,7 @@
   
   
   
+   

   
   
@@ -705,7 +706,7 @@
 
 
 
-
+
 
 
 

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: control test temp dir used in common-build.xml

2012-06-21 Thread Simon Willnauer

On Thu, Jun 21, 2012 at 1:12 PM, Dawid Weiss
 wrote:
> I don't mind but why do we need a specific property (other than Java's
> default temporary file location property)? Perhaps it'd be better to
> just use that?

tests create directories in that one currently something like
${tempDir}/S0/someindex
I am running multiple instance of a test on the same checkout and that
somethimes fails there for some reason
I am controlling java.io.tmpDir too

simon
>
> Dawid
>
> On Thu, Jun 21, 2012 at 1:08 PM, Simon Willnauer
>  wrote:
>> hey folks,
>>
>> I'd like to control the tempDir used to execute junit tests from the
>> outside. Any objections to the patch below?
>>
>>
>> diff --git a/lucene/common-build.xml b/lucene/common-build.xml
>> index 47d5013..2c8f144 100644
>> --- a/lucene/common-build.xml
>> +++ b/lucene/common-build.xml
>> @@ -95,6 +95,7 @@
>>   
>>   
>>   
>> +       
>>
>>   
>>   
>> @@ -705,7 +706,7 @@
>>     
>>     
>>     
>> -    
>> +    
>>     
>>     
>>     
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Welcome Greg Bowyer

2012-06-21 Thread Simon Willnauer

welcome!

On Thu, Jun 21, 2012 at 2:41 PM, Stefan Matheis
 wrote:
> Welcome Greg :)
>
>
>
> On Thursday, June 21, 2012 at 12:56 PM, Erick Erickson wrote:
>
>> I'm pleased to announce that Greg Bowyer has been added as a
>> Lucene/Solr committer.
>>
>> Greg:
>> It's a tradition that you reply with a brief bio.
>>
>> Your SVN access should be set up and ready to go.
>>
>> Congratulations!
>>
>> Erick Erickson
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org 
>> (mailto:dev-unsubscr...@lucene.apache.org)
>> For additional commands, e-mail: dev-h...@lucene.apache.org 
>> (mailto:dev-h...@lucene.apache.org)
>
>
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Re: suggester/autocomplete locks file preventing replication

2012-06-22 Thread Simon Willnauer

On Fri, Jun 22, 2012 at 10:37 AM, tom  wrote:

>  cross posting this issue to the dev list in the hope to get a response
> here...
>

I think you are right. Closing the Stream / Reader is the responsibility of
the caller not the FileDictionary IMO but solr doesn't close it so that
might cause your problems. Are you running on windows by any chance?
I will create an issue and fix it.

simon

>
>
>  Original Message   Subject: Re: suggester/autocomplete
> locks file preventing replication  Date: Thu, 21 Jun 2012 17:11:40 +0200  
> From:
> tomReply-To:
> solr-u...@lucene.apache.org  To: solr-u...@lucene.apache.org
>
>
> pocking into the code i think the FileDictionary class is the culprit:
> It takes an InputStream as a ctor argument but never releases the
> stream. what puzzles me is that the class seems to allow a one-time
> iteration and then the stream is useless, unless i'm missing smth. here.
>
> is there a good reason for this or rather a bug?
> should i move the topic to the dev list?
>
>
> On 21.06.2012 14:49, tom wrote:
> > BTW: a core unload doesnt release the lock either ;(
> >
> >
> > On 21.06.2012 14:39, tom wrote:
> >> hi,
> >>
> >> i'm using the suggester with a file like so:
> >>
> >>   
> >> 
> >>   suggest
> >>>> name="classname">org.apache.solr.spelling.suggest.Suggester
> >>>> name="lookupImpl">org.apache.solr.spelling.suggest.fst.FSTLookup
> >>   
> >>   
> >>   
> >>   content
> >>   0.05
> >>   true
> >>   100
> >>   autocomplete.dictionary
> >> 
> >>   
> >>
> >> when trying to replicate i get the following error message on the
> >> slave side:
> >>
> >>  2012-06-21 14:34:50,781 ERROR
> >> [pool-3-thread-1  ]
> >> handler.ReplicationHandler- SnapPull failed
> >> org.apache.solr.common.SolrException: Unable to rename: 
> >> autocomplete.dictionary.20120620120611
> >> at
> >> org.apache.solr.handler.SnapPuller.copyTmpConfFiles2Conf(SnapPuller.java:642)
> >> at
> >> org.apache.solr.handler.SnapPuller.downloadConfFiles(SnapPuller.java:526)
> >> at
> >> org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:299)
> >> at
> >> org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:268)
> >> at org.apache.solr.handler.SnapPuller$1.run(SnapPuller.java:159)
> >> at
> >> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
> >> at
> >> java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
> >> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
> >> at
> >> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
> >> at
> >> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:181)
> >> at
> >> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:205)
> >> at
> >> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
> >> at
> >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
> >> at java.lang.Thread.run(Thread.java:619)
> >>
> >> so i dug around it and found out that the solr's java process holds a
> >> lock on the autocomplete.dictionary file. any reason why this is so?
> >>
> >> thx,
> >>
> >> running:
> >> solr 3.5
> >> win7
> >>
> >
> >
> >
>
>
>
>
>
>
>
>

Re: Re: suggester/autocomplete locks file preventing replication

2012-06-22 Thread Simon Willnauer

On Fri, Jun 22, 2012 at 11:47 AM, Simon Willnauer <
simon.willna...@googlemail.com> wrote:

>
>
> On Fri, Jun 22, 2012 at 10:37 AM, tom  wrote:
>
>>  cross posting this issue to the dev list in the hope to get a response
>> here...
>>
>
> I think you are right. Closing the Stream / Reader is the responsibility
> of the caller not the FileDictionary IMO but solr doesn't close it so that
> might cause your problems. Are you running on windows by any chance?
> I will create an issue and fix it.
>

hmm I just looked at it and I see a IOUtils.close call in FileDictionary

https://svn.apache.org/repos/asf/lucene/dev/branches/lucene_solr_3_6/lucene/contrib/spellchecker/src/java/org/apache/lucene/search/suggest/FileDictionary.java

are you using solr 3.6?

>
> simon
>
>>
>>
>>  Original Message   Subject: Re: suggester/autocomplete
>> locks file preventing replication  Date: Thu, 21 Jun 2012 17:11:40 +0200  
>> From:
>> tomReply-To:
>> solr-u...@lucene.apache.org  To: solr-u...@lucene.apache.org
>>
>>
>> pocking into the code i think the FileDictionary class is the culprit:
>> It takes an InputStream as a ctor argument but never releases the
>> stream. what puzzles me is that the class seems to allow a one-time
>> iteration and then the stream is useless, unless i'm missing smth. here.
>>
>> is there a good reason for this or rather a bug?
>> should i move the topic to the dev list?
>>
>>
>> On 21.06.2012 14:49, tom wrote:
>> > BTW: a core unload doesnt release the lock either ;(
>> >
>> >
>> > On 21.06.2012 14:39, tom wrote:
>> >> hi,
>> >>
>> >> i'm using the suggester with a file like so:
>> >>
>> >>   
>> >> 
>> >>   suggest
>> >>   > >> name="classname">org.apache.solr.spelling.suggest.Suggester
>> >>   > >> name="lookupImpl">org.apache.solr.spelling.suggest.fst.FSTLookup
>> >>   
>> >>   
>> >>   
>> >>   content
>> >>   0.05
>> >>   true
>> >>   100
>> >>   autocomplete.dictionary
>> >> 
>> >>   
>> >>
>> >> when trying to replicate i get the following error message on the
>> >> slave side:
>> >>
>> >>  2012-06-21 14:34:50,781 ERROR
>> >> [pool-3-thread-1  ]
>> >> handler.ReplicationHandler- SnapPull failed
>> >> org.apache.solr.common.SolrException: Unable to rename: 
>> >> autocomplete.dictionary.20120620120611
>> >> at
>> >> org.apache.solr.handler.SnapPuller.copyTmpConfFiles2Conf(SnapPuller.java:642)
>> >> at
>> >> org.apache.solr.handler.SnapPuller.downloadConfFiles(SnapPuller.java:526)
>> >> at
>> >> org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:299)
>> >> at
>> >> org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:268)
>> >> at org.apache.solr.handler.SnapPuller$1.run(SnapPuller.java:159)
>> >> at
>> >> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
>> >> at
>> >> java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
>> >> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
>> >> at
>> >> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
>> >> at
>> >> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:181)
>> >> at
>> >> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:205)
>> >> at
>> >> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
>> >> at
>> >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
>> >> at java.lang.Thread.run(Thread.java:619)
>> >>
>> >> so i dug around it and found out that the solr's java process holds a
>> >> lock on the autocomplete.dictionary file. any reason why this is so?
>> >>
>> >> thx,
>> >>
>> >> running:
>> >> solr 3.5
>> >> win7
>> >>
>> >
>> >
>> >
>>
>>
>>
>>
>>
>>
>>
>>
>

Re: Re: suggester/autocomplete locks file preventing replication

2012-06-22 Thread Simon Willnauer

here is the issue https://issues.apache.org/jira/browse/SOLR-3570

On Fri, Jun 22, 2012 at 11:55 AM, Simon Willnauer <
simon.willna...@googlemail.com> wrote:

>
>
> On Fri, Jun 22, 2012 at 11:47 AM, Simon Willnauer <
> simon.willna...@googlemail.com> wrote:
>
>>
>>
>> On Fri, Jun 22, 2012 at 10:37 AM, tom  wrote:
>>
>>>  cross posting this issue to the dev list in the hope to get a response
>>> here...
>>>
>>
>> I think you are right. Closing the Stream / Reader is the responsibility
>> of the caller not the FileDictionary IMO but solr doesn't close it so that
>> might cause your problems. Are you running on windows by any chance?
>> I will create an issue and fix it.
>>
>
> hmm I just looked at it and I see a IOUtils.close call in FileDictionary
>
>
> https://svn.apache.org/repos/asf/lucene/dev/branches/lucene_solr_3_6/lucene/contrib/spellchecker/src/java/org/apache/lucene/search/suggest/FileDictionary.java
>
> are you using solr 3.6?
>
>>
>> simon
>>
>>>
>>>
>>>  Original Message   Subject: Re: suggester/autocomplete
>>> locks file preventing replication  Date: Thu, 21 Jun 2012 17:11:40 +0200  
>>> From:
>>> tomReply-To:
>>> solr-u...@lucene.apache.org  To: solr-u...@lucene.apache.org
>>>
>>>
>>> pocking into the code i think the FileDictionary class is the culprit:
>>> It takes an InputStream as a ctor argument but never releases the
>>> stream. what puzzles me is that the class seems to allow a one-time
>>> iteration and then the stream is useless, unless i'm missing smth. here.
>>>
>>> is there a good reason for this or rather a bug?
>>> should i move the topic to the dev list?
>>>
>>>
>>> On 21.06.2012 14:49, tom wrote:
>>> > BTW: a core unload doesnt release the lock either ;(
>>> >
>>> >
>>> > On 21.06.2012 14:39, tom wrote:
>>> >> hi,
>>> >>
>>> >> i'm using the suggester with a file like so:
>>> >>
>>> >>   
>>> >> 
>>> >>   suggest
>>> >>   >> >> name="classname">org.apache.solr.spelling.suggest.Suggester
>>> >>   >> >> name="lookupImpl">org.apache.solr.spelling.suggest.fst.FSTLookup
>>> >>   
>>> >>   
>>> >>   
>>> >>   content
>>> >>   0.05
>>> >>   true
>>> >>   100
>>> >>   autocomplete.dictionary
>>> >> 
>>> >>   
>>> >>
>>> >> when trying to replicate i get the following error message on the
>>> >> slave side:
>>> >>
>>> >>  2012-06-21 14:34:50,781 ERROR
>>> >> [pool-3-thread-1  ]
>>> >> handler.ReplicationHandler- SnapPull failed
>>> >> org.apache.solr.common.SolrException: Unable to rename: 
>>> >> autocomplete.dictionary.20120620120611
>>> >> at
>>> >> org.apache.solr.handler.SnapPuller.copyTmpConfFiles2Conf(SnapPuller.java:642)
>>> >> at
>>> >> org.apache.solr.handler.SnapPuller.downloadConfFiles(SnapPuller.java:526)
>>> >> at
>>> >> org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:299)
>>> >> at
>>> >> org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:268)
>>> >> at org.apache.solr.handler.SnapPuller$1.run(SnapPuller.java:159)
>>> >> at
>>> >> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
>>> >> at
>>> >> java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
>>> >> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
>>> >> at
>>> >> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
>>> >> at
>>> >> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:181)
>>> >> at
>>> >> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:205)
>>> >> at
>>> >> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
>>> >> at
>>> >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
>>> >> at java.lang.Thread.run(Thread.java:619)
>>> >>
>>> >> so i dug around it and found out that the solr's java process holds a
>>> >> lock on the autocomplete.dictionary file. any reason why this is so?
>>> >>
>>> >> thx,
>>> >>
>>> >> running:
>>> >> solr 3.5
>>> >> win7
>>> >>
>>> >
>>> >
>>> >
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>
>

Re: suggester/autocomplete locks file preventing replication

2012-06-22 Thread Simon Willnauer

On Fri, Jun 22, 2012 at 1:06 PM, tom  wrote:

>  ah thx and to answer ur question:
>
> no we are still on 3.5 and i had tested it on win7 (see the very end of
> this email thread)
> and after having briefly looked @ the 3.6 code the class has changed quite
> a bit since 3.5...
>

yeah that is right! I rewrote most of the parts in the suggest code I think
this should be fine now. So you might want to upgrade to 3.6 to get this
fixed.

simon

>
>
> On 22.06.2012 12:16, Simon Willnauer wrote:
>
> here is the issue https://issues.apache.org/jira/browse/SOLR-3570
>
> On Fri, Jun 22, 2012 at 11:55 AM, Simon Willnauer <
> simon.willna...@googlemail.com> wrote:
>
>>
>>
>>  On Fri, Jun 22, 2012 at 11:47 AM, Simon Willnauer <
>> simon.willna...@googlemail.com> wrote:
>>
>>>
>>>
>>>  On Fri, Jun 22, 2012 at 10:37 AM, tom  wrote:
>>>
>>>>  cross posting this issue to the dev list in the hope to get a response
>>>> here...
>>>>
>>>
>>>  I think you are right. Closing the Stream / Reader is the
>>> responsibility of the caller not the FileDictionary IMO but solr doesn't
>>> close it so that might cause your problems. Are you running on windows by
>>> any chance?
>>> I will create an issue and fix it.
>>>
>>
>>  hmm I just looked at it and I see a IOUtils.close call in FileDictionary
>>
>>
>> https://svn.apache.org/repos/asf/lucene/dev/branches/lucene_solr_3_6/lucene/contrib/spellchecker/src/java/org/apache/lucene/search/suggest/FileDictionary.java
>>
>>  are you using solr 3.6?
>>
>>>
>>>  simon
>>>
>>>>
>>>>
>>>>  Original Message   Subject: Re:
>>>> suggester/autocomplete locks file preventing replication  Date: Thu,
>>>> 21 Jun 2012 17:11:40 +0200  From: tom 
>>>>   Reply-To:
>>>> solr-u...@lucene.apache.org  To: solr-u...@lucene.apache.org
>>>>
>>>>
>>>> pocking into the code i think the FileDictionary class is the culprit:
>>>> It takes an InputStream as a ctor argument but never releases the
>>>> stream. what puzzles me is that the class seems to allow a one-time
>>>> iteration and then the stream is useless, unless i'm missing smth. here.
>>>>
>>>> is there a good reason for this or rather a bug?
>>>> should i move the topic to the dev list?
>>>>
>>>>
>>>> On 21.06.2012 14:49, tom wrote:
>>>> > BTW: a core unload doesnt release the lock either ;(
>>>> >
>>>> >
>>>> > On 21.06.2012 14:39, tom wrote:
>>>> >> hi,
>>>> >>
>>>> >> i'm using the suggester with a file like so:
>>>> >>
>>>> >>   
>>>> >> 
>>>> >>   suggest
>>>> >>   >>> >> name="classname">org.apache.solr.spelling.suggest.Suggester
>>>> >>   >>> >> name="lookupImpl">org.apache.solr.spelling.suggest.fst.FSTLookup
>>>> >>   
>>>> >>   
>>>> >>   
>>>> >>   content
>>>> >>   0.05
>>>> >>   true
>>>> >>   100
>>>> >>   autocomplete.dictionary
>>>> >> 
>>>> >>   
>>>> >>
>>>> >> when trying to replicate i get the following error message on the
>>>> >> slave side:
>>>> >>
>>>> >>  2012-06-21 14:34:50,781 ERROR
>>>> >> [pool-3-thread-1  ]
>>>> >> handler.ReplicationHandler- SnapPull failed
>>>> >> org.apache.solr.common.SolrException: Unable to rename: 
>>>> >> autocomplete.dictionary.20120620120611
>>>> >> at
>>>> >> org.apache.solr.handler.SnapPuller.copyTmpConfFiles2Conf(SnapPuller.java:642)
>>>> >> at
>>>> >> org.apache.solr.handler.SnapPuller.downloadConfFiles(SnapPuller.java:526)
>>>> >> at
>>>> >> org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:299)
>>>> >> at
>>>> >> org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:268)
>>>> >> at org.apache.solr.handler.SnapPuller$1.run(SnapPuller.java:159)
>>>> >> at
>>>> >> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
>>>> >> at
>>>> >> java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
>>>> >> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
>>>> >> at
>>>> >> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
>>>> >> at
>>>> >> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:181)
>>>> >> at
>>>> >> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:205)
>>>> >> at
>>>> >> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
>>>> >> at
>>>> >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
>>>> >> at java.lang.Thread.run(Thread.java:619)
>>>> >>
>>>> >> so i dug around it and found out that the solr's java process holds a
>>>> >> lock on the autocomplete.dictionary file. any reason why this is so?
>>>> >>
>>>> >> thx,
>>>> >>
>>>> >> running:
>>>> >> solr 3.5
>>>> >> win7
>>>> >>
>>>> >
>>>> >
>>>> >
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>
>
>
>

Re: Build failed in Jenkins: Lucene-Core-4x-Beasting #6934

2012-06-25 Thread Simon Willnauer

I committed a fix in r1353443 I posted on
https://issues.apache.org/jira/browse/LUCENE-4158

simon

On Mon, Jun 25, 2012 at 8:04 PM, Robert Muir  wrote:
> I killed this after it was hung for 3 days. I'm not sure if its
> related to the other hang Uwe opened an issue for, or perhaps even
> fixed already.
>
> On Mon, Jun 25, 2012 at 2:03 PM,   wrote:
>> See 
>>
>> --
>> [...truncated 937 lines...]
>>   [junit4] Completed on J0 in 0.01s, 2 tests
>>   [junit4]
>>   [junit4] Suite: org.apache.lucene.analysis.TestMockCharFilter
>>   [junit4] Completed on J1 in 0.01s, 1 test
>>   [junit4]
>>   [junit4] Suite: 
>> org.apache.lucene.analysis.tokenattributes.TestSimpleAttributeImpl
>>   [junit4] Completed on J3 in 0.01s, 1 test
>>   [junit4]
>>   [junit4] Suite: org.apache.lucene.util.TestNamedSPILoader
>>   [junit4] Completed on J0 in 0.01s, 3 tests
>>   [junit4]
>>   [junit4] Suite: org.apache.lucene.TestAssertions
>>   [junit4] Completed on J1 in 0.01s, 1 test
>>   [junit4]
>>   [junit4] Suite: org.apache.lucene.index.TestIndexWriterOnJRECrash
>>   [junit4] IGNOR/A 0.00s J3 | TestIndexWriterOnJRECrash.testNRTThreads
>>   [junit4]    > Assumption #1: 'nightly' test group is disabled (@Nightly)
>>   [junit4] Completed on J3 in 0.01s, 1 test, 1 skipped
>>   [junit4]
>>   [junit4] Suite: org.apache.lucene.index.TestTerm
>>   [junit4] Completed on J0 in 0.01s, 1 test
>>   [junit4]
>>   [junit4] Suite: org.apache.lucene.store.TestByteArrayDataInput
>>   [junit4] Completed on J1 in 0.01s, 1 test
>>   [junit4]
>>   [junit4] Suite: org.apache.lucene.index.Test2BPostings
>>   [junit4] IGNOR/A 0.00s J3 | Test2BPostings.test
>>   [junit4]    > Assumption #1: 'nightly' test group is disabled (@Nightly)
>>   [junit4] Completed on J3 in 0.01s, 1 test, 1 skipped
>>   [junit4]
>>   [junit4] Suite: org.apache.lucene.util.TestIntsRef
>>   [junit4] Completed on J1 in 0.08s, 2 tests
>>   [junit4]
>>   [junit4] Suite: 
>> org.apache.lucene.codecs.lucene40.TestAllFilesHaveCodecHeader
>>   [junit4] Completed on J0 in 0.58s, 1 test
>>   [junit4]
>>   [junit4] Unhandled exception in thread: Thread[pumper-events,5,main]
>>   [junit4] 
>> com.carrotsearch.ant.tasks.junit4.dependencies.com.google.gson.JsonIOException:
>>  java.io.IOException: java.lang.InterruptedException: sleep interrupted
>>   [junit4] JVM J2: stdout was not empty, see: 
>> 
>>   [junit4] >>> JVM J2: stdout (verbatim) 
>>   [junit4]     at 
>> com.carrotsearch.ant.tasks.junit4.dependencies.com.google.gson.internal.Streams.parse(Streams.java:57)
>>   [junit4]     at 
>> com.carrotsearch.ant.tasks.junit4.dependencies.com.google.gson.GsonToMiniGsonTypeAdapterFactory$3.read(GsonToMiniGsonTypeAdapterFactory.java:81)
>>   [junit4]     at 
>> com.carrotsearch.ant.tasks.junit4.dependencies.com.google.gson.internal.bind.ReflectiveTypeAdapterFactory$1.read(ReflectiveTypeAdapterFactory.java:86)
>>   [junit4]     at 
>> com.carrotsearch.ant.tasks.junit4.dependencies.com.google.gson.internal.bind.ReflectiveTypeAdapterFactory$Adapter.read(ReflectiveTypeAdapterFactory.java:170)
>>   [junit4]     at 
>> com.carrotsearch.ant.tasks.junit4.dependencies.com.google.gson.Gson.fromJson(Gson.java:720)
>>   [junit4] 2012-06-25 14:03:31
>>   [junit4] Full thread dump Java HotSpot(TM) 64-Bit Server VM (19.1-b02 
>> mixed mode):
>>   [junit4]
>>   [junit4] "Thread-65" daemon prio=10 tid=0x7f4754192000 nid=0xeb5 in 
>> Object.wait() [0x7f4794df6000]
>>   [junit4]    java.lang.Thread.State: WAITING (on object monitor)
>>   [junit4]     at java.lang.Object.wait(Native Method)
>>   [junit4]     - waiting on <0xe04eab68> (a 
>> org.apache.lucene.index.DocumentsWriterStallControl)
>>   [junit4]     at java.lang.Object.wait(Object.java:485)
>>   [junit4]     at 
>> org.apache.lucene.index.DocumentsWriterStallControl.waitIfStalled(DocumentsWriterStallControl.java:75)
>>   [junit4]     - locked <0xe04eab68> (a 
>> org.apache.lucene.index.DocumentsWriterStallControl)
>>   [junit4]     at 
>> org.apache.lucene.index.DocumentsWriterFlushControl.waitIfStalled(DocumentsWriterFlushControl.java:636)
>>   [junit4]     at 
>> org.apache.lucene.index.DocumentsWriter.preUpdate(DocumentsWriter.java:301)
>>   [junit4]     at 
>> org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:361)
>>   [junit4]     at 
>> org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1334)
>>   [junit4]     at 
>> org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1085)
>>   [junit4]     at 
>> org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1066)
>>   [junit4]     at 
>> org.apache.lucene.index.TestIndexWriterReader$2.run(TestIndexWriterReader.java:823)
>>   [junit4]
>>   [junit4] "Thread-64" daemon prio=10 tid=0x7f47540

Re: VOTE: 4.0 alpha (take two)

2012-06-26 Thread Simon Willnauer

Ran smoke tester and pushed the jars into my 4.0 apps everything looks
good to me.

here is my +1 (binding)

On Mon, Jun 25, 2012 at 11:27 PM, Robert Muir  wrote:
> artifacts are here:
>
> http://people.apache.org/~rmuir/staging_area/lucene-solr-4.0aRC1-rev1353699/
>
> Here is my +1
>
> --
> lucidimagination.com
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: VOTE: 4.0 alpha (take two)

2012-06-26 Thread Simon Willnauer

On Tue, Jun 26, 2012 at 8:17 PM, Antoine LE FLOC'H  wrote:
> Hello
>
> Thanks a lot for creating this alpha version. It is very helpful. Everything
> builds fine from the src, but I get this error when putting the war on
> tomcat:
> http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201203.mbox/%3cca+gxezhy0+nroxmgamkasql8i3zvbzuwbegftcmn8t9kdf1...@mail.gmail.com%3E
> Are you aware of this error ? Thanks again.

that seems worth an issue. is there one already?

simon
>
> Antoine LE FLOC'H
>
>
>
> On Tue, Jun 26, 2012 at 6:31 PM, Ryan McKinley  wrote:
>>
>> +1
>>
>> I dropped the .jar files into my app and everything looks good.
>>
>> Thanks Robert!
>>
>>
>>
>> On Mon, Jun 25, 2012 at 2:27 PM, Robert Muir  wrote:
>> > artifacts are here:
>> >
>> >
>> > http://people.apache.org/~rmuir/staging_area/lucene-solr-4.0aRC1-rev1353699/
>> >
>> > Here is my +1
>> >
>> > --
>> > lucidimagination.com
>> >
>> > -
>> > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> > For additional commands, e-mail: dev-h...@lucene.apache.org
>> >
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: regular but not individually reproducible failure: TestIndexWriterWithThreads.testRollbackAndCommitWithThreads

2012-06-29 Thread Simon Willnauer

this seems like a windows / files issue. We can not obtain the write
lock quickly enough. Maybe we should do a retry loop in the test if we
are on windows?

simon

On Thu, Jun 28, 2012 at 6:25 PM, Steven A Rowe  wrote:
> I see this roughly 50% of the time I run all Lucene core tests (Win7 64, Sun 
> JDK 1.6.0_21 64-bit), but the seeds don't reproduce the error when it's run 
> individually.
>
> The only unusual thing I'm doing on this machine is running some web crawls 
> in a VirtualBox Linux VM, hosted on and writing to a different physical drive 
> than the one I ran Lucene tests on.
>
> Here's the most recent failure:
>
> Suite: org.apache.lucene.index.TestIndexWriterWithThreads
> OK      0.04s J0 | 
> TestIndexWriterWithThreads.testIOExceptionDuringAbortOnlyOnce
> OK      1.01s J0 | 
> TestIndexWriterWithThreads.testOpenTwoIndexWritersOnDifferentThreads
> OK      0.97s J0 | 
> TestIndexWriterWithThreads.testIOExceptionDuringAbortWithThreads
> ERROR   7.07s J0 | TestIndexWriterWithThreads.testRollbackAndCommitWithThreads
>   > Throwable #1: java.lang.AssertionError
>   >    at 
> __randomizedtesting.SeedInfo.seed([299FEFE00B1E1931:5BC9FB3E94E8E02C]:0)
>   >    at org.junit.Assert.fail(Assert.java:92)
>   >    at org.junit.Assert.assertTrue(Assert.java:43)
>   >    at org.junit.Assert.assertTrue(Assert.java:54)
>   >    at 
> org.apache.lucene.index.TestIndexWriterWithThreads.testRollbackAndCommitWithThreads(TestIndexWriterWithThreads.java:588)
>   >    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   >    at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>   >    at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   >    at java.lang.reflect.Method.invoke(Method.java:597)
>   >    at 
> com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1969)
>   >    at 
> com.carrotsearch.randomizedtesting.RandomizedRunner.access$1100(RandomizedRunner.java:132)
>   >    at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:814)
>   >    at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:875)
>   >    at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:889)
>   >    at 
> org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
>   >    at 
> org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:32)
>   >    at 
> org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
>   >    at 
> com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
>   >    at 
> org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68)
>   >    at 
> org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
>   >    at 
> org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
>   >    at 
> com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:821)
>   >    at 
> com.carrotsearch.randomizedtesting.RandomizedRunner.access$700(RandomizedRunner.java:132)
>   >    at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$3$1.run(RandomizedRunner.java:669)
>   >    at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:695)
>   >    at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:734)
>   >    at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:745)
>   >    at 
> org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
>   >    at 
> org.apache.lucene.util.TestRuleReportUncaughtExceptions$1.evaluate(TestRuleReportUncaughtExceptions.java:68)
>   >    at 
> org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38)
>   >    at 
> org.apache.lucene.util.TestRuleIcuHack$1.evaluate(TestRuleIcuHack.java:51)
>   >    at 
> com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
>   >    at 
> org.apache.lucene.util.TestRuleNoInstanceHooksOverrides$1.evaluate(TestRuleNoInstanceHooksOverrides.java:53)
>   >    at 
> org.apache.lucene.util.TestRuleNoStaticHooksShadowing$1.evaluate(TestRuleNoStaticHooksShadowing.java:52)
>   >    at 
> org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:36)
>   >    at 
> org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
>   >    at 
> org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:56)
>   >    at 
> com.carrotsearch.randomizedtesting.RandomizedRunner.runSuite(RandomizedRunner.java:605)
>   >    at 
> com.carro

Re: buildbot failure in ASF Buildbot on lucene-site-staging

2012-07-03 Thread Simon Willnauer

thanks for trying so hard robert!! I am sorry that it is such a pain!

simon

On Tue, Jul 3, 2012 at 12:27 AM, Robert Muir  wrote:
> I'm throwing in the towel. I've been trying to publish javadocs to the
> website for almost 10 hours now.
>
> On Mon, Jul 2, 2012 at 6:26 PM,   wrote:
>> The Buildbot has detected a new failure on builder lucene-site-staging while 
>> building ASF Buildbot.
>> Full details are available at:
>>  http://ci.apache.org/builders/lucene-site-staging/builds/209
>>
>> Buildbot URL: http://ci.apache.org/
>>
>> Buildslave for this Build: bb-cms-slave
>>
>> Build Reason: scheduler
>> Build Source Stamp: [branch lucene/cms] 1356506
>> Blamelist: rmuir
>>
>> BUILD FAILED: failed shell
>>
>> sincerely,
>>  -The Buildbot
>>
>>
>>
>
>
>
> --
> lucidimagination.com
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

FieldCache goes insane in JoinUtils

2012-07-16 Thread Simon Willnauer

Just wanna forward this to the list in the case this it serious:

simon

Changes:

[sarowe] LUCENE-4199: IntelliJ configuration: add lucene tools library
to allow compilation where the asm jar is a dependency

--
[...truncated 16387 lines...]
[junit4:junit4] Suite: org.apache.lucene.search.highlight.HighlighterTest
[junit4:junit4] Completed on J0 in 9.35s, 45 tests
[junit4:junit4]
[junit4:junit4] Suite:
org.apache.lucene.search.vectorhighlight.IndexTimeSynonymTest
[junit4:junit4] Completed on J1 in 4.23s, 17 tests
[junit4:junit4]
[junit4:junit4] Suite:
org.apache.lucene.search.vectorhighlight.SingleFragListBuilderTest
[junit4:junit4] Completed on J1 in 0.53s, 3 tests
[junit4:junit4]
[junit4:junit4] Suite:
org.apache.lucene.search.vectorhighlight.WeightedFragListBuilderTest
[junit4:junit4] Completed on J1 in 0.43s, 1 test
[junit4:junit4]
[junit4:junit4] Suite:
org.apache.lucene.search.vectorhighlight.ScoreOrderFragmentsBuilderTest
[junit4:junit4] Completed on J1 in 0.25s, 1 test
[junit4:junit4]
[junit4:junit4] Suite:
org.apache.lucene.search.vectorhighlight.SimpleBoundaryScannerTest
[junit4:junit4] Completed on J1 in 0.28s, 2 tests
[junit4:junit4]
[junit4:junit4] Suite: org.apache.lucene.search.vectorhighlight.FieldQueryTest
[junit4:junit4] Completed on J0 in 3.99s, 27 tests
[junit4:junit4]
[junit4:junit4] Suite:
org.apache.lucene.search.highlight.custom.HighlightCustomQueryTest
[junit4:junit4] Completed on J1 in 0.24s, 1 test
[junit4:junit4]
[junit4:junit4] JVM J0: 0.68 ..21.82 =21.14s
[junit4:junit4] JVM J1: 0.68 ..21.83 =21.15s
[junit4:junit4] Execution time total: 22 seconds
[junit4:junit4] Tests summary: 16 suites, 157 tests
 [echo] 5 slowest tests:
[junit4:tophints]   9.35s | org.apache.lucene.search.highlight.HighlighterTest
[junit4:tophints]   5.66s |
org.apache.lucene.search.vectorhighlight.FieldPhraseListTest
[junit4:tophints]   5.01s |
org.apache.lucene.search.vectorhighlight.SimpleFragListBuilderTest
[junit4:tophints]   4.23s |
org.apache.lucene.search.vectorhighlight.IndexTimeSynonymTest
[junit4:tophints]   3.99s |
org.apache.lucene.search.vectorhighlight.FieldQueryTest
 [echo] Building join...

ivy-availability-check:

ivy-fail:

ivy-configure:
[ivy:configure] :: loading settings :: file =


resolve:

common.init:

compile-lucene-core:

jflex-uptodate-check:

jflex-notice:

javacc-uptodate-check:

javacc-notice:

ivy-availability-check:

ivy-fail:

ivy-configure:
[ivy:configure] :: loading settings :: file =


resolve:

init:

-clover.disable:

-clover.setup:

clover:

common.compile-core:
[javac] Compiling 1 source file to


compile-core:

module-build.init:

check-grouping-uptodate:

jar-grouping:

init:

test:
 [echo] Building join...

ivy-availability-check:

ivy-fail:

ivy-configure:
[ivy:configure] :: loading settings :: file =


resolve:

common.init:

compile-lucene-core:

module-build.init:

check-grouping-uptodate:

jar-grouping:

init:

compile-test:
 [echo] Building join...

ivy-availability-check:

ivy-fail:

ivy-configure:
[ivy:configure] :: loading settings :: file =


resolve:

common.init:

compile-lucene-core:

module-build.init:

check-grouping-uptodate:

jar-grouping:

init:

-clover.disable:

-clover.setup:

clover:

compile-core:

compile-test-framework:

ivy-availability-check:

ivy-fail:

ivy-configure:
[ivy:configure] :: loading settings :: file =


resolve:

init:

compile-lucene-core:

compile-core:

common.compile-test:

install-junit4-taskdef:

-clover.disable:

-clover.setup:

clover:

validate:

common.test:
[mkdir] Created dir:

[junit4:junit4]  says g'day! Master seed: 4613CDE1DF5391F
[junit4:junit4] Executing 2 suites with 2 JVMs.
[junit4:junit4] Suite: org.apache.lucene.search.join.TestBlockJoin
[junit4:junit4] Completed on J1 in 19.86s, 7 tests
[junit4:junit4]
[junit4:junit4] HEARTBEAT J0: 2012-07-15T15:30:03, no events in:
69.7s, approx. at: TestJoinUtil.testSingleValueRandomJoin
[junit4:junit4] Suite: org.apache.lucene.search.join.TestJoinUtil
[junit4:junit4] FAILURE 93.1s J0 | TestJoinUtil.testSingleValueRandomJoin
[junit4:junit4]> Throwable #1: java.lang.AssertionError:
testSingleValueRandomJoin(org.apache.lucene.search.join.TestJoinUtil):
Insane FieldCache usage(s) found expected:<0> but was:<1>
[junit4:junit4]>at
__randomizedtesting.SeedInfo.seed([4613CDE1DF5391F:E8D92D94953625C0]:0)
[junit4:junit4]>at org.junit.Assert.fail(Assert.java:93)
[junit4:junit4]>at org.junit.Assert.failNotEquals(Assert.java:647)
[junit4:junit4]>at org.junit.Assert.assertEquals(Assert.java:128)
[junit4:junit4]>at org.junit.Assert.assertEquals(Assert.java:472)
[junit4:junit4]>at
org.apache.lucene.util.LuceneTestCase.assertSaneFieldCaches(LuceneTestCase.java:515)
[junit4:junit4]>at
org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:48)
[junit4:junit4]>at
org.apache.lucene.util.AbstractBef

Re: VOTE: Lucene/Solr 3.6.1

2012-07-17 Thread Simon Willnauer

+1 thanks uwe

On Tue, Jul 17, 2012 at 8:16 PM, Erick Erickson  wrote:
> +1 fire away
>
> On Tue, Jul 17, 2012 at 11:47 AM, Robert Muir  wrote:
>> +1.
>>
>> Smoketester passes, looks good.
>>
>> Thanks for doing this Uwe!
>>
>> On Tue, Jul 17, 2012 at 11:01 AM, Uwe Schindler  wrote:
>>> Please vote to release these artifacts for Apache Lucene and Solr 3.6.1:
>>> http://s.apache.org/lucene361
>>>
>>> I tested with dev-tools/scripts/smokeTestRelease.py, ran rat-sources on both
>>> source releases, tested solr example, and reviewed packaging contents.
>>> There was only minor issue in the SmokeTester: It did not test Solr with
>>> Java 5, but I did that manually so Solr example + tests works with Java 5
>>> (as the release on itsself was built with Java 5).
>>>
>>> Here's my +1.
>>>
>>> -
>>> Uwe Schindler
>>> H.-H.-Meier-Allee 63, D-28213 Bremen
>>> http://www.thetaphi.de
>>> eMail: u...@thetaphi.de
>>>
>>>
>>>
>>>
>>> -
>>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>>
>>
>>
>>
>> --
>> lucidimagination.com
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: BytesRef comparable

2010-05-03 Thread Simon Willnauer

+1 sounds good

On Sun, May 2, 2010 at 7:59 PM, Shai Erera  wrote:

> +1
>
> Shai
>
>
> On Sun, May 2, 2010 at 8:58 PM, Uwe Schindler  wrote:
>
>> When we do the refactoring in flex to remove the comparator, this would be
>> the first that comes to my mind.
>>
>> +1
>>
>> -
>> Uwe Schindler
>> H.-H.-Meier-Allee 63, D-28213 Bremen
>> http://www.thetaphi.de
>> eMail: u...@thetaphi.de
>>
>> > -Original Message-
>> > From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik
>> > Seeley
>> > Sent: Sunday, May 02, 2010 7:33 PM
>> > To: java-...@lucene.apache.org
>> > Subject: BytesRef comparable
>> >
>> > Any objections to making BytesRef comparable?  It would make it much
>> > easier to use with containers that don't take comparators as
>> > parameters.
>> >
>> > -Yonik
>> > Apache Lucene Eurocon 2010
>> > 18-21 May 2010 | Prague
>> >
>> > -
>> > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> > For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>>
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>>
>

Speakers and Schedule for Berlin Buzzwords 2010 - Search, Store and Scale 7th/8th 2010

2010-05-07 Thread Simon Willnauer

Hi folks,

Today we proudly present the Berlin Buzzwords talks and presentations.
As promised there are tracks specific to the three tags search, store and scale.
We have a fantastic mixture of developers and users of open source software
projects that make scaling data processing today possible.

There is Steve Loughran, Aaron Kimball and Stefan Groschupf from the
Apache Hadoop community. We have Grant Ingersoll, Robert Muir and the
"Generics Policeman" Uwe Schindler from the Lucene community.

For those interested in NoSQL databases there is Mathias Stearn from MongoDB,
Jan Lehnardt from CouchDB and Eric Evans, the guy who coined the term
NoSQL one year ago.

We have just published the initial version of the schedule here:

http://berlinbuzzwords.de/content/schedule-published

It seems like we are having a fantastic set of talks and speakers for Buzzwords.

Visit us at http://berlinbuzzwords.de - looking forward to see you there!

Simon

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: questions on CommonsHttpSolrServer and MultiThreadedHttpConnectionManager

2010-05-23 Thread Simon Willnauer

Hey there,

check out Ryans link for explanation of your TooManyOpenFiles problem.
For the query time, make sure the Nagle alg is disabled - should be by
default though.

MultiThreadedHttpConnectionManager mgr = new
MultiThreadedHttpConnectionManager();
mgr.getParams().setTcpNoDelay(true)

You should also make sure you keep your HttpClient instance around and
share it between requests that will most likely improve your
performance.  Don't know if you do it but I can not tell from the
snippet you send around.

I also think your default values for setDefaultMaxConnectionsPerHost
and setMaxTotalConnections is kind of high. I guess you can reduce
them dramatically to about 100 or 1000

simon

On Sat, May 22, 2010 at 2:05 AM, Liu, Chang  wrote:
> Hi All,
>
>
>
> I have a question about using CommonsHttpSolrServer with
> MultiThreadedHttpConnectionManager. We have a single CommonsHttpSolrServer
> instance that handles all the solr requests. Once we deployed our
> application to linux servers, we kept getting too many files opened
> exception. After did some internet researches, we found that use
> CommonsHttpSolrServer instead of SolrServer should the solution that we are
> looking for. I was also hoping to get the overhead down. For some queries
> that we are running, Actual qtime is 2 to 3 ms, but the the call
> SOLRSERVER.query(query, SolrRequest.METHOD.POST) took more than 80 ms if not
> more. After we updated our application, we didn’t see much improvement on
> overhead issues. I am wondering if the changes we put in place is functional
> at all. Any suggests/advices/tutorials will be highly appreciated.
>
>
>
> Here is the snippet of our codes that does initialization and query. If we
> keep the code as it is, is the multithreadconnectionmanager active at all or
> we need to call CommonsHttpSolrServer.request() to take advantages of the
> connection pooling
>
>
>
>
>
>     HttpClient client = new  HttpClient();
>
>         MultiThreadedHttpConnectionManager mgr = new
> MultiThreadedHttpConnectionManager();
>
>         client.setHttpConnectionManager(mgr);
>
>     SOLRSERVER = new
> CommonsHttpSolrServer(Constants.SOLRSERVERURL, client);
>
>     SOLRSERVER.setDefaultMaxConnectionsPerHost(1);
>
>     SOLRSERVER.setMaxTotalConnections(1);
>
>     SOLRSERVER.setMaxRetries(1);
>
>     SOLRSERVER.setSoTimeout(1000);
>
>
>
>
>
>
>
>     SolrQuery query = new SolrQuery(queryString);
>
>        QueryResponse response = SOLRSERVER.query(query,
> SolrRequest.METHOD.POST);
>
>
>
>
>
>
>
> Thanks
>
> Chang
>
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Welcome Andrzej Bialecki as Lucene/Solr committer

2010-05-24 Thread Simon Willnauer

Welcome Andrzej!

simon

On Mon, May 24, 2010 at 11:36 AM, Uwe Schindler  wrote:
> Welcome Andrzej! I am glad to have you finally on the Team :-)
>
> -
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: u...@thetaphi.de
>
>
>> -Original Message-
>> From: Michael McCandless [mailto:luc...@mikemccandless.com]
>> Sent: Monday, May 24, 2010 11:34 AM
>> To: dev@lucene.apache.org
>> Subject: Welcome Andrzej Bialecki as Lucene/Solr committer
>>
>> I'm happy to announce that the PMC has accepted Andrzej Bialecki as
>> Lucene/Solr committer!
>>
>> Welcome aboard Andrzej,
>>
>> Mike
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional
>> commands, e-mail: dev-h...@lucene.apache.org
>
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Solr updateRequestHandler and performance vs. atomicity

2010-05-24 Thread Simon Willnauer

Hi Karl,

what are you describing seems to be a good usecase for something like
a message queue where you push a document or record to a queue which
guarantees the queues persistence. I look at this from a little
different perspective, in a distributed environment you would have to
guarantee delivery to a single solr instance but on several or at
least n instances but that is a different story.

>From a Solr point of view this sounds like a need for a write-ahead
log that guarantees durability and atomicity. I like this idea as it
might also solve lots of problems in distributed environments (solr
cloud) etc.

Very interesting topic - should investigate more in this direction


simon


On Mon, May 24, 2010 at 10:03 PM,   wrote:
> Hi Mark,
>
> Unfortunately, indexing performance *is* of concern, otherwise I'd already be 
> committing on every post.
>
> If your guess is correct, you are basically saying that adding a document to 
> an index in Solr/Lucene is just as fast as writing that file directly to the 
> disk.  Because, obviously, if we want guaranteed delivery, that's what we'd 
> have to do.  But I think this is worth the experiment - Solr/Lucene may be 
> fast, but I have doubts that it can perform as well as raw disk I/O and still 
> manage to do anything in the way of document analysis or (heaven forbid) text 
> extraction.
>
>
>
> -Original Message-
> From: ext Mark Miller [mailto:markrmil...@gmail.com]
> Sent: Monday, May 24, 2010 3:33 PM
> To: dev@lucene.apache.org
> Subject: Re: Solr updateRequestHandler and performance vs. atomicity
>
> On 5/24/10 3:10 PM, karl.wri...@nokia.com wrote:
>> Hi all,
>> It seems to me that the "commit" logic in the Solr updateRequestHandler
>> (or wherever the logic is actually located) conflates two different
>> semantics. One semantic is what you need to do to make the index process
>> perform well. The other semantic is guaranteed atomicity of document
>> reception by Solr.
>> In particular, it would be nice to be able to post documents in such a
>> way that you can guarantee that the document is permanently in Solr's
>> queue, safe in the event of a Solr restart, etc., even if the document
>> has not yet been "committed".
>> This issue came up in the LCF talk that I gave, and I initially thought
>> that separating the two kinds of events would necessarily be an LCF
>> change, but the more I thought about it the more I realized that other
>> Solr indexing clients may also benefit from such a separation.
>> Does anyone agree? Where should this logic properly live?
>> Thanks,
>> Karl
>
> Its an interesting idea - but I think you would likely pay a similar
> cost to guarantee reception as you would to commit (also, I'm not sure
> Lucene guarantees it - it works for consistency, but I'm not so sure it
> achieves durability).
>
> I can think of two things offhand -
>
> Perhaps store the text and use fsync to quasi guarantee acceptance -
> then index from the store on the commit.
>
> Another simpler idea if only the separation is important and not the
> performance - index to another side index, taking advantage of Lucene's
> current commit functionality, and then use addIndex to merge to the main
> index on commit.
>
> Just spit balling though.
>
> I think this would obviously need to be an optional mode.
>
> --
> - Mark
>
> http://www.lucidimagination.com
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: SolrCloud integration roadmap

2010-05-30 Thread Simon Willnauer

On Sun, May 30, 2010 at 6:03 PM, olivier sallou
 wrote:
> Hi,
> I'd like to know when SolrCloud feature will be released in Solr ? I saw a
> Jira track about this to integrate in trunk but I cannot see related
> roadmap.
The patch might be integrated into trunk shortly I assume - a release
isn't that near right now for various reasons.
If you really need this feature you should probably join and help
pushing it forwards, there is lots of work to do on the indexing side
of things.

simon
> I definitly need this feature I was going to develop myself as an additional
> layer above Solr (at least partially for my needs) just before reading a
> wiki article about it.
>
> Regards
>
> Olivier
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: SolrCloud integration roadmap

2010-06-01 Thread Simon Willnauer

Oliver, you can already get a snapshot from the branch

https://svn.apache.org/repos/asf/lucene/solr/branches/cloud/

and see the wiki for details: http://wiki.apache.org/solr/SolrCloud

simon

On Tue, Jun 1, 2010 at 9:06 AM, olivier sallou  wrote:
> Well,
> I would be glad to help on this feature. In very short term, I must first go
> on my prototype without Cloud features, but after that I would enjoy to help
> (could also help for testing).
> I think I will first take a snapshot from trunk when available and test it
> on my platform where I can easily set-up multiple virtual servers to test.
>
> Olivier
>
> 2010/5/30 Simon Willnauer 
>>
>> On Sun, May 30, 2010 at 6:03 PM, olivier sallou
>>  wrote:
>> > Hi,
>> > I'd like to know when SolrCloud feature will be released in Solr ? I saw
>> > a
>> > Jira track about this to integrate in trunk but I cannot see related
>> > roadmap.
>> The patch might be integrated into trunk shortly I assume - a release
>> isn't that near right now for various reasons.
>> If you really need this feature you should probably join and help
>> pushing it forwards, there is lots of work to do on the indexing side
>> of things.
>>
>> simon
>> > I definitly need this feature I was going to develop myself as an
>> > additional
>> > layer above Solr (at least partially for my needs) just before reading a
>> > wiki article about it.
>> >
>> > Regards
>> >
>> > Olivier
>> >
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: lucene-dev.jar vs lucene-SNAPSHOT.jar

2010-06-06 Thread Simon Willnauer

Can't you simply build it with ant -Ddev.version=4.0-SNAPSHOT
that should build a jar with the corresponding artifacts.

simon

On Sun, Jun 6, 2010 at 9:31 PM, Ryan McKinley  wrote:
> (hoping to avoid an anti-maven flame war)
>
> I'm looking at getting the ant tasks to play nice with maven again.
> In solr, there is a way to have the .jar files be "-dev.jar" but have
> the artifacts still map to -SNAPSHOT.jar.
>
> Getting the lucene ant scripts to do this is beyond my ant mojo, so it
> leads me to ask another question...
>
> How do people feel about renaming the lucene (and solr) dev .jar files
> from "-dev.jar" to "-SNAPSHOT.jar"?
>
> thoughts?
> ryan
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: lucene-dev.jar vs lucene-SNAPSHOT.jar

2010-06-06 Thread Simon Willnauer

I personally don't really care if it is called dev or SNAPSHOT. If it
helps other people to integrate current trunk code more easily and in
turn use the latest code for development I think we should do it.
Maybe I miss something why this is super important to be called "-dev"
if so please let me know. If it only is about somebodies preference
they could use -Ddev.version=4.0-dev too.

simon

On Mon, Jun 7, 2010 at 2:41 AM, Ryan McKinley  wrote:
> ah yes, that would work... still not great for integrating with solr
> and keeping the versions in sync though.  As a one-off just for me,
> that works fine, but I wonder more about getting snapshot builds as
> part of the hudson build scripts.
>
> perhaps 3.x, 4.x would be a good time to change some file names so
> they play nicer with maven (and does not really impact how things are
> currently done)
>
> ryan
>
>
>
> On Sun, Jun 6, 2010 at 5:51 PM, Simon Willnauer
>  wrote:
>> Can't you simply build it with ant -Ddev.version=4.0-SNAPSHOT
>> that should build a jar with the corresponding artifacts.
>>
>> simon
>>
>> On Sun, Jun 6, 2010 at 9:31 PM, Ryan McKinley  wrote:
>>> (hoping to avoid an anti-maven flame war)
>>>
>>> I'm looking at getting the ant tasks to play nice with maven again.
>>> In solr, there is a way to have the .jar files be "-dev.jar" but have
>>> the artifacts still map to -SNAPSHOT.jar.
>>>
>>> Getting the lucene ant scripts to do this is beyond my ant mojo, so it
>>> leads me to ask another question...
>>>
>>> How do people feel about renaming the lucene (and solr) dev .jar files
>>> from "-dev.jar" to "-SNAPSHOT.jar"?
>>>
>>> thoughts?
>>> ryan
>>>
>>> -
>>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>>
>>>
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>>
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Solr spewage and dropped documents, while indexing

2010-06-07 Thread Simon Willnauer

Karl, the HTTP error lines are produced by your code right?! Can you
provide what has been returned by Solr?
If that would be related to any server side problem described above
like no sockets or so you would not see a 400! I could also imagine
that the documents you are sending are empty - is that something which
could have happened?

simon

On Mon, Jun 7, 2010 at 5:05 PM,   wrote:
> Perhaps - although missing_content_stream seems to imply that it had at least 
> partly read 4 requests which later failed.  Also, wouldn't there be something 
> in the output log which would give us a clue as to what happened?
>
> Is there any post-hiccup spelunking I can reasonably do?  Or should I try to 
> reproduce the problem with more diagnostics on?
>
> Karl
>
>
> -Original Message-
> From: ext Bernd Fondermann [mailto:bf_...@brainlounge.de]
> Sent: Monday, June 07, 2010 10:54 AM
> To: dev@lucene.apache.org
> Subject: Re: Solr spewage and dropped documents, while indexing
>
> Looks like a server-side problem to me.
> Maybe the server ran out of sockets or other resources and just replied
> with a 400 error?
>
>  Bernd
>
> karl.wri...@nokia.com wrote:
>> Hi folks,
>>
>> This morning I was experimenting with using multiple threads while indexing 
>> some 20,000,000 records worth of content.  In fact, my test spun up some 50 
>> threads, and happily chugged away for a couple of hours before I saw the 
>> following output from my test code:
>>
>> Http protocol error: HTTP/1.1 400 missing_content_stream, while trying to 
>> index record 6469124
>> Http protocol error: HTTP/1.1 400 missing_content_stream, while trying to 
>> index record 6469551
>> Http protocol error: HTTP/1.1 400 missing_content_stream, while trying to 
>> index record 6470592
>> Http protocol error: HTTP/1.1 400 missing_content_stream, while trying to 
>> index record 6472454
>> java.net.SocketException: Connection reset
>>         at java.net.SocketInputStream.read(SocketInputStream.java:168)
>>         at HttpPoster.getResponse(HttpPoster.java:280)
>>         at HttpPoster.indexPost(HttpPoster.java:191)
>>         at ParseAndLoad$PostThread.run(ParseAndLoad.java:638)
>> <<
>>
>> Looking at the solr-side output, I see nothing interesting at all:
>>
>> Jun 7, 2010 9:57:48 AM org.apache.solr.core.SolrCore execute
>> INFO: [] webapp=/solr path=/update/extract 
>> params={literal.nokia_longitude=9.78518981933594&literal.nokia_phone=%2B497971910474&literal.nokia_type=0&literal.nokia_boost=1&literal.nokia_district=MÃ¼nster&literal.nokia_placerating=0&literal.id=6472724&literal.nokia_visitcount=0&literal.nokia_country=DEU&literal.nokia_housenumber=1&literal.nokia_ppid=276u0wyw-c8cb7f4d6cd84a639a4e7d3570bf8814&literal.nokia_language=de&literal.nokia_city=Gaildorf&literal.nokia_latitude=48.9985514322917&literal.nokia_postalcode=74405&literal.nokia_street=WeinhaldenstraÃe&literal.nokia_title=Dorfgemeinschaft+MÃ¼nster+e.V.&literal.nokia_category=261}
>>  status=0 QTime=1
>> Jun 7, 2010 9:57:48 AM org.apache.solr.core.SolrCore execute
>> INFO: [] webapp=/solr path=/update/extract 
>> params={literal.nokia_longitude=9.76717020670573&literal.nokia_phone=%2B497971950725&literal.nokia_type=0&literal.nokia_boost=1&literal.nokia_placerating=0&literal.id=6472737&literal.nokia_visitcount=0&literal.nokia_country=DEU&literal.nokia_housenumber=13&literal.nokia_ppid=276u0wyw-d3bed6449fcb41b0adc50ae08e041f8d&literal.nokia_language=de&literal.nokia_city=Gaildorf&literal.nokia_latitude=48.9974405924479&literal.nokia_fax=%2B497971950712&literal.nokia_postalcode=74405&literal.nokia_street=KochstraÃe&literal.nokia_title=BayWa+AG+Bau-+%26+Gartenmarkt&literal.nokia_category=194}
>>  status=0 QTime=0
>> Jun 7, 2010 9:57:48 AM org.apache.solr.core.SolrCore execute
>> INFO: [] webapp=/solr path=/update/extract 
>> params={literal.nokia_longitude=9.77591044108073&literal.nokia_phone=%2B49797124009&literal.nokia_type=0&literal.nokia_boost=1&literal.nokia_district=Unterrot&literal.nokia_placerating=0&literal.id=6472739&literal.nokia_visitcount=0&literal.nokia_country=DEU&literal.nokia_housenumber=28&literal.nokia_ppid=276u0wyw-d534d7a9235a4edf878d5e32a34bad8b&literal.nokia_language=de&literal.nokia_city=Gaildorf&literal.nokia_latitude=48.9791788736979&literal.nokia_fax=%2B49797123431&literal.nokia_postalcode=74405&literal.nokia_street=HauptstraÃe&literal.nokia_title=Gastel+R.&literal.nokia_category=5}
>>  status=0 QTime=1
>> Jun 7, 2010 9:57:48 AM org.apache.solr.core.SolrCore execute
>> INFO: [] webapp=/solr path=/update/extract 
>> params={literal.nokia_longitude=9.76935&literal.nokia_type=0&literal.nokia_boost=1&literal.nokia_placerating=5&literal.id=6472698&literal.nokia_visitcount=0&literal.nokia_country=DEU&literal.nokia_housenumber=15&literal.nokia_ppid=276u0wyw-9544100e68d74162aff54783b9376134&literal.nokia_language=de&literal.nokia_city=Gaildorf&literal.nokia_latitude=48.9981&literal.nokia_postalcode=74405&literal.nokia_street=KanzleistraÃe&literal.nokia_tag=Steuerberat

Re: Branches for large patches?

2010-06-10 Thread Simon Willnauer

On Thu, Jun 10, 2010 at 6:15 PM, Mark Miller  wrote:
> +1 to option 2 - I like feature branches more than than an issue/patch
> branch approach - large features often span many issues/patches.
>
> Option 2 can also really be a superset of option 1.

+1 to option 2 - I guess we should start with more feature branches
and see if that makes things easier.

I agree with mark this can be a superset of option 1

simon
>
> - Mark
>
> http://www.lucidimagination.com (mobile)
>
> On Jun 10, 2010, at 8:17 AM, Michael Busch  wrote:
>
>> Hi All,
>>
>> When working on large patches, such as LUCENE-2324, I find it always
>> troublesome to use patch files only to track progress.  Since branching in
>> svn works fine now (since 1.5) I'd like to create a branch for 2324.  The
>> big advantage is that everyone can track progress much more easily because
>> we get a full history on that branch.  And people who commit patches to
>> trunk can help merging, because it's sometimes very difficult if you haven't
>> followed that other change closely.
>>
>> I talked about this with Robert, Uwe and Simon in Berlin, and they all
>> like this proposal.
>>
>> So two different approaches come to my mind:
>> 1) branches/patches/LUCENE-2324/
>> 2) branches/lucene-realtime/
>>
>>
>> 1) We would have a dedicated place for branches that are used for
>> individual patches.  Every committer who thinks it makes sense for a certain
>> patch to create a branch can do in the branches/patches location.
>>
>> 2) Like with flexible indexing we create a branch for bigger features.
>>  E.g. for realtime search there are several open issues in JIRA and we could
>> just use this single branch for all of them until we're ready to merge a
>> stable realtime version to trunk.
>>
>> I like both 1) and 2) and don't have a strong preference.  What do others
>> think?
>>
>> Michael
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Branches for large patches?

2010-06-10 Thread Simon Willnauer

:)

On Thu, Jun 10, 2010 at 6:41 PM, Michael Busch  wrote:
> OK cool.  I'll start the new realtime branch soon!
>
>  Michael
>
> On 6/10/10 9:30 AM, karl.wri...@nokia.com wrote:
>>
>> I technically can't vote, but if I could I would say +1 to option 2 as
>> well.
>> Karl
>>
>> -Original Message-
>> From: ext Mark Miller [mailto:markrmil...@gmail.com]
>> Sent: Thursday, June 10, 2010 12:16 PM
>> To: dev@lucene.apache.org
>> Subject: Re: Branches for large patches?
>>
>> +1 to option 2 - I like feature branches more than than an issue/patch
>> branch approach - large features often span many issues/patches.
>>
>> Option 2 can also really be a superset of option 1.
>>
>> - Mark
>>
>> http://www.lucidimagination.com (mobile)
>>
>> On Jun 10, 2010, at 8:17 AM, Michael Busch  wrote:
>>
>>
>>>
>>> Hi All,
>>>
>>> When working on large patches, such as LUCENE-2324, I find it always
>>> troublesome to use patch files only to track progress.  Since
>>> branching in svn works fine now (since 1.5) I'd like to create a
>>> branch for 2324.  The big advantage is that everyone can track
>>> progress much more easily because we get a full history on that
>>> branch.  And people who commit patches to trunk can help merging,
>>> because it's sometimes very difficult if you haven't followed that
>>> other change closely.
>>>
>>> I talked about this with Robert, Uwe and Simon in Berlin, and they
>>> all like this proposal.
>>>
>>> So two different approaches come to my mind:
>>> 1) branches/patches/LUCENE-2324/
>>> 2) branches/lucene-realtime/
>>>
>>>
>>> 1) We would have a dedicated place for branches that are used for
>>> individual patches.  Every committer who thinks it makes sense for a
>>> certain patch to create a branch can do in the branches/patches
>>> location.
>>>
>>> 2) Like with flexible indexing we create a branch for bigger
>>> features.  E.g. for realtime search there are several open issues in
>>> JIRA and we could just use this single branch for all of them until
>>> we're ready to merge a stable realtime version to trunk.
>>>
>>> I like both 1) and 2) and don't have a strong preference.  What do
>>> others think?
>>>
>>> Michael
>>>
>>> -
>>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>>
>>>
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>>
>>
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Build failed in Hudson: Lucene-trunk #1213

2010-06-10 Thread Simon Willnauer

That seems to be caused by very high load on the hudson box - at least
what this tickes is telling:
http://issues.hudson-ci.org/browse/HUDSON-6531

simon

On Fri, Jun 11, 2010 at 5:31 AM, Apache Hudson Server
 wrote:
> See 
>
> Changes:
>
> [simonw] LUCENE-2494: use CompletionService in ParallelMultiSearcher instead 
> of simple polling
>
> --
> [...truncated 11233 lines...]
>    [junit] Testsuite: org.apache.lucene.search.TestPhrasePrefixQuery
>    [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.011 sec
>    [junit]
>    [junit] Testsuite: org.apache.lucene.search.TestPhraseQuery
>    [junit] Tests run: 16, Failures: 0, Errors: 0, Time elapsed: 2.033 sec
>    [junit]
>    [junit] Testsuite: org.apache.lucene.search.TestPositionIncrement
>    [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 0.151 sec
>    [junit]
>    [junit] Testsuite: org.apache.lucene.search.TestPositiveScoresOnlyCollector
>    [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.032 sec
>    [junit]
>    [junit] Testsuite: org.apache.lucene.search.TestPrefixFilter
>    [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.039 sec
>    [junit]
>    [junit] Testsuite: org.apache.lucene.search.TestPrefixInBooleanQuery
>    [junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed: 1.878 sec
>    [junit]
>    [junit] Testsuite: org.apache.lucene.search.TestPrefixQuery
>    [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 0.024 sec
>    [junit]
>    [junit] Testsuite: org.apache.lucene.search.TestQueryTermVector
>    [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.022 sec
>    [junit]
>    [junit] Testsuite: org.apache.lucene.search.TestQueryWrapperFilter
>    [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.092 sec
>    [junit]
>    [junit] Testsuite: org.apache.lucene.search.TestRegexpQuery
>    [junit] Tests run: 8, Failures: 0, Errors: 0, Time elapsed: 0.205 sec
>    [junit]
>    [junit] Testsuite: org.apache.lucene.search.TestRegexpRandom
>    [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 49.188 sec
>    [junit]
>    [junit] Testsuite: org.apache.lucene.search.TestRegexpRandom2
>    [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 101.927 sec
>    [junit]
>    [junit] Testsuite: org.apache.lucene.search.TestScoreCachingWrappingScorer
>    [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.048 sec
>    [junit]
>    [junit] Testsuite: org.apache.lucene.search.TestScorerPerf
>    [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 16.547 sec
>    [junit]
>    [junit] Testsuite: org.apache.lucene.search.TestSetNorm
>    [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.011 sec
>    [junit]
>    [junit] Testsuite: org.apache.lucene.search.TestSimilarity
>    [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.15 sec
>    [junit]
>    [junit] Testsuite: org.apache.lucene.search.TestSimpleExplanations
>    [junit] Tests run: 53, Failures: 0, Errors: 0, Time elapsed: 5.492 sec
>    [junit]
>    [junit] Testsuite: 
> org.apache.lucene.search.TestSimpleExplanationsOfNonMatches
>    [junit] Tests run: 53, Failures: 0, Errors: 0, Time elapsed: 0.438 sec
>    [junit]
>    [junit] Testsuite: org.apache.lucene.search.TestSloppyPhraseQuery
>    [junit] Tests run: 5, Failures: 0, Errors: 0, Time elapsed: 0.941 sec
>    [junit]
>    [junit] Testsuite: org.apache.lucene.search.TestSort
>    [junit] Tests run: 23, Failures: 0, Errors: 0, Time elapsed: 16.573 sec
>    [junit]
>    [junit] Testsuite: org.apache.lucene.search.TestSpanQueryFilter
>    [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.065 sec
>    [junit]
>    [junit] Testsuite: org.apache.lucene.search.TestTermRangeFilter
>    [junit] Tests run: 7, Failures: 0, Errors: 0, Time elapsed: 8.307 sec
>    [junit]
>    [junit] Testsuite: org.apache.lucene.search.TestTermRangeQuery
>    [junit] Tests run: 12, Failures: 0, Errors: 0, Time elapsed: 0.16 sec
>    [junit]
>    [junit] Testsuite: org.apache.lucene.search.TestTermScorer
>    [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 0.028 sec
>    [junit]
>    [junit] Testsuite: org.apache.lucene.search.TestTermVectors
>    [junit] Tests run: 8, Failures: 0, Errors: 0, Time elapsed: 1.174 sec
>    [junit]
>    [junit] Testsuite: org.apache.lucene.search.TestThreadSafe
>    [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 12.837 sec
>    [junit]
>    [junit] Testsuite: org.apache.lucene.search.TestTimeLimitingCollector
>    [junit] Tests run: 6, Failures: 0, Errors: 0, Time elapsed: 4.126 sec
>    [junit]
>    [junit] Testsuite: org.apache.lucene.search.TestTopDocsCollector
>    [junit] Tests run: 8, Failures: 0, Errors: 0, Time elapsed: 0.19 sec
>    [junit]
>    [junit] Testsuite: org.apache.lucene.search.TestTopScoreDocCollector
>    [junit] Tests run: 1, Failur

Re: Hudson build is back to normal : Lucene-trunk #1214

2010-06-13 Thread Simon Willnauer

woohoo! hudson made it again :) - thanks uwe for restarting

On Sun, Jun 13, 2010 at 11:48 AM, Apache Hudson Server
 wrote:
> See 
>
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Synchronized blocks?

2010-06-13 Thread Simon Willnauer

Hi Edward,

I don't think we should take any action on replacing all synchronized
blocks with ReentrantLock.
In my opinion synchronized should still be preferred to ReentrantLocks
as intrinsic locks still have many advantages over explicit locks.
If you don't need any features not provided by intrinsic locking you
should stick to synchronized blocks. Almost everybody is familiar with
the notation, the concept is well known and they are less error-prone.

While there are significant throughput improvements in Java 5 for
explicit locks they more or less disappeared in Java 6. Actually it
might be even more likely that future improvements are done on
intrinsic locks rather than on explicit locks due to their build in
nature. One more thing about synchronized blocks is that they do
appear in thread dumps while explicit locks don't (actually in java 6
they do) which is a very important property in my eyes.

I still think that in certain locations we could use explicit locks so
keep your eyes open - replacing all synchronized blocks might not be a
good idea!

simon

On Tue, Jun 8, 2010 at 4:37 AM, Edward Drapkin  wrote:
> Hello all,
>
> I've been getting more acquainted with the source for lucene lately and have
> noticed that all contention points (that I've seen) are handled by
> synchronized blocks.
>
> Since Lucene now requires Java 5, my question is this: is there any
> compelling reason to not dig through the code and replace uses of
> synchronized with ReentrantReadWriteLocks, especially as the performance of
> that locking mechanism (where behavior is the same) is much better in Java
> 5? While the sheer throughput difference in Java 6 may be lower, the ability
> to have multiple concurrent readers without contention ought to yield across
> the board performance improvements of some significance.
>
> As I am digging through (most of) the code anyway, I would have no problem
> doing this myself and "upgrading" the locking mechanisms everywhere, but
> seeing as it is an enormous venture, I wanted to make sure that it would be
> okay to do before investing the time and that an enormous patch wouldn't
> immediately be rejected.  Also, if this is okay to do, how should I present
> the patch?  I would think that an enormous patch touching dozens (if not
> well over a hundred?) of files isn't preferable at all, but I can do that
> too!
>
> Thanks,
> Eddie
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Repeat to the right list: Solr spewage and possible re-entrancy problem?

2010-06-14 Thread Simon Willnauer

Hey Karl,

the TIME_WAIT states you see are ok from the TCP perspective. The end
that sends the first FIN goes into the TIME_WAIT state, because that
is the end that sends the final ACK. If the other end's FIN is lost,
or if the final ACK is lost, having the end that sends the first FIN
maintain state about the connection guarantees that it has enough
information to retransmit the final ACK.
The socket will stay in TIME_WAIT for 2*packet lifetime (2* because of
the roundtrip).

As long as SO_LINGER is enabled the close operation on a socket will
wait until all queued messages are send. See this:

“When enabled, a close(2) or shutdown(2) will not return until all
queued messages for the socket have been successfully sent or the
linger timeout has been reached. Otherwise, the call returns
immediately and the closing is done in the background. When the socket
is closed as part of exit(2), it always lingers in the background.”

By defautl I think this is enabled and in the tomcat case set to 25 seconds.

I am not sure if that helps you with your problem but you could try
setting it to a lower value or disable it completely.

simon

On Mon, Jun 14, 2010 at 2:20 PM,   wrote:
> Good catch!
>
> r...@duck6:~# netstat -an | fgrep :8983 | wc
>  28223  169338 2257840
> r...@duck6:~#
>
> ... and here's an example:
>
> tcp6       0      0 127.0.0.1:8983          127.0.0.1:44058         TIME_WAIT
>
> So, once again, what causes this behavior?  How can I wind up with 28,000 
> socket connections hanging around, if both my client and Solr are behaving 
> properly and are closing connections properly?
>
> (I suspect that the answer to my somewhat rhetorical question is, "this 
> should not happen".  But then the question becomes, "why IS it happening?")
>
> Karl
>
> -Original Message-
> From: Wright Karl (Nokia-S/Cambridge)
> Sent: Sunday, June 13, 2010 7:52 AM
> To: dev@lucene.apache.org
> Subject: RE: Repeat to the right list: Solr spewage and possible re-entrancy 
> problem?
>
> Good idea.
>
> How would you prevent such a thing from occurring on the server?  Or would 
> this be the result of the client not doing something properly?
>
> Karl
>
> 
> From: ext Lance Norskog [goks...@gmail.com]
> Sent: Saturday, June 12, 2010 11:55 PM
> To: dev@lucene.apache.org
> Subject: Re: Repeat to the right list: Solr spewage and possible re-entrancy  
>   problem?
>
> There are situations where zombie sockets pile up at the server and
> keep zombie threads open. When this happens, check the total number of
> threads in the server JVM, and the total number of open or TIME_WAIT
> sockets. 'netstat -an | fgrep :8983' may find 2000 entries.
>
> Lance
>
> On Mon, Jun 7, 2010 at 7:35 AM,   wrote:
>> Hi folks,
>>
>> This morning I was experimenting with using multiple threads while indexing
>> some 20,000,000 records worth of content.  In fact, my test spun up some 50
>> threads, and happily chugged away for a couple of hours before I saw the
>> following output from my test code:
>>

>> Http protocol error: HTTP/1.1 400 missing_content_stream, while trying to
>> index record 6469124
>> Http protocol error: HTTP/1.1 400 missing_content_stream, while trying to
>> index record 6469551
>> Http protocol error: HTTP/1.1 400 missing_content_stream, while trying to
>> index record 6470592
>> Http protocol error: HTTP/1.1 400 missing_content_stream, while trying to
>> index record 6472454
>> java.net.SocketException: Connection reset
>>         at java.net.SocketInputStream.read(SocketInputStream.java:168)
>>         at HttpPoster.getResponse(HttpPoster.java:280)
>>         at HttpPoster.indexPost(HttpPoster.java:191)
>>         at ParseAndLoad$PostThread.run(ParseAndLoad.java:638)
>> <<
>>
>> Looking at the solr-side output, I see nothing interesting at all:
>>

>> Jun 7, 2010 9:57:48 AM org.apache.solr.core.SolrCore execute
>> INFO: [] webapp=/solr path=/update/extract
>> params={literal.nokia_longitude=9.78518981933594&literal.nokia_phone=%2B497971910474&literal.nokia_type=0&literal.nokia_boost=1&literal.nokia_district=MÃ¼nster&literal.nokia_placerating=0&literal.id=6472724&literal.nokia_visitcount=0&literal.nokia_country=DEU&literal.nokia_housenumber=1&literal.nokia_ppid=276u0wyw-c8cb7f4d6cd84a639a4e7d3570bf8814&literal.nokia_language=de&literal.nokia_city=Gaildorf&literal.nokia_latitude=48.9985514322917&literal.nokia_postalcode=74405&literal.nokia_street=WeinhaldenstraÃe&literal.nokia_title=Dorfgemeinschaft+MÃ¼nster+e.V.&literal.nokia_category=261}
>> status=0 QTime=1
>> Jun 7, 2010 9:57:48 AM org.apache.solr.core.SolrCore execute
>> INFO: [] webapp=/solr path=/update/extract
>> params={literal.nokia_longitude=9.76717020670573&literal.nokia_phone=%2B497971950725&literal.nokia_type=0&literal.nokia_boost=1&literal.nokia_placerating=0&literal.id=6472737&literal.nokia_visitcount=0&literal.nokia_country=DEU&literal.nokia_housenumber=13&literal.nokia_ppid=276u0wyw-d3bed6449fcb41b0adc

Re: Repeat to the right list: Solr spewage and possible re-entrancy problem?

2010-06-14 Thread Simon Willnauer

Not sure if this is a client problem at all. It seems that your server
closes the connection first (left hand side of your netstat output)
and then sticks in TIME_WAIT. Even if you are on localhost that could
be an issue. Many applications had problems with TIME_WAIT I can
remember mod_proxy having strange problems with that.

I am not saying this is the reason I just though I mention it as it
could help figuring out whats going wrong here.

simon

On Mon, Jun 14, 2010 at 3:09 PM,   wrote:
> Hi Simon,
>
> I have no doubt that TCP is working ok. ;-)  I have doubts that Solr is 
> working reasonably, however.
>
> Since this is all localhost-localhost interaction, I doubt we are losing 
> packets in my test case.  So I think we can eliminate that possibility as a 
> cause.
>
> If the claim is that the server is delaying its socket close somehow, that's 
> a problem worth trying to prevent.  The number of sockets created eventually 
> causes the process to run out of file handles, and that could well cause the 
> 400 errors, because commons-fileupload will not be able to write the content 
> to a temp file at that point.
>
> It's actually impossible for my client to be leaking in this way, so I don't 
> think that's the issue.  There's a fixed set of threads, and each thread MUST 
> close the socket it opens before it can go on to the next request:
>
>    Socket socket = createSocket();
>    try
>    {
>        ...
>    }
>    finally
>    {
>      socket.close();
>    }
>
> So, if there's a close delay, it's got to be server-side.
>
> Karl
>
>
> -Original Message-
> From: ext Simon Willnauer [mailto:simon.willna...@googlemail.com]
> Sent: Monday, June 14, 2010 8:47 AM
> To: dev@lucene.apache.org
> Subject: Re: Repeat to the right list: Solr spewage and possible re-entrancy 
> problem?
>
> Hey Karl,
>
> the TIME_WAIT states you see are ok from the TCP perspective. The end
> that sends the first FIN goes into the TIME_WAIT state, because that
> is the end that sends the final ACK. If the other end's FIN is lost,
> or if the final ACK is lost, having the end that sends the first FIN
> maintain state about the connection guarantees that it has enough
> information to retransmit the final ACK.
> The socket will stay in TIME_WAIT for 2*packet lifetime (2* because of
> the roundtrip).
>
> As long as SO_LINGER is enabled the close operation on a socket will
> wait until all queued messages are send. See this:
>
> “When enabled, a close(2) or shutdown(2) will not return until all
> queued messages for the socket have been successfully sent or the
> linger timeout has been reached. Otherwise, the call returns
> immediately and the closing is done in the background. When the socket
> is closed as part of exit(2), it always lingers in the background.”
>
> By defautl I think this is enabled and in the tomcat case set to 25 seconds.
>
> I am not sure if that helps you with your problem but you could try
> setting it to a lower value or disable it completely.
>
> simon
>
> On Mon, Jun 14, 2010 at 2:20 PM,   wrote:
>> Good catch!
>>
>> r...@duck6:~# netstat -an | fgrep :8983 | wc
>>  28223  169338 2257840
>> r...@duck6:~#
>>
>> ... and here's an example:
>>
>> tcp6       0      0 127.0.0.1:8983          127.0.0.1:44058         TIME_WAIT
>>
>> So, once again, what causes this behavior?  How can I wind up with 28,000 
>> socket connections hanging around, if both my client and Solr are behaving 
>> properly and are closing connections properly?
>>
>> (I suspect that the answer to my somewhat rhetorical question is, "this 
>> should not happen".  But then the question becomes, "why IS it happening?")
>>
>> Karl
>>
>> -Original Message-
>> From: Wright Karl (Nokia-S/Cambridge)
>> Sent: Sunday, June 13, 2010 7:52 AM
>> To: dev@lucene.apache.org
>> Subject: RE: Repeat to the right list: Solr spewage and possible re-entrancy 
>> problem?
>>
>> Good idea.
>>
>> How would you prevent such a thing from occurring on the server?  Or would 
>> this be the result of the client not doing something properly?
>>
>> Karl
>>
>> 
>> From: ext Lance Norskog [goks...@gmail.com]
>> Sent: Saturday, June 12, 2010 11:55 PM
>> To: dev@lucene.apache.org
>> Subject: Re: Repeat to the right list: Solr spewage and possible re-entrancy 
>>    problem?
>>
>> There are situations where zombie sockets pile up at the server and
>> keep zombie threads open. When this happens, check the tot

Re: maven - LUCENE-2490

2010-06-15 Thread Simon Willnauer

>From my perspective functionality rules out convenience. I like the
shorter -dev much better but agree with mark!

+1 to make maven work better as long as I don't have to fiddle around with it ;)

simon

On Tue, Jun 15, 2010 at 9:54 PM, Mark Miller  wrote:
> On 6/15/10 3:52 PM, Ryan McKinley wrote:
>>
>> Any objections if i checkin:
>> https://issues.apache.org/jira/browse/LUCENE-2490
>>
>> This changes the dev output files end in -SNAPSHOT.jar and keeps the
>> solr and lucene .pom versions in sync.  With this patch, we could
>> easily start publishing snapshot builds to maven via hudson.
>>
>> ryan
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>
> I'd say I prefer -dev over -SNAPSHOT by a long shot, but that's not a great
> argument against making this maven stuff work better is it ;)
>
> --
> - Mark
>
> http://www.lucidimagination.com
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Trouble updating Solr website

2010-06-25 Thread Simon Willnauer

On Fri, Jun 25, 2010 at 11:18 PM, Uwe Schindler  wrote:
> +1, die, forrest, die!
+1 - while my mac runs java 5 :)

simon
>
>
>
> -
>
> Uwe Schindler
>
> H.-H.-Meier-Allee 63, D-28213 Bremen
>
> http://www.thetaphi.de
>
> eMail: u...@thetaphi.de
>
>
>
> From: Robert Muir [mailto:rcm...@gmail.com]
> Sent: Friday, June 25, 2010 11:16 PM
> To: dev@lucene.apache.org; yo...@lucidimagination.com
> Subject: Re: Trouble updating Solr website
>
>
>
>
>
> On Fri, Jun 25, 2010 at 5:07 PM, Yonik Seeley 
> wrote:
>
>
>
> Yep, it definitely looks dead.
> If we move to something else, I'd lean toward a confluence export.
> I'm also leaning toward confluence for our future wiki (we're badly in
> need of versioned docs).
>
>
>
> this is interesting, what is the advantage of forrest? it seems from its
> description that the whole point is being able to go to multiple output
> formats, but is it really helpful to have the PDF files, especially for solr
> where so much documentation is in the wiki anyway?
>
>
>
> i just got a new mac too, and it doesn't support java 5, so lets [lucene,
> too!] please move away from forrest!!!
>
> --
> Robert Muir
> rcm...@gmail.com

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Build failed in Hudson: Lucene-trunk #1229

2010-06-29 Thread Simon Willnauer

On Wed, Jun 30, 2010 at 4:39 AM, Apache Hudson Server
 wrote:
> See 
>
> --
> [...truncated 3861 lines...]
> clover.setup:
>
> clover.info:
>
> clover:
>
> compile-core:
>
> common.compile-test:
>    [mkdir] Created dir: 
> 
>    [javac] Compiling 13 source files to 
> 
>     [copy] Copying 4 files to 
> 
>
> build-artifacts-and-tests:
>
> bdb:
>     [echo] Building bdb...
>
> common.init:
>
> build-lucene:
>
> contrib-build.init:
>
> get-db-jar:
>    [mkdir] Created dir: 
> 
>      [get] Getting: http://downloads.osafoundation.org/db/db-4.7.25.jar
>      [get] To: 
> 
>
> check-and-get-db-jar:
>
> init:
>
> clover.setup:
>
> clover.info:
>
> clover:
>
> compile-core:
>    [mkdir] Created dir: 
> 
>    [javac] Compiling 7 source files to 
> 
>
> jar-core:
>      [jar] Building jar: 
> 
>
> default:
>
> bdb-je:
>     [echo] Building bdb-je...
>
> common.init:
>
> build-lucene:
>
> contrib-build.init:
>
> get-je-jar:
>    [mkdir] Created dir: 
> 
>      [get] Getting: 
> http://download.oracle.com/maven/com/sleepycat/je/3.3.93/je-3.3.93.jar
>      [get] To: 
> 
>
> check-and-get-je-jar:
>
> init:
>
> clover.setup:
>
> clover.info:
>
> clover:
>
> compile-core:
>    [mkdir] Created dir: 
> 
>    [javac] Compiling 6 source files to 
> 
>
> jar-core:
>      [jar] Building jar: 
> 
>
> default:
>
> default:
>
> compile-test:
>     [echo] Building bdb...
>
> common.init:
>
> build-lucene:
>
> contrib-build.init:
>
> get-db-jar:
>
> check-and-get-db-jar:
>
> init:
>
> compile-test:
>     [echo] Building bdb...
>
> common.init:
>
> build-lucene:
>
> contrib-build.init:
>
> get-db-jar:
>
> check-and-get-db-jar:
>
> init:
>
> clover.setup:
>
> clover.info:
>
> clover:
>
> compile-core:
>
> common.compile-test:
>    [mkdir] Created dir: 
> 
>    [javac] Compiling 2 source files to 
> 
>     [echo] Building bdb-je...
>
> common.init:
>
> build-lucene:
>
> contrib-build.init:
>
> get-je-jar:
>
> check-and-get-je-jar:
>
> init:
>
> compile-test:
>     [echo] Building bdb-je...
>
> common.init:
>
> build-lucene:
>
> contrib-build.init:
>
> get-je-jar:
>
> check-and-get-je-jar:
>
> init:
>
> clover.setup:
>
> clover.info:
>
> clover:
>
> compile-core:
>
> common.compile-test:
>    [mkdir] Created dir: 
> 
>    [javac] Compiling 1 source file to 
> 
>
> build-artifacts-and-tests:
>     [echo] Building demo...
>
> compile-analyzers-common:
>
> common.init:
>
> build-lucene:
>
> init:
>
> clover.setup:
>
> clover.info:
>
> clover:
>
> common.compile-core:FATAL: command execution failed
> hudson.util.IOException2: Failed to join the process
>        at hudson.Proc$RemoteProc.join(Proc.java:312)
>        at hudson.Launcher$ProcStarter.join(Launcher.java:275)
>        at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:83)
>        at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:58)
>        at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
>        at 
> hudson.model.AbstractBuild$AbstractRunner.perform(AbstractBuild.java:582)
>        at hudson.model.Build$RunnerImpl.build(Build.java:165)
>        at hudson.model.Build$RunnerImpl.doRun(Build.java:132)
>

Re: Welcome Robert Muir to the Lucene PMC

2010-07-07 Thread Simon Willnauer

Congrats Robert! :)

Its a pleasure!


On Wed, Jul 7, 2010 at 10:01 PM, Erick Erickson  wrote:
> Congrats!
> Erick
>
> On Wed, Jul 7, 2010 at 3:49 PM, Shai Erera  wrote:
>>
>> Congratulations !
>>
>> Shai
>>
>> On Wed, Jul 7, 2010 at 10:30 PM, Uwe Schindler  wrote:
>>>
>>> Congratulations!
>>>
>>> -
>>> Uwe Schindler
>>> H.-H.-Meier-Allee 63, D-28213 Bremen
>>> http://www.thetaphi.de
>>> eMail: u...@thetaphi.de
>>>
>>>
>>> > -Original Message-
>>> > From: Grant Ingersoll [mailto:gsing...@apache.org]
>>> > Sent: Wednesday, July 07, 2010 8:12 PM
>>> > To: Lucene mailing list
>>> > Cc: dev@lucene.apache.org
>>> > Subject: Welcome Robert Muir to the Lucene PMC
>>> >
>>> > In recognition of Robert's continuing contributions to Lucene and Solr,
>>> I'm
>>> > happy to announce Robert has accepted our invitation to join the Lucene
>>> > PMC.
>>> >
>>> > Cheers,
>>> > Grant Ingersoll
>>> > Lucene PMC Chair
>>> > -
>>> > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For
>>> > additional
>>> > commands, e-mail: dev-h...@lucene.apache.org
>>>
>>>
>>>
>>> -
>>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>>
>>
>
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: busywait hang using extracting update handler on trunk

2010-08-02 Thread Simon Willnauer

On Mon, Aug 2, 2010 at 10:56 AM,   wrote:
>>>
> And can you run CheckIndex and post that output, and also the AIOOBE
> you hit for certain searches?
> <<
>
> Where can I find CheckIndex?

you find it in your lucene JAR
java -cp lucene.jar -ea:org.apache.lucene...
org.apache.lucene.index.CheckIndex pathToIndex [-fix] [-segment X]
[-segment Y]

see  
http://lucene.apache.org/java/3_0_2/api/core/org/apache/lucene/index/CheckIndex.html#main%28java.lang.String[]%29
for details

simon

> Karl
>
> 
> From: ext Michael McCandless [luc...@mikemccandless.com]
> Sent: Thursday, July 29, 2010 11:56 AM
> To: dev@lucene.apache.org
> Subject: Re: busywait hang using extracting update handler on trunk
>
> Hi Karl,
>
> Can you post the original merge failure?  A merge failure should not
> corrupt the index and shouldn't cause spinning.
>
> And can you run CheckIndex and post that output, and also the AIOOBE
> you hit for certain searches?
>
> Your first thread dumps shows one thread waiting the for the commit to
> finish so it can add a doc, and another thread closing the IW for
> commit (aside: Solr seems to close the IW whenever it autoCommits?  Is
> that true?) with the IW.close waiting for all outstanding merges to
> finish (though I see no actual merge thread running).  Your 2nd
> [truncated] dump does show a merge thread actively running.
>
> Is it possible there's just a big merge running, and this is why Solr
> appears hung?  (Ie because it closes the writer -- why not just call
> IW.commit instead?)
>
> Mike
>
> On Wed, Jul 28, 2010 at 5:57 AM,   wrote:
>> It appears that whenever I see a merge failure, I also apparently have a 
>> corrupt index (I get arrayindexoutofbounds exceptions when searching for 
>> certain things).  So that may be the underlying cause of the merge infinite 
>> loop.
>>
>> I've blown away the indexes repeatedly and tried to rebuild.  I am now 
>> committing every 1000 records, and I can make this happen utterly reliably.  
>> The data contains quite a lot of unicode, and I've noted recent posts about 
>> tests failing in this area.  Perhaps this is related?
>>
>> If this guess is correct, then there are two bugs.  First bug is that index 
>> corruption causes merge to spin indefinitely, rather than error out.  Second 
>> bug is that certain characters cause index corruption.
>>
>> With a bit of work I should be able to isolate the record that is the 
>> proximate cause of the index corruption.  I will post it when I have it.
>>
>> Karl
>>
>> --- original message ---
>> From: "Wright Karl (Nokia-MS/Cambridge)" 
>> Subject: RE: busywait hang using extracting update handler on trunk
>> Date: July 27, 2010
>> Time: 11:57:49  AM
>>
>>
>> Happened again.  The thing that caused it seems to have been an autocommit.  
>> Here's part of the first thread dump:
>>
>>        at java.lang.Object.wait(Native Method)
>>        - waiting on <0x29642f00> (a java.util.TaskQueue)
>>        at java.util.TimerThread.mainLoop(Timer.java:509)
>>        - locked <0x29642f00> (a java.util.TaskQueue)
>>        at java.util.TimerThread.run(Timer.java:462)
>>
>> "25615...@qtp-20051738-9 - Acceptor0 socketconnec...@0.0.0.0:8983" prio=6 
>> tid=0x
>> 03076800 nid=0x19ac runnable [0x034df000]
>>   java.lang.Thread.State: RUNNABLE
>>        at java.net.PlainSocketImpl.socketAccept(Native Method)
>>        at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:390)
>>        - locked <0x29770448> (a java.net.SocksSocketImpl)
>>        at java.net.ServerSocket.implAccept(ServerSocket.java:453)
>>        at java.net.ServerSocket.accept(ServerSocket.java:421)
>>        at 
>> org.mortbay.jetty.bio.SocketConnector.accept(SocketConnector.java:99)
>>
>>        at 
>> org.mortbay.jetty.AbstractConnector$Acceptor.run(AbstractConnector.ja
>> va:707)
>>        at 
>> org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.j
>> ava:582)
>>
>> "11245...@qtp-20051738-8" prio=6 tid=0x03075000 nid=0x10b4 waiting on 
>> condition
>> [0x0348e000]
>>   java.lang.Thread.State: WAITING (parking)
>>        at sun.misc.Unsafe.park(Native Method)
>>        - parking to wait for  <0x2970bad8> (a 
>> java.util.concurrent.locks.Reentr
>> antReadWriteLock$NonfairSync)
>>        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
>>        at 
>> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInt
>> errupt(AbstractQueuedSynchronizer.java:747)
>>        at 
>> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared
>> (AbstractQueuedSynchronizer.java:877)
>>        at 
>> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(A
>> bstractQueuedSynchronizer.java:1197)
>>        at 
>> java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(Reent
>> rantReadWriteLock.java:594)
>>        at 
>> org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandle
>> r2.java:211)
>>        at 
>> org.apache.solr.update.pr

Re: Combined Lucene/Solr Issues

2010-08-18 Thread Simon Willnauer

>> I think I would prefer to just be able to refer to Lucene issues from time 
>> to time in Solr's CHANGES.txt file and obviously, the patch would contain 
>> the fix across the source.  Thoughts?

I would appreciate creating two issues and use one only for reference
and link it by the one which contains patches and discussion if the
changes are large. Using SOLR- vs. LUCENE- I'd decide on a case by
case basis depending which "project" / "codebase" might undergo the
most significant changes. Generally,  referencing the issues in
CHANGES.TXT sounds like a good idea.

simon

On 8/18/10, Yonik Seeley  wrote:
> On Wed, Aug 18, 2010 at 10:48 AM, Grant Ingersoll 
> wrote:
>> Anyone have opinions on dealing with "combined issues" in JIRA.  By this,
>> I mean, issues that involve both changes to Lucene and Solr.
>
> It often makes sense - we've had a lot of patches that go across both
> already.
>
> -Yonik
> http://www.lucidimagination.com
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Combined Lucene/Solr Issues

2010-08-19 Thread Simon Willnauer

> Worst case scenerio: if it starts out as a SOLR issue and then the scope
> gets bigger, creating a new LUCENE issue to track it (and linking the two)
> seems trivial to me.

Thanks hoss for expressing what i tried to do :) That all makes perfect sense!

simon

On 8/19/10, Grant Ingersoll  wrote:
>
> On Aug 19, 2010, at 2:14 PM, Chris Hostetter wrote:
>
>> : Form me it does not matter, but when I open new issues, I do it against
>> : the project where the “bug” is visible. If there is also code committed
>> : to Solr, but the main task is Lucene this is fine.
>>
>> Right ... i think it's handy to still have the "SOLR" bug queue for people
>>
>> to file bugs against Solr, if they wind up requiring fixes further down
>> the tree then so be it.
>
> +1
>
>>
>> : Personally, i don't waste any time thinking about whether the issue is
>> : SOLR or LUCENE, and I think two JIRAs is actually confusing.
>>
>> If you know from the outset when you create an issue (ie: tracking an
>> improvement, or a new feature) that it requires updating "the whole tree"
>> then it should definitely be a LUCENE issue.  even if you aren't sure it
>> makes sense to start using LUCENE, but having SOLR arround for Solr users
>> to file bugs is handy.
>
> This is what I did for LUCENE-2608.
>
>>
>> Worst case scenerio: if it starts out as a SOLR issue and then the scope
>> gets bigger, creating a new LUCENE issue to track it (and linking the two)
>>
>> seems trivial to me.
>>
>> As far as refrencing LUCENE-* issues directly in Solr's CHANGES.txt --
>> sure, why not?
>
> Again, I did.
>
> -Grant
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Nested Document support in Lucene

2011-03-19 Thread Simon Willnauer

On Sat, Mar 19, 2011 at 9:39 AM, Kapil Charania
 wrote:
> Hi,
>
> I am a newbie to Lucene. I have already created indexes for my project. But
> now requirement is to go with Nested Document. I googled a lot but can not
> find much implementation of nested documents.
>
> My I know if its already implemented in any release of Lucene.
>
> Thanks in Advances !!!

AFAIK this is still under heavy development and it doesn't seem to be
ready in the near future. I has not yet been released.

simon
>
> --
> Kapil Charania.
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

HEADS UP - Rebuild any trunk indices

2011-03-22 Thread Simon Willnauer

Heads up -- LUCENE-2881, which I committed earlier today, changes the
SegmentInfos file format and adds a new ${number}.fnx file storing
global field numbers.

If you have any indexes lying around built with revs of trunk before
this commit, after you update you should completely reindex.

Indexes prior to trunk (< 4.0) will work fine and don't require any reindexing

Simon

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [GSoC] Apache Lucene @ Google Summer of Code 2011 [STUDENTS READ THIS]

2011-03-23 Thread Simon Willnauer

On Wed, Mar 23, 2011 at 9:37 AM, David Nemeskey
 wrote:
> Hey Simon and all,
>
> May we get an update on this? I understand that Google has published the list
> of accepted organizations, which -- not surprisingly -- includes the ASF. Is
> there any information on how many slots Apache got, and which issues will be
> selected?
>
> The student application period opens on the 28th, so I'm just wondering if I
> should go ahead and apply or wait for the decision.

David,

you should go ahead and apply via the GSoC website and reference the
issue there this is how I understand it works.
We will later rate the proposals from the GSoC website and decide
which we choose. This is also when slots get assigned.

simon
>
> Thanks,
> David
>
> On 2011 March 11, Friday 17:23:58 Simon Willnauer wrote:
>> Hey folks,
>>
>> Google Summer of Code 2011 is very close and the Project Applications
>> Period has started recently. Now it's time to get some excited students
>> on board for this year's GSoC.
>>
>> I encourage students to submit an application to the Google Summer of Code
>> web-application. Lucene & Solr are amazing projects and GSoC is an
>> incredible opportunity to join the community and push the project
>> forward.
>>
>> If you are a student and you are interested spending some time on a
>> great open source project while getting paid for it, you should submit
>> your application from March 28 - April 8, 2011. There are only 3
>> weeks until this process starts!
>>
>> Quote from the GSoC website: "We hear almost universally from our
>> mentoring organizations that the best applications they receive are
>> from students who took the time to interact and discuss their ideas
>> before submitting an application, so make sure to check out each
>> organization's Ideas list to get to know a particular open source
>> organization better."
>>
>> So if you have any ideas what Lucene & Solr should have, or if you
>> find any of the GSoC pre-selected projects [1] interesting, please
>> join us on dev@lucene.apache.org [2].  Since you as a student must
>> apply for a certain project via the GSoC website [3], it's a good idea
>> to work on it ahead of time and include the community and possible
>> mentors as soon as possible.
>>
>> Open source development here at the Apache Software
>> Foundation happens almost exclusively in the public and I encourage you to
>> follow this. Don't mail folks privately; please use the mailing list to
>> get the best possible visibility and attract interested community
>> members and push your idea forward. As always, it's the idea that
>> counts not the person!
>>
>> That said, please do not underestimate the complexity of even small
>> "GSoC - Projects". Don't try to rewrite Lucene or Solr!  A project
>> usually gains more from a smaller, well discussed and carefully
>> crafted & tested feature than from a half baked monster change that's
>> too large to work with.
>>
>> Once your proposal has been accepted and you begin work, you should
>> give the community the opportunity to iterate with you.  We prefer
>> "progress over perfection" so don't hesitate to describe your overall
>> vision, but when the rubber meets the road let's take it in small
>> steps.  A code patch of 20 KB is likely to be reviewed very quickly so
>> get fast feedback, while a patch even 60kb in size can take very
>> - Hide quoted text -
>> long. So try to break up your vision and the community will work with
>> you to get things done!
>>
>> On behalf of the Lucene & Solr community,
>>
>> Go! join the mailing list and apply for GSoC 2011,
>>
>> Simon
>>
>> [1]
>> https://issues.apache.org/jira/secure/IssueNavigator.jspa?reset=true&jqlQu
>> ery=labels+%3D+lucene-gsoc-11 [2]
>> http://lucene.apache.org/java/docs/mailinglists.html
>> [3] http://www.google-melange.com
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

BerlinBuzzwords 2011 Early Bird Ticket Period ends on April 7th.

2011-03-24 Thread Simon Willnauer

Hey folks,

just a short notice for those who haven't noticed we have only a
limited amount of Early-Bird tickets left and the Early-Bird period is
ends on April 7th. If you want to get one of the 30 remaining tickets
go and get one now here: http://berlinbuzzwords.de/content/tickets

While we are still working on the schedule and selecting speakers we
didn't send out any reject mail yet. So if you have submitted a talk
for BerlinBuzzwords 2011 you don't need to get a Early-Bird ticket
now. All potential speakers will be eligible for Early-Bird discount
even after April 7th.


regards,

Simon

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: boosting with standard search handler

2011-03-24 Thread Simon Willnauer

please do not cross post to dev list unless its a dev related question.

simon

On Thu, Mar 24, 2011 at 10:13 AM, Gastone Penzo  wrote:
> Hi,
> is possibile to boost fields like bf parameter of dismax in standard request
> handler?
> with or without funcions?
> thanx
>
> --
> Gastone Penzo
>
> www.solr-italia.it
> The first italian blog about Apache Solr

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Interested in GSOC

2011-03-25 Thread Simon Willnauer

Hey there,

welcome to Lucene :), good to hear you are interested in Lucene and GSoC!

On Fri, Mar 25, 2011 at 4:49 AM, Vinicius Paes de barros
 wrote:
>
> Hi there,
> I heard about GSOC from a friend of mine at college and I decide I want to 
> participate this year. I already used Lucene before, so Lucene sounds like a 
> good place to start.
> I went through the JIRA projects, but I couldn't find something I feel like 
> writing a proposal to, maybe I don't have enough knowledge yet about how 
> Lucene is implemented internally. So I started looking at the wiki, but I'm 
> not sure whether it contains all the info I need.
> Is there any other place I should be looking at to learn more about Lucene's 
> internal design?
We don't have a lot of design documents and if there are any they
might be most likely outdated. I think the best documentation is the
code and the people who have written it. If you wanna dive into lucene
you should ask as many questions you need to ask and get all the info
out of us. We are usually around every day depending on the timezones
though so you either go and write emails or you join our IRC channel
#lucene on freenode (http://lucene.apache.org/java/docs/irc.html)
Is there anything particular that you are interested in like indexing,
search, analysis etc?

simon

> Thanks in advance,
> Vinicius Barros
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [VOTE] Lucene 3.1.0 RC3

2011-03-29 Thread Simon Willnauer

+1

I ran the tests & checked signatures. Looks good to me.


On Tue, Mar 29, 2011 at 9:27 PM, Dawid Weiss
 wrote:
> +1
>
> Checked the clustering stuff again, works fine.
>
> Dawid
>
> On Tue, Mar 29, 2011 at 8:30 PM, Chris Hostetter
>  wrote:
>>
>> : Artifacts are at http://people.apache.org/~gsingers/staging_area/rc3/.
>> : Please vote as you see appropriate.  Vote closes on March 29th.
>>
>> +1
>>
>> -Hoss
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [HUDSON] Lucene-Solr-tests-only-trunk - Build # 6565 - Failure

2011-03-31 Thread Simon Willnauer

This on is weird seems like there is a synchronized missing on
FieldInfoBiMap#containsConsistent

I try to reproduce first.

simon
On Thu, Mar 31, 2011 at 11:37 AM, Apache Hudson Server
 wrote:
> Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-trunk/6565/
>
> 3 tests failed.
> REGRESSION:  org.apache.lucene.index.TestNRTThreads.testNRTThreads
>
> Error Message:
> null
>
> Stack Trace:
> junit.framework.AssertionFailedError
>        at 
> org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1221)
>        at 
> org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1149)
>        at org.apache.lucene.index.FieldInfos.putInternal(FieldInfos.java:280)
>        at org.apache.lucene.index.FieldInfos.clone(FieldInfos.java:302)
>        at org.apache.lucene.index.SegmentInfo.clone(SegmentInfo.java:345)
>        at org.apache.lucene.index.SegmentInfos.clone(SegmentInfos.java:374)
>        at 
> org.apache.lucene.index.DirectoryReader.(DirectoryReader.java:165)
>        at org.apache.lucene.index.IndexWriter.getReader(IndexWriter.java:360)
>        at org.apache.lucene.index.IndexReader.open(IndexReader.java:316)
>        at 
> org.apache.lucene.index.TestNRTThreads.testNRTThreads(TestNRTThreads.java:244)
>
>
> REGRESSION:  org.apache.lucene.index.TestSegmentTermDocs.test
>
> Error Message:
> Some threads threw uncaught exceptions!
>
> Stack Trace:
> junit.framework.AssertionFailedError: Some threads threw uncaught exceptions!
>        at 
> org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1221)
>        at 
> org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1149)
>        at 
> org.apache.lucene.util.LuceneTestCase.tearDown(LuceneTestCase.java:521)
>        at 
> org.apache.lucene.index.TestSegmentTermDocs.tearDown(TestSegmentTermDocs.java:45)
>
>
> REGRESSION:  
> org.apache.lucene.index.codecs.preflex.TestSurrogates.testSurrogatesOrder
>
> Error Message:
> Some threads threw uncaught exceptions!
>
> Stack Trace:
> junit.framework.AssertionFailedError: Some threads threw uncaught exceptions!
>        at 
> org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1221)
>        at 
> org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1149)
>        at 
> org.apache.lucene.util.LuceneTestCase.tearDown(LuceneTestCase.java:521)
>
>
>
>
> Build Log (for compile errors):
> [...truncated 3276 lines...]
>
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [HUDSON] Lucene-Solr-tests-only-trunk - Build # 6565 - Failure

2011-03-31 Thread Simon Willnauer

I just committed a fix for this

simon

On Thu, Mar 31, 2011 at 5:28 PM, Simon Willnauer
 wrote:
> This on is weird seems like there is a synchronized missing on
> FieldInfoBiMap#containsConsistent
>
> I try to reproduce first.
>
> simon
> On Thu, Mar 31, 2011 at 11:37 AM, Apache Hudson Server
>  wrote:
>> Build: 
>> https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-trunk/6565/
>>
>> 3 tests failed.
>> REGRESSION:  org.apache.lucene.index.TestNRTThreads.testNRTThreads
>>
>> Error Message:
>> null
>>
>> Stack Trace:
>> junit.framework.AssertionFailedError
>>        at 
>> org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1221)
>>        at 
>> org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1149)
>>        at org.apache.lucene.index.FieldInfos.putInternal(FieldInfos.java:280)
>>        at org.apache.lucene.index.FieldInfos.clone(FieldInfos.java:302)
>>        at org.apache.lucene.index.SegmentInfo.clone(SegmentInfo.java:345)
>>        at org.apache.lucene.index.SegmentInfos.clone(SegmentInfos.java:374)
>>        at 
>> org.apache.lucene.index.DirectoryReader.(DirectoryReader.java:165)
>>        at org.apache.lucene.index.IndexWriter.getReader(IndexWriter.java:360)
>>        at org.apache.lucene.index.IndexReader.open(IndexReader.java:316)
>>        at 
>> org.apache.lucene.index.TestNRTThreads.testNRTThreads(TestNRTThreads.java:244)
>>
>>
>> REGRESSION:  org.apache.lucene.index.TestSegmentTermDocs.test
>>
>> Error Message:
>> Some threads threw uncaught exceptions!
>>
>> Stack Trace:
>> junit.framework.AssertionFailedError: Some threads threw uncaught exceptions!
>>        at 
>> org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1221)
>>        at 
>> org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1149)
>>        at 
>> org.apache.lucene.util.LuceneTestCase.tearDown(LuceneTestCase.java:521)
>>        at 
>> org.apache.lucene.index.TestSegmentTermDocs.tearDown(TestSegmentTermDocs.java:45)
>>
>>
>> REGRESSION:  
>> org.apache.lucene.index.codecs.preflex.TestSurrogates.testSurrogatesOrder
>>
>> Error Message:
>> Some threads threw uncaught exceptions!
>>
>> Stack Trace:
>> junit.framework.AssertionFailedError: Some threads threw uncaught exceptions!
>>        at 
>> org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1221)
>>        at 
>> org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1149)
>>        at 
>> org.apache.lucene.util.LuceneTestCase.tearDown(LuceneTestCase.java:521)
>>
>>
>>
>>
>> Build Log (for compile errors):
>> [...truncated 3276 lines...]
>>
>>
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>>
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: My GSOC proposal

2011-04-05 Thread Simon Willnauer

Hey Varun,
On Tue, Apr 5, 2011 at 11:07 PM, Michael McCandless
 wrote:
> Hi Varun,
>
> Those two issues would make a great GSoC!  Comments below...
+1
>
> On Tue, Apr 5, 2011 at 1:56 PM, Varun Thacker
>  wrote:
>
>> I would like to combine two tasks as part of my project
>> namely-Directory createOutput and openInput should take an IOContext
>> (Lucene-2793) and compliment it by Generalize DirectIOLinuxDir to
>> UnixDir (Lucene-2795).
>>
>> The first part of the project is aimed at significantly reducing time
>> taken to search during indexing by adding an IOContext which would
>> store buffer size and have options to bypass the OS’s buffer cache
>> (This is what causes the slowdown in search ) and other hints. Once
>> completed I would move on to Lucene-2795 and generalize the Directory
>> implementation to make a UnixDirectory .
>
> So, the first part (LUCENE-2793) should cause no change at all to
> performance, functionality, etc., because it's "merely" installing the
> plumbing (IOContext threaded throughout the low-level store APIs in
> Lucene) so that higher levels can send important details down to the
> Directory.  We'd fix IndexWriter/IndexReader to fill out this
> IOContext with the details (merging, flushing, new reader, etc.).
>
> There's some fun/freedom here in figuring out just what details should
> be included in IOContext... (eg: is it low level "set buffer size to 4 KB"
> or is it high level "I am opening a new near-real-time reader").
>
> This first step is a rote cutover, just changing APIs but in no way
> taking advantage of the new APIs.
>
> The 2nd step (LUCENE-2795) would then take advantage of this plumbing,
> by creating a UnixDir impl that, using JNI (C code), passes advanced
> flags when opening files, based on the incoming IOContext.
>
> The goal is a single UnixDir that has ifdefs so that it's usable
> across multiple Unices, and eg would use direct IO if the context is
> merging.  If we are ambitious we could rope Windows into the mix, too,
> and then this would be NativeDir...
>
> We can measure success by validating that a big merge while searching
> does not hurt search performance?  (Ie we should be able to reproduce
> the results from
> http://blog.mikemccandless.com/2010/06/lucene-and-fadvisemadvise.html).

Thanks for the summary mike!
>
>> I have spoken to Micheal McCandless and Simon Willnauer about
>> undertaking these tasks. Micheal McCandless has agreed to mentor me .
>> I would love to be able to contribute and learn from Apache Lucene
>> community this summer. Also I would love suggestions on how to make my
>> application proposal stronger.
>
> I think either Simon or I can be the "official" mentor, and then the
> other one of us (and other Lucene committers) will support/chime
> in...

I will take the official responsibility here once we are there!
simon
>
> This is an important change for Lucene!
>
> Mike
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Google Summer Code 2011 participation

2011-04-08 Thread Simon Willnauer

On Fri, Apr 8, 2011 at 12:11 PM, Michael McCandless
 wrote:
> Anyone can participate in Lucene/Solr!  You don't need to be GSoC
> student to do so...
>
> Browse the issues in Jira (maybe focusing on the ones marked for GSoC
> and not already "taken"), or open your own issues, discuss, post
> patches, iterate, etc.
>
> Find your itch and scratch it ;)

+1 we are all around and jump on the issue to guide you. Find one, ask
questions if you have and start discussions / coding!

simon
>
> And there are a great many itches out there that need scratching...
>
> Mike
>
> http://blog.mikemccandless.com
>
> On Thu, Apr 7, 2011 at 9:34 PM, Minh Doan  wrote:
>> Hi forks,
>> Receiving a bunch of emails recently about GSOC, I really want to join but
>> it seems like I'm not eligible to do even though I used to be a PhD student,
>> and currently on leave (I will be probably back soon). I really want to
>> contribute to lucene to implement some of my ideas. Can I have a lucene
>> "mentor" like those mentor experts who are excited to GSOC ?
>>
>> Best,
>> Minh
>> On Tue, Apr 5, 2011 at 7:06 AM, Steven A Rowe  wrote:
>>>
>>> Hi Jayendra,
>>>
>>> From
>>> :
>>>
>>> In order to participate in the program, you must be a student. Google
>>> defines a student as an individual enrolled in or accepted into an
>>> accredited institution including (but not necessarily limited to) colleges,
>>> universities, masters programs, PhD programs and undergraduate programs. You
>>> are eligible to apply if you are enrolled in an accredited university
>>> educational program provided you meet all of the other eligibility
>>> requirements. You should be prepared, upon request, to provide Google with
>>> transcripts or other documentation from your accredited institution as proof
>>> of enrollment or admission status. Computer Science does not need to be your
>>> field of study in order to participate in the program.
>>>
>>> You may be enrolled as a full-time or part-time student. You must also be
>>> eligible to work in the country in which you'll reside throughout the
>>> duration of the program, e.g. if you are in the United States on an F-1
>>> visa, you are welcome to apply to Google Summer of Code as long as you have
>>> U.S. work authorization. For F-1 students applying for CPT, Google will
>>> furnish you with a letter you can provide to your university to get CPT
>>> established once your application to the program has been accepted.
>>>
>>> > -Original Message-
>>> > From: Jayendra Patil [mailto:jayendra.patil@gmail.com]
>>> > Sent: Tuesday, April 05, 2011 9:56 AM
>>> > To: dev@lucene.apache.org
>>> > Subject: Google Summer Code 2011 participation
>>> >
>>> > Hi,
>>> >
>>> > Does the Google Summer Code 2011 apply only to students ??
>>> > I have been working on Solr for quite some time now and would like to
>>> > start contributing back.
>>> > Have been using it to index structured and unstructured data and have
>>> > a fair bit of knowledge of the internals as well. (Have a few jiras
>>> > and patches submitted)
>>> > I don't have a specific proposal in mind yet, but would like to start
>>> > with any specific area or issues.
>>> >
>>> > Let me know if and how can i participate.
>>> >
>>> > Regards,
>>> > Jayendra
>>> >
>>> > -
>>> > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>>> > For additional commands, e-mail: dev-h...@lucene.apache.org
>>>
>>
>>
>>
>> --
>> ---
>> Minh
>>
>>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

TestIndexWriterDelete#testUpdatesOnDiskFull can false fail

2011-04-13 Thread Simon Willnauer

In TestIndexWriterDelete#testUpdatesOnDiskFull especially between line
538 and 553 we could get a random exception from the
MockDirectoryWrapper which makes the test fail since we are not
catching / expecting those exceptions.
I can make this fail  on trunk even in 1000 runs but on realtime it
fails quickly after I merged this morning. I think we should just
disable the random exception for this part and reenable after we are
done, see patch below! - Thoughts?


Index: lucene/src/test/org/apache/lucene/index/TestIndexWriterDelete.java
===
--- lucene/src/test/org/apache/lucene/index/TestIndexWriterDelete.java  
(revision
1091721)
+++ lucene/src/test/org/apache/lucene/index/TestIndexWriterDelete.java  (working
copy)
@@ -536,7 +536,9 @@
 fail(testName + " hit IOException after disk space was freed up");
   }
 }
-
+// prevent throwing a random exception here!!
+final double randomIOExceptionRate = dir.getRandomIOExceptionRate();
+dir.setRandomIOExceptionRate(0.0);
 if (!success) {
   // Must force the close else the writer can have
   // open files which cause exc in MockRAMDir.close
@@ -549,6 +551,7 @@
   _TestUtil.checkIndex(dir);
   TestIndexWriter.assertNoUnreferencedFiles(dir, "after writer.close");
 }
+dir.setRandomIOExceptionRate(randomIOExceptionRate);

 // Finally, verify index is not corrupt, and, if
 // we succeeded, we see all docs changed, and if

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: TestIndexWriterDelete#testUpdatesOnDiskFull can false fail

2011-04-14 Thread Simon Willnauer

just committed to trunk

simon

On Wed, Apr 13, 2011 at 5:06 PM, Michael McCandless
 wrote:
> +1
>
> Mike
>
> http://blog.mikemccandless.com
>
> On Wed, Apr 13, 2011 at 5:58 AM, Simon Willnauer
>  wrote:
>> In TestIndexWriterDelete#testUpdatesOnDiskFull especially between line
>> 538 and 553 we could get a random exception from the
>> MockDirectoryWrapper which makes the test fail since we are not
>> catching / expecting those exceptions.
>> I can make this fail  on trunk even in 1000 runs but on realtime it
>> fails quickly after I merged this morning. I think we should just
>> disable the random exception for this part and reenable after we are
>> done, see patch below! - Thoughts?
>>
>>
>> Index: lucene/src/test/org/apache/lucene/index/TestIndexWriterDelete.java
>> ===
>> --- lucene/src/test/org/apache/lucene/index/TestIndexWriterDelete.java  
>> (revision
>> 1091721)
>> +++ lucene/src/test/org/apache/lucene/index/TestIndexWriterDelete.java  
>> (working
>> copy)
>> @@ -536,7 +536,9 @@
>>             fail(testName + " hit IOException after disk space was freed 
>> up");
>>           }
>>         }
>> -
>> +        // prevent throwing a random exception here!!
>> +        final double randomIOExceptionRate = dir.getRandomIOExceptionRate();
>> +        dir.setRandomIOExceptionRate(0.0);
>>         if (!success) {
>>           // Must force the close else the writer can have
>>           // open files which cause exc in MockRAMDir.close
>> @@ -549,6 +551,7 @@
>>           _TestUtil.checkIndex(dir);
>>           TestIndexWriter.assertNoUnreferencedFiles(dir, "after 
>> writer.close");
>>         }
>> +        dir.setRandomIOExceptionRate(randomIOExceptionRate);
>>
>>         // Finally, verify index is not corrupt, and, if
>>         // we succeeded, we see all docs changed, and if
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>>
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [HUDSON] Lucene-Solr-tests-only-realtime_search-branch - Build # 2 - Still Failing

2011-04-14 Thread Simon Willnauer

I just committed a fix for this

On Thu, Apr 14, 2011 at 4:47 PM, Apache Hudson Server
 wrote:
> Build: 
> https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-realtime_search-branch/2/
>
> 1 tests failed.
> REGRESSION:  
> org.apache.lucene.index.TestIndexWriterDelete.testUpdatesOnDiskFull
>
> Error Message:
> fake disk full at 13517 bytes when writing _0_1.del (file length=0; wrote 10 
> of 20 bytes)
>
> Stack Trace:
> java.io.IOException: fake disk full at 13517 bytes when writing _0_1.del 
> (file length=0; wrote 10 of 20 bytes)
>        at 
> org.apache.lucene.store.MockIndexOutputWrapper.writeBytes(MockIndexOutputWrapper.java:111)
>        at org.apache.lucene.store.DataOutput.writeBytes(DataOutput.java:43)
>        at org.apache.lucene.util.BitVector.writeBits(BitVector.java:182)
>        at org.apache.lucene.util.BitVector.write(BitVector.java:171)
>        at 
> org.apache.lucene.index.SegmentReader.commitChanges(SegmentReader.java:718)
>        at 
> org.apache.lucene.index.SegmentReader.doCommit(SegmentReader.java:696)
>        at 
> org.apache.lucene.index.IndexWriter$ReaderPool.commit(IndexWriter.java:572)
>        at 
> org.apache.lucene.index.IndexWriter.startCommit(IndexWriter.java:3597)
>        at 
> org.apache.lucene.index.IndexWriter.prepareCommit(IndexWriter.java:2466)
>        at 
> org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:2537)
>        at 
> org.apache.lucene.index.IndexWriter.closeInternal(IndexWriter.java:1067)
>        at 
> org.apache.lucene.index.IndexWriter.rollbackInternal(IndexWriter.java:1923)
>        at org.apache.lucene.index.IndexWriter.rollback(IndexWriter.java:1848)
>        at 
> org.apache.lucene.index.TestIndexWriterDelete.doTestOperationsOnDiskFull(TestIndexWriterDelete.java:545)
>        at 
> org.apache.lucene.index.TestIndexWriterDelete.testUpdatesOnDiskFull(TestIndexWriterDelete.java:409)
>        at 
> org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1226)
>        at 
> org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1154)
>
>
>
>
> Build Log (for compile errors):
> [...truncated 3190 lines...]
>
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Setting the max number of merge threads across IndexWriters

2011-04-14 Thread Simon Willnauer

On Thu, Apr 14, 2011 at 5:20 PM, Jason Rutherglen
 wrote:
> Today the ConcurrentMergeScheduler allows setting the max thread
> count and is bound to a single IndexWriter.
>
> However in the [common] case of multiple IndexWriters running in
> the same process, this disallows one from managing the aggregate
> number of merge threads executing at any given time.
>
> I think this can be fixed, shall I open an issue?

go ahead! I think I have seen this suggestion somewhere maybe you need
to see if there is one already

simon
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Setting the max number of merge threads across IndexWriters

2011-04-14 Thread Simon Willnauer

On Thu, Apr 14, 2011 at 5:52 PM, Earwin Burrfoot  wrote:
> I proposed to decouple MergeScheduler from IW (stop keeping a
> reference to it). Then you can create a single CMS and pass it to all
> your IWs.
Yep that was it... is there an issue for this?

simon
>
> On Thu, Apr 14, 2011 at 19:40, Jason Rutherglen
>  wrote:
>> I think the proposal involved using a ThreadPoolExecutor, which seemed
>> to not quite work as well as what we have.  I think it'll be easier to
>> simply pass a global context that keeps a counter of the actively
>> running threads, and pass that into each IW's CMS?
>>
>> On Thu, Apr 14, 2011 at 8:25 AM, Simon Willnauer
>>  wrote:
>>> On Thu, Apr 14, 2011 at 5:20 PM, Jason Rutherglen
>>>  wrote:
>>>> Today the ConcurrentMergeScheduler allows setting the max thread
>>>> count and is bound to a single IndexWriter.
>>>>
>>>> However in the [common] case of multiple IndexWriters running in
>>>> the same process, this disallows one from managing the aggregate
>>>> number of merge threads executing at any given time.
>>>>
>>>> I think this can be fixed, shall I open an issue?
>>>
>>> go ahead! I think I have seen this suggestion somewhere maybe you need
>>> to see if there is one already
>>>
>>> simon
>>>>
>>>> -
>>>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>>>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>>>
>>>>
>>>
>>> -
>>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>>
>>>
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>>
>
>
>
> --
> Kirill Zakharenko/Кирилл Захаренко
> E-Mail/Jabber: ear...@gmail.com
> Phone: +7 (495) 683-567-4
> ICQ: 104465785
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Lucene Merge failing on Open Files

2011-04-14 Thread Simon Willnauer

On Wed, Apr 6, 2011 at 8:44 PM, Grant Ingersoll  wrote:
>
>
> Begin forwarded message:
>
> From: Michael McCandless 
> Date: April 5, 2011 5:46:13 AM EDT
> To: simon.willna...@gmail.com
> Cc: Simon Willnauer ,
> java-u...@lucene.apache.org, paul_t...@fastmail.fm
> Subject: Re: Lucene Merge failing on Open Files
> Reply-To: java-u...@lucene.apache.org
>
> Yeah, that mergeFactor is way too high and will cause
> too-many-open-files (if the index has enough segments).
>
> This is one of the things that has always bothered me about Merge Factor.
>  We state what the lower bound is, but we don't doc the upper bound.
> Should we even allow higher values?  Of course, how does one pick the
> cutoff?  I've seen up to about 100 be effective.  But 3000 is a bit high
> (although, who knows what the future will hold)

grant, we can at least add some documentation no?

simon

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [HUDSON] Lucene-Solr-tests-only-trunk - Build # 7260 - Still Failing

2011-04-19 Thread Simon Willnauer

uwe go ahead and disable it... :( any idea when we get more HW for our
tests? Its a shame that we don't have enough HW to run our continuous
tests all the time as we like

simon

On Tue, Apr 19, 2011 at 9:01 AM, Uwe Schindler  wrote:
> Disk full again, deleting
>
> Maybe we added again some stuff that fills up the file system. We should 
> disable one of the jobs (e.g. realtime) for now.
>
> -
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: u...@thetaphi.de
>
>> -Original Message-
>> From: Apache Hudson Server [mailto:hud...@hudson.apache.org]
>> Sent: Tuesday, April 19, 2011 8:26 AM
>> To: dev@lucene.apache.org
>> Subject: [HUDSON] Lucene-Solr-tests-only-trunk - Build # 7260 - Still Failing
>>
>> Build: https://hudson.apache.org/hudson/job/Lucene-Solr-tests-only-
>> trunk/7260/
>>
>> No tests ran.
>>
>> Build Log (for compile errors):
>> [...truncated 52 lines...]
>>
>>
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional
>> commands, e-mail: dev-h...@lucene.apache.org
>
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [HUDSON] Lucene-trunk - Build # 1537 - Failure

2011-04-22 Thread Simon Willnauer

hey folks, this seems to happen more often in the near past... can we
raise the limit for jenkins so we don't get these failures again. Once
we land DWPT this is likely to happen even more often since we are
writing more / smaller segments with DWPT.

Uwe do you have karma to fix that?

simon

On Fri, Apr 22, 2011 at 4:38 AM, Apache Jenkins Server
 wrote:
> Build: https://builds.apache.org/hudson/job/Lucene-trunk/1537/
>
> 21 tests failed.
> REGRESSION:  org.apache.lucene.index.TestNRTThreads.testNRTThreads
>
> Error Message:
> /usr/home/hudson/hudson-slave/workspace/Lucene-trunk/checkout/lucene/build/test/1/nrtopenfiles.4311211294863747903/_bx.tvd
>  (Too many open files in system)
>
> Stack Trace:
> java.io.FileNotFoundException: 
> /usr/home/hudson/hudson-slave/workspace/Lucene-trunk/checkout/lucene/build/test/1/nrtopenfiles.4311211294863747903/_bx.tvd
>  (Too many open files in system)
>        at java.io.RandomAccessFile.open(Native Method)
>        at java.io.RandomAccessFile.(RandomAccessFile.java:233)
>        at 
> org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexInput$Descriptor.(SimpleFSDirectory.java:69)
>        at 
> org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexInput.(SimpleFSDirectory.java:90)
>        at 
> org.apache.lucene.store.NIOFSDirectory$NIOFSIndexInput.(NIOFSDirectory.java:91)
>        at 
> org.apache.lucene.store.NIOFSDirectory.openInput(NIOFSDirectory.java:78)
>        at org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:345)
>        at 
> org.apache.lucene.store.MockDirectoryWrapper.openInput(MockDirectoryWrapper.java:374)
>        at org.apache.lucene.store.Directory.openInput(Directory.java:122)
>        at 
> org.apache.lucene.index.TermVectorsReader.(TermVectorsReader.java:83)
>        at 
> org.apache.lucene.index.SegmentReader$CoreReaders.openDocStores(SegmentReader.java:236)
>        at 
> org.apache.lucene.index.SegmentReader.openDocStores(SegmentReader.java:515)
>        at 
> org.apache.lucene.index.IndexWriter$ReaderPool.get(IndexWriter.java:611)
>        at 
> org.apache.lucene.index.IndexWriter$ReaderPool.getReadOnlyClone(IndexWriter.java:560)
>        at 
> org.apache.lucene.index.DirectoryReader.(DirectoryReader.java:172)
>        at org.apache.lucene.index.IndexWriter.getReader(IndexWriter.java:360)
>        at 
> org.apache.lucene.index.DirectoryReader.doReopenFromWriter(DirectoryReader.java:419)
>        at 
> org.apache.lucene.index.DirectoryReader.doReopen(DirectoryReader.java:432)
>        at 
> org.apache.lucene.index.DirectoryReader.reopen(DirectoryReader.java:392)
>        at 
> org.apache.lucene.index.TestNRTThreads.testNRTThreads(TestNRTThreads.java:213)
>        at 
> org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1232)
>        at 
> org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1160)
>
>
> REGRESSION:  org.apache.lucene.index.TestOmitNorms.testOmitNormsCombos
>
> Error Message:
> /usr/home/hudson/hudson-slave/workspace/Lucene-trunk/checkout/lucene/build/test/1/test8730544420518378026tmp/_i_0.skp
>  (Too many open files in system)
>
> Stack Trace:
> java.io.FileNotFoundException: 
> /usr/home/hudson/hudson-slave/workspace/Lucene-trunk/checkout/lucene/build/test/1/test8730544420518378026tmp/_i_0.skp
>  (Too many open files in system)
>        at java.io.RandomAccessFile.open(Native Method)
>        at java.io.RandomAccessFile.(RandomAccessFile.java:233)
>        at 
> org.apache.lucene.store.FSDirectory$FSIndexOutput.(FSDirectory.java:448)
>        at 
> org.apache.lucene.store.FSDirectory.createOutput(FSDirectory.java:312)
>        at 
> org.apache.lucene.store.MockDirectoryWrapper.createOutput(MockDirectoryWrapper.java:348)
>        at 
> org.apache.lucene.index.codecs.sep.SepPostingsWriterImpl.(SepPostingsWriterImpl.java:139)
>        at 
> org.apache.lucene.index.codecs.sep.SepPostingsWriterImpl.(SepPostingsWriterImpl.java:106)
>        at 
> org.apache.lucene.index.codecs.mockintblock.MockFixedIntBlockCodec.fieldsConsumer(MockFixedIntBlockCodec.java:114)
>        at 
> org.apache.lucene.index.PerFieldCodecWrapper$FieldsWriter.(PerFieldCodecWrapper.java:64)
>        at 
> org.apache.lucene.index.PerFieldCodecWrapper.fieldsConsumer(PerFieldCodecWrapper.java:54)
>        at 
> org.apache.lucene.index.FreqProxTermsWriter.flush(FreqProxTermsWriter.java:78)
>        at org.apache.lucene.index.TermsHash.flush(TermsHash.java:103)
>        at org.apache.lucene.index.DocInverter.flush(DocInverter.java:65)
>        at 
> org.apache.lucene.index.DocFieldProcessor.flush(DocFieldProcessor.java:55)
>        at 
> org.apache.lucene.index.DocumentsWriter.flush(DocumentsWriter.java:567)
>        at org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:2497)
>        at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:2462)
>        at 
> org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1211)
>        at 
> org.apache.luc

Re: [HUDSON] Lucene-trunk - Build # 1537 - Failure

2011-04-22 Thread Simon Willnauer

On Fri, Apr 22, 2011 at 12:20 PM, Uwe Schindler  wrote:
> I will look into it ASAP. I am not sure what limitation there currently are, 
> but I have root access to the VM itsself.

cool thanks uwe!

simon
>
> Uwe
>
> -
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: u...@thetaphi.de
>
>> -----Original Message-
>> From: Simon Willnauer [mailto:simon.willna...@googlemail.com]
>> Sent: Friday, April 22, 2011 9:55 AM
>> To: dev@lucene.apache.org
>> Subject: Re: [HUDSON] Lucene-trunk - Build # 1537 - Failure
>>
>> hey folks, this seems to happen more often in the near past... can we raise
>> the limit for jenkins so we don't get these failures again. Once we land DWPT
>> this is likely to happen even more often since we are writing more / smaller
>> segments with DWPT.
>>
>> Uwe do you have karma to fix that?
>>
>> simon
>>
>> On Fri, Apr 22, 2011 at 4:38 AM, Apache Jenkins Server
>>  wrote:
>> > Build: https://builds.apache.org/hudson/job/Lucene-trunk/1537/
>> >
>> > 21 tests failed.
>> > REGRESSION:  org.apache.lucene.index.TestNRTThreads.testNRTThreads
>> >
>> > Error Message:
>> > /usr/home/hudson/hudson-slave/workspace/Lucene-
>> trunk/checkout/lucene/b
>> > uild/test/1/nrtopenfiles.4311211294863747903/_bx.tvd (Too many open
>> > files in system)
>> >
>> > Stack Trace:
>> > java.io.FileNotFoundException:
>> > /usr/home/hudson/hudson-slave/workspace/Lucene-
>> trunk/checkout/lucene/b
>> > uild/test/1/nrtopenfiles.4311211294863747903/_bx.tvd (Too many open
>> > files in system)
>> >        at java.io.RandomAccessFile.open(Native Method)
>> >        at java.io.RandomAccessFile.(RandomAccessFile.java:233)
>> >        at
>> > org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexInput$Descripto
>> > r.(SimpleFSDirectory.java:69)
>> >        at
>> > org.apache.lucene.store.SimpleFSDirectory$SimpleFSIndexInput.(Si
>> > mpleFSDirectory.java:90)
>> >        at
>> > org.apache.lucene.store.NIOFSDirectory$NIOFSIndexInput.(NIOFSDir
>> > ectory.java:91)
>> >        at
>> > org.apache.lucene.store.NIOFSDirectory.openInput(NIOFSDirectory.java:7
>> > 8)
>> >        at
>> > org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:345)
>> >        at
>> >
>> org.apache.lucene.store.MockDirectoryWrapper.openInput(MockDirectory
>> Wr
>> > apper.java:374)
>> >        at
>> > org.apache.lucene.store.Directory.openInput(Directory.java:122)
>> >        at
>> >
>> org.apache.lucene.index.TermVectorsReader.(TermVectorsReader.jav
>> > a:83)
>> >        at
>> >
>> org.apache.lucene.index.SegmentReader$CoreReaders.openDocStores(Seg
>> men
>> > tReader.java:236)
>> >        at
>> >
>> org.apache.lucene.index.SegmentReader.openDocStores(SegmentReader.j
>> ava
>> > :515)
>> >        at
>> > org.apache.lucene.index.IndexWriter$ReaderPool.get(IndexWriter.java:61
>> > 1)
>> >        at
>> >
>> org.apache.lucene.index.IndexWriter$ReaderPool.getReadOnlyClone(Index
>> W
>> > riter.java:560)
>> >        at
>> > org.apache.lucene.index.DirectoryReader.(DirectoryReader.java:17
>> > 2)
>> >        at
>> > org.apache.lucene.index.IndexWriter.getReader(IndexWriter.java:360)
>> >        at
>> >
>> org.apache.lucene.index.DirectoryReader.doReopenFromWriter(DirectoryR
>> e
>> > ader.java:419)
>> >        at
>> >
>> org.apache.lucene.index.DirectoryReader.doReopen(DirectoryReader.java:
>> > 432)
>> >        at
>> > org.apache.lucene.index.DirectoryReader.reopen(DirectoryReader.java:39
>> > 2)
>> >        at
>> >
>> org.apache.lucene.index.TestNRTThreads.testNRTThreads(TestNRTThreads.j
>> > ava:213)
>> >        at
>> >
>> org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(Lu
>> > ceneTestCase.java:1232)
>> >        at
>> >
>> org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(Lu
>> > ceneTestCase.java:1160)
>> >
>> >
>> >
>> REGRESSION:  org.apache.lucene.index.TestOmitNorms.testOmitNormsCom
>> bos
>> >
>> > Error Message:
>> > /usr/home/hudson/hudson-slave/workspace/Lucene-
>> tru

Re: Lucene Jenkins slave out of disk

2011-04-23 Thread Simon Willnauer

On Fri, Apr 22, 2011 at 5:13 PM, Robert Muir  wrote:
> On Fri, Apr 22, 2011 at 9:13 AM, Uwe Schindler  wrote:
>> Hi Robert,
>>
>> Thanks for pointing to that issue. Indeed the leftover test files in Lucene 
>> take approx. 3 GB per build. With our 9 builds that’s 30 GB - useless. If 
>> the tests clean up the thing successfully after running, we should be fine.
>>
>
> I resolved this for trunk, branch_3x, and backwards.
>
> any other branches (realtime? docvalues?) currently being tested by
> hudson should merge up as soon as we can

Thanks robert, I will merge RT now and commit... DocValues build is
disabled currently I will make sure that I merge before reenabling
it...

simon
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [HUDSON] Lucene-trunk - Build # 1537 - Failure

2011-04-23 Thread Simon Willnauer

On Fri, Apr 22, 2011 at 2:44 PM, Robert Muir  wrote:
> On Fri, Apr 22, 2011 at 8:41 AM, Thomas Matthijs
>  wrote:
>>
>>
>> On Fri, Apr 22, 2011 at 14:15, Uwe Schindler  wrote:
>>>
>>> Hi Simon,
>>>
>>> I had no success to change anything. As root I can at least call ulimit
>>> -n, but the limit does not raise. Lowering is easily possible:
>>>
>>> [root@lucene ~]# ulimit -n 32768
>>
>> Probably kernel level enforced max, try raising it with sysctl, i think
>> there are options named "kern.maxfilesperproc" or "kern.maxfiles" you can
>> list them with # sysctl -a
>>
>
> Are you sure we should do this? I've had this discussion with
> mikemccand before, the concern is that if we have too many open files
> this is definitely a realistic problem (it comes up on the userlist
> quite often).

--> open files  (-n) 11095

thats quite a ok setting though... the problem I see here is that
there are some tests around that could produce tons of files due to
some settings like maxBufferedDocs = 2 if then no merge policy is used
we getting pretty close to those limits. The problems on the userlist
are coming up since ever not sure what to do here then..

simon

>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Lucene Jenkins slave out of disk

2011-04-23 Thread Simon Willnauer

On Sat, Apr 23, 2011 at 9:42 AM, Uwe Schindler  wrote:
>> > any other branches (realtime? docvalues?) currently being tested by
>> > hudson should merge up as soon as we can
>>
>> Thanks robert, I will merge RT now and commit... DocValues build is disabled
>> currently I will make sure that I merge before reenabling it...
>
> Don't hurry, the FreeBSD machine hosting the Jail is down since about 18 hrs. 
> Major problems as it seems - or they are updating harddisks? *g*

heh... tests already running.. so I will commit in a bit... maybe they
give us a stronger machine :D

simon
>
> Uwe
>
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Lucene Jenkins slave out of disk

2011-04-23 Thread Simon Willnauer

On Sat, Apr 23, 2011 at 10:11 AM, Uwe Schindler  wrote:
>> Hi,
>>
>>
>> > On Sat, Apr 23, 2011 at 9:47 AM, Uwe Schindler  wrote:
>> > > Hi,
>> > >
>> > > Can you also check that all new tests in realtime use the new
>> > > _TestUtils API
>> > for getting an index dir? That would be nice.
>> >
>> > This only applies if we are getting an explicit index dir right?
>>
>> Yes!
>>
>
> Addition:
> I meant such code to be replaced:
>
> -        indexDir = new File(workDir, "testIndex");
> +        indexDir = _TestUtil.getTempDir("testIndex");

ok will do!

simon
>
> Uwe
>
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Lucene Jenkins slave out of disk

2011-04-24 Thread Simon Willnauer

On Sun, Apr 24, 2011 at 9:53 AM, Uwe Schindler  wrote:
> Hi,
>
> JUHU! - Thanks. Now the test folder(s) (build/test, build/backwards/test) 
> after running the builds only contain the test results and some empty dirs.
>
> Simon, if you merge that one we should be fine now!

not until tuesday so feel free to merge!

simon
>
> Uwe
>
> -
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: u...@thetaphi.de
>
>
>> -Original Message-
>> From: Robert Muir [mailto:rcm...@gmail.com]
>> Sent: Sunday, April 24, 2011 12:49 AM
>> To: dev@lucene.apache.org
>> Subject: Re: Lucene Jenkins slave out of disk
>>
>> On Sat, Apr 23, 2011 at 6:40 PM, Robert Muir  wrote:
>> > I thought i did this already, but maybe i screwed it up
>> >
>>
>> Sorry, silly test bug... fixed in Revision: 1096249
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional
>> commands, e-mail: dev-h...@lucene.apache.org
>
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Lucene Jenkins slave out of disk

2011-04-24 Thread Simon Willnauer

On Sun, Apr 24, 2011 at 12:21 PM, Uwe Schindler  wrote:
> I am merging realtime up to trunk now...

thanks uwe!

simon
>
> Uwe
>
> -
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: u...@thetaphi.de
>
>
>> -----Original Message-
>> From: Simon Willnauer [mailto:simon.willna...@googlemail.com]
>> Sent: Sunday, April 24, 2011 12:06 PM
>> To: dev@lucene.apache.org
>> Subject: Re: Lucene Jenkins slave out of disk
>>
>> On Sun, Apr 24, 2011 at 9:53 AM, Uwe Schindler  wrote:
>> > Hi,
>> >
>> > JUHU! - Thanks. Now the test folder(s) (build/test, build/backwards/test)
>> after running the builds only contain the test results and some empty dirs.
>> >
>> > Simon, if you merge that one we should be fine now!
>>
>> not until tuesday so feel free to merge!
>>
>> simon
>> >
>> > Uwe
>> >
>> > -
>> > Uwe Schindler
>> > H.-H.-Meier-Allee 63, D-28213 Bremen
>> > http://www.thetaphi.de
>> > eMail: u...@thetaphi.de
>> >
>> >
>> >> -Original Message-
>> >> From: Robert Muir [mailto:rcm...@gmail.com]
>> >> Sent: Sunday, April 24, 2011 12:49 AM
>> >> To: dev@lucene.apache.org
>> >> Subject: Re: Lucene Jenkins slave out of disk
>> >>
>> >> On Sat, Apr 23, 2011 at 6:40 PM, Robert Muir  wrote:
>> >> > I thought i did this already, but maybe i screwed it up
>> >> >
>> >>
>> >> Sorry, silly test bug... fixed in Revision: 1096249
>> >>
>> >> -
>> >> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For
>> >> additional commands, e-mail: dev-h...@lucene.apache.org
>> >
>> >
>> >
>> > -
>> > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For
>> > additional commands, e-mail: dev-h...@lucene.apache.org
>> >
>> >
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional
>> commands, e-mail: dev-h...@lucene.apache.org
>
>
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

bug in LuceneTestCase#TEST_MIN_ITER

2011-04-26 Thread Simon Willnauer

hey I wonder how this TEST_MIN_ITER feature works though...
I expect that if I set -Dtests.iter.min=1 -Dtests.iter=10 and I fail
in any of those iterations that the the runner stops immediately and
prints a failure. Is that correct?

if so I don't understand this code:

 if (testsFailed) {
lastIterFailed = i;
if (i == TEST_ITER_MIN - 1) {
  if (verbose) {
System.out.println("\nNOTE: iteration " + lastIterFailed + " failed !");
  }
  break;
   }
}

this only stops if it fails at tests.iter.min but not if it has failed
test.iter.min+1
This should rather be something like if(i >=TEST_ITERM_MIN-1) right?

simon

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: bug in LuceneTestCase#TEST_MIN_ITER

2011-04-27 Thread Simon Willnauer

Fixed the behavior in Revision: 1097097


simon

On Tue, Apr 26, 2011 at 6:14 PM, Shai Erera  wrote:
> I think you're right Simon !
>
> Obviously I didn't test it with that scenario in mind :).
>
> Shai
>
> On Tue, Apr 26, 2011 at 6:15 PM, Simon Willnauer
>  wrote:
>>
>> hey I wonder how this TEST_MIN_ITER feature works though...
>> I expect that if I set -Dtests.iter.min=1 -Dtests.iter=10 and I fail
>> in any of those iterations that the the runner stops immediately and
>> prints a failure. Is that correct?
>>
>> if so I don't understand this code:
>>
>>  if (testsFailed) {
>>    lastIterFailed = i;
>>    if (i == TEST_ITER_MIN - 1) {
>>      if (verbose) {
>>        System.out.println("\nNOTE: iteration " + lastIterFailed + " failed
>> !");
>>      }
>>      break;
>>   }
>> }
>>
>> this only stops if it fails at tests.iter.min but not if it has failed
>> test.iter.min+1
>> This should rather be something like if(i >=TEST_ITERM_MIN-1) right?
>>
>> simon
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Code Freeze on realtime_search branch

2011-04-29 Thread Simon Willnauer

Hey folks,

LUCENE-3023 aims to land the considerably large
DocumentsWriterPerThread (DWPT) refactoring on trunk.
During the last weeks we have put lots of efforts into cleaning the
code up, fixing javadocs and run test locally
as well as on Jenkins. We reached the point where we are able to
create a final patch for review and land this
exciting refactoring on trunk very soon. I committed the CHANGES.TXT
entry (also appended below) a couple of minutes ago so from now on
we freeze the branch for final review (Robert can you create a new
"final" patch and upload to LUCENE-3023).
Any comments should go to [1] or as a reply to this email. If there is
no blocker coming up we plan to reintegrate the
branch and commit it to trunk early next week. For those who want some
background what DWPT does read: [2]

Note: this change will not change the index file format so there is no
need to reindex for trunk users. Yet, I will send a heads up next week
with an
overview of that has changed.

Simon

[1] https://issues.apache.org/jira/browse/LUCENE-3023
[2] http://blog.jteam.nl/2011/04/01/gimme-all-resources-you-have-i-can-use-them/


* LUCENE-2956, LUCENE-2573, LUCENE-2324, LUCENE-2555: Changes from
  DocumentsWriterPerThread:

  - IndexWriter now uses a DocumentsWriter per thread when indexing documents.
Each DocumentsWriterPerThread indexes documents in its own private segment,
and the in memory segments are no longer merged on flush.  Instead, each
segment is separately flushed to disk and subsequently merged with normal
segment merging.

  - DocumentsWriterPerThread (DWPT) is now flushed concurrently based on a
FlushPolicy.  When a DWPT is flushed, a fresh DWPT is swapped in so that
indexing may continue concurrently with flushing.  The selected
DWPT flushes all its RAM resident documents do disk.  Note: Segment flushes
don't flush all RAM resident documents but only the documents private to
the DWPT selected for flushing.

  - Flushing is now controlled by FlushPolicy that is called for every add,
update or delete on IndexWriter. By default DWPTs are flushed either on
maxBufferedDocs per DWPT or the global active used memory. Once the active
memory exceeds ramBufferSizeMB only the largest DWPT is selected for
flushing and the memory used by this DWPT is substracted from the active
memory and added to a flushing memory pool, which can lead to temporarily
higher memory usage due to ongoing indexing.

  - IndexWriter now can utilize ramBufferSize > 2048 MB. Each DWPT can address
up to 2048 MB memory such that the ramBufferSize is now bounded by the max
number of DWPT avaliable in the used DocumentsWriterPerThreadPool.
IndexWriters net memory consumption can grow far beyond the 2048 MB limit if
the applicatoin can use all available DWPTs. To prevent a DWPT from
exhausting its address space IndexWriter will forcefully flush a DWPT if its
hard memory limit is exceeded. The RAMPerThreadHardLimitMB can be controlled
via IndexWriterConfig and defaults to 1945 MB.
Since IndexWriter flushes DWPT concurrently not all memory is released
immediately. Applications should still use a ramBufferSize significantly
lower than the JVMs avaliable heap memory since under high load multiple
flushing DWPT can consume substantial transient memory when IO performance
is slow relative to indexing rate.

  - IndexWriter#commit now doesn't block concurrent indexing while flushing all
'currently' RAM resident documents to disk. Yet, flushes that occur while a
a full flush is running are queued and will happen after all DWPT involved
in the full flush are done flushing. Applications using multiple threads
during indexing and trigger a full flush (eg call commmit() or open a new
NRT reader) can use significantly more transient memory.

  - IndexWriter#addDocument and IndexWriter.updateDocument can block indexing
threads if the number of active + number of flushing DWPT exceed a
safety limit. By default this happens if 2 * max number available thread
states (DWPTPool) is exceeded. This safety limit prevents applications from
exhausting their available memory if flushing can't keep up with
concurrently indexing threads.

  - IndexWriter only applies and flushes deletes if the maxBufferedDelTerms
limit is reached during indexing. No segment flushes will be triggered
due to this setting.

  - IndexWriter#flush(boolean, boolean) doesn't synchronized on IndexWriter
anymore. A dedicated flushLock has been introduced to prevent multiple full-
flushes happening concurrently.

  - DocumentsWriter doesn't write shared doc stores anymore.

  (Mike McCandless, Michael Busch, Simon Willnauer)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Link to nightly build test reports on main Lucene site needs updating

2011-04-30 Thread Simon Willnauer

thanks tom,
I cced dev@l.a.o

simon

On Fri, Apr 29, 2011 at 11:14 PM, Burton-West, Tom  wrote:
> Hello,
>
> I went to look at the "Hudson nightly builds" and tried to follow the link 
> from the main Lucene page
> http://lucene.apache.org/java/docs/developer-resources.html#Nightly
>
>
> The links  to the Clover Test Coverage Reports  point to 
> http://hudson.zones.apache.org/hudson/view/Lucene/job/Lucene-trunk/lastSuccessfulBuild/clover/
>   but apparently hudson.zones.apache.org is no longer being used.  I think 
> the link should point to somewhere on  
> https://builds.apache.org/hudson/job/Lucene-trunk/.
> Is this the right list to alert whoever maintains the main Lucene pages on 
> lucene.apache.org?
> Tom
>
>
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Link to nightly build test reports on main Lucene site needs updating

2011-05-01 Thread Simon Willnauer

Thanks for fixing uwe!

On Sun, May 1, 2011 at 12:04 PM, Uwe Schindler  wrote:
> I fixed the nightly docs, once the webserver mirrors them from SVN they 
> should appear. The developer-resources page was completely broken. It now 
> also contains references to the stable 3.x branch as most users would prefer 
> that one to fix latest bugs but don’t want to have a backwards-incompatible 
> version.
>
> -
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: u...@thetaphi.de
>
>
>> -Original Message-
>> From: Simon Willnauer [mailto:simon.willna...@googlemail.com]
>> Sent: Saturday, April 30, 2011 9:42 PM
>> To: java-u...@lucene.apache.org
>> Cc: dev@lucene.apache.org
>> Subject: Re: Link to nightly build test reports on main Lucene site needs
>> updating
>>
>> thanks tom,
>> I cced dev@l.a.o
>>
>> simon
>>
>> On Fri, Apr 29, 2011 at 11:14 PM, Burton-West, Tom 
>> wrote:
>> > Hello,
>> >
>> > I went to look at the "Hudson nightly builds" and tried to follow the
>> > link from the main Lucene page
>> > http://lucene.apache.org/java/docs/developer-resources.html#Nightly
>> >
>> >
>> > The links  to the Clover Test Coverage Reports  point to
>> http://hudson.zones.apache.org/hudson/view/Lucene/job/Lucene-
>> trunk/lastSuccessfulBuild/clover/  but apparently hudson.zones.apache.org
>> is no longer being used.  I think the link should point to somewhere
>> on  https://builds.apache.org/hudson/job/Lucene-trunk/.
>> > Is this the right list to alert whoever maintains the main Lucene pages on
>> lucene.apache.org?
>> > Tom
>> >
>> >
>> >
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional
>> commands, e-mail: dev-h...@lucene.apache.org
>
>
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

33 Days left to Berlin Buzzwords 2011

2011-05-04 Thread Simon Willnauer

hey folks,

BerlinBuzzwords 2011 is close only 33 days left until the big Search,
Store and Scale opensource crowd is gathering
in Berlin on June 6th/7th.

The conference again focuses on the topics search,
data analysis and NoSQL. It is to take place on June 6/7th 2011 in Berlin.

We are looking forward to two awesome keynote speakers who shaped the world of
open source data analysis: Doug Cutting, founder of Apache Lucene and
Hadoop) as
well as Ted Dunning (Chief Application Architect at MapR Technologies
and active
developer at Apache Hadoop and Mahout).

We are amazed by the amount and quality of the talk submissions we
got. As a result
this year we have added one more track to the main conference. If you haven't
done so already, make sure to book your ticket now - early bird tickets are
already sold out since April 7th and there might not be many tickets left.

As we would like to give visitors of our main conference a reason to stay in
town for the whole week, we have been talking to local co-working spaces and
companies asking them for free space and WiFi to host Hackathons right after the
main conference - that is on June 8th through 10th.

If you would like to gather with fellow developers and users of your project,
fix bugs together, hack on new features or give users a hands-on introduction to
your tools, please submit your workshop proposal to our wiki:

http://berlinbuzzwords.de/node/428

Please note that slots are assigned on a first come first serve basis. We are
doing our best to get you connected, however space is limited.

The deal is simple: We get you in touch with a conference room provider. Your
event gets promoted in our schedule. Co-Ordination however is completely up to
you: Make sure to provide an interesting abstract, provide a Hackathon
registration area - see the Barcamp page for a good example:

http://berlinbuzzwords.de/wiki/barcamp

Attending Hackathons requires a Berlin Buzzwords ticket and (then free)
registration at the Hackathon in question.

Hope I see you all around in Berlin,

Simon

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: modularization discussion

2011-05-04 Thread Simon Willnauer

On Wed, May 4, 2011 at 3:49 PM, Mark Miller  wrote:
>
> On May 4, 2011, at 9:42 AM, Uwe Schindler wrote:
>
>> Solr has no performance testing framework, see the issue from today 
>> (SOLR-2493).
>
> Come to Berlin Buzzwords!
I think I will come :)
simon
>
> (I know you already are :) )
>
> - Mark Miller
> lucidimagination.com
>
> Lucene/Solr User Conference
> May 25-26, San Francisco
> www.lucenerevolution.org
>
>
>
>
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: modularization discussion

2011-05-05 Thread Simon Willnauer

Hey folks

On Tue, May 3, 2011 at 6:49 PM, Michael McCandless
 wrote:
> Isn't our end goal here a bunch of well factored search modules?  Ie,
> fast forward a year or two and I think we should have modules like
> these:

I think we have two camps here (10k feet view):

1. wants to move towards modularization might support all the modules
mike has listed below
2. wants to stick with Solr's current architecture and remain
"monolithic" (not negative in this case) as much as possible

I think we can meet somewhere in between and agree on certain module
that should be available to lucene users as well. The ones I have in
mind are
primary search features like:
 - Faceting
- Highlighting
- Suggest
- Function Query (consolidation is needed here!)
- Analyzer factories

things like distribution and replication should remain in solr IMO but
might be moved to a more extensible API so that people can add their
own implementation. I am thinking about things like the ZooKeeper
support that might not be a good solution for everybody where folks
have already JGroups infrastructure. So I think we can work towards 2
distinct goals.
1. extract common search features into modules
2. refactor solr to be more "elastic" / "distributed"  and extensible
with respect to those goals.

maybe we can get agreement on such a basis though.

let me know what you think

simon
>
>  * Faceting
>
>  * Highlighting
>
>  * Suggest (good patch is on LUCENE-2995)
>
>  * Schema
>
>  * Query impls
>
>  * Query parsers
>
>  * Analyzers (good progress here already, thanks Robert!),
>    incl. factories/XML configuration (still need this)
>
>  * Database import (DIH)
>
>  * Web app
>
>  * Distribution/replication
>
>  * Doc set representations
>
>  * Collapse/grouping
>
>  * Caches
>
>  * Similarity/scoring impls (BM25, etc.)
>
>  * Codecs
>
>  * Joins
>
>  * Lucene core
>
> In this future, much of this code came from what is now Solr and
> Lucene, but we should freely and aggressively poach from other
> projects when appropriate (and license/provenance is OK).
>
> I keep seeing all these cool "compressed int set" projects popping
> up... surely these are useful for us.  Solr poached a doc set impl
> from Nutch; probably there's other stuff to poach from Nutch, Mahout,
> etc.
>
> Katta's doing something sweet with distribution/replication; let's
> poach & merge w/ Solr's approach.  There are various facet impls out
> there (Bobo browse/Zoie; Toke's; Elastic Search); let's poach & merge
> with Solr's.
>
> Elastic Search has lots of cool stuff, too, under ASL2.
>
> All these external open-source projects are fair game for poaching and
> refactoring into shared modules, along with what is now Solr and
> Lucene sources.
>
> In this ideal future, Solr becomes the bundling and default/example
> configuration of the Web App and other modules, much like how the
> various Linux distros bundle different stuff together around the Linux
> kernel.  And if you are an advanced app and don't need the webapp
> part, you can cherry pick the huper duper modules you do need and
> directly embedded into your app.
>
> Isn't this the future we are working towards?
>
> Mike
>
> http://blog.mikemccandless.com
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [JENKINS] Lucene-Solr-tests-only-docvalues-branch - Build # 1064 - Failure

2011-05-05 Thread Simon Willnauer

I removed the @Override annotation on that file!

simon

On Thu, May 5, 2011 at 11:03 AM, Apache Jenkins Server
 wrote:
> Build: 
> https://builds.apache.org/hudson/job/Lucene-Solr-tests-only-docvalues-branch/1064/
>
> No tests ran.
>
> Build Log (for compile errors):
> [...truncated 63 lines...]
> + cd 
> /home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-docvalues-branch/checkout
> + JAVA_HOME=/home/hudson/tools/java/latest1.5 
> /home/hudson/tools/ant/latest1.7/bin/ant clean
> Buildfile: build.xml
>
> clean:
>
> clean:
>   [delete] Deleting directory 
> /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-docvalues-branch/checkout/lucene/build
>
> clean:
>
> clean:
>     [echo] Building analyzers-common...
>
> clean:
>   [delete] Deleting directory 
> /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-docvalues-branch/checkout/modules/analysis/build/common
>     [echo] Building analyzers-icu...
>
> clean:
>   [delete] Deleting directory 
> /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-docvalues-branch/checkout/modules/analysis/build/icu
>     [echo] Building analyzers-phonetic...
>
> clean:
>   [delete] Deleting directory 
> /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-docvalues-branch/checkout/modules/analysis/build/phonetic
>     [echo] Building analyzers-smartcn...
>
> clean:
>   [delete] Deleting directory 
> /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-docvalues-branch/checkout/modules/analysis/build/smartcn
>     [echo] Building analyzers-stempel...
>
> clean:
>   [delete] Deleting directory 
> /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-docvalues-branch/checkout/modules/analysis/build/stempel
>     [echo] Building benchmark...
>
> clean:
>   [delete] Deleting directory 
> /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-docvalues-branch/checkout/modules/benchmark/build
>
> clean-contrib:
>
> clean:
>   [delete] Deleting directory 
> /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-docvalues-branch/checkout/solr/contrib/analysis-extras/build
>   [delete] Deleting directory 
> /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-docvalues-branch/checkout/solr/contrib/analysis-extras/lucene-libs
>
> clean:
>   [delete] Deleting directory 
> /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-docvalues-branch/checkout/solr/contrib/clustering/build
>
> clean:
>   [delete] Deleting directory 
> /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-docvalues-branch/checkout/solr/contrib/dataimporthandler/target
>
> clean:
>   [delete] Deleting directory 
> /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-docvalues-branch/checkout/solr/contrib/extraction/build
>
> clean:
>   [delete] Deleting directory 
> /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-docvalues-branch/checkout/solr/contrib/uima/build
>
> clean:
>   [delete] Deleting directory 
> /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-docvalues-branch/checkout/solr/build
>
> BUILD SUCCESSFUL
> Total time: 7 seconds
> + cd 
> /home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-docvalues-branch/checkout/lucene
> + JAVA_HOME=/home/hudson/tools/java/latest1.5 
> /home/hudson/tools/ant/latest1.7/bin/ant compile compile-test build-contrib
> Buildfile: build.xml
>
> jflex-uptodate-check:
>
> jflex-notice:
>
> javacc-uptodate-check:
>
> javacc-notice:
>
> init:
>
> clover.setup:
>
> clover.info:
>     [echo]
>     [echo]       Clover not found. Code coverage reports disabled.
>     [echo]
>
> clover:
>
> common.compile-core:
>    [mkdir] Created dir: 
> /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-docvalues-branch/checkout/lucene/build/classes/java
>    [javac] Compiling 536 source files to 
> /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-docvalues-branch/checkout/lucene/build/classes/java
>    [javac] 
> /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-docvalues-branch/checkout/lucene/src/java/org/apache/lucene/util/Version.java:80:
>  warning: [dep-ann] deprecated name isnt annotated with @Deprecated
>    [javac]   public boolean onOrAfter(Version other) {
>    [javac]                  ^
>    [javac] 
> /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-docvalues-branch/checkout/lucene/src/java/org/apache/lucene/index/codecs/DefaultDocValuesConsumer.java:49:
>  method does not override a method from its superclass
>    [javac]   @Override
>    [javac]    ^
>    [javac] 
> /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-docvalues-branch/checkout/lucene/src/java/org/apache/lucene/queryParser/CharStream.java:34:
>  warning: [dep-ann] deprecated name isnt annotated with @Deprecated
>    [javac]   int getColumn();
>    [javac]       ^
>    [javac] 
> /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-docvalues-branch/checkout/lucene/src/java/org/apache/lucene/que

Re: modularization discussion

2011-05-05 Thread Simon Willnauer

On Thu, May 5, 2011 at 4:41 PM, Mark Miller  wrote:
>
> On May 5, 2011, at 10:25 AM, Grant Ingersoll wrote:
>
>> 3.  Those who think most should be modularized, but realize it's a ton of 
>> work for an unproven gain (although most admit it is a highly likely gain) 
>> and should be handled on a case-by-case basis as people do the work.   I 
>> don't have anything against modularization, I just know, given my schedule, 
>> I won't be able to block off weeks of time to do it.  I'm happy to review 
>> where/when I can.
>
> +1. From what I have gathered, Grant and I come down pretty much on the same 
> page on most of this stuff. Yeah, that mean's I'm reevaluating my position :) 
> but seems to be the case.

so this is one thing I really don't understand. you say you are in the
3rd camp. Guys in that camp have not much time to do the work but
still are not willing to sign up for what we want to modularize.
Nobody asks you to do the work I only ask you to say ok I think this
is good and NOT sitting in the way blocking others. This is really
what the 3rd camp is about to me but maybe I miss-understand something
here.

Again you are saying you are not in camp 1 but you want to still
fiddle around with long discussion before we get anything done (and
eventually be against it - nothing personal) because you don't have
enough time to fit stiff in your schedule. This makes no sense to me.
That case by case stuff makes me sick. Lets put some goals out and say
ok this makes sense in a module this doesn't and let folks work on it.
We need some agreement here and I think we have written enough emails
to make our points. I think we should agree on a set of things and
once we are there we can talk again. Dreams vs. Babysteps!

Lets settle on something now, today or next week and stop this wast of
time. I am happy with an agreement that we don't factor anything out.
all remains in solr but we need to move here! After all these
discussion I don't have any motivation to work on it anyway. I think I
need to step back for a while along those lines!

simon
>
> Except I'm more open to IRC discussion :)
>
> - Mark Miller
> lucidimagination.com
>
> Lucene/Solr User Conference
> May 25-26, San Francisco
> www.lucenerevolution.org
>
>
>
>
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [JENKINS] Lucene-Solr-tests-only-trunk - Build # 7757 - Failure

2011-05-05 Thread Simon Willnauer

the actual exception we are tripping here is

 java.lang.RuntimeException: java.lang.AssertionError
[junit] at
org.apache.lucene.index.TestFlushByRamOrCountsPolicy$IndexThread.run(TestFlushByRamOrCountsPolicy.java:328)
[junit] Caused by: java.lang.AssertionError
[junit] at
org.apache.lucene.index.DocumentsWriterFlushControl.setFlushPending(DocumentsWriterFlushControl.java:169)
[junit] at
org.apache.lucene.index.DocumentsWriterFlushControl.internalTryCheckOutForFlush(DocumentsWriterFlushControl.java:202)
[junit] at
org.apache.lucene.index.DocumentsWriterFlushControl.markForFullFlush(DocumentsWriterFlushControl.java:333)
[junit] at
org.apache.lucene.index.DocumentsWriter.flushAllThreads(DocumentsWriter.java:500)
[junit] at
org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:2622)
[junit] at 
org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:2599)
[junit] at
org.apache.lucene.index.IndexWriter.prepareCommit(IndexWriter.java:2465)
[junit] at
org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:2538)
[junit] at
org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:2520)
[junit] at
org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:2504)
[junit] at
org.apache.lucene.index.TestFlushByRamOrCountsPolicy$IndexThread.run(TestFlushByRamOrCountsPolicy.java:326)
[junit] *** Thread: Thread-106 ***

I will take care of it tomorrow...

On Thu, May 5, 2011 at 9:45 PM, Apache Jenkins Server
 wrote:
> Build: https://builds.apache.org/hudson/job/Lucene-Solr-tests-only-trunk/7757/
>
> 1 tests failed.
> REGRESSION:  
> org.apache.lucene.index.TestFlushByRamOrCountsPolicy.testHealthyness
>
> Error Message:
>  flushingQueue: DWDQ: [ generation: 9 ] currentqueue: DWDQ: [ generation: 10 
> ] perThread queue: DWDQ: [ generation: 0 ] numDocsInRam: 3
>
> Stack Trace:
> junit.framework.AssertionFailedError:  flushingQueue: DWDQ: [ generation: 9 ] 
> currentqueue: DWDQ: [ generation: 10 ] perThread queue: DWDQ: [ generation: 0 
> ] numDocsInRam: 3
>        at 
> org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1260)
>        at 
> org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1189)
>        at 
> org.apache.lucene.index.DocumentsWriterFlushControl.markForFullFlush(DocumentsWriterFlushControl.java:326)
>        at 
> org.apache.lucene.index.DocumentsWriter.flushAllThreads(DocumentsWriter.java:500)
>        at org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:2622)
>        at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:2599)
>        at 
> org.apache.lucene.index.IndexWriter.closeInternal(IndexWriter.java:1051)
>        at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:1015)
>        at 
> org.apache.lucene.index.TestFlushByRamOrCountsPolicy.testHealthyness(TestFlushByRamOrCountsPolicy.java:276)
>
>
>
>
> Build Log (for compile errors):
> [...truncated 3370 lines...]
>
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [JENKINS] Lucene-Solr-tests-only-trunk - Build # 7757 - Failure

2011-05-05 Thread Simon Willnauer

I committed a fix for this in revision 1100103

simon

On Thu, May 5, 2011 at 10:24 PM, Simon Willnauer
 wrote:
> the actual exception we are tripping here is
>
>  java.lang.RuntimeException: java.lang.AssertionError
>    [junit]     at
> org.apache.lucene.index.TestFlushByRamOrCountsPolicy$IndexThread.run(TestFlushByRamOrCountsPolicy.java:328)
>    [junit] Caused by: java.lang.AssertionError
>    [junit]     at
> org.apache.lucene.index.DocumentsWriterFlushControl.setFlushPending(DocumentsWriterFlushControl.java:169)
>    [junit]     at
> org.apache.lucene.index.DocumentsWriterFlushControl.internalTryCheckOutForFlush(DocumentsWriterFlushControl.java:202)
>    [junit]     at
> org.apache.lucene.index.DocumentsWriterFlushControl.markForFullFlush(DocumentsWriterFlushControl.java:333)
>    [junit]     at
> org.apache.lucene.index.DocumentsWriter.flushAllThreads(DocumentsWriter.java:500)
>    [junit]     at
> org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:2622)
>    [junit]     at 
> org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:2599)
>    [junit]     at
> org.apache.lucene.index.IndexWriter.prepareCommit(IndexWriter.java:2465)
>    [junit]     at
> org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:2538)
>    [junit]     at
> org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:2520)
>    [junit]     at
> org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:2504)
>    [junit]     at
> org.apache.lucene.index.TestFlushByRamOrCountsPolicy$IndexThread.run(TestFlushByRamOrCountsPolicy.java:326)
>    [junit] *** Thread: Thread-106 ***
>
> I will take care of it tomorrow...
>
> On Thu, May 5, 2011 at 9:45 PM, Apache Jenkins Server
>  wrote:
>> Build: 
>> https://builds.apache.org/hudson/job/Lucene-Solr-tests-only-trunk/7757/
>>
>> 1 tests failed.
>> REGRESSION:  
>> org.apache.lucene.index.TestFlushByRamOrCountsPolicy.testHealthyness
>>
>> Error Message:
>>  flushingQueue: DWDQ: [ generation: 9 ] currentqueue: DWDQ: [ generation: 10 
>> ] perThread queue: DWDQ: [ generation: 0 ] numDocsInRam: 3
>>
>> Stack Trace:
>> junit.framework.AssertionFailedError:  flushingQueue: DWDQ: [ generation: 9 
>> ] currentqueue: DWDQ: [ generation: 10 ] perThread queue: DWDQ: [ 
>> generation: 0 ] numDocsInRam: 3
>>        at 
>> org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1260)
>>        at 
>> org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1189)
>>        at 
>> org.apache.lucene.index.DocumentsWriterFlushControl.markForFullFlush(DocumentsWriterFlushControl.java:326)
>>        at 
>> org.apache.lucene.index.DocumentsWriter.flushAllThreads(DocumentsWriter.java:500)
>>        at org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:2622)
>>        at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:2599)
>>        at 
>> org.apache.lucene.index.IndexWriter.closeInternal(IndexWriter.java:1051)
>>        at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:1015)
>>        at 
>> org.apache.lucene.index.TestFlushByRamOrCountsPolicy.testHealthyness(TestFlushByRamOrCountsPolicy.java:276)
>>
>>
>>
>>
>> Build Log (for compile errors):
>> [...truncated 3370 lines...]
>>
>>
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>>
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [JENKINS] Lucene-Solr-tests-only-3.x - Build # 7777 - Still Failing

2011-05-06 Thread Simon Willnauer

seems like there is a file missing - I am working on it

simon

On Fri, May 6, 2011 at 9:40 AM, Apache Jenkins Server
 wrote:
> Build: https://builds.apache.org/hudson/job/Lucene-Solr-tests-only-3.x//
>
> No tests ran.
>
> Build Log (for compile errors):
> [...truncated 472 lines...]
>    [javac] location: class 
> org.apache.lucene.analysis.path.TestReversePathHierarchyTokenizer
>    [javac]     ReversePathHierarchyTokenizer t = new 
> ReversePathHierarchyTokenizer( new StringReader(path) );
>    [javac]     ^
>    [javac] 
> /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-3.x/checkout/lucene/contrib/analyzers/common/src/test/org/apache/lucene/analysis/path/TestReversePathHierarchyTokenizer.java:50:
>  cannot find symbol
>    [javac] symbol  : class ReversePathHierarchyTokenizer
>    [javac] location: class 
> org.apache.lucene.analysis.path.TestReversePathHierarchyTokenizer
>    [javac]     ReversePathHierarchyTokenizer t = new 
> ReversePathHierarchyTokenizer( new StringReader(path) );
>    [javac]                                           ^
>    [javac] 
> /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-3.x/checkout/lucene/contrib/analyzers/common/src/test/org/apache/lucene/analysis/path/TestReversePathHierarchyTokenizer.java:61:
>  cannot find symbol
>    [javac] symbol  : class ReversePathHierarchyTokenizer
>    [javac] location: class 
> org.apache.lucene.analysis.path.TestReversePathHierarchyTokenizer
>    [javac]     ReversePathHierarchyTokenizer t = new 
> ReversePathHierarchyTokenizer( new StringReader(path) );
>    [javac]     ^
>    [javac] 
> /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-3.x/checkout/lucene/contrib/analyzers/common/src/test/org/apache/lucene/analysis/path/TestReversePathHierarchyTokenizer.java:61:
>  cannot find symbol
>    [javac] symbol  : class ReversePathHierarchyTokenizer
>    [javac] location: class 
> org.apache.lucene.analysis.path.TestReversePathHierarchyTokenizer
>    [javac]     ReversePathHierarchyTokenizer t = new 
> ReversePathHierarchyTokenizer( new StringReader(path) );
>    [javac]                                           ^
>    [javac] 
> /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-3.x/checkout/lucene/contrib/analyzers/common/src/test/org/apache/lucene/analysis/path/TestReversePathHierarchyTokenizer.java:72:
>  cannot find symbol
>    [javac] symbol  : class ReversePathHierarchyTokenizer
>    [javac] location: class 
> org.apache.lucene.analysis.path.TestReversePathHierarchyTokenizer
>    [javac]     ReversePathHierarchyTokenizer t = new 
> ReversePathHierarchyTokenizer( new StringReader(path) );
>    [javac]     ^
>    [javac] 
> /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-3.x/checkout/lucene/contrib/analyzers/common/src/test/org/apache/lucene/analysis/path/TestReversePathHierarchyTokenizer.java:72:
>  cannot find symbol
>    [javac] symbol  : class ReversePathHierarchyTokenizer
>    [javac] location: class 
> org.apache.lucene.analysis.path.TestReversePathHierarchyTokenizer
>    [javac]     ReversePathHierarchyTokenizer t = new 
> ReversePathHierarchyTokenizer( new StringReader(path) );
>    [javac]                                           ^
>    [javac] 
> /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-3.x/checkout/lucene/contrib/analyzers/common/src/test/org/apache/lucene/analysis/path/TestReversePathHierarchyTokenizer.java:83:
>  cannot find symbol
>    [javac] symbol  : class ReversePathHierarchyTokenizer
>    [javac] location: class 
> org.apache.lucene.analysis.path.TestReversePathHierarchyTokenizer
>    [javac]     ReversePathHierarchyTokenizer t = new 
> ReversePathHierarchyTokenizer( new StringReader(path) );
>    [javac]     ^
>    [javac] 
> /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-3.x/checkout/lucene/contrib/analyzers/common/src/test/org/apache/lucene/analysis/path/TestReversePathHierarchyTokenizer.java:83:
>  cannot find symbol
>    [javac] symbol  : class ReversePathHierarchyTokenizer
>    [javac] location: class 
> org.apache.lucene.analysis.path.TestReversePathHierarchyTokenizer
>    [javac]     ReversePathHierarchyTokenizer t = new 
> ReversePathHierarchyTokenizer( new StringReader(path) );
>    [javac]                                           ^
>    [javac] 
> /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-3.x/checkout/lucene/contrib/analyzers/common/src/test/org/apache/lucene/analysis/path/TestReversePathHierarchyTokenizer.java:94:
>  cannot find symbol
>    [javac] symbol  : class ReversePathHierarchyTokenizer
>    [javac] location: class 
> org.apache.lucene.analysis.path.TestReversePathHierarchyTokenizer
>    [javac]     ReversePathHierarchyTokenizer t = new 
> ReversePathHierarchyTokenizer( new StringReader(path), 1 );
>    [javac]     ^
>    [javac] 
> /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-3.x/checkout/lucene/contrib/analyzers/common/src/test/org/apache/lucene/anal

Re: [JENKINS] Lucene-Solr-tests-only-3.x - Build # 7777 - Still Failing

2011-05-06 Thread Simon Willnauer

merged missing file in and committed revision 1100131.

simon


On Fri, May 6, 2011 at 10:40 AM, Simon Willnauer
 wrote:
> seems like there is a file missing - I am working on it
>
> simon
>
> On Fri, May 6, 2011 at 9:40 AM, Apache Jenkins Server
>  wrote:
>> Build: https://builds.apache.org/hudson/job/Lucene-Solr-tests-only-3.x//
>>
>> No tests ran.
>>
>> Build Log (for compile errors):
>> [...truncated 472 lines...]
>>    [javac] location: class 
>> org.apache.lucene.analysis.path.TestReversePathHierarchyTokenizer
>>    [javac]     ReversePathHierarchyTokenizer t = new 
>> ReversePathHierarchyTokenizer( new StringReader(path) );
>>    [javac]     ^
>>    [javac] 
>> /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-3.x/checkout/lucene/contrib/analyzers/common/src/test/org/apache/lucene/analysis/path/TestReversePathHierarchyTokenizer.java:50:
>>  cannot find symbol
>>    [javac] symbol  : class ReversePathHierarchyTokenizer
>>    [javac] location: class 
>> org.apache.lucene.analysis.path.TestReversePathHierarchyTokenizer
>>    [javac]     ReversePathHierarchyTokenizer t = new 
>> ReversePathHierarchyTokenizer( new StringReader(path) );
>>    [javac]                                           ^
>>    [javac] 
>> /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-3.x/checkout/lucene/contrib/analyzers/common/src/test/org/apache/lucene/analysis/path/TestReversePathHierarchyTokenizer.java:61:
>>  cannot find symbol
>>    [javac] symbol  : class ReversePathHierarchyTokenizer
>>    [javac] location: class 
>> org.apache.lucene.analysis.path.TestReversePathHierarchyTokenizer
>>    [javac]     ReversePathHierarchyTokenizer t = new 
>> ReversePathHierarchyTokenizer( new StringReader(path) );
>>    [javac]     ^
>>    [javac] 
>> /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-3.x/checkout/lucene/contrib/analyzers/common/src/test/org/apache/lucene/analysis/path/TestReversePathHierarchyTokenizer.java:61:
>>  cannot find symbol
>>    [javac] symbol  : class ReversePathHierarchyTokenizer
>>    [javac] location: class 
>> org.apache.lucene.analysis.path.TestReversePathHierarchyTokenizer
>>    [javac]     ReversePathHierarchyTokenizer t = new 
>> ReversePathHierarchyTokenizer( new StringReader(path) );
>>    [javac]                                           ^
>>    [javac] 
>> /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-3.x/checkout/lucene/contrib/analyzers/common/src/test/org/apache/lucene/analysis/path/TestReversePathHierarchyTokenizer.java:72:
>>  cannot find symbol
>>    [javac] symbol  : class ReversePathHierarchyTokenizer
>>    [javac] location: class 
>> org.apache.lucene.analysis.path.TestReversePathHierarchyTokenizer
>>    [javac]     ReversePathHierarchyTokenizer t = new 
>> ReversePathHierarchyTokenizer( new StringReader(path) );
>>    [javac]     ^
>>    [javac] 
>> /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-3.x/checkout/lucene/contrib/analyzers/common/src/test/org/apache/lucene/analysis/path/TestReversePathHierarchyTokenizer.java:72:
>>  cannot find symbol
>>    [javac] symbol  : class ReversePathHierarchyTokenizer
>>    [javac] location: class 
>> org.apache.lucene.analysis.path.TestReversePathHierarchyTokenizer
>>    [javac]     ReversePathHierarchyTokenizer t = new 
>> ReversePathHierarchyTokenizer( new StringReader(path) );
>>    [javac]                                           ^
>>    [javac] 
>> /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-3.x/checkout/lucene/contrib/analyzers/common/src/test/org/apache/lucene/analysis/path/TestReversePathHierarchyTokenizer.java:83:
>>  cannot find symbol
>>    [javac] symbol  : class ReversePathHierarchyTokenizer
>>    [javac] location: class 
>> org.apache.lucene.analysis.path.TestReversePathHierarchyTokenizer
>>    [javac]     ReversePathHierarchyTokenizer t = new 
>> ReversePathHierarchyTokenizer( new StringReader(path) );
>>    [javac]     ^
>>    [javac] 
>> /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-3.x/checkout/lucene/contrib/analyzers/common/src/test/org/apache/lucene/analysis/path/TestReversePathHierarchyTokenizer.java:83:
>>  cannot find symbol
>>    [javac] symbol  : class ReversePathHierarchyTokenizer
>>    [javac] location: class 
>> org.apache.lucene.analysis.path.TestReversePathHierarchyTokenizer
>>    [javac]     ReversePathHierarchyTokenizer t = new 
>> ReversePathHierarchyTokenizer( new StringReader(path) );
>>    [javac]

Re: modularization discussion

2011-05-07 Thread Simon Willnauer

On Sat, May 7, 2011 at 12:30 PM, Michael McCandless
 wrote:
> I agree: refactoring is TONS of work.  Even cases that seem cut and
> dry, from a distance, quickly prove to be hairy (just ask Robert about
> refactoring analyzers).
>
> However, I think "unproven gain" is too strong.  EG, just a few days
> ago we had a user thread asking how to use auto-suggest outside of
> Solr.  Once we commit the suggest module, this is easy/ier for that
> user, and now we have one more user testing things, finding bugs,
> maybe offering improvements, etc.  I think the gains of each
> refactoring are potentially large, but they are not immediate -- they
> accrue over time.  It's an investment.
>
> Also: I'm in no way asking/expecting other devs to sign up to do
> refactoring (your response seems to imply this).  Nobody can do such a
> thing.  We all scratch our own itches and I'm not asking you to
> scratch mine :)
>
> What I am asking is that if someone wants to scratch this itch (factor
> out XXX as a module), they are fully free to do so, as long as it
> doesn't harm Solr's/Lucene's current functions, performance, etc.  We
> don't seem to have this freedom today, and this is, I think, the core
> conflict.
>
> Grant if I'm reading your response right, you agree with that freedom
> (others are free to refactor); you're just tempering in a good dose of
> reality ("refactoring is hard"), which I agree with.

Mike thank you for this email - this is the consens we need to have!!!

+1 for this... I think this is also what the board report should
contain but I will reply to this separately.

simon
>
> Mike
>
> http://blog.mikemccandless.com
>
> On Thu, May 5, 2011 at 10:25 AM, Grant Ingersoll  wrote:
>>
>> On May 5, 2011, at 4:15 AM, Simon Willnauer wrote:
>>
>>> Hey folks
>>>
>>> On Tue, May 3, 2011 at 6:49 PM, Michael McCandless
>>>  wrote:
>>>> Isn't our end goal here a bunch of well factored search modules?  Ie,
>>>> fast forward a year or two and I think we should have modules like
>>>> these:
>>>
>>> I think we have two camps here (10k feet view):
>>>
>>
>> I'd say 3 camps:
>>
>>> 1. wants to move towards modularization might support all the modules
>>> mike has listed below
>>> 2. wants to stick with Solr's current architecture and remain
>>> "monolithic" (not negative in this case) as much as possible
>>
>> 3.  Those who think most should be modularized, but realize it's a ton of 
>> work for an unproven gain (although most admit it is a highly likely gain) 
>> and should be handled on a case-by-case basis as people do the work.   I 
>> don't have anything against modularization, I just know, given my schedule, 
>> I won't be able to block off weeks of time to do it.  I'm happy to review 
>> where/when I can.
>>
>>
>>>
>>> I think we can meet somewhere in between and agree on certain module
>>> that should be available to lucene users as well. The ones I have in
>>> mind are
>>> primary search features like:
>>> - Faceting
>>
>> Yeah, for instance, Bobo seems to have some interesting faceting 
>> implementations that are ASL, perhaps we can combine into this new faceting 
>> module.
>>
>>> - Highlighting
>>> - Suggest
>>> - Function Query (consolidation is needed here!)
>>> - Analyzer factories
>>
>> +1.
>>
>>>
>>> things like distribution and replication should remain in solr IMO but
>>> might be moved to a more extensible API so that people can add their
>>> own implementation.
>>
>> And, of course, all the web tier stuff (response writers, inputs, etc.)
>>
>>> I am thinking about things like the ZooKeeper
>>> support that might not be a good solution for everybody where folks
>>> have already JGroups infrastructure.
>>
>> Or other similar solutions.  I wonder about using a ZeroConf implementation 
>> that can do self-discovery.
>>
>>> So I think we can work towards 2
>>> distinct goals.
>>> 1. extract common search features into modules
>>> 2. refactor solr to be more "elastic" / "distributed"  and extensible
>>> with respect to those goals.
>>
>> 3. Make it easier for Solr to be programmatically configured by decoupling 
>> the reading of schema.xml and solrconfig.xml from the code that actually 
>> contains the structures for the properties (I

Re: modularization discussion

2011-05-07 Thread Simon Willnauer

On Sat, May 7, 2011 at 1:02 PM, Michael McCandless
 wrote:
> OK I opened:
>
>    https://issues.apache.org/jira/browse/LUCENE-3079
awesome!

+1
>
> Mike
>
> http://blog.mikemccandless.com
>
> On Sat, May 7, 2011 at 6:46 AM, Michael McCandless
>  wrote:
>> I agree!  And I think you're saying the same thing as Grant.
>>
>> Ie, others are fully free to refactor stuff, as long as they don't
>> hurt Solr/Lucene (functionality, performance).
>>
>> But you are tempering that with a nice dose of reality (successfully
>> factoring out faceting will be insanely hard).
>>
>> I very much agree with that.
>>
>> And, I (and other refactor-itchers) very much want to hear the
>> specific technical skepticism/concerns on a given module: that
>> assessment is awesome and very useful.  In fact, I love your
>> enumeration of how faceting is so well integrated into Solr so much
>> that I'll go open an issue (to factor out faceting), and put your list
>> in!
>>
>> I think this will mean, in practice, that the refactoring should
>> itself proceed in baby steps.  Ie, birthing a new faceting module,
>> iterating on it, etc., and then at some point cutting Solr over to it,
>> are two events likely spread out substantially in time.
>>
>> Freedom to refactor/poach is the bread and butter of open source.
>>
>> Mike
>>
>> http://blog.mikemccandless.com
>>
>> On Fri, May 6, 2011 at 4:35 PM, Chris Hostetter
>>  wrote:
>>>
>>> : To me, the third camp is just saying the proof is in the pudding.  If
>>> : you want to refactor, then go for it.  Just make sure everything still
>>> : works, which of course I know people will (but part of that means
>>> : actually running Solr, IMO).  Perhaps, more importantly don't get mad
>>> : that if I have only one day a week to work on Lucene/Solr that I spend
>>> : it putting a specific feature in a specific place.  Just because
>>> : something can/should be modularized, doesn't mean that a person working
>>> : in that area must do it before they add whatever they were working on.
>>> : For instance, if and when function queries are a module, I will add to
>>> : them there and be happy to do so.  In the meantime, I will likely add to
>>> : them in Solr if that is something I happen to be interested in at that
>>> : time b/c I can certainly add a new function in a day, but I can't
>>> : refactor the whole module _and_ add my new function in a day.
>>>
>>> +1
>>>
>>> I want to get that printed on a t-shirt
>>>
>>> the corrolarry issue in my mind...
>>>
>>> I am happily in favor of code reuse and modularization in the abstract,
>>> and when it works in practice i'm plesantly delighted.
>>>
>>> But when people talk about modularization as a goal, and make a laundry
>>> list things in solr that people think should be refactored into modules
>>> (w/o showing specifics of what that module would look like) then i have a
>>> hard time buying into some of these ideas panning out in a way that:
>>>  a) is a useful module to people in and of itself
>>>  b) doesn't hamstring the evolution/performance in solr.
>>>
>>> To look at "faceting" as a concrete example, there are big the reasons
>>> faceting works so well in Solr: Solr has total control over the
>>> index, knows exactly when the index has changed to rebuild caches, has a
>>> strict schema so it can make sense of field types and
>>> pick faceting algos accordingly, has multi-phase distributed search
>>> approach to get exact counts efficiently across multiple shards, etc...
>>> (and there are still a lot of additional enhancements and improvements
>>> that can be made to take even more advantage of knowledge solr has because
>>> it "owns" the index that we no one has had time to tackle)
>>>
>>> I find it really hard to picture a way that this code could be refactored
>>> into a reusable module in such a way that it could have an API that would
>>> be easily usable outside of Solr -- and when i do get a glimmer of an
>>> inkling of what that might look like, that vision scares me because of how
>>> that API might then "hobble" Solr's ability to leverage it's total control
>>> of the underlying index to add additional performance/features.
>>>
>>> To be crystal clear: I recognize that this is *my* hangup -- I am not
>>> suggesting that "I am short sighted and have little imagination
>>> therefore this code should never be modularized."
>>>
>>> I'm trying to explain why i *personally* am hesitant and sceptical of how
>>> well modularizations of features like like this might actually work in
>>> practice, and why i'm not eager to jump in and contribute on a goal whose
>>> end result is something that i can't fully picture (and when i can picture
>>> it, i'm a little scared by what i see)
>>>
>>> That doesn't mean i'm opposed to it happening -- i would love to live in
>>> the land of candy where houses are made of ginger bread and sugar plums
>>> grow on trees, I'm just too skeptical that such a land exists (or is as
>>> great as legend describes) to go slogging al

Re: [JENKINS] Lucene-Solr-tests-only-trunk - Build # 7924 - Failure

2011-05-10 Thread Simon Willnauer

On Tue, May 10, 2011 at 8:02 PM, Michael McCandless
 wrote:
> I committed fix... false failure tickled by the cool new sneaky
> throttling MockDirWrapper now does!

YAY! :)

simon
>
> Mike
>
> http://blog.mikemccandless.com
>
> On Tue, May 10, 2011 at 1:57 PM, Apache Jenkins Server
>  wrote:
>> Build: 
>> https://builds.apache.org/hudson/job/Lucene-Solr-tests-only-trunk/7924/
>>
>> 1 tests failed.
>> REGRESSION:  
>> org.apache.lucene.index.TestIndexWriter.testThreadInterruptDeadlock
>>
>> Error Message:
>> Some threads threw uncaught exceptions!
>>
>> Stack Trace:
>> junit.framework.AssertionFailedError: Some threads threw uncaught exceptions!
>>        at 
>> org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1282)
>>        at 
>> org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1211)
>>        at 
>> org.apache.lucene.util.LuceneTestCase.tearDown(LuceneTestCase.java:557)
>>
>>
>>
>>
>> Build Log (for compile errors):
>> [...truncated 3241 lines...]
>>
>>
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: a optimizable point about solr

2011-05-12 Thread Simon Willnauer

Hey good catch :)

we already addressed this issue AFAIK here:
https://issues.apache.org/jira/browse/SOLR-2493

so the question is if we need to do another release since it seems serious.

simon

2011/5/12 shuigen kang :
> Hi all:
>
>       I was recently used solr to set up a Search Engine for our web site.
> In the process of performance test, I used jprofiler to analyse the
> operational aspect of solr , And found a optimizable point, in the method
>  getLuceneVersion(String path, Version def) of class
> org.apache.solr.core.config. In this method,it will parse the config file
> with xml format each time when be invoked.
>
> it is terrible,look this:
>
> http://dl.iteye.com/upload/picture/pic/89884/00ff1ee5-a156-354d-8eaa-35abdcd1cfa6.jpg
>
> http://dl.iteye.com/upload/picture/pic/89884/00ff1ee5-a156-354d-8eaa-35abdcd1cfa6.jpg
>
> 15.1%+14.2%, it cost 29.3% cpu resource,only a be of little use method.
>
> So I suggest to edit this method like this:
>
> 
>
> public Version getLuceneVersion(String path) {
>
>        if(luceneVersion == null){
>
>       luceneVersion = parseLuceneVersionString(getVal(path, true));
>
>        }
>
>  return luceneVersion;
>
>    }
> 
>
> Only run parseLuceneVersionString() method at this first time to save the
> valuable and limited cpu resource.
>
>
>
> Any problem and what do you think?
>
>
>
> Best regards.
>
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Lucene PMC Welcomes 3 New Members

2011-05-12 Thread Simon Willnauer

On Thu, May 12, 2011 at 12:24 PM, Michael McCandless
 wrote:
> Welcome aboard!!
>
+1
> Mike
>
> http://blog.mikemccandless.com
>
> On Wed, May 11, 2011 at 10:08 PM, Grant Ingersoll  wrote:
>> The Lucene PMC would like to announce the addition of Steve Rowe, Shai Erera 
>> and Doron Cohen to the PMC.  All of them have been long time committers to 
>> Lucene and Solr and we look forward to having them on the PMC!
>>
>> Congratulations!
>>
>> -Grant
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [JENKINS] Lucene-trunk - Build # 1559 - Still Failing

2011-05-12 Thread Simon Willnauer

I found the issue - created LUCENE-3090 for it...
I am on it.

On Thu, May 12, 2011 at 3:08 PM, Michael McCandless
 wrote:
> I dug some into this heap dump...
>
> Somehow we have a DWPT using WAY too much RAM (301 MB).  A 2nd DWPT
> has 105 MB tied up.  I'm getting these numbers from the bytesUsed
> AtomicLong in the DWPT class from the dump.
>
> This is baffling because, although this test indexes a large amount of
> content for a reasonably long time (5 minutes w/ -nightly), IW's RAM
> buffer is set to 16 MB, so these two DWPTs should have flushed long
> ago.
>
> However, I can't tell from the dump (it doesn't have enough info)
> whether these DWPTs are created by TestNRTThreads, or one of the many
> other tests that run inside the same JVM.  Ie, it's possible (but I
> think unlikely) some other test made these massive DWPTs and then
> somehow failed to clean them up (ie, left references to them).
>
> So my best theory at this point is something is wrong w/ the
> FlushPolicy -- it's not flushing after crossing the 16 MB threshold.
>
> Mike
>
> http://blog.mikemccandless.com
>
> On Wed, May 11, 2011 at 10:59 PM, Apache Jenkins Server
>  wrote:
>> Build: https://builds.apache.org/hudson/job/Lucene-trunk/1559/
>>
>> 1 tests failed.
>> FAILED:  org.apache.lucene.index.TestNRTThreads.testNRTThreads
>>
>> Error Message:
>> this writer hit an OutOfMemoryError; cannot commit
>>
>> Stack Trace:
>> java.lang.IllegalStateException: this writer hit an OutOfMemoryError; cannot 
>> commit
>>        at 
>> org.apache.lucene.index.IndexWriter.prepareCommit(IndexWriter.java:2456)
>>        at 
>> org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:2538)
>>        at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:2520)
>>        at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:2504)
>>        at 
>> org.apache.lucene.index.TestNRTThreads.testNRTThreads(TestNRTThreads.java:223)
>>        at 
>> org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1282)
>>        at 
>> org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1211)
>>
>>
>>
>>
>> Build Log (for compile errors):
>> [...truncated 11983 lines...]
>>
>>
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: 3.2.0 (or 3.1.1)

2011-05-16 Thread Simon Willnauer

+1 for pushing 3.2!!

There have been discussions about porting DWPT to 3.x but I think its
a little premature now and I am still not sure if we should do it at
all. The refactoring is pretty intense throughout all IndexWriter and
it integrates with Flex / Codecs. I am not saying its impossible,
certainly doable but I am not sure if its worth the hassle, lets
rather concentrate on 4.0.

the question is if we should backport stuff like LUCENE-2881 to 3.2 or
if we should hold off until 3.3, should we do it at all?

simon

On Sat, May 14, 2011 at 12:30 PM, Michael McCandless
 wrote:
> +1 for 3.2.
>
> Mike
>
> http://blog.mikemccandless.com
>
> On Sat, May 14, 2011 at 12:32 AM, Shai Erera  wrote:
>> +1 for 3.2!
>>
>> And also, we should adopt that approach going forward (no more bug fix
>> releases for the stable branch, except for the last release before 4.0
>> is out). That means updating the release TODO with e.g., not creating
>> a branch for 3.2.x, only tag it. When 4.0 is out, we branch 3.x.y out
>> of the last 3.x tag.
>>
>> Shai
>>
>> On Saturday, May 14, 2011, Ryan McKinley  wrote:
>>> On Fri, May 13, 2011 at 6:40 PM, Grant Ingersoll  
>>> wrote:
 It's been just over 1 month since the last release.  We've all said we 
 want to get to about a 3 month release cycle (if not more often).  I think 
 this means we should start shooting for a next release sometime in June.  
 Which, in my mind, means we should start working on wrapping up issues 
 now, IMO.

 Here's what's open for 3.2 against:
 Lucene: https://issues.apache.org/jira/browse/LUCENE/fixforversion/12316070
 Solr: https://issues.apache.org/jira/browse/SOLR/fixforversion/12316172

 Thoughts?

>>>
>>> +1 for 3.2 with a new feature freeze pretty soon
>>>
>>> -
>>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>>
>>>
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: 3.2.0 (or 3.1.1)

2011-05-16 Thread Simon Willnauer

On Mon, May 16, 2011 at 1:30 PM, Robert Muir  wrote:
> On Mon, May 16, 2011 at 7:10 AM, Simon Willnauer
>  wrote:
>> the question is if we should backport stuff like LUCENE-2881 to 3.2 or
>> if we should hold off until 3.3, should we do it at all?
>>
>
> I think it depends solely if someone is willing to do the work? The
> only idea i would suggest is if we did such a thing, it would really
> be preferred if it was able to have around 2 weeks of hudson to knock
> out problems?
>

Absolutely, but I think we can safely move that to 3.3 though.. I am
busy with other things right now

simon

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Moving towards Lucene 4.0

2011-05-16 Thread Simon Willnauer

Hey folks,

we just started the discussion about Lucene 3.2 and releasing more
often. Yet, I think we should also start planning for Lucene 4.0 soon.
We have tons of stuff in trunk that people want to have and we can't
just keep on talking about it - we need to push this out to our users.
>From my perspective we should decide on at least the big outstanding
issues like:

- BulkPostings (my +1 since I want to enable positional scoring on all queries)
- DocValues (pretty close)
- FlexibleScoring (+- 0 I think we should wait how gsoc turns out and
decide then?)
- Codec Support for Stored Fields, Norms & TV (not sure about that but
seems doable at least an API and current impl as default)
- Realtime Search aka. Searchable Ram Buffer (this seems quite far
though while I would love to have it it seems we need to push this to
> 4.0)

For DocValues the decision seems easy since we are very close with
that and I expect it to land until end of June. I want to kick off the
discussion here so nothing will be set to stone really but I think we
should plan to release somewhere near the end of the year?!


simon

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

< 1 2 3 4 5 6 7 8 9 10 >

101 - 200 of 1562 matches

Mail list logo