Encoding data in terms; UTF8 concerns?

2014-05-10 Thread david.w.smi...@gmail.com
I’m working on an encoding of numbers / data into indexed terms.  In the
past I limited the encoding to ASCII but now I’m doing it at a more
raw/byte level.  Do I have to be aware of UTF8 / sorting issues when I do
this?  I noticed the following code in NumericUtils.java, line 186:
while (nChars  0) {
  // Store 7 bits per byte for compatibility
  // with UTF-8 encoding of terms
  bytes.bytes[nChars--] = (byte)(sortableBits  0x7f);
  sortableBits = 7;
}
It’s the comment more than anything that has my attention. Do I have to
limit my bytes to only the low 7 bits?  If so, why?  I’ve already written a
bunch of code that generates the terms without consideration for this, and
I think a bug I’m looking at could be related to this.

~ David
p.s. sorry to be CC’ing some folks directly but the mailing list is having
problems


DocumentsWriterPerThread architecture

2014-04-30 Thread david.w.smi...@gmail.com
Is this still up to date?:
https://blog.trifork.com/2011/04/01/gimme-all-resources-you-have-i-can-use-them/
I thought at some point subsequently, some significant work was done, and
perhaps it was blogged. But I can’t find it.
~ David


Re: maximum number of shards per SolrCloud

2014-04-21 Thread david.w.smi...@gmail.com
Zhifeng,
Please ask Solr questions on the solr-user list.

Thanks.
~ David


On Mon, Apr 21, 2014 at 9:54 PM, Zhifeng Wang zhifeng.wang...@gmail.comwrote:

 Hi,

 We are facing a high incoming rate of usually small documents (logs). The
 incoming rate is initially assumed at 2K/sec but could reach as high as
 20K/sec. So a year's worth of data could reach 60G (assuming the rate at
 2K/sec) searchable documents.

 Since a single shard can contain no more than 2G documents, we will need
 at least 30 shards per year. Considering that we don't want to have shards
 to their maximum capacity, the shards we need will be considerably higher.

 My question is whether there is a hard (not possible) or soft (bad
 performance) limit on the number of shards per SolrCloud. ZooKeeper
 defaults file size to 1M, so I guess that causes some limit. If I set the
 value to a larger number, will SolrCloud really scales OK if there
 thousands of shards?  Or I would be better off using multiple SolrCloud to
 handle the data (Result aggregation is done outside of SolrCloud)?

 Thanks,
 Zhifeng



Re: [JENKINS] Lucene-Solr-trunk-Linux (64bit/jdk1.7.0_55) - Build # 10106 - Still Failing!

2014-04-18 Thread david.w.smi...@gmail.com
This build started before I fixed the issue; it’s already fixed.


On Fri, Apr 18, 2014 at 9:12 AM, Policeman Jenkins Server 
jenk...@thetaphi.de wrote:

 Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/10106/
 Java: 64bit/jdk1.7.0_55 -XX:+UseCompressedOops -XX:+UseConcMarkSweepGC

 All tests passed

 Build Log:
 [...truncated 44392 lines...]
 -documentation-lint:
  [echo] checking for broken html...
 [jtidy] Checking for broken html (such as invalid tags)...
[delete] Deleting directory
 /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/lucene/build/jtidy_tmp
  [echo] Checking for broken links...
  [exec]
  [exec] Crawl/parse...
  [exec]
  [exec] Verify...
  [exec]
  [exec]
 file:///build/docs/spatial/org/apache/lucene/spatial/prefix/PrefixTreeStrategy.html
  [exec]   BROKEN LINK:
 file:///build/docs/core/org/apache/lucene/spatial.prefix.CellTokenStream.html
  [exec]   BROKEN LINK:
 file:///build/docs/core/org/apache/lucene/spatial.prefix.CellTokenStream.html
  [exec]
  [exec]
 file:///build/docs/spatial/org/apache/lucene/spatial/prefix/RecursivePrefixTreeStrategy.html
  [exec]   BROKEN LINK:
 file:///build/docs/core/org/apache/lucene/spatial.prefix.CellTokenStream.html
  [exec]   BROKEN LINK:
 file:///build/docs/core/org/apache/lucene/spatial.prefix.CellTokenStream.html
  [exec]
  [exec] Broken javadocs links were found!

 BUILD FAILED
 /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/build.xml:467: The
 following error occurred while executing this line:
 /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/build.xml:63: The
 following error occurred while executing this line:
 /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/lucene/build.xml:208:
 The following error occurred while executing this line:
 /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/lucene/build.xml:221:
 The following error occurred while executing this line:
 /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/lucene/common-build.xml:2330:
 exec returned: 1

 Total time: 68 minutes 2 seconds
 Build step 'Invoke Ant' marked build as failure
 Description set: Java: 64bit/jdk1.7.0_55 -XX:+UseCompressedOops
 -XX:+UseConcMarkSweepGC
 Archiving artifacts
 Recording test results
 Email was triggered for: Failure
 Sending email for trigger: Failure




 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Solr: Serving Javadoc from Jetty server

2014-04-17 Thread david.w.smi...@gmail.com
Alex,
Yes it would be useful (of course)!  In addition, the admin UI should have
a link to it, in addition to the generic documentation link. Create an
issue and I’ll commit it.
~ David


On Thu, Apr 17, 2014 at 6:54 AM, Alexandre Rafalovitch
arafa...@gmail.comwrote:

 Hello,

 The binary Solr distribution includes Javadoc, but it just sits there.

 I just tested adding second Jetty context that makes that Javadoc
 served under /javadoc handle.

 I think it is useful as sometimes Javadoc breaks when it is loaded
 from local filesystem (I think), plus it opens up other options like
 linking to it from other places.

 Would this be useful as a contribution? The context file is at:
 https://github.com/arafalov/Solr-Javadoc/tree/master/JettyContext

 Regards,
Alex.

 Personal website: http://www.outerthoughts.com/
 Current project: http://www.solr-start.com/ - Accelerating your Solr
 proficiency

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org




Blog post: Indexing Polygons In Lucene With Accuracy

2014-04-11 Thread david.w.smi...@gmail.com
FYI I published this blog post today:
http://www.opensourceconnections.com/2014/04/11/indexing-polygons-in-lucene-with-accuracy/
There's a strong Spatial4j connection because the SerializedDVStrategy
referenced uses the new BinaryCodec from Spatial4j 0.4.

~ David


Re: 4.7.2

2014-04-08 Thread david.w.smi...@gmail.com
LOL indeed ;-)
But in all seriousness, that should have no bearing on this conversation.

On Tue, Apr 8, 2014 at 3:00 AM, Alexandre Rafalovitch arafa...@gmail.comwrote:

 Let's hope nobody is trying to finish any books right now. :-)
 Personal website: http://www.outerthoughts.com/
 Current project: http://www.solr-start.com/ - Accelerating your Solr
 proficiency


 On Tue, Apr 8, 2014 at 1:55 PM, Simon Willnauer
 simon.willna...@gmail.com wrote:
  +1 to both 4.7.3 and 4.8 soon
 
  On Tue, Apr 8, 2014 at 8:40 AM, Uwe Schindler u...@thetaphi.de wrote:
  Hi,
 
  I am fine! I would also like to push the first 4.8 RC builds soon! I
 will check the changes list and open issues and make a proposal soon.
 
  Uwe
 
  -
  Uwe Schindler
  H.-H.-Meier-Allee 63, D-28213 Bremen
  http://www.thetaphi.de
  eMail: u...@thetaphi.de
 
 
  -Original Message-
  From: Robert Muir [mailto:rcm...@gmail.com]
  Sent: Monday, April 07, 2014 11:37 PM
  To: dev@lucene.apache.org
  Subject: 4.7.2
 
  Hello,
 
  I would like a 4.7.2 that fixes the corruption bug
  (https://issues.apache.org/jira/browse/LUCENE-5574).
 
  I'd like to build an RC tomorrow night for this (I'll be RM). I think
 its fine if we
  followup with e.g. a 4.7.3 out, but I want to be aggressive about this
  corruption stuff.
 
  Thanks,
  Robert
 
  -
  To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For
 additional
  commands, e-mail: dev-h...@lucene.apache.org
 
 
  -
  To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
  For additional commands, e-mail: dev-h...@lucene.apache.org
 
 
  -
  To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
  For additional commands, e-mail: dev-h...@lucene.apache.org
 

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org




Re: Anticipating a benchmark for direct posting format

2014-04-07 Thread david.w.smi...@gmail.com
Benson, I like your idea.

I think your idea can be achieved as a codec, one that wraps another codec
that establishes the on-disk format.  By default the wrapped codec can be
Lucene's default codec.  I think, if implemented, this would be a change to
DPF instead of an additional DPF-variant codec.

~ David


On Mon, Apr 7, 2014 at 9:22 AM, Benson Margulies bimargul...@gmail.comwrote:

 On Mon, Apr 7, 2014 at 9:14 AM, Robert Muir rcm...@gmail.com wrote:
  On Thu, Apr 3, 2014 at 12:27 PM, Benson Margulies bimargul...@gmail.com
 wrote:
 
 
  My takeaway from the prior conversation was that various people didn't
  entirely believe that I'd seen a dramatic improvement in query perfo
  using D-P-F, and so would not smile upon a patch intended to liberate
  D-P-F from codecs. It could be that the effect I saw has to do with
  the fact that our system depends on hitting and scoring 50% of the
  documents in an index with a lot of documents.
 
 
  I dont understand the word liberate here. why is it such a problem
  that this is a codec?

  I don't want to have to declare my intentions at the time I create
 the index. I don't want to have to use D-P-F for all readers all the
 time. Because I want to be able to decide to open up an index with an
 arbitrary on-disk format and get the in-memory cache behavior of
 D-P-F. Thus 'liberate' -- split the question of 'keep a copy in
 memory' from the choice of the on-disk format.


 
  i do not think we should give it any more status than that, it wastes
  too much ram.

 It didn't seem like 'waste' when it solved a big practical for us. We
 had an application that was too slow, and had plenty of RAM available,
 and we were able to trade space for time by applying D-P-F.

 Maybe I'm going about this backwards; if I can come up with a small,
 inconspicuous proposed change that does what I want, there won't be
 any disagreement.


 
  -
  To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
  For additional commands, e-mail: dev-h...@lucene.apache.org
 

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org




Re: Anticipating a benchmark for direct posting format

2014-04-07 Thread david.w.smi...@gmail.com
Aaaah, nice idea to simply use FilterAtomicReader -- of course!  So this
would ultimately be a new IndexReaderFactory that creates
FilterAtomicReaders for a subset of the fields you want to do this on.
 Cool!  With that, I don't think there would be a need for
DirectPostingsFormat as a postings format, would there be?

~ David


On Mon, Apr 7, 2014 at 10:58 AM, Shai Erera ser...@gmail.com wrote:

 The only problem is how the Codec makes a dynamic decision on whether to
 use the wrapped Codec for reading vs pre-load data into in-memory
 structures, because Codecs are loaded through reflection by the SPI loading
 mechanism.

 There is also a TODO in DirectPF to allow wrapping arbitrary PFs, just
 mentioning in case you want to tackle DPF.

 I think that if we allowed passing something like a CodecLookupService,
 with an SPILookupService default impl, you could easily pass that to
 DirectoryReader which will use your runtime logic to load the right PF
 (e.g. DPF) instead of the one the index was created with.

 But it sounds like the core problem is that when we load a Codec/PF/DVF
 for reading, we cannot pass it any arguments, and so we must make an
 index-time decision about how we're going to read the data later on. If we
 could somehow support that, I think that will help you to achieve what you
 want too.

 E.g. currently it's an all-or-nothing decision, but if we could pass a
 parameter like 50% available heap, the Codec/PF/DVF could cache the
 frequently accessed postings instead of loading all of them into memory.
 But, that can also be achieved at the IndexReader level, through a custom
 FilterAtomicReader. And if you could reuse DPF's structures (like
 DirectTermsEnum, DirectFields...), it should be easier to do this. So
 perhaps we can think about a DirectAtomicReader which does that? I believe
 it can share some code w/ DPF, as long as we don't make these APIs public,
 or make them @super.experimental and @super.expert.

 Just throwing some ideas...

 Shai


 On Mon, Apr 7, 2014 at 5:35 PM, david.w.smi...@gmail.com 
 david.w.smi...@gmail.com wrote:

 Benson, I like your idea.

 I think your idea can be achieved as a codec, one that wraps another
 codec that establishes the on-disk format.  By default the wrapped codec
 can be Lucene's default codec.  I think, if implemented, this would be a
 change to DPF instead of an additional DPF-variant codec.

 ~ David


 On Mon, Apr 7, 2014 at 9:22 AM, Benson Margulies 
 bimargul...@gmail.comwrote:

 On Mon, Apr 7, 2014 at 9:14 AM, Robert Muir rcm...@gmail.com wrote:
  On Thu, Apr 3, 2014 at 12:27 PM, Benson Margulies 
 bimargul...@gmail.com wrote:
 
 
  My takeaway from the prior conversation was that various people didn't
  entirely believe that I'd seen a dramatic improvement in query perfo
  using D-P-F, and so would not smile upon a patch intended to liberate
  D-P-F from codecs. It could be that the effect I saw has to do with
  the fact that our system depends on hitting and scoring 50% of the
  documents in an index with a lot of documents.
 
 
  I dont understand the word liberate here. why is it such a problem
  that this is a codec?

  I don't want to have to declare my intentions at the time I create
 the index. I don't want to have to use D-P-F for all readers all the
 time. Because I want to be able to decide to open up an index with an
 arbitrary on-disk format and get the in-memory cache behavior of
 D-P-F. Thus 'liberate' -- split the question of 'keep a copy in
 memory' from the choice of the on-disk format.


 
  i do not think we should give it any more status than that, it wastes
  too much ram.

 It didn't seem like 'waste' when it solved a big practical for us. We
 had an application that was too slow, and had plenty of RAM available,
 and we were able to trade space for time by applying D-P-F.

 Maybe I'm going about this backwards; if I can come up with a small,
 inconspicuous proposed change that does what I want, there won't be
 any disagreement.


 
  -
  To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
  For additional commands, e-mail: dev-h...@lucene.apache.org
 

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org






Re: Welcome Tim Potter as Lucene/Solr committer

2014-04-07 Thread david.w.smi...@gmail.com
Welcome Tim!


On Tue, Apr 8, 2014 at 12:40 AM, Steve Rowe sar...@gmail.com wrote:

 I'm pleased to announce that Tim Potter has accepted the PMC's invitation
 to become a committer.

 Tim, it's tradition that you introduce yourself with a brief bio.

 Once your account has been created - could take a few days - you'll be
 able to add yourself to the committers section of the Who We Are page on
 the website: http://lucene.apache.org/whoweare.html (use the ASF CMS
 bookmarklet at the bottom of the page here: 
 https://cms.apache.org/#bookmark - more info here 
 http://www.apache.org/dev/cms.html).

 Check out the ASF dev page - lots of useful links: 
 http://www.apache.org/dev/.

 Congratulations and welcome!

 Steve
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org




Re: Welcome Alan Woodward to the PMC

2014-04-02 Thread david.w.smi...@gmail.com
Welcome Alan!
~ David


On Wed, Apr 2, 2014 at 8:23 AM, Steve Rowe sar...@gmail.com wrote:

 I'm pleased to announce that Alan Woodward has accepted the PMC's
 invitation to join.

 Welcome Alan!

 - Steve
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org




Re: [VOTE] Lucene / Solr 4.7.1 RC2

2014-03-31 Thread david.w.smi...@gmail.com
+1

SUCCESS! [1:51:37.952160]


On Sat, Mar 29, 2014 at 4:46 AM, Steve Rowe sar...@gmail.com wrote:

 Please vote for the second Release Candidate for Lucene/Solr 4.7.1.

 Download it here:
 
 https://people.apache.org/~sarowe/staging_area/lucene-solr-4.7.1-RC2-rev1582953/
 

 Smoke tester cmdline (from the lucene_solr_4_7 branch):

 python3.2 -u dev-tools/scripts/smokeTestRelease.py \

 https://people.apache.org/~sarowe/staging_area/lucene-solr-4.7.1-RC2-rev1582953/\
 1582953 4.7.1 /tmp/4.7.1-smoke

 The smoke tester passed for me: SUCCESS! [0:50:29.936732]

 My vote: +1

 Steve
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org




Re: [VOTE] Lucene / Solr 4.7.1 RC1

2014-03-26 Thread david.w.smi...@gmail.com
+1

SUCCESS! [2:13:44.301402]


On Tue, Mar 25, 2014 at 6:46 PM, Steve Rowe sar...@gmail.com wrote:

 Please vote for the first Release Candidate for Lucene/Solr 4.7.1.

 Download it here:
 
 http://people.apache.org/~sarowe/staging_area/lucene-solr-4.7.1-RC1-rev1581444/
 

 Smoke tester cmdline:

 python3.2 -u dev-tools/scripts/smokeTestRelease.py \

 http://people.apache.org/~sarowe/staging_area/lucene-solr-4.7.1-RC1-rev1581444/\
 1581444 4.7.1 /tmp/4.7.1-smoke

 The smoke tester passed for me: SUCCESS! [1:08:24.099010]

 My vote: +1

 Steve


 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org




Re: [JENKINS] Lucene-Solr-trunk-Linux (64bit/jdk1.7.0_60-ea-b10) - Build # 9882 - Still Failing!

2014-03-23 Thread david.w.smi...@gmail.com
I'm looking in to this.


On Sun, Mar 23, 2014 at 5:45 AM, Policeman Jenkins Server 
jenk...@thetaphi.de wrote:

 Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/9882/
 Java: 64bit/jdk1.7.0_60-ea-b10 -XX:-UseCompressedOops -XX:+UseSerialGC

 1 tests failed.
 FAILED:
  org.apache.lucene.spatial.prefix.SpatialOpRecursivePrefixTreeTest.testWithin
 {#3 seed=[270CA22550192CCD:5FB2F89BF67A00A9]}

 Error Message:
 Shouldn't match I#2:Rect(minX=104.0,maxX=110.0,minY=-127.0,maxY=-119.0)
 Q:Pt(x=6.0,y=0.0)

 Stack Trace:
 java.lang.AssertionError: Shouldn't match
 I#2:Rect(minX=104.0,maxX=110.0,minY=-127.0,maxY=-119.0) Q:Pt(x=6.0,y=0.0)
 at
 __randomizedtesting.SeedInfo.seed([270CA22550192CCD:5FB2F89BF67A00A9]:0)
 at org.junit.Assert.fail(Assert.java:93)
 at
 org.apache.lucene.spatial.prefix.SpatialOpRecursivePrefixTreeTest.fail(SpatialOpRecursivePrefixTreeTest.java:358)
 at
 org.apache.lucene.spatial.prefix.SpatialOpRecursivePrefixTreeTest.doTest(SpatialOpRecursivePrefixTreeTest.java:338)
 at
 org.apache.lucene.spatial.prefix.SpatialOpRecursivePrefixTreeTest.testWithin(SpatialOpRecursivePrefixTreeTest.java:120)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at
 com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1617)
 at
 com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:826)
 at
 com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:862)
 at
 com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:876)
 at
 org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
 at
 org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
 at
 org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
 at
 com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
 at
 org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
 at
 org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
 at
 org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
 at
 com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
 at
 com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:359)
 at
 com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:783)
 at
 com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:443)
 at
 com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:835)
 at
 com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:737)
 at
 com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:771)
 at
 com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:782)
 at
 org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
 at
 org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
 at
 com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
 at
 com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
 at
 com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
 at
 com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
 at
 org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43)
 at
 org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
 at
 org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
 at
 org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55)
 at
 com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
 at
 com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:359)
 at java.lang.Thread.run(Thread.java:744)




 Build Log:
 

Re: [JENKINS] Lucene-Solr-trunk-Linux (64bit/jdk1.7.0_60-ea-b10) - Build # 9867 - Failure!

2014-03-21 Thread david.w.smi...@gmail.com
I'm definitely looking at it and I've found the problem.  I'm working on a
fix right now.

On Fri, Mar 21, 2014 at 3:27 PM, Michael McCandless 
luc...@mikemccandless.com wrote:

 I someone looking at this test failure?  Should we @BadApple it, or
 revert recent spatial changes, or something?

 Mike McCandless

 http://blog.mikemccandless.com


 On Fri, Mar 21, 2014 at 12:26 PM, Policeman Jenkins Server
 jenk...@thetaphi.de wrote:
  Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/9867/
  Java: 64bit/jdk1.7.0_60-ea-b10 -XX:-UseCompressedOops
 -XX:+UseConcMarkSweepGC
 
  1 tests failed.
  FAILED:
  org.apache.lucene.spatial.prefix.SpatialOpRecursivePrefixTreeTest.testWithin
 {#9 seed=[E934CCA05FA676E7:BFB8DC407C97398]}
 
  Error Message:
  Shouldn't match I#4:Rect(minX=48.0,maxX=76.0,minY=-44.0,maxY=27.0)
 Q:Pt(x=120.0,y=0.0)
 



Re: [JENKINS] Lucene-Solr-4.x-Linux (32bit/jdk1.7.0_51) - Build # 9725 - Still Failing!

2014-03-18 Thread david.w.smi...@gmail.com
I'll look into this one and get it fixed ASAP.


On Tue, Mar 18, 2014 at 2:26 AM, Policeman Jenkins Server 
jenk...@thetaphi.de wrote:

 Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/9725/
 Java: 32bit/jdk1.7.0_51 -server -XX:+UseSerialGC

 2 tests failed.
 FAILED:
  
 org.apache.lucene.spatial.prefix.SpatialOpRecursivePrefixTreeTest.testContains
 {#7 seed=[175A10038A619363:146B27D88CDE16A]}

 Error Message:
 Shouldn't match I#0:Rect(minX=-180.0,maxX=180.0,minY=-90.0,maxY=90.0)
 Q:ShapePair(Rect(minX=-180.0,maxX=180.0,minY=-90.0,maxY=90.0) ,
 Rect(minX=-21.0,maxX=-14.0,minY=-26.0,maxY=-21.0))

 Stack Trace:
 java.lang.AssertionError: Shouldn't match
 I#0:Rect(minX=-180.0,maxX=180.0,minY=-90.0,maxY=90.0)
 Q:ShapePair(Rect(minX=-180.0,maxX=180.0,minY=-90.0,maxY=90.0) ,
 Rect(minX=-21.0,maxX=-14.0,minY=-26.0,maxY=-21.0))
 at
 __randomizedtesting.SeedInfo.seed([175A10038A619363:146B27D88CDE16A]:0)
 at org.junit.Assert.fail(Assert.java:93)
 at
 org.apache.lucene.spatial.prefix.SpatialOpRecursivePrefixTreeTest.fail(SpatialOpRecursivePrefixTreeTest.java:355)
 at
 org.apache.lucene.spatial.prefix.SpatialOpRecursivePrefixTreeTest.doTest(SpatialOpRecursivePrefixTreeTest.java:335)
 at
 org.apache.lucene.spatial.prefix.SpatialOpRecursivePrefixTreeTest.testContains(SpatialOpRecursivePrefixTreeTest.java:126)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at
 com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1617)
 at
 com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:826)
 at
 com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:862)
 at
 com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:876)
 at
 org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
 at
 org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
 at
 org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
 at
 com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
 at
 org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
 at
 org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
 at
 org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
 at
 com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
 at
 com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:359)
 at
 com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:783)
 at
 com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:443)
 at
 com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:835)
 at
 com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:737)
 at
 com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:771)
 at
 com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:782)
 at
 org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
 at
 org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
 at
 com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
 at
 com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
 at
 com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
 at
 com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
 at
 org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43)
 at
 org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
 at
 org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
 at
 org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55)
 at
 

Re: Welcome back, Wolfgang Hoschek!

2013-09-26 Thread david.w.smi...@gmail.com
Nice!  Welcome back Wolfgang!


On Thu, Sep 26, 2013 at 6:21 AM, Uwe Schindler uschind...@apache.orgwrote:

 Hi,

 I'm pleased to announce that after a long abstinence, Wolfgang Hoschek
 rejoined the Lucene/Solr committer team. He is working now at Cloudera and
 plans to help with the integration of Solr and Hadoop.
 Wolfgang originally wrote the MemoryIndex, which is used by the classical
 Lucene highlighter and ElasticSearch's percolator module.

 Looking forward to new contributions.

 Welcome back  heavy committing! :-)
 Uwe

 P.S.: Wolfgang, as soon as you have setup your subversion access, you
 should add yourself back to the committers list on the website as well.

 -
 Uwe Schindler
 uschind...@apache.org
 Apache Lucene PMC Chair / Committer
 Bremen, Germany
 http://lucene.apache.org/





Fwd: [JENKINS] Lucene-Solr-4.x-Linux (32bit/jdk1.6.0_45) - Build # 6066 - Still Failing!

2013-06-14 Thread david.w.smi...@gmail.com
Dawid,

Could you please take a look at the reproducibility of this test failure in
lucene/spatial?  I tried to reproduce it but couldn't, and I thought
perhaps you might have some insight because I'm using some
RandomizedTesting features that aren't as often used, like @Repeat.  For
example, one thing fishy is this log message:

[junit4:junit4]   2 NOTE: reproduce with: ant test
 -Dtestcase=SpatialOpRecursivePrefixTreeTest -Dtests.method=testContains
{#1 seed=[9166D28D6532217A:472BE5C4B7344982]}
-Dtests.seed=9166D28D6532217A -Dtests.multiplier=3 -Dtests.slow=true
-Dtests.locale=uk_UA -Dtests.timezone=Etc/GMT-6 -Dtests.file.encoding=UTF-8

Notice the -Dtests.method=testContains {#1
seed=[9166D28D6532217A:472BE5C4B7344982]} part, which is wrong because if
I do that, it'll not find the method to test.  If I change this to simply
testContains, and set the seed normally -Dtests.seed=91 then I still
can't reproduce the problem.  This test appears to have failed a bunch of
times lately with different seeds.

~ David

-- Forwarded message --
From: Policeman Jenkins Server jenk...@thetaphi.de
Date: Fri, Jun 14, 2013 at 9:33 PM
Subject: [JENKINS] Lucene-Solr-4.x-Linux (32bit/jdk1.6.0_45) - Build # 6066
- Still Failing!
To: dev@lucene.apache.org


Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/6066/
Java: 32bit/jdk1.6.0_45 -server -XX:+UseSerialGC

1 tests failed.
FAILED:
 org.apache.lucene.spatial.prefix.SpatialOpRecursivePrefixTreeTest.testContains
{#1 seed=[9166D28D6532217A:472BE5C4B7344982]}

Error Message:
Shouldn't match I
#0:ShapePair(Rect(minX=102.0,maxX=112.0,minY=-36.0,maxY=120.0) ,
Rect(minX=168.0,maxX=175.0,minY=-1.0,maxY=11.0))
Q:Rect(minX=0.0,maxX=256.0,minY=-128.0,maxY=128.0)

Stack Trace:
java.lang.AssertionError: Shouldn't match I
#0:ShapePair(Rect(minX=102.0,maxX=112.0,minY=-36.0,maxY=120.0) ,
Rect(minX=168.0,maxX=175.0,minY=-1.0,maxY=11.0))
Q:Rect(minX=0.0,maxX=256.0,minY=-128.0,maxY=128.0)
at
__randomizedtesting.SeedInfo.seed([9166D28D6532217A:472BE5C4B7344982]:0)
at org.junit.Assert.fail(Assert.java:93)
at
org.apache.lucene.spatial.prefix.SpatialOpRecursivePrefixTreeTest.fail(SpatialOpRecursivePrefixTreeTest.java:287)
at
org.apache.lucene.spatial.prefix.SpatialOpRecursivePrefixTreeTest.doTest(SpatialOpRecursivePrefixTreeTest.java:273)
at
org.apache.lucene.spatial.prefix.SpatialOpRecursivePrefixTreeTest.testContains(SpatialOpRecursivePrefixTreeTest.java:101)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559)
at
com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79)
at
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737)
at
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773)
at
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787)
at
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
at
org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
at
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
at
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
at
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
at
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782)
at
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442)
at
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746)
at
com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648)
at
com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682)
at
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693)
at

BooleanFilter MUST clauses and getDocIdSet(acceptDocs)

2012-11-07 Thread david.w.smi...@gmail.com
I am about to write a Filter that only operates on a set of documents that
have already passed other filter(s).  It's rather expensive, since it has
to use DocValues to examine a value and then determine if its a match.  So
it scales O(n) where n is the number of documents it must see.  The 2nd arg
of getDocIdSet is Bits acceptDocs.  Unfortunately Bits doesn't have an int
iterator but I can deal with that seeing if it extends DocIdSet.

I'm looking at BooleanFilter which I want to use and I notice that it
passes null to filter.getDocIdSet for acceptDocs, and it justifies this
with the following comment:
// we dont pass acceptDocs, we will filter at the end using an additional
filter
Uwe wrote this comment in relation to LUCENE-1536 (r1188624).
For the MUST clause loop, couldn't it give it the accumulated bits of the
MUST clauses?

~ David


Changes as we approach v4

2012-09-21 Thread david.w.smi...@gmail.com
Rob,
  It appears you are in-effect the Release Manager for v4.0 so I'm
asking you this question.  Clearly v4 is going to be out soon and
consequently we're not pushing new features to the v4 branch.
Regarding the new spatial codebase, there isn't a backwards
compatibility concern to changes until v4 is actually released.  In
your opinion, is it too late to do class renames in this area? --
LUCENE-4374 is about renaming TwoDoublesStrategy to
PointVectorStrategy (much better name; the old name is crap and that's
my fault).   And FYI I intend to add a bunch of javadocs to all
spatial classes this weekend.

Thanks for all the time you spend on doing your R.M. duties -- it's a
ton of work that few people would step forward to do.

~ David

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [JENKINS] Lucene-Solr-trunk-Linux-Java7-64 - Build # 438 - Failure!

2012-06-29 Thread david.w.smi...@gmail.com
I added the missing ASL header.

On Thu, Jun 28, 2012 at 4:54 PM, Policeman Jenkins Server 
jenk...@sd-datasolutions.de wrote:

 Build:
 http://jenkins.sd-datasolutions.de/job/Lucene-Solr-trunk-Linux-Java7-64/438/

 All tests passed

 Build Log:
 [...truncated 15182 lines...]
 BUILD FAILED
 /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux-Java7-64/checkout/build.xml:62:
 The following error occurred while executing this line:
 /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux-Java7-64/checkout/lucene/build.xml:270:
 The following error occurred while executing this line:
 /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux-Java7-64/checkout/lucene/common-build.xml:1435:
 The following error occurred while executing this line:
 /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux-Java7-64/checkout/lucene/common-build.xml:1275:
 Rat problems were found!

 Total time: 6 seconds
 Build step 'Execute shell' marked build as failure
 Archiving artifacts
 Recording test results
 Email was triggered for: Failure
 Sending email for trigger: Failure





<    1   2   3