date:20130917

Re: [JENKINS] Lucene-Solr-trunk-Linux (32bit/jdk1.8.0-ea-b106) - Build # 7511 - Failure!

2013-09-17 Thread Dawid Weiss

The assert is a consequence of a missed update to upto. I don't think
there's any sensible way to guard against that -- we could try to
rewrite the code in a way that doesn't trigger this bug but it'll
going to show up somewhere else I'm sure.

Dawid

On Wed, Sep 18, 2013 at 8:26 AM, Robert Muir  wrote:
> Scary that sometimes it doesnt trip the assert and instead broken
> stuff makes it to the postingswriter:
>
>[junit4] Suite: org.apache.lucene.index.TestNorms
>[junit4]   2> NOTE: reproduce with: ant test  -Dtestcase=TestNorms
> -Dtests.method=testMaxByteNorms -Dtests.seed=2B65838ED85E2427
> -Dtests.multiplier=3 -Dtests.slow=true -Dtests.locale=zh_SG
> -Dtests.timezone=Australia/Lindeman -Dtests.file.encoding=ISO-8859-1
>[junit4] ERROR   0.64s J1 | TestNorms.testMaxByteNorms <<<
>[junit4]> Throwable #1:
> org.apache.lucene.index.CorruptIndexException: docs out of order (58
> <= 58 ) (docOut:
> MockIndexOutputWrapper(org.apache.lucene.store.FSDirectory$FSIndexOutput@ba8419))
>[junit4]> at
> __randomizedtesting.SeedInfo.seed([2B65838ED85E2427:9F9A16B0F21250F1]:0)
>[junit4]> at
> org.apache.lucene.codecs.lucene41.Lucene41PostingsWriter.startDoc(Lucene41PostingsWriter.java:297)
>[junit4]> at
> org.apache.lucene.index.FreqProxTermsWriterPerField.flush(FreqProxTermsWriterPerField.java:470)
>[junit4]> at
> org.apache.lucene.index.FreqProxTermsWriter.flush(FreqProxTermsWriter.java:85)
>[junit4]> at 
> org.apache.lucene.index.TermsHash.flush(TermsHash.java:116)
>[junit4]> at
> org.apache.lucene.index.DocInverter.flush(DocInverter.java:53)
>[junit4]> at
> org.apache.lucene.index.DocFieldProcessor.flush(DocFieldProcessor.java:81)
>[junit4]> at
> org.apache.lucene.index.DocumentsWriterPerThread.flush(DocumentsWriterPerThread.java:466)
>[junit4]> at
> org.apache.lucene.index.DocumentsWriter.doFlush(DocumentsWriter.java:508)
>[junit4]> at
> org.apache.lucene.index.DocumentsWriter.flushAllThreads(DocumentsWriter.java:618)
>[junit4]> at
> org.apache.lucene.index.IndexWriter.prepareCommitInternal(IndexWriter.java:2769)
>[junit4]> at
> org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:2919)
>[junit4]> at
> org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:2894)
>[junit4]> at
> org.apache.lucene.index.RandomIndexWriter.commit(RandomIndexWriter.java:233)
>[junit4]> at
> org.apache.lucene.index.TestNorms.buildIndex(TestNorms.java:144)
>[junit4]> at
> org.apache.lucene.index.TestNorms.testMaxByteNorms(TestNorms.java:111)
>[junit4]> at java.lang.Thread.run(Thread.java:724)
>
>
> On Wed, Sep 18, 2013 at 2:22 AM, Dawid Weiss
>  wrote:
>> Yes, it looks like LUCENE-5168 (G1GC, 32-bit).
>>
>> Dawid
>>
>> On Wed, Sep 18, 2013 at 3:04 AM, Robert Muir  wrote:
>>> I think its "dawid's" jvm bug:
>>>
>>>[junit4] ERROR   1.61s J1 | TestDuelingCodecs.testEquals <<<
>>>[junit4]> Throwable #1: java.lang.AssertionError
>>>[junit4]>at org.apache.lucene.index.
>>> ByteSliceReader.readByte(ByteSliceReader.java:73)
>>>[junit4]>at
>>> org.apache.lucene.store.DataInput.readVInt(DataInput.java:108)
>>>[junit4]>at
>>> org.apache.lucene.index.FreqProxTermsWriterPerField.flush(FreqProxTermsWriterPerField.java:500)
>>>
>>> On Tue, Sep 17, 2013 at 5:59 PM, Policeman Jenkins Server
>>>  wrote:
 Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/7511/
 Java: 32bit/jdk1.8.0-ea-b106 -server -XX:+UseG1GC

 3 tests failed.
 REGRESSION:  org.apache.lucene.index.TestDuelingCodecs.testEquals

 Error Message:
 MockDirectoryWrapper: cannot close: there are still open locks: 
 [write.lock]

 Stack Trace:
 java.lang.RuntimeException: MockDirectoryWrapper: cannot close: there are 
 still open locks: [write.lock]
 at 
 org.apache.lucene.store.MockDirectoryWrapper.close(MockDirectoryWrapper.java:626)
 at 
 org.apache.lucene.index.TestDuelingCodecs.tearDown(TestDuelingCodecs.java:111)
 at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:491)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:795)
 at 
 org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
 at 
 org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
 at 
 org.apache.lucene.util.Abstrac

Re: [JENKINS] Lucene-Solr-trunk-Linux (32bit/jdk1.8.0-ea-b106) - Build # 7511 - Failure!

2013-09-17 Thread Robert Muir

Scary that sometimes it doesnt trip the assert and instead broken
stuff makes it to the postingswriter:

   [junit4] Suite: org.apache.lucene.index.TestNorms
   [junit4]   2> NOTE: reproduce with: ant test  -Dtestcase=TestNorms
-Dtests.method=testMaxByteNorms -Dtests.seed=2B65838ED85E2427
-Dtests.multiplier=3 -Dtests.slow=true -Dtests.locale=zh_SG
-Dtests.timezone=Australia/Lindeman -Dtests.file.encoding=ISO-8859-1
   [junit4] ERROR   0.64s J1 | TestNorms.testMaxByteNorms <<<
   [junit4]> Throwable #1:
org.apache.lucene.index.CorruptIndexException: docs out of order (58
<= 58 ) (docOut:
MockIndexOutputWrapper(org.apache.lucene.store.FSDirectory$FSIndexOutput@ba8419))
   [junit4]> at
__randomizedtesting.SeedInfo.seed([2B65838ED85E2427:9F9A16B0F21250F1]:0)
   [junit4]> at
org.apache.lucene.codecs.lucene41.Lucene41PostingsWriter.startDoc(Lucene41PostingsWriter.java:297)
   [junit4]> at
org.apache.lucene.index.FreqProxTermsWriterPerField.flush(FreqProxTermsWriterPerField.java:470)
   [junit4]> at
org.apache.lucene.index.FreqProxTermsWriter.flush(FreqProxTermsWriter.java:85)
   [junit4]> at org.apache.lucene.index.TermsHash.flush(TermsHash.java:116)
   [junit4]> at
org.apache.lucene.index.DocInverter.flush(DocInverter.java:53)
   [junit4]> at
org.apache.lucene.index.DocFieldProcessor.flush(DocFieldProcessor.java:81)
   [junit4]> at
org.apache.lucene.index.DocumentsWriterPerThread.flush(DocumentsWriterPerThread.java:466)
   [junit4]> at
org.apache.lucene.index.DocumentsWriter.doFlush(DocumentsWriter.java:508)
   [junit4]> at
org.apache.lucene.index.DocumentsWriter.flushAllThreads(DocumentsWriter.java:618)
   [junit4]> at
org.apache.lucene.index.IndexWriter.prepareCommitInternal(IndexWriter.java:2769)
   [junit4]> at
org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:2919)
   [junit4]> at
org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:2894)
   [junit4]> at
org.apache.lucene.index.RandomIndexWriter.commit(RandomIndexWriter.java:233)
   [junit4]> at
org.apache.lucene.index.TestNorms.buildIndex(TestNorms.java:144)
   [junit4]> at
org.apache.lucene.index.TestNorms.testMaxByteNorms(TestNorms.java:111)
   [junit4]> at java.lang.Thread.run(Thread.java:724)


On Wed, Sep 18, 2013 at 2:22 AM, Dawid Weiss
 wrote:
> Yes, it looks like LUCENE-5168 (G1GC, 32-bit).
>
> Dawid
>
> On Wed, Sep 18, 2013 at 3:04 AM, Robert Muir  wrote:
>> I think its "dawid's" jvm bug:
>>
>>[junit4] ERROR   1.61s J1 | TestDuelingCodecs.testEquals <<<
>>[junit4]> Throwable #1: java.lang.AssertionError
>>[junit4]>at org.apache.lucene.index.
>> ByteSliceReader.readByte(ByteSliceReader.java:73)
>>[junit4]>at
>> org.apache.lucene.store.DataInput.readVInt(DataInput.java:108)
>>[junit4]>at
>> org.apache.lucene.index.FreqProxTermsWriterPerField.flush(FreqProxTermsWriterPerField.java:500)
>>
>> On Tue, Sep 17, 2013 at 5:59 PM, Policeman Jenkins Server
>>  wrote:
>>> Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/7511/
>>> Java: 32bit/jdk1.8.0-ea-b106 -server -XX:+UseG1GC
>>>
>>> 3 tests failed.
>>> REGRESSION:  org.apache.lucene.index.TestDuelingCodecs.testEquals
>>>
>>> Error Message:
>>> MockDirectoryWrapper: cannot close: there are still open locks: [write.lock]
>>>
>>> Stack Trace:
>>> java.lang.RuntimeException: MockDirectoryWrapper: cannot close: there are 
>>> still open locks: [write.lock]
>>> at 
>>> org.apache.lucene.store.MockDirectoryWrapper.close(MockDirectoryWrapper.java:626)
>>> at 
>>> org.apache.lucene.index.TestDuelingCodecs.tearDown(TestDuelingCodecs.java:111)
>>> at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
>>> at 
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>> at java.lang.reflect.Method.invoke(Method.java:491)
>>> at 
>>> com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559)
>>> at 
>>> com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79)
>>> at 
>>> com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:795)
>>> at 
>>> org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
>>> at 
>>> org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
>>> at 
>>> org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
>>> at 
>>> com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
>>> at 
>>> org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
>>> at 
>>> org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
>>> at 
>

[jira] [Commented] (LUCENE-5214) Add new FreeTextSuggester, to handle "long tail" suggestions

2013-09-17 Thread Dawid Weiss (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13770487#comment-13770487
 ] 

Dawid Weiss commented on LUCENE-5214:
-

Pretty cool, thanks Mike.

> Add new FreeTextSuggester, to handle "long tail" suggestions
> 
>
> Key: LUCENE-5214
> URL: https://issues.apache.org/jira/browse/LUCENE-5214
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/spellchecker
>Reporter: Michael McCandless
>Assignee: Michael McCandless
> Fix For: 5.0, 4.6
>
> Attachments: LUCENE-5214.patch
>
>
> The current suggesters are all based on a finite space of possible
> suggestions, i.e. the ones they were built on, so they can only
> suggest a full suggestion from that space.
> This means if the current query goes outside of that space then no
> suggestions will be found.
> The goal of FreeTextSuggester is to address this, by giving
> predictions based on an ngram language model, i.e. using the last few
> tokens from the user's query to predict likely following token.
> I got the idea from this blog post about Google's suggest:
> http://googleblog.blogspot.com/2011/04/more-predictions-in-autocomplete.html
> This is very much still a work in progress, but it seems to be
> working.  I've tested it on the AOL query logs, using an interactive
> tool from luceneutil to show the suggestions, and it seems to work well.
> It's fun to use that tool to explore the word associations...
> I don't think this suggester would be used standalone; rather, I think
> it'd be a fallback for times when the primary suggester fails to find
> anything.  You can see this behavior on google.com, if you type "the
> fast and the ", you see entire queries being suggested, but then if
> the next word you type is "burning" then suddenly you see the
> suggestions are only based on the last word, not the entire query.
> It uses ShingleFilter under-the-hood to generate the token ngrams;
> once LUCENE-5180 is in it will be able to properly handle a user query
> that ends with stop-words (e.g. "wizard of "), and then stores the
> ngrams in an FST.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [JENKINS] Lucene-Solr-trunk-Linux (32bit/jdk1.8.0-ea-b106) - Build # 7511 - Failure!

2013-09-17 Thread Dawid Weiss

Yes, it looks like LUCENE-5168 (G1GC, 32-bit).

Dawid

On Wed, Sep 18, 2013 at 3:04 AM, Robert Muir  wrote:
> I think its "dawid's" jvm bug:
>
>[junit4] ERROR   1.61s J1 | TestDuelingCodecs.testEquals <<<
>[junit4]> Throwable #1: java.lang.AssertionError
>[junit4]>at org.apache.lucene.index.
> ByteSliceReader.readByte(ByteSliceReader.java:73)
>[junit4]>at
> org.apache.lucene.store.DataInput.readVInt(DataInput.java:108)
>[junit4]>at
> org.apache.lucene.index.FreqProxTermsWriterPerField.flush(FreqProxTermsWriterPerField.java:500)
>
> On Tue, Sep 17, 2013 at 5:59 PM, Policeman Jenkins Server
>  wrote:
>> Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/7511/
>> Java: 32bit/jdk1.8.0-ea-b106 -server -XX:+UseG1GC
>>
>> 3 tests failed.
>> REGRESSION:  org.apache.lucene.index.TestDuelingCodecs.testEquals
>>
>> Error Message:
>> MockDirectoryWrapper: cannot close: there are still open locks: [write.lock]
>>
>> Stack Trace:
>> java.lang.RuntimeException: MockDirectoryWrapper: cannot close: there are 
>> still open locks: [write.lock]
>> at 
>> org.apache.lucene.store.MockDirectoryWrapper.close(MockDirectoryWrapper.java:626)
>> at 
>> org.apache.lucene.index.TestDuelingCodecs.tearDown(TestDuelingCodecs.java:111)
>> at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
>> at 
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> at java.lang.reflect.Method.invoke(Method.java:491)
>> at 
>> com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559)
>> at 
>> com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79)
>> at 
>> com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:795)
>> at 
>> org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
>> at 
>> org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
>> at 
>> org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
>> at 
>> com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
>> at 
>> org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
>> at 
>> org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
>> at 
>> org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
>> at 
>> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
>> at 
>> com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
>> at 
>> com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782)
>> at 
>> com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442)
>> at 
>> com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746)
>> at 
>> com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648)
>> at 
>> com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682)
>> at 
>> com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693)
>> at 
>> org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
>> at 
>> org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
>> at 
>> com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
>> at 
>> com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
>> at 
>> com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
>> at 
>> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
>> at 
>> org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43)
>> at 
>> org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
>> at 
>> org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
>> at 
>> org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55)
>> at 
>> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
>> at 
>> com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(

Re: Modifying the Lucene 4 index?

2013-09-17 Thread Ralf Bierig

The answer to this is that I am not fully sure, but would like to know 
how index modification is further supported and how it can further 
escalated if needed. I will know more specifics by tonight, as we are 
working on it today for the first time, but I wanted to collect more 
information in advance. For example, I read that Lucene uses Codecs to 
allow exchanging index implementations and that should allow to driver 
such a modification.

Ralf

On 18.09.2013 03:44, Kranti Parisa wrote:

got you. so having custom analyzers is not enough for your requirements?

Thanks & Regards,
Kranti K Parisa
http://www.linkedin.com/in/krantiparisa

On Tue, Sep 17, 2013 at 6:43 PM, Ralf Bierig > wrote:

Hi Kranti,

No, i am familiar with creating indices and adding / updating
documents, creating fields and using analyzers in the preparation
of the content. I am asking for making actual changes to the
indexing mechanics underneath, like substituting Term with
something else, like concepts or changing the codec. However, i
know to little yet to be more specific and was just asking for
general documentation of any kind that explains some larger
changes on the index structure and Lucene's mechanics on the index.

Best.
Ralf

On 17.09.2013 23:57, Kranti Parisa wrote:

Are you asking about the steps to update the existing document?

Thanks & Regards,
Kranti K Parisa
http://www.linkedin.com/in/krantiparisa

On Tue, Sep 17, 2013 at 5:34 PM, Ralf Bierig
mailto:ralf.bie...@gmail.com>
>>
wrote:

Hi all,

is there any good documentation of how to change and
modify the
index of
Lucene version 4 other than what is already on the
website? Blogs,
papers, reports etc. or just a report on experience in
some form ---
anything would be good.

Based on an early-stage project, I would like to get first
hand
experience in order to a) get an overview about what is
possible /
what can be modified and
b) how difficult is it to do in general. I remember the Lucene
index being rather closed up in the early times, but this
might be
very different now...

Best,
Ralf

-

To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org

>

For additional commands, e-mail:
dev-h...@lucene.apache.org 
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org

For additional commands, e-mail: dev-h...@lucene.apache.org

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [jira] [Updated] (SOLR-2345) Extend geodist() to support MultiValued lat long field

2013-09-17 Thread David Smiley (@MITRE.org)

Pradeep, you should actually comment in JIRA.  Responding to JIRA emails
doesn't work (I think).

To answer your question, the fundamental capabilities in SOLR-2155 made it
into Solr 4.  SOLR-2345 is specifically about using the geodist() function
query in addition to the more awkward method of using the score=distance
local-param. If you search you may see info on that approach, if you can't
yet go to Solr 4.5.

~ David


Pradeep Pujari-4 wrote
> Hi Bill,
> 
> We are using SOLR-2155 patch in solr3.6 for quite long time. This works
> well for us. We are currently migrating to Solr 4.0. Is this SOLR-2345 is
> equivalent of  SOLR-2155 in terms of functionality?
> 
> Thanks
> 
> 
> 
>  From: David Smiley (JIRA) <

> jira@

> >
> To: 

> dev@.apache

>  
> Sent: Tuesday, July 16, 2013 10:20 PM
> Subject: [jira] [Updated] (SOLR-2345) Extend geodist() to support
> MultiValued lat long field
>  
> 
> 
>      [
> https://issues.apache.org/jira/browse/SOLR-2345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
> ]
> 
> David Smiley updated SOLR-2345:
> ---
> 
>     Attachment: SOLR-2345_geodist_support_for_RPT.patch
> 
> The attached patch implements the desired functionality.  It depends on
> LUCENE-5118 being applied first.
> 
> This hack wasn't too terrible after all.
>                 
>> Extend geodist() to support MultiValued lat long field
>> --
>>
>>                 Key: SOLR-2345
>>                 URL: https://issues.apache.org/jira/browse/SOLR-2345
>>             Project: Solr
>>          Issue Type: New Feature
>>          Components: spatial
>>            Reporter: Bill Bell
>>            Assignee: David Smiley
>>             Fix For: 4.5
>>
>>         Attachments: SOLR-2345_geodist_refactor.patch,
SOLR-2345_geodist_support_for_RPT.patch
>>
>>
>> Extend geodist() and {!geofilt} to support a multiValued lat,long field
>> without using geohash.
>> sort=geodist() asc
> 
> --
> This message is automatically generated by JIRA.
> If you think it was sent incorrectly, please contact your JIRA
> administrators
> For more information on JIRA, see: http://www.atlassian.com/software/jira
> 
> -
> To unsubscribe, e-mail: 

> dev-unsubscribe@.apache

> For additional commands, e-mail: 

> dev-help@.apache





-
 Author: http://www.packtpub.com/apache-solr-3-enterprise-search-server/book
--
View this message in context: 
http://lucene.472066.n3.nabble.com/jira-Updated-SOLR-2345-Extend-geodist-to-support-MultiValued-lat-long-field-tp4078512p4090774.html
Sent from the Lucene - Java Developer mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-5245) Create a test for SOLR-5243.

2013-09-17 Thread Mark Miller (JIRA)

Mark Miller created SOLR-5245:
-

 Summary: Create a test for SOLR-5243.
 Key: SOLR-5245
 URL: https://issues.apache.org/jira/browse/SOLR-5245
 Project: Solr
  Issue Type: Test
  Components: SolrCloud
Reporter: Mark Miller
Assignee: Mark Miller
 Fix For: 5.0, 4.6




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-5243) killing a shard in one collection can result in leader election in a different collection

2013-09-17 Thread Mark Miller (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-5243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller resolved SOLR-5243.
---

Resolution: Fixed

> killing a shard in one collection can result in leader election in a 
> different collection
> -
>
> Key: SOLR-5243
> URL: https://issues.apache.org/jira/browse/SOLR-5243
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Reporter: Yonik Seeley
>Assignee: Mark Miller
>Priority: Blocker
> Fix For: 4.5, 5.0
>
> Attachments: SOLR-5243.patch, SOLR-5243.patch
>
>
> Discovered while doing some more ad-hoc testing... if I create two 
> collections with the same shard name and then kill the leader in one, it can 
> sometimes cause a leader election in the other (leaving the first leaderless).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-5243) killing a shard in one collection can result in leader election in a different collection

2013-09-17 Thread Mark Miller (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-5243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller updated SOLR-5243:
--

Component/s: SolrCloud

> killing a shard in one collection can result in leader election in a 
> different collection
> -
>
> Key: SOLR-5243
> URL: https://issues.apache.org/jira/browse/SOLR-5243
> Project: Solr
>  Issue Type: Bug
>  Components: SolrCloud
>Reporter: Yonik Seeley
>Assignee: Mark Miller
>Priority: Blocker
> Fix For: 4.5, 5.0
>
> Attachments: SOLR-5243.patch, SOLR-5243.patch
>
>
> Discovered while doing some more ad-hoc testing... if I create two 
> collections with the same shard name and then kill the leader in one, it can 
> sometimes cause a leader election in the other (leaving the first leaderless).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Assigned] (SOLR-5243) killing a shard in one collection can result in leader election in a different collection

2013-09-17 Thread Mark Miller (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-5243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller reassigned SOLR-5243:
-

Assignee: Mark Miller

> killing a shard in one collection can result in leader election in a 
> different collection
> -
>
> Key: SOLR-5243
> URL: https://issues.apache.org/jira/browse/SOLR-5243
> Project: Solr
>  Issue Type: Bug
>Reporter: Yonik Seeley
>Assignee: Mark Miller
>Priority: Blocker
> Fix For: 4.5, 5.0
>
> Attachments: SOLR-5243.patch, SOLR-5243.patch
>
>
> Discovered while doing some more ad-hoc testing... if I create two 
> collections with the same shard name and then kill the leader in one, it can 
> sometimes cause a leader election in the other (leaving the first leaderless).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5243) killing a shard in one collection can result in leader election in a different collection

2013-09-17 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13770386#comment-13770386
 ] 

ASF subversion and git services commented on SOLR-5243:
---

Commit 1524291 from [~markrmil...@gmail.com] in branch 
'dev/branches/lucene_solr_4_5'
[ https://svn.apache.org/r1524291 ]

SOLR-5243: CHANGES entry.

> killing a shard in one collection can result in leader election in a 
> different collection
> -
>
> Key: SOLR-5243
> URL: https://issues.apache.org/jira/browse/SOLR-5243
> Project: Solr
>  Issue Type: Bug
>Reporter: Yonik Seeley
>Priority: Blocker
> Fix For: 4.5, 5.0
>
> Attachments: SOLR-5243.patch, SOLR-5243.patch
>
>
> Discovered while doing some more ad-hoc testing... if I create two 
> collections with the same shard name and then kill the leader in one, it can 
> sometimes cause a leader election in the other (leaving the first leaderless).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5243) killing a shard in one collection can result in leader election in a different collection

2013-09-17 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13770381#comment-13770381
 ] 

ASF subversion and git services commented on SOLR-5243:
---

Commit 1524289 from [~markrmil...@gmail.com] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1524289 ]

SOLR-5243: CHANGES entry.

> killing a shard in one collection can result in leader election in a 
> different collection
> -
>
> Key: SOLR-5243
> URL: https://issues.apache.org/jira/browse/SOLR-5243
> Project: Solr
>  Issue Type: Bug
>Reporter: Yonik Seeley
>Priority: Blocker
> Fix For: 4.5, 5.0
>
> Attachments: SOLR-5243.patch, SOLR-5243.patch
>
>
> Discovered while doing some more ad-hoc testing... if I create two 
> collections with the same shard name and then kill the leader in one, it can 
> sometimes cause a leader election in the other (leaving the first leaderless).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5243) killing a shard in one collection can result in leader election in a different collection

2013-09-17 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13770385#comment-13770385
 ] 

ASF subversion and git services commented on SOLR-5243:
---

Commit 1524290 from [~markrmil...@gmail.com] in branch 
'dev/branches/lucene_solr_4_5'
[ https://svn.apache.org/r1524290 ]

SOLR-5243: Killing a shard in one collection can result in leader election in a 
different collection if they share the same coreNodeName.

> killing a shard in one collection can result in leader election in a 
> different collection
> -
>
> Key: SOLR-5243
> URL: https://issues.apache.org/jira/browse/SOLR-5243
> Project: Solr
>  Issue Type: Bug
>Reporter: Yonik Seeley
>Priority: Blocker
> Fix For: 4.5, 5.0
>
> Attachments: SOLR-5243.patch, SOLR-5243.patch
>
>
> Discovered while doing some more ad-hoc testing... if I create two 
> collections with the same shard name and then kill the leader in one, it can 
> sometimes cause a leader election in the other (leaving the first leaderless).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5243) killing a shard in one collection can result in leader election in a different collection

2013-09-17 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13770378#comment-13770378
 ] 

ASF subversion and git services commented on SOLR-5243:
---

Commit 1524287 from [~markrmil...@gmail.com] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1524287 ]

SOLR-5243: Killing a shard in one collection can result in leader election in a 
different collection if they share the same coreNodeName.

> killing a shard in one collection can result in leader election in a 
> different collection
> -
>
> Key: SOLR-5243
> URL: https://issues.apache.org/jira/browse/SOLR-5243
> Project: Solr
>  Issue Type: Bug
>Reporter: Yonik Seeley
>Priority: Blocker
> Fix For: 4.5, 5.0
>
> Attachments: SOLR-5243.patch, SOLR-5243.patch
>
>
> Discovered while doing some more ad-hoc testing... if I create two 
> collections with the same shard name and then kill the leader in one, it can 
> sometimes cause a leader election in the other (leaving the first leaderless).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5243) killing a shard in one collection can result in leader election in a different collection

2013-09-17 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13770379#comment-13770379
 ] 

ASF subversion and git services commented on SOLR-5243:
---

Commit 1524288 from [~markrmil...@gmail.com] in branch 'dev/trunk'
[ https://svn.apache.org/r1524288 ]

SOLR-5243: CHANGES entry.

> killing a shard in one collection can result in leader election in a 
> different collection
> -
>
> Key: SOLR-5243
> URL: https://issues.apache.org/jira/browse/SOLR-5243
> Project: Solr
>  Issue Type: Bug
>Reporter: Yonik Seeley
>Priority: Blocker
> Fix For: 4.5, 5.0
>
> Attachments: SOLR-5243.patch, SOLR-5243.patch
>
>
> Discovered while doing some more ad-hoc testing... if I create two 
> collections with the same shard name and then kill the leader in one, it can 
> sometimes cause a leader election in the other (leaving the first leaderless).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5243) killing a shard in one collection can result in leader election in a different collection

2013-09-17 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13770377#comment-13770377
 ] 

ASF subversion and git services commented on SOLR-5243:
---

Commit 1524286 from [~markrmil...@gmail.com] in branch 'dev/trunk'
[ https://svn.apache.org/r1524286 ]

SOLR-5243: Killing a shard in one collection can result in leader election in a 
different collection if they share the same coreNodeName.

> killing a shard in one collection can result in leader election in a 
> different collection
> -
>
> Key: SOLR-5243
> URL: https://issues.apache.org/jira/browse/SOLR-5243
> Project: Solr
>  Issue Type: Bug
>Reporter: Yonik Seeley
>Priority: Blocker
> Fix For: 4.5, 5.0
>
> Attachments: SOLR-5243.patch, SOLR-5243.patch
>
>
> Discovered while doing some more ad-hoc testing... if I create two 
> collections with the same shard name and then kill the leader in one, it can 
> sometimes cause a leader election in the other (leaving the first leaderless).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5082) Implement ie=charset parameter

2013-09-17 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13770375#comment-13770375
 ] 

ASF subversion and git services commented on SOLR-5082:
---

Commit 1524284 from [~thetaphi] in branch 'dev/branches/lucene_solr_4_5'
[ https://svn.apache.org/r1524284 ]

Merged revision(s) 1524282 from lucene/dev/trunk:
SOLR-5082: Fix credits

> Implement ie=charset parameter
> --
>
> Key: SOLR-5082
> URL: https://issues.apache.org/jira/browse/SOLR-5082
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 4.4
>Reporter: Shawn Heisey
>Assignee: Uwe Schindler
>Priority: Minor
> Fix For: 4.5, 5.0
>
> Attachments: SOLR-5082.patch, SOLR-5082.patch
>
>
> Allow a user to send a query or update to Solr in a character set other than 
> UTF-8 and inform Solr what charset to use with an "ie" parameter, for input 
> encoding.  This was discussed in SOLR-4265 and SOLR-4283.
> Changing the default charset is a bad idea because distributed search 
> (SolrCloud) relies on UTF-8.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (LUCENE-5223) IndexUpgrader (4.4.0) fails when -verbose is not set, or when any value of -dir-impl is specified

2013-09-17 Thread Uwe Schindler (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13770368#comment-13770368
 ] 

Uwe Schindler edited comment on LUCENE-5223 at 9/18/13 3:13 AM:


Hi Hoss,
thanks for the patch. You can assign the issue to yourself. I just took it, 
because I wrote the code originally, so I wanted find out what changed.

In that case the change to Luecen 3 was, that IndexWriterConfig no longer 
allows {{null}} as InfoStream, but instead requires NO_OUTPUT constant (which 
is a good thing). While doing this change, the code in IndexUpgrader was 
unfortunately not upgraded. Your new tests are fine. The random directory is OK.

I don't think that we should allow {{null}} in IndexWriterConfig. I don't like 
this behaviour in Solr (e.g., SolrResourceLoader and a lot of other classes) to 
use crazy defaults if somewhere {{null}} is passed. Always be explicit.

The problem is a missing @NotNull annotation, like Java 8 provides (does it?).

We should state in the javadocs that {{null}} is not allowed.

  was (Author: thetaphi):
Hi Hoss,
thanks for the patch. You can assign the issue to yourself. I just took it, 
because I wrote the code originally, so I wanted find out what changed.

In that case the change to Luecen 3 was, that IndexWriterConfig no longer 
allows {{null}} as InfoStream, but instead requires NO_OUTPUT constant (which 
is a good thing). While doing this change, the code in IndexUpgrader was 
unfortunately not upgraded. Your new tests are fine. The random directory is OK.

I don't think that we should allow {{null} in IndexWriterConfig. I don't like 
this behaviour in Solr (e.g., SolrResourceLoader and a lot of other classes) to 
use crazy defaults if somewhere {{null}} is passed. Always be explicit.

The problem is a missing @NotNull annotation, like Java 8 provides (does it?).
  
> IndexUpgrader (4.4.0) fails when -verbose is not set, or when any value of 
> -dir-impl is specified
> -
>
> Key: LUCENE-5223
> URL: https://issues.apache.org/jira/browse/LUCENE-5223
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/index
>Affects Versions: 4.4
> Environment: Linux
>Reporter: Bruce Karsh
>Assignee: Uwe Schindler
>Priority: Minor
> Attachments: LUCENE-5223.patch
>
>
> Here it fails because -verbose is not set:
> $ java -cp ./lucene-core-4.4-SNAPSHOT.jar 
> org.apache.lucene.index.IndexUpgrader ./INDEX
> Exception in thread "main" java.lang.IllegalArgumentException: printStream 
> must not be null
>   at 
> org.apache.lucene.index.IndexWriterConfig.setInfoStream(IndexWriterConfig.java:514)
>   at org.apache.lucene.index.IndexUpgrader.(IndexUpgrader.java:126)
>   at org.apache.lucene.index.IndexUpgrader.main(IndexUpgrader.java:109)
> Here it works with -verbose set:
> $ java -cp ./lucene-core-4.4-SNAPSHOT.jar 
> org.apache.lucene.index.IndexUpgrader -verbose ./INDEX
> IFD 0 [Mon Sep 16 18:25:53 PDT 2013; main]: init: current segments file is 
> "segments_5"; 
> deletionPolicy=org.apache.lucene.index.KeepOnlyLastCommitDeletionPolicy@42698403
> ...
> IW 0 [Mon Sep 16 18:25:53 PDT 2013; main]: at close: _2(4.4):C4

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5223) IndexUpgrader (4.4.0) fails when -verbose is not set, or when any value of -dir-impl is specified

2013-09-17 Thread Uwe Schindler (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13770368#comment-13770368
 ] 

Uwe Schindler commented on LUCENE-5223:
---

Hi Hoss,
thanks for the patch. You can assign the issue to yourself. I just took it, 
because I wrote the code originally, so I wanted find out what changed.

In that case the change to Luecen 3 was, that IndexWriterConfig no longer 
allows {{null}} as InfoStream, but instead requires NO_OUTPUT constant (which 
is a good thing). While doing this change, the code in IndexUpgrader was 
unfortunately not upgraded. Your new tests are fine. The random directory is OK.

I don't think that we should allow {{null} in IndexWriterConfig. I don't like 
this behaviour in Solr (e.g., SolrResourceLoader and a lot of other classes) to 
use crazy defaults if somewhere {{null}} is passed. Always be explicit.

The problem is a missing @NotNull annotation, like Java 8 provides (does it?).

> IndexUpgrader (4.4.0) fails when -verbose is not set, or when any value of 
> -dir-impl is specified
> -
>
> Key: LUCENE-5223
> URL: https://issues.apache.org/jira/browse/LUCENE-5223
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/index
>Affects Versions: 4.4
> Environment: Linux
>Reporter: Bruce Karsh
>Assignee: Uwe Schindler
>Priority: Minor
> Attachments: LUCENE-5223.patch
>
>
> Here it fails because -verbose is not set:
> $ java -cp ./lucene-core-4.4-SNAPSHOT.jar 
> org.apache.lucene.index.IndexUpgrader ./INDEX
> Exception in thread "main" java.lang.IllegalArgumentException: printStream 
> must not be null
>   at 
> org.apache.lucene.index.IndexWriterConfig.setInfoStream(IndexWriterConfig.java:514)
>   at org.apache.lucene.index.IndexUpgrader.(IndexUpgrader.java:126)
>   at org.apache.lucene.index.IndexUpgrader.main(IndexUpgrader.java:109)
> Here it works with -verbose set:
> $ java -cp ./lucene-core-4.4-SNAPSHOT.jar 
> org.apache.lucene.index.IndexUpgrader -verbose ./INDEX
> IFD 0 [Mon Sep 16 18:25:53 PDT 2013; main]: init: current segments file is 
> "segments_5"; 
> deletionPolicy=org.apache.lucene.index.KeepOnlyLastCommitDeletionPolicy@42698403
> ...
> IW 0 [Mon Sep 16 18:25:53 PDT 2013; main]: at close: _2(4.4):C4

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5244) Full Search Result Export

2013-09-17 Thread Joel Bernstein (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13770364#comment-13770364
 ] 

Joel Bernstein commented on SOLR-5244:
--

The way it's currently written the ExportQParserPlugin will defeat all caching. 
So it always runs. PostFilters can't be cached in the FilterCache, so only the 
QueryResultCache is being bypassed here.

I see your point that if you need caching and the ability to use the request 
context, then you are out of luck. Not sure exactly how to solve this though.

> Full Search Result Export
> -
>
> Key: SOLR-5244
> URL: https://issues.apache.org/jira/browse/SOLR-5244
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 5.0
>Reporter: Joel Bernstein
>Priority: Minor
> Fix For: 5.0
>
> Attachments: SOLR-5244.patch
>
>
> It would be great if Solr could efficiently export entire search result sets 
> without scoring or ranking documents. This would allow external systems to 
> perform rapid bulk imports from Solr. It also provides a possible platform 
> for exporting results to support distributed join scenarios within Solr.
> This ticket provides a patch that has two pluggable components:
> 1) ExportQParserPlugin: which is a post filter that gathers a BitSet with 
> document results and does not delegate to ranking collectors. Instead it puts 
> the BitSet on the request context.
> 2) BinaryExportWriter: Is a output writer that iterates the BitSet and prints 
> the entire result as a binary stream. A header is provided at the beginning 
> of the stream so external clients can self configure.
> Note:
> These two components will be sufficient for a non-distributed environment. 
> For distributed export a new Request handler will need to be developed.
> After applying the patch and building the dist or example, you can register 
> the components through the following changes to solrconfig.xml
> Register export contrib libraries:
> 
>  
> Register the "export" queryParser with the following line:
>  
>  class="org.apache.solr.export.ExportQParserPlugin"/>
>  
> Register the "xbin" writer:
>  
>  class="org.apache.solr.export.BinaryExportWriter"/>
>  
> The following query will perform the export:
> {code}
> http://localhost:8983/solr/collection1/select?q=*:*&fq={!export}&wt=xbin&fl=join_i
> {code}
> Initial patch supports export of four data-types:
> 1) Single value trie int, long and float
> 2) Binary doc values.
> The numerics are currently exported from the FieldCache and the Binary doc 
> values can be in memory or on disk.
> Since this is designed to export very large result sets efficiently, stored 
> fields are not used for the export.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5082) Implement ie=charset parameter

2013-09-17 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13770363#comment-13770363
 ] 

ASF subversion and git services commented on SOLR-5082:
---

Commit 1524283 from [~thetaphi] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1524283 ]

SOLR-5082: Fix credits

> Implement ie=charset parameter
> --
>
> Key: SOLR-5082
> URL: https://issues.apache.org/jira/browse/SOLR-5082
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 4.4
>Reporter: Shawn Heisey
>Assignee: Uwe Schindler
>Priority: Minor
> Fix For: 4.5, 5.0
>
> Attachments: SOLR-5082.patch, SOLR-5082.patch
>
>
> Allow a user to send a query or update to Solr in a character set other than 
> UTF-8 and inform Solr what charset to use with an "ie" parameter, for input 
> encoding.  This was discussed in SOLR-4265 and SOLR-4283.
> Changing the default charset is a bad idea because distributed search 
> (SolrCloud) relies on UTF-8.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5082) Implement ie=charset parameter

2013-09-17 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13770362#comment-13770362
 ] 

ASF subversion and git services commented on SOLR-5082:
---

Commit 1524282 from [~thetaphi] in branch 'dev/trunk'
[ https://svn.apache.org/r1524282 ]

SOLR-5082: Fix credits

> Implement ie=charset parameter
> --
>
> Key: SOLR-5082
> URL: https://issues.apache.org/jira/browse/SOLR-5082
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 4.4
>Reporter: Shawn Heisey
>Assignee: Uwe Schindler
>Priority: Minor
> Fix For: 4.5, 5.0
>
> Attachments: SOLR-5082.patch, SOLR-5082.patch
>
>
> Allow a user to send a query or update to Solr in a character set other than 
> UTF-8 and inform Solr what charset to use with an "ie" parameter, for input 
> encoding.  This was discussed in SOLR-4265 and SOLR-4283.
> Changing the default charset is a bad idea because distributed search 
> (SolrCloud) relies on UTF-8.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Modifying the Lucene 4 index?

2013-09-17 Thread Kranti Parisa

got you. so having custom analyzers is not enough for your requirements?

Thanks & Regards,
Kranti K Parisa
http://www.linkedin.com/in/krantiparisa



On Tue, Sep 17, 2013 at 6:43 PM, Ralf Bierig  wrote:

> Hi Kranti,
>
> No, i am familiar with creating indices and adding / updating documents,
> creating fields and using analyzers in the preparation of the content. I am
> asking for making actual changes to the indexing mechanics underneath, like
> substituting Term with something else, like concepts or changing the codec.
> However, i know to little yet to be more specific and was just asking for
> general documentation of any kind that explains some larger changes on the
> index structure and Lucene's mechanics on the index.
>
> Best.
> Ralf
>
>
> On 17.09.2013 23:57, Kranti Parisa wrote:
>
>> Are you asking about the steps to update the existing document?
>>
>> Thanks & Regards,
>> Kranti K Parisa
>> http://www.linkedin.com/in/**krantiparisa
>>
>>
>>
>> On Tue, Sep 17, 2013 at 5:34 PM, Ralf Bierig > ralf.bie...@gmail.com>**> wrote:
>>
>> Hi all,
>>
>> is there any good documentation of how to change and modify the
>> index of
>> Lucene version 4 other than what is already on the website? Blogs,
>> papers, reports etc. or just a report on experience in some form ---
>> anything would be good.
>>
>> Based on an early-stage project, I would like to get first hand
>> experience in order to a) get an overview about what is possible /
>> what can be modified and
>> b) how difficult is it to do in general. I remember the Lucene
>> index being rather closed up in the early times, but this might be
>> very different now...
>>
>> Best,
>> Ralf
>>
>>
>> --**--**
>> -
>> To unsubscribe, e-mail: 
>> dev-unsubscribe@lucene.apache.**org
>> 
>> > >
>>
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>> 
>>
>>
>>
>
> --**--**-
> To unsubscribe, e-mail: 
> dev-unsubscribe@lucene.apache.**org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>

[jira] [Commented] (LUCENE-5222) TestExpressionSorts fails sometimes when using expression returning score

2013-09-17 Thread Ryan Ernst (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13770289#comment-13770289
 ] 

Ryan Ernst commented on LUCENE-5222:


A couple things I found:
* scorer.score() in ScoreFunctionValues is sometimes returning NaN
* If I add the last two boolean values explicitly as true (doDocScores and 
doMaxScore) to both search() and searchAfter() calls, the test succeeds

> TestExpressionSorts fails sometimes when using expression returning score
> -
>
> Key: LUCENE-5222
> URL: https://issues.apache.org/jira/browse/LUCENE-5222
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Ryan Ernst
>
> Jenkins picked this up.  Repeat with:
> {code}
> ant test  -Dtestcase=TestExpressionSorts -Dtests.method=testQueries 
> -Dtests.seed=115AD00ED89D9F7B -Dtests.multiplier=3 -Dtests.slow=true 
> -Dtests.locale=no_NO -Dtests.timezone=America/Nassau 
> -Dtests.file.encoding=US-ASCII
> {code}
> It appears to have to do with scoring, as removing the score sort from the 
> original sorts causes the tests to pass.  If you remove the possible 
> discrepancy between doDocScores and docMaxScore params to searcher.search, 
> then the test gets farther before failing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [JENKINS] Lucene-Solr-trunk-Linux (32bit/jdk1.8.0-ea-b106) - Build # 7511 - Failure!

2013-09-17 Thread Robert Muir

I think its "dawid's" jvm bug:

   [junit4] ERROR   1.61s J1 | TestDuelingCodecs.testEquals <<<
   [junit4]> Throwable #1: java.lang.AssertionError
   [junit4]>at org.apache.lucene.index.
ByteSliceReader.readByte(ByteSliceReader.java:73)
   [junit4]>at
org.apache.lucene.store.DataInput.readVInt(DataInput.java:108)
   [junit4]>at
org.apache.lucene.index.FreqProxTermsWriterPerField.flush(FreqProxTermsWriterPerField.java:500)

On Tue, Sep 17, 2013 at 5:59 PM, Policeman Jenkins Server
 wrote:
> Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/7511/
> Java: 32bit/jdk1.8.0-ea-b106 -server -XX:+UseG1GC
>
> 3 tests failed.
> REGRESSION:  org.apache.lucene.index.TestDuelingCodecs.testEquals
>
> Error Message:
> MockDirectoryWrapper: cannot close: there are still open locks: [write.lock]
>
> Stack Trace:
> java.lang.RuntimeException: MockDirectoryWrapper: cannot close: there are 
> still open locks: [write.lock]
> at 
> org.apache.lucene.store.MockDirectoryWrapper.close(MockDirectoryWrapper.java:626)
> at 
> org.apache.lucene.index.TestDuelingCodecs.tearDown(TestDuelingCodecs.java:111)
> at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:491)
> at 
> com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559)
> at 
> com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79)
> at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:795)
> at 
> org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
> at 
> org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
> at 
> org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
> at 
> com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
> at 
> org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
> at 
> org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
> at 
> org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
> at 
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> at 
> com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
> at 
> com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782)
> at 
> com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442)
> at 
> com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746)
> at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648)
> at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682)
> at 
> com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693)
> at 
> org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
> at 
> org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
> at 
> com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
> at 
> com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
> at 
> com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
> at 
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> at 
> org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43)
> at 
> org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
> at 
> org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
> at 
> org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55)
> at 
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> at 
> com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
> at java.lang.Thread.run(Thread.java:724)
>
>
> REGRESSION:  org.apache.lucene.index.TestNorms.testMaxByteNorms
>
> Error Message:
> docs out of order (58 <= 58 ) (docOut: 
>

[JENKINS] Lucene-Solr-trunk-Linux (32bit/jdk1.8.0-ea-b106) - Build # 7511 - Failure!

2013-09-17 Thread Policeman Jenkins Server

Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/7511/
Java: 32bit/jdk1.8.0-ea-b106 -server -XX:+UseG1GC

3 tests failed.
REGRESSION:  org.apache.lucene.index.TestDuelingCodecs.testEquals

Error Message:
MockDirectoryWrapper: cannot close: there are still open locks: [write.lock]

Stack Trace:
java.lang.RuntimeException: MockDirectoryWrapper: cannot close: there are still 
open locks: [write.lock]
at 
org.apache.lucene.store.MockDirectoryWrapper.close(MockDirectoryWrapper.java:626)
at 
org.apache.lucene.index.TestDuelingCodecs.tearDown(TestDuelingCodecs.java:111)
at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:491)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:795)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
at 
org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
at java.lang.Thread.run(Thread.java:724)


REGRESSION:  org.apache.lucene.index.TestNorms.testMaxByteNorms

Error Message:
docs out of order (58 <= 58 ) (docOut: 
MockIndexOutputWrapper(org.apache.lucene.store.FSDirectory$FSIndexOutput@ba8419))

Stack Trace:
org.apache.lucene.index.CorruptIndexException: docs out of order (58 <= 58 ) 
(docOut: 
MockIndexOutputWrapper(org.apache.lucene.store.FSDirectory$FSIndexOutput@ba8419))
at 
__randomizedtesting.SeedInfo.seed([2B65838ED85E2427:9F9A16B0F21250F1]:0)
at 
org.apache.lucene.codecs.lucene41.Lucene41PostingsWriter.startDoc(Lucene41PostingsWriter.java:297)
at 
org.apache.lucene.index.FreqProxTermsWriterPerField.flush(FreqProxTermsWriterPerField.java:470)
at 
org.apache.lucene.index.FreqProxTermsWriter.flush(FreqProxTermsWriter.java:85)
at org.apache.lucene.index.TermsHas

[jira] [Updated] (LUCENE-5223) IndexUpgrader (4.4.0) fails when -verbose is not set, or when any value of -dir-impl is specified

2013-09-17 Thread Hoss Man (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-5223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated LUCENE-5223:
-

Summary: IndexUpgrader (4.4.0) fails when -verbose is not set, or when any 
value of -dir-impl is specified  (was: IndexUpgrader (4.4.0) fails when 
-verbose is not set)

> IndexUpgrader (4.4.0) fails when -verbose is not set, or when any value of 
> -dir-impl is specified
> -
>
> Key: LUCENE-5223
> URL: https://issues.apache.org/jira/browse/LUCENE-5223
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/index
>Affects Versions: 4.4
> Environment: Linux
>Reporter: Bruce Karsh
>Assignee: Uwe Schindler
>Priority: Minor
> Attachments: LUCENE-5223.patch
>
>
> Here it fails because -verbose is not set:
> $ java -cp ./lucene-core-4.4-SNAPSHOT.jar 
> org.apache.lucene.index.IndexUpgrader ./INDEX
> Exception in thread "main" java.lang.IllegalArgumentException: printStream 
> must not be null
>   at 
> org.apache.lucene.index.IndexWriterConfig.setInfoStream(IndexWriterConfig.java:514)
>   at org.apache.lucene.index.IndexUpgrader.(IndexUpgrader.java:126)
>   at org.apache.lucene.index.IndexUpgrader.main(IndexUpgrader.java:109)
> Here it works with -verbose set:
> $ java -cp ./lucene-core-4.4-SNAPSHOT.jar 
> org.apache.lucene.index.IndexUpgrader -verbose ./INDEX
> IFD 0 [Mon Sep 16 18:25:53 PDT 2013; main]: init: current segments file is 
> "segments_5"; 
> deletionPolicy=org.apache.lucene.index.KeepOnlyLastCommitDeletionPolicy@42698403
> ...
> IW 0 [Mon Sep 16 18:25:53 PDT 2013; main]: at close: _2(4.4):C4

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-5223) IndexUpgrader (4.4.0) fails when -verbose is not set

2013-09-17 Thread Hoss Man (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-5223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated LUCENE-5223:
-

Attachment: LUCENE-5223.patch

As mentioned to uwe on irc, i took a stab at improving hte tests for 
IndexUpgrader -- we weren't doing any testing of the command line parsing 
logic, and only one of the class constructors was being tested.

this attached patch includes the new tests, as well as a quick attempt at 
fixing the issue reported.  it also fixes another issue discoverd by the test: 
"-dir-impl" aparently didn't work at all, because it would be interpreted as 
the name of the directory and then the subsequent option value would be 
considered bogus and cause the usage to be printed.

This patch still has some problems...

1) since it now randomly tests "verbose" mode, it's really verbose.  i don't 
have any good suggestions here other them to create a static variable that 
defaults to System.out when the code runs normally, but the tests could set it 
to some MockOutputStream in a @BeforeClass

2) there are some nocommit lines relate to how we randomize the "-dir-impl" 
option ... it's really kludgy now, but it's the best i could come up with w/o 
making changes to LuceneTestCase ... hopefully someone else has a suggestion.

3) as i mentioned on irc, i'm not convinced that IndexWriterConfig doesn't also 
need fixed...

* IndexWriterConfig.setInfoStream(InfoStream) javadocs say "If non-null, ... 
will be printed to this." but it throws an error if you try to set it to the 
value of null you get an error -- why doesn't it just implicitly use NO_OUTPUT 
if the arg is null? why don't the javadocs mention NO_OUTPUT?
* IndexWriterConfig.setInfoStream(PrintStream) javadocs just say it's a 
convinience wrapper using PrintStreamInfoStream, with no mention of null at all 
-- even if setInfoStream(InfoStream) is going to be a hard-ass about null, why 
can't setInfoStream(PrintStream) implicitly use NO_OUTPUT when it's arg is null?

> IndexUpgrader (4.4.0) fails when -verbose is not set
> 
>
> Key: LUCENE-5223
> URL: https://issues.apache.org/jira/browse/LUCENE-5223
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/index
>Affects Versions: 4.4
> Environment: Linux
>Reporter: Bruce Karsh
>Assignee: Uwe Schindler
>Priority: Minor
> Attachments: LUCENE-5223.patch
>
>
> Here it fails because -verbose is not set:
> $ java -cp ./lucene-core-4.4-SNAPSHOT.jar 
> org.apache.lucene.index.IndexUpgrader ./INDEX
> Exception in thread "main" java.lang.IllegalArgumentException: printStream 
> must not be null
>   at 
> org.apache.lucene.index.IndexWriterConfig.setInfoStream(IndexWriterConfig.java:514)
>   at org.apache.lucene.index.IndexUpgrader.(IndexUpgrader.java:126)
>   at org.apache.lucene.index.IndexUpgrader.main(IndexUpgrader.java:109)
> Here it works with -verbose set:
> $ java -cp ./lucene-core-4.4-SNAPSHOT.jar 
> org.apache.lucene.index.IndexUpgrader -verbose ./INDEX
> IFD 0 [Mon Sep 16 18:25:53 PDT 2013; main]: init: current segments file is 
> "segments_5"; 
> deletionPolicy=org.apache.lucene.index.KeepOnlyLastCommitDeletionPolicy@42698403
> ...
> IW 0 [Mon Sep 16 18:25:53 PDT 2013; main]: at close: _2(4.4):C4

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-5225) ToParentBlockJoinQuery don't accumulate the child doc ids and scores if ToParentBlockJoinCollector is not used

2013-09-17 Thread Martijn van Groningen (JIRA)

Martijn van Groningen created LUCENE-5225:
-

 Summary: ToParentBlockJoinQuery don't accumulate the child doc ids 
and scores if ToParentBlockJoinCollector is not used
 Key: LUCENE-5225
 URL: https://issues.apache.org/jira/browse/LUCENE-5225
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Martijn van Groningen
Assignee: Martijn van Groningen
Priority: Minor


The BlockJoinScorer temporarily saves the child docids and scores in two arrays 
(pendingChildDocs/pendingChildScores) for the current block (parent/child docs) 
being processed. This is only need for ToParentBlockJoinCollector and in the 
case that this collector isn't used then these two arrays shouldn't be used as 
well.

I've seen cases where only the ToParentBlockJoinQuery is used and there are 
many child docs (100k and up), in that case these two arrays are a waste of 
resources.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-4771) Query-time join collectors could maybe be more efficient

2013-09-17 Thread Martijn van Groningen (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-4771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13770130#comment-13770130
 ] 

Martijn van Groningen commented on LUCENE-4771:
---

Yes, that would work really nice. The TermsCollector can than just collect 
global ordinals in a BitSet impl and the TermsQuery can just iterate from this 
BitSet.

> Query-time join collectors could maybe be more efficient
> 
>
> Key: LUCENE-4771
> URL: https://issues.apache.org/jira/browse/LUCENE-4771
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/join
>Reporter: Robert Muir
> Attachments: LUCENE-4771_prototype.patch, 
> LUCENE-4771-prototype.patch, LUCENE-4771_prototype_without_bug.patch
>
>
> I was looking @ these collectors on LUCENE-4765 and I noticed:
> * SingleValued collector (SV) pulls FieldCache.getTerms and adds the bytes to 
> a bytesrefhash per-collect.
> * MultiValued  collector (MV) pulls FieldCache.getDocTermsOrds, but doesnt 
> use the ords, just looks up each value and adds the bytes per-collect.
> I think instead its worth investigating if SV should use getTermsIndex, and 
> both collectors just collect-up their per-segment ords in something like a 
> BitSet[maxOrd]. 
> When asked for the terms at the end in getCollectorTerms(), they could merge 
> these into one BytesRefHash.
> Of course, if you are going to turn around and execute the query against the 
> same searcher anyway (is this the typical case?), this could even be more 
> efficient: No need to hash or instantiate all the terms in memory, we could 
> do postpone the lookups to SeekingTermSetTermsEnum.accept()/nextSeekTerm() i 
> think... somehow :)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-5225) ToParentBlockJoinQuery don't accumulate the child doc ids and scores if ToParentBlockJoinCollector is not used

2013-09-17 Thread Martijn van Groningen (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-5225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Martijn van Groningen updated LUCENE-5225:
--

Attachment: LUCENE-5225.patch

Added boolean option to ToParentBlockJoinQuery that controls whether the two 
arrays are being used. 

Not really happy with this approach... too many if statements, perhaps there 
should be a dedicated ToParentBlockJoinQuery impl for 
ToParentBlockJoinCollector.

> ToParentBlockJoinQuery don't accumulate the child doc ids and scores if 
> ToParentBlockJoinCollector is not used
> --
>
> Key: LUCENE-5225
> URL: https://issues.apache.org/jira/browse/LUCENE-5225
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Martijn van Groningen
>Assignee: Martijn van Groningen
>Priority: Minor
> Attachments: LUCENE-5225.patch
>
>
> The BlockJoinScorer temporarily saves the child docids and scores in two 
> arrays (pendingChildDocs/pendingChildScores) for the current block 
> (parent/child docs) being processed. This is only need for 
> ToParentBlockJoinCollector and in the case that this collector isn't used 
> then these two arrays shouldn't be used as well.
> I've seen cases where only the ToParentBlockJoinQuery is used and there are 
> many child docs (100k and up), in that case these two arrays are a waste of 
> resources.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Modifying the Lucene 4 index?

2013-09-17 Thread Ralf Bierig


Hi Kranti,

No, i am familiar with creating indices and adding / updating documents, 
creating fields and using analyzers in the preparation of the content. I 
am asking for making actual changes to the indexing mechanics 
underneath, like substituting Term with something else, like concepts or 
changing the codec. However, i know to little yet to be more specific 
and was just asking for general documentation of any kind that explains 
some larger changes on the index structure and Lucene's mechanics on the 
index.


Best.
Ralf

On 17.09.2013 23:57, Kranti Parisa wrote:

Are you asking about the steps to update the existing document?

Thanks & Regards,
Kranti K Parisa
http://www.linkedin.com/in/krantiparisa



On Tue, Sep 17, 2013 at 5:34 PM, Ralf Bierig > wrote:


Hi all,

is there any good documentation of how to change and modify the
index of
Lucene version 4 other than what is already on the website? Blogs,
papers, reports etc. or just a report on experience in some form ---
anything would be good.

Based on an early-stage project, I would like to get first hand
experience in order to a) get an overview about what is possible /
what can be modified and
b) how difficult is it to do in general. I remember the Lucene
index being rather closed up in the early times, but this might be
very different now...

Best,
Ralf


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org

For additional commands, e-mail: dev-h...@lucene.apache.org






-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Modifying the Lucene 4 index?

2013-09-17 Thread Kranti Parisa

Are you asking about the steps to update the existing document?

Thanks & Regards,
Kranti K Parisa
http://www.linkedin.com/in/krantiparisa



On Tue, Sep 17, 2013 at 5:34 PM, Ralf Bierig  wrote:

> Hi all,
>
> is there any good documentation of how to change and modify the index of
> Lucene version 4 other than what is already on the website? Blogs,
> papers, reports etc. or just a report on experience in some form ---
> anything would be good.
>
> Based on an early-stage project, I would like to get first hand
> experience in order to a) get an overview about what is possible / what
> can be modified and
> b) how difficult is it to do in general. I remember the Lucene index being
> rather closed up in the early times, but this might be very different now...
>
> Best,
> Ralf
>
>
> --**--**-
> To unsubscribe, e-mail: 
> dev-unsubscribe@lucene.apache.**org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>

[jira] [Created] (LUCENE-5224) org.apache.lucene.analysis.hunspell.HunspellDictionary should implement ICONV and OCONV lines in the affix file

2013-09-17 Thread George Rhoten (JIRA)

George Rhoten created LUCENE-5224:
-

 Summary: org.apache.lucene.analysis.hunspell.HunspellDictionary 
should implement ICONV and OCONV lines in the affix file
 Key: LUCENE-5224
 URL: https://issues.apache.org/jira/browse/LUCENE-5224
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/analysis
Affects Versions: 4.4, 4.0
Reporter: George Rhoten


There are some Hunspell dictionaries that need to emulate Unicode normalization 
and collation in order to get the correct stem of a word. The original Hunspell 
provides a way to do this with the ICONV and OCONV lines in the affix file. The 
Lucene HunspellDictionary ignores these lines right now.

Please support these keys in the affix file.

This bit of functionality is briefly described in the hunspell man page 
http://manpages.ubuntu.com/manpages/lucid/man4/hunspell.4.html

This functionality is practically required in order to use a Korean dictionary 
because you want only some of the Jamos of a Hangul character (grapheme 
cluster) when using stemming. Other languages will find this to be helpful 
functionality.

Here is an example for a .aff file:

{code}
ICONV 각 각
...
OCONV 각 각
{code}

Here is the same example escaped.

{code}
ICONV \uAC01 \u1100\u1161\u11A8
...
OCONV \u1100\u1161\u11A8 \uAC01
{code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-5211) StopFilterFactory docs do not advertise/explain hte "format" option

2013-09-17 Thread Hoss Man (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-5211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated LUCENE-5211:
-

Attachment: LUCENE-5211.code.patch
LUCENE-5211.stopfilecomments.patch

two patches to make it easier to review...

* patch that improves the StopFilterFactory javadocs to mention format, as well 
as improves the error handling of the format param (includes tests)
* patch that updates all the snowball formatted files with a comment pointing 
out hteneed to use format="snowball" with those files.

FWIW: the second patch was generated by the following perl script...

{code}
#!/usr/bin/perl -i -n

my $msg = q{NOTE: To use this file with StopFilterFactory, you must specify 
format="snowball"};
print $_;
if (m/This notice was added./) {
print " |\n | $msg\n";
}
{code}

Run as...
{{find -name \*.txt | xargs grep -l "This notice was added" | xargs 
~/tmp/lucene5211.note.in.snowballfiles.pl}}


> StopFilterFactory docs do not advertise/explain hte "format" option
> ---
>
> Key: LUCENE-5211
> URL: https://issues.apache.org/jira/browse/LUCENE-5211
> Project: Lucene - Core
>  Issue Type: Bug
>Affects Versions: 4.2
>Reporter: Hayden Muhl
>Assignee: Hoss Man
>Priority: Minor
> Attachments: LUCENE-5211.code.patch, 
> LUCENE-5211.stopfilecomments.patch
>
>
> StopFilterFactory supports a "format" option for controlling wether 
> "getWordSet" or "getSnowballWordSet" is used to parse the file, but this 
> option is not advertised and people can be confused by looking at the example 
> stopword files include in the releases (some of which are in the snoball 
> format w/ "|" comments) and try to use them w/o explicitly specifying 
> {{format="snowball"}} and silently get useless stopwords (that include the "| 
> comments" as literal portions of hte stopwrds.
> we need to better document the use of "format" and consider updating all of 
> the example stopword files we ship that are in the snowball format with a 
> note about the need to use {{format="snowball"}} with those files.
> {panel:title=Initial Bug Report}
> The StopFilterFactory builds a CharArraySet directly from the raw lines of 
> the supplied words file. This causes a problem when using the stop word files 
> supplied with the Solr/Lucene distribution. In particular, the comments in 
> those files get added to the CharArraySet. A line like this...
> ceci   |  this
> Should result in the string "ceci" being added to the CharArraySet, but "ceci 
>   |  this" is what actually gets added.
> Workaround: Remove all comments from stop word files you are using.
> Suggested fix: The StopFilterFactory should strip any comments, then strip 
> trailing whitespace. The stop word files supplied with the distribution 
> should be edited to conform to the supported comment format.
> {panel}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Assigned] (LUCENE-5223) IndexUpgrader (4.4.0) fails when -verbose is not set

2013-09-17 Thread Uwe Schindler (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-5223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler reassigned LUCENE-5223:
-

Assignee: Uwe Schindler

> IndexUpgrader (4.4.0) fails when -verbose is not set
> 
>
> Key: LUCENE-5223
> URL: https://issues.apache.org/jira/browse/LUCENE-5223
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/index
>Affects Versions: 4.4
> Environment: Linux
>Reporter: Bruce Karsh
>Assignee: Uwe Schindler
>Priority: Minor
>
> Here it fails because -verbose is not set:
> $ java -cp ./lucene-core-4.4-SNAPSHOT.jar 
> org.apache.lucene.index.IndexUpgrader ./INDEX
> Exception in thread "main" java.lang.IllegalArgumentException: printStream 
> must not be null
>   at 
> org.apache.lucene.index.IndexWriterConfig.setInfoStream(IndexWriterConfig.java:514)
>   at org.apache.lucene.index.IndexUpgrader.(IndexUpgrader.java:126)
>   at org.apache.lucene.index.IndexUpgrader.main(IndexUpgrader.java:109)
> Here it works with -verbose set:
> $ java -cp ./lucene-core-4.4-SNAPSHOT.jar 
> org.apache.lucene.index.IndexUpgrader -verbose ./INDEX
> IFD 0 [Mon Sep 16 18:25:53 PDT 2013; main]: init: current segments file is 
> "segments_5"; 
> deletionPolicy=org.apache.lucene.index.KeepOnlyLastCommitDeletionPolicy@42698403
> ...
> IW 0 [Mon Sep 16 18:25:53 PDT 2013; main]: at close: _2(4.4):C4

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-5243) killing a shard in one collection can result in leader election in a different collection

2013-09-17 Thread Mark Miller (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-5243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller updated SOLR-5243:
--

Attachment: SOLR-5243.patch

Another patch.

Had to remove an assert in unregister, because it actually did not make sense 
on a fail core start - but this bug was hiding that.

> killing a shard in one collection can result in leader election in a 
> different collection
> -
>
> Key: SOLR-5243
> URL: https://issues.apache.org/jira/browse/SOLR-5243
> Project: Solr
>  Issue Type: Bug
>Reporter: Yonik Seeley
>Priority: Blocker
> Fix For: 4.5, 5.0
>
> Attachments: SOLR-5243.patch, SOLR-5243.patch
>
>
> Discovered while doing some more ad-hoc testing... if I create two 
> collections with the same shard name and then kill the leader in one, it can 
> sometimes cause a leader election in the other (leaving the first leaderless).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5082) Implement ie=charset parameter

2013-09-17 Thread Uwe Schindler (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13770034#comment-13770034
 ] 

Uwe Schindler commented on SOLR-5082:
-

Hi David. Sorry for adding credit to you. The credit was meant to Shawn Heisey, 
so maybe you can add him instead. I can also do this.

Uwe

> Implement ie=charset parameter
> --
>
> Key: SOLR-5082
> URL: https://issues.apache.org/jira/browse/SOLR-5082
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: 4.4
>Reporter: Shawn Heisey
>Assignee: Uwe Schindler
>Priority: Minor
> Fix For: 4.5, 5.0
>
> Attachments: SOLR-5082.patch, SOLR-5082.patch
>
>
> Allow a user to send a query or update to Solr in a character set other than 
> UTF-8 and inform Solr what charset to use with an "ie" parameter, for input 
> encoding.  This was discussed in SOLR-4265 and SOLR-4283.
> Changing the default charset is a bad idea because distributed search 
> (SolrCloud) relies on UTF-8.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Modifying the Lucene 4 index?

2013-09-17 Thread Ralf Bierig


Hi all,

is there any good documentation of how to change and modify the index of
Lucene version 4 other than what is already on the website? Blogs,
papers, reports etc. or just a report on experience in some form ---
anything would be good.

Based on an early-stage project, I would like to get first hand
experience in order to a) get an overview about what is possible / what can be 
modified and
b) how difficult is it to do in general. I remember the Lucene index being 
rather closed up in the early times, but this might be very different now...

Best,
Ralf


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-5238) Update the .css for the Ref Guide

2013-09-17 Thread Hoss Man (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-5238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man resolved SOLR-5238.


Resolution: Fixed

no problem: h3 size block removed.

> Update the .css for the Ref Guide
> -
>
> Key: SOLR-5238
> URL: https://issues.apache.org/jira/browse/SOLR-5238
> Project: Solr
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Cassandra Targett
>Assignee: Hoss Man
>Priority: Minor
> Fix For: 4.5
>
> Attachments: SolrRefGuide.css
>
>
> I put a custom .css into the Ref Guide before it was uploaded. I was going 
> for something parallel to the Solr website, but only spent a little time with 
> it. In terms of readibility of the text online, it's not great, which is 
> putting it nicely. It's also very difficult to differentiate between "normal" 
> text and monospaced text to indicate a command, program name, etc.
> I'm attaching a new .css that can simply replace what's already in the Space 
> Admin -> Stylesheet section. I did several things with this:
> * cleaned up the .css generally, consolidated some repetitive sections, and 
> added more comments in case future changes are desired.
> * changed the font throughout to Helvetica, Arial, or sans-serif and updated 
> the color to a slightly less strong black.
> * changed the monospace font to match the font used in the code boxes 
> (Consolas) and made them the same color as the text (default is a lot 
> lighter).
> * added a bit more space between lines.
> * removed the negative margin in the header/breadcrumbs to give it a bit more 
> space.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5244) Full Search Result Export

2013-09-17 Thread Kranti Parisa (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13769993#comment-13769993
 ] 

Kranti Parisa commented on SOLR-5244:
-

Joel,

In one of my emails to the dev-group I have asked the following question
-
I am sure this doesn't exist today, but just wondering about your thoughts.

When we use Join queries (first time or with out hitting Filter Cache) and say 
debug=true, we are able to see good amount of debug info in the response. 

Do we have any plans of supporting this debug info even when we hit the Filter 
Cache. I believe that this information will be helpful with/without hitting the 
caches.

Consider this use case: in production, a request comes in and builds the Filter 
Cache for a Join Query and at some point of time we want to run that query 
manually with debug turned on, we can't see a bunch of very useful 
stats/numbers. 
---

ExportQParserPlugin will save the BitSet into the request context even when we 
hit the caches? The idea of saving the BitSets into the request context is very 
helpful when we do Joins. Because, when we write the response, for each 
document we would want to specify what all the cores this document was matched 
for the given criteria/filters

So, I think it is also a good idea to support an extra local_param in the new 
join implementations (SOLR-4787) say matchFlag="true" and if its true save the 
BitSet into the request context (even in the case of a cache hit). by default 
it can be "false" so that we don't need to save the BitSet in memory

Example response:

111
my title

coreA
coreB



I was able to achieve that saving the BitSet into the join debug info. but was 
not able to get the point about cache hits. I think your idea of saving that 
into the request context makes more sense.

Your thoughts?

> Full Search Result Export
> -
>
> Key: SOLR-5244
> URL: https://issues.apache.org/jira/browse/SOLR-5244
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 5.0
>Reporter: Joel Bernstein
>Priority: Minor
> Fix For: 5.0
>
> Attachments: SOLR-5244.patch
>
>
> It would be great if Solr could efficiently export entire search result sets 
> without scoring or ranking documents. This would allow external systems to 
> perform rapid bulk imports from Solr. It also provides a possible platform 
> for exporting results to support distributed join scenarios within Solr.
> This ticket provides a patch that has two pluggable components:
> 1) ExportQParserPlugin: which is a post filter that gathers a BitSet with 
> document results and does not delegate to ranking collectors. Instead it puts 
> the BitSet on the request context.
> 2) BinaryExportWriter: Is a output writer that iterates the BitSet and prints 
> the entire result as a binary stream. A header is provided at the beginning 
> of the stream so external clients can self configure.
> Note:
> These two components will be sufficient for a non-distributed environment. 
> For distributed export a new Request handler will need to be developed.
> After applying the patch and building the dist or example, you can register 
> the components through the following changes to solrconfig.xml
> Register export contrib libraries:
> 
>  
> Register the "export" queryParser with the following line:
>  
>  class="org.apache.solr.export.ExportQParserPlugin"/>
>  
> Register the "xbin" writer:
>  
>  class="org.apache.solr.export.BinaryExportWriter"/>
>  
> The following query will perform the export:
> {code}
> http://localhost:8983/solr/collection1/select?q=*:*&fq={!export}&wt=xbin&fl=join_i
> {code}
> Initial patch supports export of four data-types:
> 1) Single value trie int, long and float
> 2) Binary doc values.
> The numerics are currently exported from the FieldCache and the Binary doc 
> values can be in memory or on disk.
> Since this is designed to export very large result sets efficiently, stored 
> fields are not used for the export.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Documentation on the new compressed DocIdSet implementations

2013-09-17 Thread Michael McCandless

On Tue, Sep 17, 2013 at 1:24 PM, Smiley, David W.  wrote:
> Lucene has got some new compressed DocIdSet implementations that are
> technically very interesting and exciting: PForDeltaDocIdSet, WAH8DocIdSet,
> EliasFanoDocIdSet, … any more?  Yet it's difficult (at least for me) to
> understand their pros/cons to know when to pick amongst them.  They all seem
> great yet why do we have 3?  Only one is actually used by Lucene itself —
> WAH8DocIdSet in CachingWrapperFilter.   Javadocs are hit & miss; the JIRA
> issues have lots of fascinating background but it's time consuming to
> distill.  I think it would be very useful to summarily document key
> characteristics on class level javadocs — not so much implementation details
> but information to help a user choose it versus another.  And as a bonus a
> table perhaps showing relative performance characteristics in package-level
> javadocs.
>
> Related to this is, I'm wondering does it make sense for a codec's postings
> (assuming no doc freq & no positions?) to be implemented as a serialized
> version of one of these compressed doc id sets?  I think it would be really
> great, not just for compression but also because it might support
> Terms.advance() since some of these compressed formats have indexes.

I think it makes sense; there's an issue for it: LUCENE-5052.  Also,
LUCENE-5123 (invert the PostingsFormat writing APIs) should make it
easier, since you can iterate the postings for each term more than
once, e.g. to decide in the first pass whether to encode using a
bitset or not ...

Mike McCandless

http://blog.mikemccandless.com

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-5221) SimilarityBase.computeNorm is inconsistent with TFIDFSimilarity

2013-09-17 Thread Yubin Kim (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-5221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yubin Kim updated LUCENE-5221:
--

Attachment: LUCENE-5221.patch

Here's the patch.

> SimilarityBase.computeNorm is inconsistent with TFIDFSimilarity
> ---
>
> Key: LUCENE-5221
> URL: https://issues.apache.org/jira/browse/LUCENE-5221
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/search
>Affects Versions: 4.4
>Reporter: Yubin Kim
>  Labels: normalize, search, similarity
> Attachments: LUCENE-5221.patch
>
>
> {{SimilarityBase.computeNorm}} Javadoc indicates that the doc length should 
> be encoded in the same way as {{TFIDFSimilarity}}. However, when 
> {{discountOverlaps}} is {{false}}, what gets encoded is 
> {{SmallFloat.floatToByte315((boost / (float) Math.sqrt(docLen / boost)));}} 
> rather than  {{SmallFloat.floatToByte315((boost / (float) 
> Math.sqrt(length)));}} due to the extra {{/ state.getBoost()}} term in 
> {{SimilarityBase.computeNorm}}: 
>   final float numTerms;
>   if (discountOverlaps)
> numTerms = state.getLength() - state.getNumOverlap();
>   else
> numTerms = state.getLength() */ state.getBoost();*
>   return encodeNormValue(state.getBoost(), numTerms);

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5244) Full Search Result Export

2013-09-17 Thread Mark Miller (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13769990#comment-13769990
 ] 

Mark Miller commented on SOLR-5244:
---

bq. And you would have to deal with huge result sets and limited ram of course.

When I was thinking about this before, I was thinking about perhaps the deep 
paging stuff that was done as a cursor - but I have not looked at that feature 
at all yet.

> Full Search Result Export
> -
>
> Key: SOLR-5244
> URL: https://issues.apache.org/jira/browse/SOLR-5244
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 5.0
>Reporter: Joel Bernstein
>Priority: Minor
> Fix For: 5.0
>
> Attachments: SOLR-5244.patch
>
>
> It would be great if Solr could efficiently export entire search result sets 
> without scoring or ranking documents. This would allow external systems to 
> perform rapid bulk imports from Solr. It also provides a possible platform 
> for exporting results to support distributed join scenarios within Solr.
> This ticket provides a patch that has two pluggable components:
> 1) ExportQParserPlugin: which is a post filter that gathers a BitSet with 
> document results and does not delegate to ranking collectors. Instead it puts 
> the BitSet on the request context.
> 2) BinaryExportWriter: Is a output writer that iterates the BitSet and prints 
> the entire result as a binary stream. A header is provided at the beginning 
> of the stream so external clients can self configure.
> Note:
> These two components will be sufficient for a non-distributed environment. 
> For distributed export a new Request handler will need to be developed.
> After applying the patch and building the dist or example, you can register 
> the components through the following changes to solrconfig.xml
> Register export contrib libraries:
> 
>  
> Register the "export" queryParser with the following line:
>  
>  class="org.apache.solr.export.ExportQParserPlugin"/>
>  
> Register the "xbin" writer:
>  
>  class="org.apache.solr.export.BinaryExportWriter"/>
>  
> The following query will perform the export:
> {code}
> http://localhost:8983/solr/collection1/select?q=*:*&fq={!export}&wt=xbin&fl=join_i
> {code}
> Initial patch supports export of four data-types:
> 1) Single value trie int, long and float
> 2) Binary doc values.
> The numerics are currently exported from the FieldCache and the Binary doc 
> values can be in memory or on disk.
> Since this is designed to export very large result sets efficiently, stored 
> fields are not used for the export.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5244) Full Search Result Export

2013-09-17 Thread Mark Miller (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13769988#comment-13769988
 ] 

Mark Miller commented on SOLR-5244:
---

bq. I think scoring and ranking will be difficult because the priority queues 
will take up too much memory and be too slow.

I agree that it will be more difficult and certainly a slower operation, but if 
you are looking to export an entire results list, 'slow' is very relative and 
use case dependent.

My main interest in this is in 1 - it's a pretty common want. Using search for 
sub-selection that can be processed by something else.

I think it would be great if that sub selection could come out ranked though - 
I think that is also valuable for 1 - and while the other system could somehow 
rank, it would have to dupe the lucene logic to do it as well. It would be nice 
to just be able to dump either way and make your decision based on use case and 
speed reqs. It's obviously going to be much slower though. And you would have 
to deal with huge result sets and limited ram of course.

> Full Search Result Export
> -
>
> Key: SOLR-5244
> URL: https://issues.apache.org/jira/browse/SOLR-5244
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 5.0
>Reporter: Joel Bernstein
>Priority: Minor
> Fix For: 5.0
>
> Attachments: SOLR-5244.patch
>
>
> It would be great if Solr could efficiently export entire search result sets 
> without scoring or ranking documents. This would allow external systems to 
> perform rapid bulk imports from Solr. It also provides a possible platform 
> for exporting results to support distributed join scenarios within Solr.
> This ticket provides a patch that has two pluggable components:
> 1) ExportQParserPlugin: which is a post filter that gathers a BitSet with 
> document results and does not delegate to ranking collectors. Instead it puts 
> the BitSet on the request context.
> 2) BinaryExportWriter: Is a output writer that iterates the BitSet and prints 
> the entire result as a binary stream. A header is provided at the beginning 
> of the stream so external clients can self configure.
> Note:
> These two components will be sufficient for a non-distributed environment. 
> For distributed export a new Request handler will need to be developed.
> After applying the patch and building the dist or example, you can register 
> the components through the following changes to solrconfig.xml
> Register export contrib libraries:
> 
>  
> Register the "export" queryParser with the following line:
>  
>  class="org.apache.solr.export.ExportQParserPlugin"/>
>  
> Register the "xbin" writer:
>  
>  class="org.apache.solr.export.BinaryExportWriter"/>
>  
> The following query will perform the export:
> {code}
> http://localhost:8983/solr/collection1/select?q=*:*&fq={!export}&wt=xbin&fl=join_i
> {code}
> Initial patch supports export of four data-types:
> 1) Single value trie int, long and float
> 2) Binary doc values.
> The numerics are currently exported from the FieldCache and the Binary doc 
> values can be in memory or on disk.
> Since this is designed to export very large result sets efficiently, stored 
> fields are not used for the export.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5244) Full Search Result Export

2013-09-17 Thread Joel Bernstein (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13769972#comment-13769972
 ] 

Joel Bernstein commented on SOLR-5244:
--

I think scoring and ranking will be difficult because the priority queues will 
take up too much memory and be too slow. 

There are two use cases that I was mainly thinking of:

1) Providing a searchable data source for Hadoop or other aggregation engines. 
So Map jobs could search Solr to bring in millions of records very quickly.

2) Doing distributed joins. Allowing a remote search engine to pull data very 
quickly from Solr so it can filter local search results.  

> Full Search Result Export
> -
>
> Key: SOLR-5244
> URL: https://issues.apache.org/jira/browse/SOLR-5244
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 5.0
>Reporter: Joel Bernstein
>Priority: Minor
> Fix For: 5.0
>
> Attachments: SOLR-5244.patch
>
>
> It would be great if Solr could efficiently export entire search result sets 
> without scoring or ranking documents. This would allow external systems to 
> perform rapid bulk imports from Solr. It also provides a possible platform 
> for exporting results to support distributed join scenarios within Solr.
> This ticket provides a patch that has two pluggable components:
> 1) ExportQParserPlugin: which is a post filter that gathers a BitSet with 
> document results and does not delegate to ranking collectors. Instead it puts 
> the BitSet on the request context.
> 2) BinaryExportWriter: Is a output writer that iterates the BitSet and prints 
> the entire result as a binary stream. A header is provided at the beginning 
> of the stream so external clients can self configure.
> Note:
> These two components will be sufficient for a non-distributed environment. 
> For distributed export a new Request handler will need to be developed.
> After applying the patch and building the dist or example, you can register 
> the components through the following changes to solrconfig.xml
> Register export contrib libraries:
> 
>  
> Register the "export" queryParser with the following line:
>  
>  class="org.apache.solr.export.ExportQParserPlugin"/>
>  
> Register the "xbin" writer:
>  
>  class="org.apache.solr.export.BinaryExportWriter"/>
>  
> The following query will perform the export:
> {code}
> http://localhost:8983/solr/collection1/select?q=*:*&fq={!export}&wt=xbin&fl=join_i
> {code}
> Initial patch supports export of four data-types:
> 1) Single value trie int, long and float
> 2) Binary doc values.
> The numerics are currently exported from the FieldCache and the Binary doc 
> values can be in memory or on disk.
> Since this is designed to export very large result sets efficiently, stored 
> fields are not used for the export.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5234) Allow SolrResourceLoader to load resources from URLs

2013-09-17 Thread Alan Woodward (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13769866#comment-13769866
 ] 

Alan Woodward commented on SOLR-5234:
-

I had more in mind things like resources specified in solrconfig.xml, files 
used to configure lemmatizers or whatever, rather than the whole solrconfig.xml 
itself.  Although I suppose it would work for that too.  But generally, you'd 
only be fetching things on core reload, which is an expensive operation anyway.

> Allow SolrResourceLoader to load resources from URLs
> 
>
> Key: SOLR-5234
> URL: https://issues.apache.org/jira/browse/SOLR-5234
> Project: Solr
>  Issue Type: Improvement
>Reporter: Alan Woodward
>Assignee: Alan Woodward
>Priority: Minor
> Attachments: SOLR-5234.patch, SOLR-5234.patch
>
>
> This would allow multiple solr instance to share large configuration files.  
> It would also help resolve problems caused by attempting to store >1Mb files 
> in zookeeper.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Assigned] (SOLR-5193) Need better docs about Atomic Updates + Optimistic Concurrency

2013-09-17 Thread Cassandra Targett (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-5193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cassandra Targett reassigned SOLR-5193:
---

Assignee: Cassandra Targett

> Need better docs about Atomic Updates + Optimistic Concurrency
> --
>
> Key: SOLR-5193
> URL: https://issues.apache.org/jira/browse/SOLR-5193
> Project: Solr
>  Issue Type: Improvement
>  Components: documentation
>Reporter: HeXin
>Assignee: Cassandra Targett
>Priority: Minor
>
> Atomic Update features are mentioned in passing in the Ref Guide, but not 
> explained as well as on the wiki (and doesn't refer to it as "atomic updates" 
> so it's hard to find)
> https://cwiki.apache.org/confluence/display/solr/Uploading+Data+with+Index+Handlers#UploadingDatawithIndexHandlers-UpdatingOnlyPartofaDocument
> https://wiki.apache.org/solr/Atomic_Updates
> In particula, The Optimistic Concurrency options on atomic updatesare not 
> really documented at all.
> ---
> (Initial issue description requested a "check" option that could be used on 
> the uniqueKey field when doing atomic updates that would cause hte updated to 
> fail if the uniqueKey specified did not exist -- but this type of feature is 
> not needed since we already support the equivilent optimisitic concurrency 
> garuntees )

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-5223) IndexUpgrader (4.4.0) fails when -verbose is not set

2013-09-17 Thread Bruce Karsh (JIRA)

Bruce Karsh created LUCENE-5223:
---

 Summary: IndexUpgrader (4.4.0) fails when -verbose is not set
 Key: LUCENE-5223
 URL: https://issues.apache.org/jira/browse/LUCENE-5223
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/index
Affects Versions: 4.4
 Environment: Linux
Reporter: Bruce Karsh
Priority: Minor


Here it fails because -verbose is not set:

$ java -cp ./lucene-core-4.4-SNAPSHOT.jar org.apache.lucene.index.IndexUpgrader 
./INDEX
Exception in thread "main" java.lang.IllegalArgumentException: printStream must 
not be null
at 
org.apache.lucene.index.IndexWriterConfig.setInfoStream(IndexWriterConfig.java:514)
at org.apache.lucene.index.IndexUpgrader.(IndexUpgrader.java:126)
at org.apache.lucene.index.IndexUpgrader.main(IndexUpgrader.java:109)

Here it works with -verbose set:

$ java -cp ./lucene-core-4.4-SNAPSHOT.jar org.apache.lucene.index.IndexUpgrader 
-verbose ./INDEX
IFD 0 [Mon Sep 16 18:25:53 PDT 2013; main]: init: current segments file is 
"segments_5"; 
deletionPolicy=org.apache.lucene.index.KeepOnlyLastCommitDeletionPolicy@42698403

...

IW 0 [Mon Sep 16 18:25:53 PDT 2013; main]: at close: _2(4.4):C4

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5243) killing a shard in one collection can result in leader election in a different collection

2013-09-17 Thread Mark Miller (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13769927#comment-13769927
 ] 

Mark Miller commented on SOLR-5243:
---

Hmm..not all test passing with that patch yet.

> killing a shard in one collection can result in leader election in a 
> different collection
> -
>
> Key: SOLR-5243
> URL: https://issues.apache.org/jira/browse/SOLR-5243
> Project: Solr
>  Issue Type: Bug
>Reporter: Yonik Seeley
>Priority: Blocker
> Fix For: 4.5, 5.0
>
> Attachments: SOLR-5243.patch
>
>
> Discovered while doing some more ad-hoc testing... if I create two 
> collections with the same shard name and then kill the leader in one, it can 
> sometimes cause a leader election in the other (leaving the first leaderless).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [jira] [Updated] (SOLR-2345) Extend geodist() to support MultiValued lat long field

2013-09-17 Thread Pradeep Pujari

Hi Bill,

We are using SOLR-2155 patch in solr3.6 for quite long time. This works well 
for us. We are currently migrating to Solr 4.0. Is this SOLR-2345 is equivalent 
of  SOLR-2155 in terms of functionality?

Thanks



 From: David Smiley (JIRA) 
To: dev@lucene.apache.org 
Sent: Tuesday, July 16, 2013 10:20 PM
Subject: [jira] [Updated] (SOLR-2345) Extend geodist() to support MultiValued 
lat long field
 


     [ 
https://issues.apache.org/jira/browse/SOLR-2345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Smiley updated SOLR-2345:
---

    Attachment: SOLR-2345_geodist_support_for_RPT.patch

The attached patch implements the desired functionality.  It depends on 
LUCENE-5118 being applied first.

This hack wasn't too terrible after all.
                
> Extend geodist() to support MultiValued lat long field
> --
>
>                 Key: SOLR-2345
>                 URL: https://issues.apache.org/jira/browse/SOLR-2345
>             Project: Solr
>          Issue Type: New Feature
>          Components: spatial
>            Reporter: Bill Bell
>            Assignee: David Smiley
>             Fix For: 4.5
>
>         Attachments: SOLR-2345_geodist_refactor.patch, 
>SOLR-2345_geodist_support_for_RPT.patch
>
>
> Extend geodist() and {!geofilt} to support a multiValued lat,long field 
> without using geohash.
> sort=geodist() asc

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5243) killing a shard in one collection can result in leader election in a different collection

2013-09-17 Thread Mark Miller (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13769913#comment-13769913
 ] 

Mark Miller commented on SOLR-5243:
---

It would seem this only happens if you have the same core node name in 
different collections.

> killing a shard in one collection can result in leader election in a 
> different collection
> -
>
> Key: SOLR-5243
> URL: https://issues.apache.org/jira/browse/SOLR-5243
> Project: Solr
>  Issue Type: Bug
>Reporter: Yonik Seeley
>Priority: Blocker
> Fix For: 4.5, 5.0
>
> Attachments: SOLR-5243.patch
>
>
> Discovered while doing some more ad-hoc testing... if I create two 
> collections with the same shard name and then kill the leader in one, it can 
> sometimes cause a leader election in the other (leaving the first leaderless).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5243) killing a shard in one collection can result in leader election in a different collection

2013-09-17 Thread Yonik Seeley (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13769842#comment-13769842
 ] 

Yonik Seeley commented on SOLR-5243:


It appears like the election process is OK... it's the unload that results in 
the wrong ephemeral node being removed.

> killing a shard in one collection can result in leader election in a 
> different collection
> -
>
> Key: SOLR-5243
> URL: https://issues.apache.org/jira/browse/SOLR-5243
> Project: Solr
>  Issue Type: Bug
>Reporter: Yonik Seeley
>Priority: Blocker
> Fix For: 4.5, 5.0
>
>
> Discovered while doing some more ad-hoc testing... if I create two 
> collections with the same shard name and then kill the leader in one, it can 
> sometimes cause a leader election in the other (leaving the first leaderless).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [JENKINS] Lucene-Solr-trunk-Linux (64bit/jdk1.8.0-ea-b106) - Build # 7485 - Failure!

2013-09-17 Thread Shai Erera

Ok this is one tricky bug. Without prints, it took 2000 iters to reproduce
on Windows, 700 on Linux.
But the more prints I add, the faster it reproduces. So anti-heisen
(fortunately!)

Anyway, I think I found the cause, running some tests and exploring how
best to solve it. Just wanted to update the status.

Shai


On Mon, Sep 16, 2013 at 10:34 AM, Shai Erera  wrote:

> I failed to reproduce with the reported seed, master seed, random seeds
> ... all with iters. I'll dig.
>
> Shai
>
>
> On Mon, Sep 16, 2013 at 9:07 AM, Policeman Jenkins Server <
> jenk...@thetaphi.de> wrote:
>
>> Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/7485/
>> Java: 64bit/jdk1.8.0-ea-b106 -XX:-UseCompressedOops -XX:+UseSerialGC
>>
>> 1 tests failed.
>> REGRESSION:
>>  org.apache.lucene.index.TestNumericDocValuesUpdates.testManyReopensAndFields
>>
>> Error Message:
>> invalid value for doc=351, field=f1 expected:<15> but was:<14>
>>
>> Stack Trace:
>> java.lang.AssertionError: invalid value for doc=351, field=f1
>> expected:<15> but was:<14>
>> at
>> __randomizedtesting.SeedInfo.seed([5E1E0079E35D52E:331D82281FC0B632]:0)
>> at org.junit.Assert.fail(Assert.java:93)
>> at org.junit.Assert.failNotEquals(Assert.java:647)
>> at org.junit.Assert.assertEquals(Assert.java:128)
>> at org.junit.Assert.assertEquals(Assert.java:472)
>> at
>> org.apache.lucene.index.TestNumericDocValuesUpdates.testManyReopensAndFields(TestNumericDocValuesUpdates.java:757)
>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>> at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> at java.lang.reflect.Method.invoke(Method.java:491)
>> at
>> com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559)
>> at
>> com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79)
>> at
>> com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737)
>> at
>> com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773)
>> at
>> com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787)
>> at
>> org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
>> at
>> org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
>> at
>> org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
>> at
>> com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
>> at
>> org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
>> at
>> org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
>> at
>> org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
>> at
>> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
>> at
>> com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
>> at
>> com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782)
>> at
>> com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442)
>> at
>> com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746)
>> at
>> com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648)
>> at
>> com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682)
>> at
>> com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693)
>> at
>> org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
>> at
>> org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
>> at
>> com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
>> at
>> com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
>> at
>> com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
>> at
>> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
>> at
>> org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43)
>> at
>> org.apache.lucene.util.TestRuleMarkFailure$1

[jira] [Commented] (LUCENE-5221) SimilarityBase.computeNorm is inconsistent with TFIDFSimilarity

2013-09-17 Thread Robert Muir (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13769908#comment-13769908
 ] 

Robert Muir commented on LUCENE-5221:
-

+1, thank you for reporting this!


> SimilarityBase.computeNorm is inconsistent with TFIDFSimilarity
> ---
>
> Key: LUCENE-5221
> URL: https://issues.apache.org/jira/browse/LUCENE-5221
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/search
>Affects Versions: 4.4
>Reporter: Yubin Kim
>  Labels: normalize, search, similarity
>
> {{SimilarityBase.computeNorm}} Javadoc indicates that the doc length should 
> be encoded in the same way as {{TFIDFSimilarity}}. However, when 
> {{discountOverlaps}} is {{false}}, what gets encoded is 
> {{SmallFloat.floatToByte315((boost / (float) Math.sqrt(docLen / boost)));}} 
> rather than  {{SmallFloat.floatToByte315((boost / (float) 
> Math.sqrt(length)));}} due to the extra {{/ state.getBoost()}} term in 
> {{SimilarityBase.computeNorm}}: 
>   final float numTerms;
>   if (discountOverlaps)
> numTerms = state.getLength() - state.getNumOverlap();
>   else
> numTerms = state.getLength() */ state.getBoost();*
>   return encodeNormValue(state.getBoost(), numTerms);

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5243) killing a shard in one collection can result in leader election in a different collection

2013-09-17 Thread Mark Miller (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13769905#comment-13769905
 ] 

Mark Miller commented on SOLR-5243:
---

It looks like Sami was storing the electionContexts by coreNodeName.

> killing a shard in one collection can result in leader election in a 
> different collection
> -
>
> Key: SOLR-5243
> URL: https://issues.apache.org/jira/browse/SOLR-5243
> Project: Solr
>  Issue Type: Bug
>Reporter: Yonik Seeley
>Priority: Blocker
> Fix For: 4.5, 5.0
>
>
> Discovered while doing some more ad-hoc testing... if I create two 
> collections with the same shard name and then kill the leader in one, it can 
> sometimes cause a leader election in the other (leaving the first leaderless).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-5193) Need better docs about Atomic Updates + Optimistic Concurrency

2013-09-17 Thread Cassandra Targett (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-5193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cassandra Targett updated SOLR-5193:


Fix Version/s: 4.5

> Need better docs about Atomic Updates + Optimistic Concurrency
> --
>
> Key: SOLR-5193
> URL: https://issues.apache.org/jira/browse/SOLR-5193
> Project: Solr
>  Issue Type: Improvement
>  Components: documentation
>Reporter: HeXin
>Assignee: Cassandra Targett
>Priority: Minor
> Fix For: 4.5
>
>
> Atomic Update features are mentioned in passing in the Ref Guide, but not 
> explained as well as on the wiki (and doesn't refer to it as "atomic updates" 
> so it's hard to find)
> https://cwiki.apache.org/confluence/display/solr/Uploading+Data+with+Index+Handlers#UploadingDatawithIndexHandlers-UpdatingOnlyPartofaDocument
> https://wiki.apache.org/solr/Atomic_Updates
> In particula, The Optimistic Concurrency options on atomic updatesare not 
> really documented at all.
> ---
> (Initial issue description requested a "check" option that could be used on 
> the uniqueKey field when doing atomic updates that would cause hte updated to 
> fail if the uniqueKey specified did not exist -- but this type of feature is 
> not needed since we already support the equivilent optimisitic concurrency 
> garuntees )

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5244) Full Search Result Export

2013-09-17 Thread Mark Miller (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13769923#comment-13769923
 ] 

Mark Miller commented on SOLR-5244:
---

bq. It would be great if Solr could efficiently export entire search result 
sets 

Definitely!

bq. without scoring or ranking documents.

With scoring or ranking would be great too :) I'm sure it can wait though.

> Full Search Result Export
> -
>
> Key: SOLR-5244
> URL: https://issues.apache.org/jira/browse/SOLR-5244
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 5.0
>Reporter: Joel Bernstein
>Priority: Minor
> Fix For: 5.0
>
> Attachments: SOLR-5244.patch
>
>
> It would be great if Solr could efficiently export entire search result sets 
> without scoring or ranking documents. This would allow external systems to 
> perform rapid bulk imports from Solr. It also provides a possible platform 
> for exporting results to support distributed join scenarios within Solr.
> This ticket provides a patch that has two pluggable components:
> 1) ExportQParserPlugin: which is a post filter that gathers a BitSet with 
> document results and does not delegate to ranking collectors. Instead it puts 
> the BitSet on the request context.
> 2) BinaryExportWriter: Is a output writer that iterates the BitSet and prints 
> the entire result as a binary stream. A header is provided at the beginning 
> of the stream so external clients can self configure.
> Note:
> These two components will be sufficient for a non-distributed environment. 
> For distributed export a new Request handler will need to be developed.
> After applying the patch and building the dist or example, you can register 
> the components through the following changes to solrconfig.xml
> Register export contrib libraries:
> 
>  
> Register the "export" queryParser with the following line:
>  
>  class="org.apache.solr.export.ExportQParserPlugin"/>
>  
> Register the "xbin" writer:
>  
>  class="org.apache.solr.export.BinaryExportWriter"/>
>  
> The following query will perform the export:
> {code}
> http://localhost:8983/solr/collection1/select?q=*:*&fq={!export}&wt=xbin&fl=join_i
> {code}
> Initial patch supports export of four data-types:
> 1) Single value trie int, long and float
> 2) Binary doc values.
> The numerics are currently exported from the FieldCache and the Binary doc 
> values can be in memory or on disk.
> Since this is designed to export very large result sets efficiently, stored 
> fields are not used for the export.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-5244) Full Search Result Export

2013-09-17 Thread Joel Bernstein (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-5244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-5244:
-

Summary: Full Search Result Export  (was: Export search results)

> Full Search Result Export
> -
>
> Key: SOLR-5244
> URL: https://issues.apache.org/jira/browse/SOLR-5244
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 5.0
>Reporter: Joel Bernstein
>Priority: Minor
> Fix For: 5.0
>
> Attachments: SOLR-5244.patch
>
>
> It would be great if Solr could efficiently export entire search result sets 
> without scoring or ranking documents. This would allow external systems to 
> perform rapid bulk imports from Solr. It also provides a possible platform 
> for exporting results to support distributed join scenarios within Solr.
> This ticket provides a patch that has two pluggable components:
> 1) ExportQParserPlugin: which is a post filter that gathers a BitSet with 
> document results and does not delegate to ranking collectors. Instead it puts 
> the BitSet on the request context.
> 2) BinaryExportWriter: Is a output writer that iterates the BitSet and prints 
> the entire result as a binary stream. A header is provided at the beginning 
> of the stream so external clients can self configure.
> Note:
> These two components will be sufficient for a non-distributed environment. 
> For distributed export a new Request handler will need to be developed.
> After applying the patch and building the dist or example, you can register 
> the components through the following changes to solrconfig.xml
> Register export contrib libraries:
> 
>  
> Register the "export" queryParser with the following line:
>  
>  class="org.apache.solr.export.ExportQParserPlugin"/>
>  
> Register the "xbin" writer:
>  
>  class="org.apache.solr.export.BinaryExportWriter"/>
>  
> The following query will perform the export:
> {code}
> http://localhost:8983/solr/collection1/select?q=*:*&fq={!export}&wt=xbin&fl=join_i
> {code}
> Initial patch supports export of four data-types:
> 1) Single value trie int, long and float
> 2) Binary doc values.
> The numerics are currently exported from the FieldCache and the Binary doc 
> values can be in memory or on disk.
> Since this is designed to export very large result sets efficiently, stored 
> fields are not used for the export.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Reopened] (SOLR-5238) Update the .css for the Ref Guide

2013-09-17 Thread Cassandra Targett (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-5238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cassandra Targett reopened SOLR-5238:
-


Sorry, need to reopen this again.

I failed to notice that in some early iteration of this, I defined the h3 
header size to "15.22px", which is about 11pt. The h4 headers aren't customized 
with this css, so by default are about 14pt. Thus, h3 is smaller than h4.

If you could simply remove the section that defines the h3 size, that should be 
fine - the default h3 is bigger than the default h4 and is perfectly acceptable.

The section to remove looks like this:

/* Set H3 header size */
.wiki-content h3 {
   font-size: 15.22px; 
}

Thanks

> Update the .css for the Ref Guide
> -
>
> Key: SOLR-5238
> URL: https://issues.apache.org/jira/browse/SOLR-5238
> Project: Solr
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Cassandra Targett
>Assignee: Hoss Man
>Priority: Minor
> Fix For: 4.5
>
> Attachments: SolrRefGuide.css
>
>
> I put a custom .css into the Ref Guide before it was uploaded. I was going 
> for something parallel to the Solr website, but only spent a little time with 
> it. In terms of readibility of the text online, it's not great, which is 
> putting it nicely. It's also very difficult to differentiate between "normal" 
> text and monospaced text to indicate a command, program name, etc.
> I'm attaching a new .css that can simply replace what's already in the Space 
> Admin -> Stylesheet section. I did several things with this:
> * cleaned up the .css generally, consolidated some repetitive sections, and 
> added more comments in case future changes are desired.
> * changed the font throughout to Helvetica, Arial, or sans-serif and updated 
> the color to a slightly less strong black.
> * changed the monospace font to match the font used in the code boxes 
> (Consolas) and made them the same color as the text (default is a lot 
> lighter).
> * added a bit more space between lines.
> * removed the negative margin in the header/breadcrumbs to give it a bit more 
> space.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-5243) killing a shard in one collection can result in leader election in a different collection

2013-09-17 Thread Mark Miller (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-5243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller updated SOLR-5243:
--

Attachment: SOLR-5243.patch

> killing a shard in one collection can result in leader election in a 
> different collection
> -
>
> Key: SOLR-5243
> URL: https://issues.apache.org/jira/browse/SOLR-5243
> Project: Solr
>  Issue Type: Bug
>Reporter: Yonik Seeley
>Priority: Blocker
> Fix For: 4.5, 5.0
>
> Attachments: SOLR-5243.patch
>
>
> Discovered while doing some more ad-hoc testing... if I create two 
> collections with the same shard name and then kill the leader in one, it can 
> sometimes cause a leader election in the other (leaving the first leaderless).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4816) Add document routing to CloudSolrServer

2013-09-17 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13769816#comment-13769816
 ] 

ASF subversion and git services commented on SOLR-4816:
---

Commit 1524177 from [~markrmil...@gmail.com] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1524177 ]

SOLR-4816: add missing CHANGES credit

> Add document routing to CloudSolrServer
> ---
>
> Key: SOLR-4816
> URL: https://issues.apache.org/jira/browse/SOLR-4816
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 4.3
>Reporter: Joel Bernstein
>Assignee: Mark Miller
>Priority: Minor
> Fix For: 4.5, 5.0
>
> Attachments: RequestTask-removal.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816-sriesenberg.patch
>
>
> This issue adds the following enhancements to CloudSolrServer's update logic:
> 1) Document routing: Updates are routed directly to the correct shard leader 
> eliminating document routing at the server.
> 2) Optional parallel update execution: Updates for each shard are executed in 
> a separate thread so parallel indexing can occur across the cluster.
> These enhancements should allow for near linear scalability on indexing 
> throughput.
> Usage:
> CloudSolrServer cloudClient = new CloudSolrServer(zkAddress);
> cloudClient.setParallelUpdates(true); 
> SolrInputDocument doc1 = new SolrInputDocument();
> doc1.addField(id, "0");
> doc1.addField("a_t", "hello1");
> SolrInputDocument doc2 = new SolrInputDocument();
> doc2.addField(id, "2");
> doc2.addField("a_t", "hello2");
> UpdateRequest request = new UpdateRequest();
> request.add(doc1);
> request.add(doc2);
> request.setAction(AbstractUpdateRequest.ACTION.OPTIMIZE, false, false);
> NamedList response = cloudClient.request(request); // Returns a backwards 
> compatible condensed response.
> //To get more detailed response down cast to RouteResponse:
> CloudSolrServer.RouteResponse rr = (CloudSolrServer.RouteResponse)response;

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5222) TestExpressionSorts fails sometimes when using expression returning score

2013-09-17 Thread Robert Muir (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13769902#comment-13769902
 ] 

Robert Muir commented on LUCENE-5222:
-

Note this only happens with the paging collector... searchAfter.
If u change the test to page the old fashioned way its fine.

I suspect a bug in topfieldcollector.pagingfieldcollector. eg when there are 
multiple sorts and tiebreaks or something. I don't even think testsearchafter 
tests the case of multiple sorts this well.


> TestExpressionSorts fails sometimes when using expression returning score
> -
>
> Key: LUCENE-5222
> URL: https://issues.apache.org/jira/browse/LUCENE-5222
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Ryan Ernst
>
> Jenkins picked this up.  Repeat with:
> {code}
> ant test  -Dtestcase=TestExpressionSorts -Dtests.method=testQueries 
> -Dtests.seed=115AD00ED89D9F7B -Dtests.multiplier=3 -Dtests.slow=true 
> -Dtests.locale=no_NO -Dtests.timezone=America/Nassau 
> -Dtests.file.encoding=US-ASCII
> {code}
> It appears to have to do with scoring, as removing the score sort from the 
> original sorts causes the tests to pass.  If you remove the possible 
> discrepancy between doDocScores and docMaxScore params to searcher.search, 
> then the test gets farther before failing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-5222) TestExpressionSorts fails sometimes when using expression returning score

2013-09-17 Thread Ryan Ernst (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-5222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ryan Ernst updated LUCENE-5222:
---

Description: 
Jenkins picked this up.  Repeat with:

{code}
ant test  -Dtestcase=TestExpressionSorts -Dtests.method=testQueries 
-Dtests.seed=115AD00ED89D9F7B -Dtests.multiplier=3 -Dtests.slow=true 
-Dtests.locale=no_NO -Dtests.timezone=America/Nassau 
-Dtests.file.encoding=US-ASCII
{code}

It appears to have to do with scoring, as removing the score sort from the 
original sorts causes the tests to pass.  If you remove the possible 
discrepancy between doDocScores and docMaxScore params to searcher.search, then 
the test gets farther before failing.

  was:
Jenkins picked this up.  Repeat with:

{quote}
ant test  -Dtestcase=TestExpressionSorts -Dtests.method=testQueries 
-Dtests.seed=115AD00ED89D9F7B -Dtests.multiplier=3 -Dtests.slow=true 
-Dtests.locale=no_NO -Dtests.timezone=America/Nassau 
-Dtests.file.encoding=US-ASCII
{quote}

It appears to have to do with scoring, as removing the score sort from the 
original sorts causes the tests to pass.  If you remove the possible 
discrepancy between doDocScores and docMaxScore params to searcher.search, then 
the test gets farther before failing.


> TestExpressionSorts fails sometimes when using expression returning score
> -
>
> Key: LUCENE-5222
> URL: https://issues.apache.org/jira/browse/LUCENE-5222
> Project: Lucene - Core
>  Issue Type: Bug
>Reporter: Ryan Ernst
>
> Jenkins picked this up.  Repeat with:
> {code}
> ant test  -Dtestcase=TestExpressionSorts -Dtests.method=testQueries 
> -Dtests.seed=115AD00ED89D9F7B -Dtests.multiplier=3 -Dtests.slow=true 
> -Dtests.locale=no_NO -Dtests.timezone=America/Nassau 
> -Dtests.file.encoding=US-ASCII
> {code}
> It appears to have to do with scoring, as removing the score sort from the 
> original sorts causes the tests to pass.  If you remove the possible 
> discrepancy between doDocScores and docMaxScore params to searcher.search, 
> then the test gets farther before failing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-5222) TestExpressionSorts fails sometimes when using expression returning score

2013-09-17 Thread Ryan Ernst (JIRA)

Ryan Ernst created LUCENE-5222:
--

 Summary: TestExpressionSorts fails sometimes when using expression 
returning score
 Key: LUCENE-5222
 URL: https://issues.apache.org/jira/browse/LUCENE-5222
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Ryan Ernst


Jenkins picked this up.  Repeat with:

{quote}
ant test  -Dtestcase=TestExpressionSorts -Dtests.method=testQueries 
-Dtests.seed=115AD00ED89D9F7B -Dtests.multiplier=3 -Dtests.slow=true 
-Dtests.locale=no_NO -Dtests.timezone=America/Nassau 
-Dtests.file.encoding=US-ASCII
{quote}

It appears to have to do with scoring, as removing the score sort from the 
original sorts causes the tests to pass.  If you remove the possible 
discrepancy between doDocScores and docMaxScore params to searcher.search, then 
the test gets farther before failing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4816) Add document routing to CloudSolrServer

2013-09-17 Thread Mark Miller (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13769836#comment-13769836
 ] 

Mark Miller commented on SOLR-4816:
---

Parallel threads is now on by default. Document routing works if the right 
field id is set - it defaults to "id".

> Add document routing to CloudSolrServer
> ---
>
> Key: SOLR-4816
> URL: https://issues.apache.org/jira/browse/SOLR-4816
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 4.3
>Reporter: Joel Bernstein
>Assignee: Mark Miller
>Priority: Minor
> Fix For: 4.5, 5.0
>
> Attachments: RequestTask-removal.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816-sriesenberg.patch
>
>
> This issue adds the following enhancements to CloudSolrServer's update logic:
> 1) Document routing: Updates are routed directly to the correct shard leader 
> eliminating document routing at the server.
> 2) Optional parallel update execution: Updates for each shard are executed in 
> a separate thread so parallel indexing can occur across the cluster.
> These enhancements should allow for near linear scalability on indexing 
> throughput.
> Usage:
> CloudSolrServer cloudClient = new CloudSolrServer(zkAddress);
> cloudClient.setParallelUpdates(true); 
> SolrInputDocument doc1 = new SolrInputDocument();
> doc1.addField(id, "0");
> doc1.addField("a_t", "hello1");
> SolrInputDocument doc2 = new SolrInputDocument();
> doc2.addField(id, "2");
> doc2.addField("a_t", "hello2");
> UpdateRequest request = new UpdateRequest();
> request.add(doc1);
> request.add(doc2);
> request.setAction(AbstractUpdateRequest.ACTION.OPTIMIZE, false, false);
> NamedList response = cloudClient.request(request); // Returns a backwards 
> compatible condensed response.
> //To get more detailed response down cast to RouteResponse:
> CloudSolrServer.RouteResponse rr = (CloudSolrServer.RouteResponse)response;

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [JENKINS-MAVEN] Lucene-Solr-Maven-4.x #450: POMs out of sync

2013-09-17 Thread Steve Rowe

Yeah, this configuration style is definitely a kludge around this Maven 
shortcoming.  

About submodules, though, you can see this in lucene/codecs/pom.xml.template 
aggregation POM - this is how maven specifies submodules (rather than via 
implicit recursive directories; submodules can be physically located in sibling 
directories):

  
src/java
src/test
  

Probably should be a comment in each of the three pom.xml.template files for 
each module: in the aggregation POM, the src/java POM and the src/test POM.

On Sep 17, 2013, at 2:48 PM, Robert Muir  wrote:

> yes a comment like that is good. i think what confused me was maybe
> the fact the submodules are under "src".
> 
> if you look at this, you dont think there is any nested modules.
> 
> [mac:maven/lucene/codecs] rmuir% ls -la
> total 8
> drwxr-xr-x   4 rmuir  staff   136 Oct 27  2012 .
> drwxr-xr-x  22 rmuir  staff   748 Jul 13 09:48 ..
> -rw-r--r--   1 rmuir  staff  1794 Sep  9  2012 pom.xml.template
> drwxr-xr-x   4 rmuir  staff   136 Oct 27  2012 src
> 
> On Tue, Sep 17, 2013 at 2:45 PM, Steve Rowe  wrote:
>> +1 to add this comment to the pom.xml.template files for lucene-codec (and 
>> lucene-core, solr-core and solr-solrj), but the "dark magic" isn't necessary 
>> for Ant, since it's more flexible than Maven, so I don't think any mention 
>> needs to be made of this in build.xml files.
>> 
>> Maven considers dependencies to be cyclic even if they are used in different 
>> build phases - e.g. lucene-test-framework has a compile-phase dependency on 
>> lucene-codecs, and lucene-codecs has a test-phase dependency on 
>> lucene-test-framework.
>> 
>> On Sep 17, 2013, at 2:29 PM, Chris Hostetter  
>> wrote:
>>> : Some background on the lucene-codec module's complex maven config
>>> : (lucene-core, solr-core, and solr-solrj all have the same setup):
>>> :  - the two attached
>>> : images show the cyclic dependency situation involving the test-framework
>>> : modules before[1] and after[2] the maven config complexification.
>>> 
>>> perhaps we need a comment in the lucene-codec pom.xml/build.xml files...
>>> 
>>>  There is dark magic here, do not use this file as a template,
>>>   or inspiration, when adding new modules...
>>>  https://issues.apache.org/jira/browse/LUCENE-4365
>>> 
>>> 
>>> -Hoss
>> 
>> 
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>> 
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
> 


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-5244) Export search results

2013-09-17 Thread Joel Bernstein (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-5244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-5244:
-

Attachment: SOLR-5244.patch

> Export search results
> -
>
> Key: SOLR-5244
> URL: https://issues.apache.org/jira/browse/SOLR-5244
> Project: Solr
>  Issue Type: New Feature
>  Components: search
>Affects Versions: 5.0
>Reporter: Joel Bernstein
>Priority: Minor
> Fix For: 5.0
>
> Attachments: SOLR-5244.patch
>
>
> It would be great if Solr could efficiently export entire search result sets 
> without scoring or ranking documents. This would allow external systems to 
> perform rapid bulk imports from Solr. It also provides a possible platform 
> for exporting results to support distributed join scenarios within Solr.
> This ticket provides a patch that has two pluggable components:
> 1) ExportQParserPlugin: which is a post filter that gathers a BitSet with 
> document results and does not delegate to ranking collectors. Instead it puts 
> the BitSet on the request context.
> 2) BinaryExportWriter: Is a output writer that iterates the BitSet and prints 
> the entire result as a binary stream. A header is provided at the beginning 
> of the stream so external clients can self configure.
> Note:
> These two components will be sufficient for a non-distributed environment. 
> For distributed export a new Request handler will need to be developed.
> After applying the patch and building the dist or example, you can register 
> the components through the following changes to solrconfig.xml
> Register export contrib libraries:
> 
>  
> Register the "export" queryParser with the following line:
>  
>  class="org.apache.solr.export.ExportQParserPlugin"/>
>  
> Register the "xbin" writer:
>  
>  class="org.apache.solr.export.BinaryExportWriter"/>
>  
> The following query will perform the export:
> {code}
> http://localhost:8983/solr/collection1/select?q=*:*&fq={!export}&wt=xbin&fl=join_i
> {code}
> Initial patch supports export of four data-types:
> 1) Single value trie int, long and float
> 2) Binary doc values.
> The numerics are currently exported from the FieldCache and the Binary doc 
> values can be in memory or on disk.
> Since this is designed to export very large result sets efficiently, stored 
> fields are not used for the export.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [JENKINS-MAVEN] Lucene-Solr-Maven-4.x #450: POMs out of sync

2013-09-17 Thread Robert Muir

I probably would have still screwed it up... I just saw what looked
like a build file along some source folders, and "java" and "test"
still to me look like they would hold source code.

Really i should have said "WTF: why is there source code here at all"
but that didn't click either

I don't think normally someone would use codecs/ as a "template"
though. I just thought "who has resources" and i knew this guy had
META-INF for the codec stuff so it was the victim i picked.

On Tue, Sep 17, 2013 at 12:15 PM, Steve Rowe  wrote:
> Actually, now that I think about it, locating the submodules under src/java/ 
> and src/test/ is a vestige from when the interpolated pom.xml files went into 
> the source directories, and so had to use the existing structure.
>
> Under the top-level maven-build/, where the interpolated pom.xml files are 
> placed these days, the structure could be anything we want - the src/ 
> directory could just go away:
>
> maven/
> +--lucene/
>+--codecs/
>   +--pom.xml.template <-- aggregator POM
>   +--java/
>   |  +--pom.xml.template  <-- compile-phase POM
>   +--test/
>  +--pom.xml.template  <-- test-phase POM
>
> Robert, do you think the above structure would have provided the "nested 
> modules here" cue?
>
> Steve
>
> On Sep 17, 2013, at 2:58 PM, Steve Rowe  wrote:
>
>> Yeah, this configuration style is definitely a kludge around this Maven 
>> shortcoming.
>>
>> About submodules, though, you can see this in lucene/codecs/pom.xml.template 
>> aggregation POM - this is how maven specifies submodules (rather than via 
>> implicit recursive directories; submodules can be physically located in 
>> sibling directories):
>>
>>  
>>src/java
>>src/test
>>  
>>
>> Probably should be a comment in each of the three pom.xml.template files for 
>> each module: in the aggregation POM, the src/java POM and the src/test POM.
>>
>> On Sep 17, 2013, at 2:48 PM, Robert Muir  wrote:
>>
>>> yes a comment like that is good. i think what confused me was maybe
>>> the fact the submodules are under "src".
>>>
>>> if you look at this, you dont think there is any nested modules.
>>>
>>> [mac:maven/lucene/codecs] rmuir% ls -la
>>> total 8
>>> drwxr-xr-x   4 rmuir  staff   136 Oct 27  2012 .
>>> drwxr-xr-x  22 rmuir  staff   748 Jul 13 09:48 ..
>>> -rw-r--r--   1 rmuir  staff  1794 Sep  9  2012 pom.xml.template
>>> drwxr-xr-x   4 rmuir  staff   136 Oct 27  2012 src
>>>
>>> On Tue, Sep 17, 2013 at 2:45 PM, Steve Rowe  wrote:
 +1 to add this comment to the pom.xml.template files for lucene-codec (and 
 lucene-core, solr-core and solr-solrj), but the "dark magic" isn't 
 necessary for Ant, since it's more flexible than Maven, so I don't think 
 any mention needs to be made of this in build.xml files.

 Maven considers dependencies to be cyclic even if they are used in 
 different build phases - e.g. lucene-test-framework has a compile-phase 
 dependency on lucene-codecs, and lucene-codecs has a test-phase dependency 
 on lucene-test-framework.

 On Sep 17, 2013, at 2:29 PM, Chris Hostetter  
 wrote:
> : Some background on the lucene-codec module's complex maven config
> : (lucene-core, solr-core, and solr-solrj all have the same setup):
> :  - the two attached
> : images show the cyclic dependency situation involving the test-framework
> : modules before[1] and after[2] the maven config complexification.
>
> perhaps we need a comment in the lucene-codec pom.xml/build.xml files...
>
> There is dark magic here, do not use this file as a template,
>  or inspiration, when adding new modules...
> https://issues.apache.org/jira/browse/LUCENE-4365
>
>
> -Hoss


 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org

>>>
>>> -
>>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>>
>>
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4816) Add document routing to CloudSolrServer

2013-09-17 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13769812#comment-13769812
 ] 

ASF subversion and git services commented on SOLR-4816:
---

Commit 1524176 from [~markrmil...@gmail.com] in branch 'dev/trunk'
[ https://svn.apache.org/r1524176 ]

SOLR-4816: add missing CHANGES credit

> Add document routing to CloudSolrServer
> ---
>
> Key: SOLR-4816
> URL: https://issues.apache.org/jira/browse/SOLR-4816
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 4.3
>Reporter: Joel Bernstein
>Assignee: Mark Miller
>Priority: Minor
> Fix For: 4.5, 5.0
>
> Attachments: RequestTask-removal.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816-sriesenberg.patch
>
>
> This issue adds the following enhancements to CloudSolrServer's update logic:
> 1) Document routing: Updates are routed directly to the correct shard leader 
> eliminating document routing at the server.
> 2) Optional parallel update execution: Updates for each shard are executed in 
> a separate thread so parallel indexing can occur across the cluster.
> These enhancements should allow for near linear scalability on indexing 
> throughput.
> Usage:
> CloudSolrServer cloudClient = new CloudSolrServer(zkAddress);
> cloudClient.setParallelUpdates(true); 
> SolrInputDocument doc1 = new SolrInputDocument();
> doc1.addField(id, "0");
> doc1.addField("a_t", "hello1");
> SolrInputDocument doc2 = new SolrInputDocument();
> doc2.addField(id, "2");
> doc2.addField("a_t", "hello2");
> UpdateRequest request = new UpdateRequest();
> request.add(doc1);
> request.add(doc2);
> request.setAction(AbstractUpdateRequest.ACTION.OPTIMIZE, false, false);
> NamedList response = cloudClient.request(request); // Returns a backwards 
> compatible condensed response.
> //To get more detailed response down cast to RouteResponse:
> CloudSolrServer.RouteResponse rr = (CloudSolrServer.RouteResponse)response;

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4816) Add document routing to CloudSolrServer

2013-09-17 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13769801#comment-13769801
 ] 

ASF subversion and git services commented on SOLR-4816:
---

Commit 1524171 from [~markrmil...@gmail.com] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1524171 ]

SOLR-4816: deal with leader=null case and init map with known size

> Add document routing to CloudSolrServer
> ---
>
> Key: SOLR-4816
> URL: https://issues.apache.org/jira/browse/SOLR-4816
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 4.3
>Reporter: Joel Bernstein
>Assignee: Mark Miller
>Priority: Minor
> Fix For: 4.5, 5.0
>
> Attachments: RequestTask-removal.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816-sriesenberg.patch
>
>
> This issue adds the following enhancements to CloudSolrServer's update logic:
> 1) Document routing: Updates are routed directly to the correct shard leader 
> eliminating document routing at the server.
> 2) Optional parallel update execution: Updates for each shard are executed in 
> a separate thread so parallel indexing can occur across the cluster.
> These enhancements should allow for near linear scalability on indexing 
> throughput.
> Usage:
> CloudSolrServer cloudClient = new CloudSolrServer(zkAddress);
> cloudClient.setParallelUpdates(true); 
> SolrInputDocument doc1 = new SolrInputDocument();
> doc1.addField(id, "0");
> doc1.addField("a_t", "hello1");
> SolrInputDocument doc2 = new SolrInputDocument();
> doc2.addField(id, "2");
> doc2.addField("a_t", "hello2");
> UpdateRequest request = new UpdateRequest();
> request.add(doc1);
> request.add(doc2);
> request.setAction(AbstractUpdateRequest.ACTION.OPTIMIZE, false, false);
> NamedList response = cloudClient.request(request); // Returns a backwards 
> compatible condensed response.
> //To get more detailed response down cast to RouteResponse:
> CloudSolrServer.RouteResponse rr = (CloudSolrServer.RouteResponse)response;

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-5244) Export search results

2013-09-17 Thread Joel Bernstein (JIRA)

Joel Bernstein created SOLR-5244:


 Summary: Export search results
 Key: SOLR-5244
 URL: https://issues.apache.org/jira/browse/SOLR-5244
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 5.0
Reporter: Joel Bernstein
Priority: Minor
 Fix For: 5.0


It would be great if Solr could efficiently export entire search result sets 
without scoring or ranking documents. This would allow external systems to 
perform rapid bulk imports from Solr. It also provides a possible platform for 
exporting results to support distributed join scenarios within Solr.

This ticket provides a patch that has two pluggable components:

1) ExportQParserPlugin: which is a post filter that gathers a BitSet with 
document results and does not delegate to ranking collectors. Instead it puts 
the BitSet on the request context.

2) BinaryExportWriter: Is a output writer that iterates the BitSet and prints 
the entire result as a binary stream. A header is provided at the beginning of 
the stream so external clients can self configure.

Note:
These two components will be sufficient for a non-distributed environment. 
For distributed export a new Request handler will need to be developed.

After applying the patch and building the dist or example, you can register the 
components through the following changes to solrconfig.xml

Register export contrib libraries:


 
Register the "export" queryParser with the following line:
 

 
Register the "xbin" writer:
 

 
The following query will perform the export:
{code}
http://localhost:8983/solr/collection1/select?q=*:*&fq={!export}&wt=xbin&fl=join_i
{code}

Initial patch supports export of four data-types:

1) Single value trie int, long and float
2) Binary doc values.

The numerics are currently exported from the FieldCache and the Binary doc 
values can be in memory or on disk.

Since this is designed to export very large result sets efficiently, stored 
fields are not used for the export.




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [JENKINS-MAVEN] Lucene-Solr-Maven-4.x #450: POMs out of sync

2013-09-17 Thread Robert Muir

yes a comment like that is good. i think what confused me was maybe
the fact the submodules are under "src".

if you look at this, you dont think there is any nested modules.

[mac:maven/lucene/codecs] rmuir% ls -la
total 8
drwxr-xr-x   4 rmuir  staff   136 Oct 27  2012 .
drwxr-xr-x  22 rmuir  staff   748 Jul 13 09:48 ..
-rw-r--r--   1 rmuir  staff  1794 Sep  9  2012 pom.xml.template
drwxr-xr-x   4 rmuir  staff   136 Oct 27  2012 src

On Tue, Sep 17, 2013 at 2:45 PM, Steve Rowe  wrote:
> +1 to add this comment to the pom.xml.template files for lucene-codec (and 
> lucene-core, solr-core and solr-solrj), but the "dark magic" isn't necessary 
> for Ant, since it's more flexible than Maven, so I don't think any mention 
> needs to be made of this in build.xml files.
>
> Maven considers dependencies to be cyclic even if they are used in different 
> build phases - e.g. lucene-test-framework has a compile-phase dependency on 
> lucene-codecs, and lucene-codecs has a test-phase dependency on 
> lucene-test-framework.
>
> On Sep 17, 2013, at 2:29 PM, Chris Hostetter  wrote:
>> : Some background on the lucene-codec module's complex maven config
>> : (lucene-core, solr-core, and solr-solrj all have the same setup):
>> :  - the two attached
>> : images show the cyclic dependency situation involving the test-framework
>> : modules before[1] and after[2] the maven config complexification.
>>
>> perhaps we need a comment in the lucene-codec pom.xml/build.xml files...
>>
>>   There is dark magic here, do not use this file as a template,
>>or inspiration, when adding new modules...
>>   https://issues.apache.org/jira/browse/LUCENE-4365
>>
>>
>> -Hoss
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [JENKINS-MAVEN] Lucene-Solr-Maven-4.x #450: POMs out of sync

2013-09-17 Thread Steve Rowe

Actually, now that I think about it, locating the submodules under src/java/ 
and src/test/ is a vestige from when the interpolated pom.xml files went into 
the source directories, and so had to use the existing structure. 

Under the top-level maven-build/, where the interpolated pom.xml files are 
placed these days, the structure could be anything we want - the src/ directory 
could just go away:

maven/
+--lucene/
   +--codecs/
  +--pom.xml.template <-- aggregator POM
  +--java/
  |  +--pom.xml.template  <-- compile-phase POM
  +--test/
 +--pom.xml.template  <-- test-phase POM

Robert, do you think the above structure would have provided the "nested 
modules here" cue?

Steve

On Sep 17, 2013, at 2:58 PM, Steve Rowe  wrote:

> Yeah, this configuration style is definitely a kludge around this Maven 
> shortcoming.  
> 
> About submodules, though, you can see this in lucene/codecs/pom.xml.template 
> aggregation POM - this is how maven specifies submodules (rather than via 
> implicit recursive directories; submodules can be physically located in 
> sibling directories):
> 
>  
>src/java
>src/test
>  
> 
> Probably should be a comment in each of the three pom.xml.template files for 
> each module: in the aggregation POM, the src/java POM and the src/test POM.
> 
> On Sep 17, 2013, at 2:48 PM, Robert Muir  wrote:
> 
>> yes a comment like that is good. i think what confused me was maybe
>> the fact the submodules are under "src".
>> 
>> if you look at this, you dont think there is any nested modules.
>> 
>> [mac:maven/lucene/codecs] rmuir% ls -la
>> total 8
>> drwxr-xr-x   4 rmuir  staff   136 Oct 27  2012 .
>> drwxr-xr-x  22 rmuir  staff   748 Jul 13 09:48 ..
>> -rw-r--r--   1 rmuir  staff  1794 Sep  9  2012 pom.xml.template
>> drwxr-xr-x   4 rmuir  staff   136 Oct 27  2012 src
>> 
>> On Tue, Sep 17, 2013 at 2:45 PM, Steve Rowe  wrote:
>>> +1 to add this comment to the pom.xml.template files for lucene-codec (and 
>>> lucene-core, solr-core and solr-solrj), but the "dark magic" isn't 
>>> necessary for Ant, since it's more flexible than Maven, so I don't think 
>>> any mention needs to be made of this in build.xml files.
>>> 
>>> Maven considers dependencies to be cyclic even if they are used in 
>>> different build phases - e.g. lucene-test-framework has a compile-phase 
>>> dependency on lucene-codecs, and lucene-codecs has a test-phase dependency 
>>> on lucene-test-framework.
>>> 
>>> On Sep 17, 2013, at 2:29 PM, Chris Hostetter  
>>> wrote:
 : Some background on the lucene-codec module's complex maven config
 : (lucene-core, solr-core, and solr-solrj all have the same setup):
 :  - the two attached
 : images show the cyclic dependency situation involving the test-framework
 : modules before[1] and after[2] the maven config complexification.
 
 perhaps we need a comment in the lucene-codec pom.xml/build.xml files...
 
 There is dark magic here, do not use this file as a template,
  or inspiration, when adding new modules...
 https://issues.apache.org/jira/browse/LUCENE-4365
 
 
 -Hoss
>>> 
>>> 
>>> -
>>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>> 
>> 
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>> 
> 


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4816) Add document routing to CloudSolrServer

2013-09-17 Thread Markus Jelsma (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13769813#comment-13769813
 ] 

Markus Jelsma commented on SOLR-4816:
-

Mark, is it still enabled the same way as Joel's original patches?

> Add document routing to CloudSolrServer
> ---
>
> Key: SOLR-4816
> URL: https://issues.apache.org/jira/browse/SOLR-4816
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 4.3
>Reporter: Joel Bernstein
>Assignee: Mark Miller
>Priority: Minor
> Fix For: 4.5, 5.0
>
> Attachments: RequestTask-removal.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816-sriesenberg.patch
>
>
> This issue adds the following enhancements to CloudSolrServer's update logic:
> 1) Document routing: Updates are routed directly to the correct shard leader 
> eliminating document routing at the server.
> 2) Optional parallel update execution: Updates for each shard are executed in 
> a separate thread so parallel indexing can occur across the cluster.
> These enhancements should allow for near linear scalability on indexing 
> throughput.
> Usage:
> CloudSolrServer cloudClient = new CloudSolrServer(zkAddress);
> cloudClient.setParallelUpdates(true); 
> SolrInputDocument doc1 = new SolrInputDocument();
> doc1.addField(id, "0");
> doc1.addField("a_t", "hello1");
> SolrInputDocument doc2 = new SolrInputDocument();
> doc2.addField(id, "2");
> doc2.addField("a_t", "hello2");
> UpdateRequest request = new UpdateRequest();
> request.add(doc1);
> request.add(doc2);
> request.setAction(AbstractUpdateRequest.ACTION.OPTIMIZE, false, false);
> NamedList response = cloudClient.request(request); // Returns a backwards 
> compatible condensed response.
> //To get more detailed response down cast to RouteResponse:
> CloudSolrServer.RouteResponse rr = (CloudSolrServer.RouteResponse)response;

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [JENKINS-MAVEN] Lucene-Solr-Maven-4.x #450: POMs out of sync

2013-09-17 Thread Steve Rowe

+1 to add this comment to the pom.xml.template files for lucene-codec (and 
lucene-core, solr-core and solr-solrj), but the "dark magic" isn't necessary 
for Ant, since it's more flexible than Maven, so I don't think any mention 
needs to be made of this in build.xml files.

Maven considers dependencies to be cyclic even if they are used in different 
build phases - e.g. lucene-test-framework has a compile-phase dependency on 
lucene-codecs, and lucene-codecs has a test-phase dependency on 
lucene-test-framework.

On Sep 17, 2013, at 2:29 PM, Chris Hostetter  wrote:
> : Some background on the lucene-codec module's complex maven config 
> : (lucene-core, solr-core, and solr-solrj all have the same setup): 
> :  - the two attached 
> : images show the cyclic dependency situation involving the test-framework 
> : modules before[1] and after[2] the maven config complexification.
> 
> perhaps we need a comment in the lucene-codec pom.xml/build.xml files...
> 
>   There is dark magic here, do not use this file as a template, 
>or inspiration, when adding new modules...
>   https://issues.apache.org/jira/browse/LUCENE-4365
> 
> 
> -Hoss


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4816) Add document routing to CloudSolrServer

2013-09-17 Thread Mark Miller (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13769806#comment-13769806
 ] 

Mark Miller commented on SOLR-4816:
---

Thanks Joel and Thanks Shikhar!

> Add document routing to CloudSolrServer
> ---
>
> Key: SOLR-4816
> URL: https://issues.apache.org/jira/browse/SOLR-4816
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 4.3
>Reporter: Joel Bernstein
>Assignee: Mark Miller
>Priority: Minor
> Fix For: 4.5, 5.0
>
> Attachments: RequestTask-removal.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816-sriesenberg.patch
>
>
> This issue adds the following enhancements to CloudSolrServer's update logic:
> 1) Document routing: Updates are routed directly to the correct shard leader 
> eliminating document routing at the server.
> 2) Optional parallel update execution: Updates for each shard are executed in 
> a separate thread so parallel indexing can occur across the cluster.
> These enhancements should allow for near linear scalability on indexing 
> throughput.
> Usage:
> CloudSolrServer cloudClient = new CloudSolrServer(zkAddress);
> cloudClient.setParallelUpdates(true); 
> SolrInputDocument doc1 = new SolrInputDocument();
> doc1.addField(id, "0");
> doc1.addField("a_t", "hello1");
> SolrInputDocument doc2 = new SolrInputDocument();
> doc2.addField(id, "2");
> doc2.addField("a_t", "hello2");
> UpdateRequest request = new UpdateRequest();
> request.add(doc1);
> request.add(doc2);
> request.setAction(AbstractUpdateRequest.ACTION.OPTIMIZE, false, false);
> NamedList response = cloudClient.request(request); // Returns a backwards 
> compatible condensed response.
> //To get more detailed response down cast to RouteResponse:
> CloudSolrServer.RouteResponse rr = (CloudSolrServer.RouteResponse)response;

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4816) Add document routing to CloudSolrServer

2013-09-17 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13769799#comment-13769799
 ] 

ASF subversion and git services commented on SOLR-4816:
---

Commit 1524170 from [~markrmil...@gmail.com] in branch 'dev/trunk'
[ https://svn.apache.org/r1524170 ]

SOLR-4816: deal with leader=null case and init map with known size

> Add document routing to CloudSolrServer
> ---
>
> Key: SOLR-4816
> URL: https://issues.apache.org/jira/browse/SOLR-4816
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 4.3
>Reporter: Joel Bernstein
>Assignee: Mark Miller
>Priority: Minor
> Fix For: 4.5, 5.0
>
> Attachments: RequestTask-removal.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816-sriesenberg.patch
>
>
> This issue adds the following enhancements to CloudSolrServer's update logic:
> 1) Document routing: Updates are routed directly to the correct shard leader 
> eliminating document routing at the server.
> 2) Optional parallel update execution: Updates for each shard are executed in 
> a separate thread so parallel indexing can occur across the cluster.
> These enhancements should allow for near linear scalability on indexing 
> throughput.
> Usage:
> CloudSolrServer cloudClient = new CloudSolrServer(zkAddress);
> cloudClient.setParallelUpdates(true); 
> SolrInputDocument doc1 = new SolrInputDocument();
> doc1.addField(id, "0");
> doc1.addField("a_t", "hello1");
> SolrInputDocument doc2 = new SolrInputDocument();
> doc2.addField(id, "2");
> doc2.addField("a_t", "hello2");
> UpdateRequest request = new UpdateRequest();
> request.add(doc1);
> request.add(doc2);
> request.setAction(AbstractUpdateRequest.ACTION.OPTIMIZE, false, false);
> NamedList response = cloudClient.request(request); // Returns a backwards 
> compatible condensed response.
> //To get more detailed response down cast to RouteResponse:
> CloudSolrServer.RouteResponse rr = (CloudSolrServer.RouteResponse)response;

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4816) Add document routing to CloudSolrServer

2013-09-17 Thread Mark Miller (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13769805#comment-13769805
 ] 

Mark Miller commented on SOLR-4816:
---

That should address some of the recent jenkins fails this has caused and 
addresses Shikhar's last comment.

> Add document routing to CloudSolrServer
> ---
>
> Key: SOLR-4816
> URL: https://issues.apache.org/jira/browse/SOLR-4816
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 4.3
>Reporter: Joel Bernstein
>Assignee: Mark Miller
>Priority: Minor
> Fix For: 4.5, 5.0
>
> Attachments: RequestTask-removal.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816-sriesenberg.patch
>
>
> This issue adds the following enhancements to CloudSolrServer's update logic:
> 1) Document routing: Updates are routed directly to the correct shard leader 
> eliminating document routing at the server.
> 2) Optional parallel update execution: Updates for each shard are executed in 
> a separate thread so parallel indexing can occur across the cluster.
> These enhancements should allow for near linear scalability on indexing 
> throughput.
> Usage:
> CloudSolrServer cloudClient = new CloudSolrServer(zkAddress);
> cloudClient.setParallelUpdates(true); 
> SolrInputDocument doc1 = new SolrInputDocument();
> doc1.addField(id, "0");
> doc1.addField("a_t", "hello1");
> SolrInputDocument doc2 = new SolrInputDocument();
> doc2.addField(id, "2");
> doc2.addField("a_t", "hello2");
> UpdateRequest request = new UpdateRequest();
> request.add(doc1);
> request.add(doc2);
> request.setAction(AbstractUpdateRequest.ACTION.OPTIMIZE, false, false);
> NamedList response = cloudClient.request(request); // Returns a backwards 
> compatible condensed response.
> //To get more detailed response down cast to RouteResponse:
> CloudSolrServer.RouteResponse rr = (CloudSolrServer.RouteResponse)response;

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4816) Add document routing to CloudSolrServer

2013-09-17 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13769803#comment-13769803
 ] 

ASF subversion and git services commented on SOLR-4816:
---

Commit 1524174 from [~markrmil...@gmail.com] in branch 
'dev/branches/lucene_solr_4_5'
[ https://svn.apache.org/r1524174 ]

SOLR-4816: deal with leader=null case and init map with known size

> Add document routing to CloudSolrServer
> ---
>
> Key: SOLR-4816
> URL: https://issues.apache.org/jira/browse/SOLR-4816
> Project: Solr
>  Issue Type: Improvement
>  Components: SolrCloud
>Affects Versions: 4.3
>Reporter: Joel Bernstein
>Assignee: Mark Miller
>Priority: Minor
> Fix For: 4.5, 5.0
>
> Attachments: RequestTask-removal.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, 
> SOLR-4816.patch, SOLR-4816.patch, SOLR-4816.patch, SOLR-4816-sriesenberg.patch
>
>
> This issue adds the following enhancements to CloudSolrServer's update logic:
> 1) Document routing: Updates are routed directly to the correct shard leader 
> eliminating document routing at the server.
> 2) Optional parallel update execution: Updates for each shard are executed in 
> a separate thread so parallel indexing can occur across the cluster.
> These enhancements should allow for near linear scalability on indexing 
> throughput.
> Usage:
> CloudSolrServer cloudClient = new CloudSolrServer(zkAddress);
> cloudClient.setParallelUpdates(true); 
> SolrInputDocument doc1 = new SolrInputDocument();
> doc1.addField(id, "0");
> doc1.addField("a_t", "hello1");
> SolrInputDocument doc2 = new SolrInputDocument();
> doc2.addField(id, "2");
> doc2.addField("a_t", "hello2");
> UpdateRequest request = new UpdateRequest();
> request.add(doc1);
> request.add(doc2);
> request.setAction(AbstractUpdateRequest.ACTION.OPTIMIZE, false, false);
> NamedList response = cloudClient.request(request); // Returns a backwards 
> compatible condensed response.
> //To get more detailed response down cast to RouteResponse:
> CloudSolrServer.RouteResponse rr = (CloudSolrServer.RouteResponse)response;

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-5243) killing a shard in one collection can result in leader election in a different collection

2013-09-17 Thread Yonik Seeley (JIRA)

Yonik Seeley created SOLR-5243:
--

 Summary: killing a shard in one collection can result in leader 
election in a different collection
 Key: SOLR-5243
 URL: https://issues.apache.org/jira/browse/SOLR-5243
 Project: Solr
  Issue Type: Bug
Reporter: Yonik Seeley
Priority: Blocker
 Fix For: 4.5, 5.0


Discovered while doing some more ad-hoc testing... if I create two collections 
with the same shard name and then kill the leader in one, it can sometimes 
cause a leader election in the other (leaving the first leaderless).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5243) killing a shard in one collection can result in leader election in a different collection

2013-09-17 Thread Yonik Seeley (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13769807#comment-13769807
 ] 

Yonik Seeley commented on SOLR-5243:


Steps to reproduce:

{code}
#Bring up 2 nodes
cp -rp example example2
cd example
java -Dbootstrap_confdir=./solr/collection1/conf -Dcollection.configName=myConf 
-DzkRun -DnumShards=2 -jar start.jar

cd example2
java -Djetty.port=7574 -DzkHost=localhost:9983 -jar start.jar

#if both leaders aren't on port 8983, kill example2 and then bring it back up.

#look up the core name for the c2/s1 leader and unload it
curl "http://localhost:8983/solr/admin/cores?action=UNLOAD&core=c2_s1_replica2";

# now see two things:
# 1) c2/s1 is now leaderless
# 2) The leader of c3/s1 has switched to port 7574
# from the logs on port 7574 we can see that leader election was kicked off for 
the wrong collection...

102432 [main-EventThread] INFO  
org.apache.solr.cloud.ShardLeaderElectionContext  – Running the leader process 
for shard s1
102484 [main-EventThread] INFO  
org.apache.solr.cloud.ShardLeaderElectionContext  – Checking if I should try 
and be the leader.
102484 [main-EventThread] INFO  
org.apache.solr.cloud.ShardLeaderElectionContext  – My last published State was 
Active, it's okay to be the leader.
102484 [main-EventThread] INFO  
org.apache.solr.cloud.ShardLeaderElectionContext  – I may be the new leader - 
try and sync
102485 [main-EventThread] INFO  org.apache.solr.cloud.SyncStrategy  – Sync 
replicas to http://192.168.1.104:7574/solr/c3_s1_replica2/

{code}


> killing a shard in one collection can result in leader election in a 
> different collection
> -
>
> Key: SOLR-5243
> URL: https://issues.apache.org/jira/browse/SOLR-5243
> Project: Solr
>  Issue Type: Bug
>Reporter: Yonik Seeley
>Priority: Blocker
> Fix For: 4.5, 5.0
>
>
> Discovered while doing some more ad-hoc testing... if I create two 
> collections with the same shard name and then kill the leader in one, it can 
> sometimes cause a leader election in the other (leaving the first leaderless).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [JENKINS-MAVEN] Lucene-Solr-Maven-4.x #450: POMs out of sync

2013-09-17 Thread Chris Hostetter


: Some background on the lucene-codec module's complex maven config 
: (lucene-core, solr-core, and solr-solrj all have the same setup): 
:  - the two attached 
: images show the cyclic dependency situation involving the test-framework 
: modules before[1] and after[2] the maven config complexification.

perhaps we need a comment in the lucene-codec pom.xml/build.xml files...

There is dark magic here, do not use this file as a template, 
or inspiration, when adding new modules...
   https://issues.apache.org/jira/browse/LUCENE-4365


-Hoss

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [JENKINS-MAVEN] Lucene-Solr-Maven-4.x #450: POMs out of sync

2013-09-17 Thread Steve Rowe

Thanks Robert.

Some background on the lucene-codec module's complex maven config (lucene-core, 
solr-core, and solr-solrj all have the same setup): 
 - the two attached images 
show the cyclic dependency situation involving the test-framework modules 
before[1] and after[2] the maven config complexification.

Steve

[1] 

[2] 


On Sep 17, 2013, at 1:23 PM, Robert Muir  wrote:

> OK I see it. My problem was i looked at codecs/ for an example, and i
> saw it didnt have any resources/ in its config, so i thought it was
> unnecessary.
> 
> but this was a bad module for me to look at, because its more
> complicated (it makes separate src/test submodules probably due to the
> way its used in tests).
> 
> Ill fix this
> 
> On Tue, Sep 17, 2013 at 10:17 AM, Robert Muir  wrote:
>> Looks like the maven configuration is not pulling in src/resources?
>> 
>> I don't know whats wrong with the pom.template...
>> 
>> On Tue, Sep 17, 2013 at 9:56 AM, Apache Jenkins Server
>>  wrote:
>>> Build: https://builds.apache.org/job/Lucene-Solr-Maven-4.x/450/
>>> 
>>> 83 tests failed.
>>> FAILED:  
>>> org.apache.lucene.expressions.TestDemoExpressions.testLotsOfBindings
>>> 
>>> Error Message:
>>> null
>>> 
>>> Stack Trace:
>>> java.lang.ExceptionInInitializerError: null
>>>at java.io.Reader.(Reader.java:78)
>>>at java.io.InputStreamReader.(InputStreamReader.java:129)
>>>at org.apache.lucene.util.IOUtils.getDecodingReader(IOUtils.java:268)
>>>at org.apache.lucene.util.IOUtils.getDecodingReader(IOUtils.java:319)
>>>at 
>>> org.apache.lucene.expressions.js.JavascriptCompiler.(JavascriptCompiler.java:512)
>>>at 
>>> org.apache.lucene.expressions.TestDemoExpressions.doTestLotsOfBindings(TestDemoExpressions.java:174)
>>>at 
>>> org.apache.lucene.expressions.TestDemoExpressions.testLotsOfBindings(TestDemoExpressions.java:156)
>>> 
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
> 


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS-MAVEN] Lucene-Solr-Maven-4.x #450: POMs out of sync

2013-09-17 Thread Apache Jenkins Server

Build: https://builds.apache.org/job/Lucene-Solr-Maven-4.x/450/

83 tests failed.
FAILED:  org.apache.lucene.expressions.TestDemoExpressions.testLotsOfBindings

Error Message:
null

Stack Trace:
java.lang.ExceptionInInitializerError: null
at java.io.Reader.(Reader.java:78)
at java.io.InputStreamReader.(InputStreamReader.java:129)
at org.apache.lucene.util.IOUtils.getDecodingReader(IOUtils.java:268)
at org.apache.lucene.util.IOUtils.getDecodingReader(IOUtils.java:319)
at 
org.apache.lucene.expressions.js.JavascriptCompiler.(JavascriptCompiler.java:512)
at 
org.apache.lucene.expressions.TestDemoExpressions.doTestLotsOfBindings(TestDemoExpressions.java:174)
at 
org.apache.lucene.expressions.TestDemoExpressions.testLotsOfBindings(TestDemoExpressions.java:156)


FAILED:  org.apache.lucene.expressions.TestDemoExpressions.testSortValues

Error Message:
Could not initialize class org.apache.lucene.expressions.js.JavascriptCompiler

Stack Trace:
java.lang.NoClassDefFoundError: Could not initialize class 
org.apache.lucene.expressions.js.JavascriptCompiler
at 
__randomizedtesting.SeedInfo.seed([25464DD2FE7F3FEA:3B2605A00EEC481]:0)
at 
org.apache.lucene.expressions.TestDemoExpressions.testSortValues(TestDemoExpressions.java:100)


FAILED:  org.apache.lucene.expressions.TestDemoExpressions.test

Error Message:
Could not initialize class org.apache.lucene.expressions.js.JavascriptCompiler

Stack Trace:
java.lang.NoClassDefFoundError: Could not initialize class 
org.apache.lucene.expressions.js.JavascriptCompiler
at 
__randomizedtesting.SeedInfo.seed([25464DD2FE7F3FEA:AD12720850835212]:0)
at 
org.apache.lucene.expressions.TestDemoExpressions.test(TestDemoExpressions.java:85)


FAILED:  
org.apache.lucene.expressions.TestDemoExpressions.testExpressionRefersToExpression

Error Message:
Could not initialize class org.apache.lucene.expressions.js.JavascriptCompiler

Stack Trace:
java.lang.NoClassDefFoundError: Could not initialize class 
org.apache.lucene.expressions.js.JavascriptCompiler
at 
__randomizedtesting.SeedInfo.seed([25464DD2FE7F3FEA:14448966FC3B3C4C]:0)
at 
org.apache.lucene.expressions.TestDemoExpressions.testExpressionRefersToExpression(TestDemoExpressions.java:136)


FAILED:  org.apache.lucene.expressions.TestDemoExpressions.testTwoOfSameBinding

Error Message:
Could not initialize class org.apache.lucene.expressions.js.JavascriptCompiler

Stack Trace:
java.lang.NoClassDefFoundError: Could not initialize class 
org.apache.lucene.expressions.js.JavascriptCompiler
at 
__randomizedtesting.SeedInfo.seed([25464DD2FE7F3FEA:DD68AE731CD77FC2]:0)
at 
org.apache.lucene.expressions.TestDemoExpressions.testTwoOfSameBinding(TestDemoExpressions.java:118)


FAILED:  org.apache.lucene.expressions.TestExpressionSorts.testQueries

Error Message:
Could not initialize class org.apache.lucene.expressions.js.JavascriptCompiler

Stack Trace:
java.lang.NoClassDefFoundError: Could not initialize class 
org.apache.lucene.expressions.js.JavascriptCompiler
at 
__randomizedtesting.SeedInfo.seed([94300E0855732A53:C8BEC2D34F1A9FFD]:0)
at 
org.apache.lucene.expressions.TestExpressionSorts.assertQuery(TestExpressionSorts.java:146)
at 
org.apache.lucene.expressions.TestExpressionSorts.assertQuery(TestExpressionSorts.java:130)
at 
org.apache.lucene.expressions.TestExpressionSorts.testQueries(TestExpressionSorts.java:101)


FAILED:  
org.apache.lucene.expressions.TestExpressionValidation.testValidExternals

Error Message:
Could not initialize class org.apache.lucene.expressions.js.JavascriptCompiler

Stack Trace:
java.lang.NoClassDefFoundError: Could not initialize class 
org.apache.lucene.expressions.js.JavascriptCompiler
at 
__randomizedtesting.SeedInfo.seed([4914F5371C97A83A:EE4F8D3D609326F8]:0)
at 
org.apache.lucene.expressions.TestExpressionValidation.testValidExternals(TestExpressionValidation.java:33)


FAILED:  org.apache.lucene.expressions.TestExpressionValidation.testCoRecursion

Error Message:
Could not initialize class org.apache.lucene.expressions.js.JavascriptCompiler

Stack Trace:
java.lang.NoClassDefFoundError: Could not initialize class 
org.apache.lucene.expressions.js.JavascriptCompiler
at 
__randomizedtesting.SeedInfo.seed([4914F5371C97A83A:6EA4F533DF48064E]:0)
at 
org.apache.lucene.expressions.TestExpressionValidation.testCoRecursion(TestExpressionValidation.java:78)


FAILED:  
org.apache.lucene.expressions.TestExpressionValidation.testInvalidExternal2

Error Message:
Could not initialize class org.apache.lucene.expressions.js.JavascriptCompiler

Stack Trace:
java.lang.NoClassDefFoundError: Could not initialize class 
org.apache.lucene.expressions.js.JavascriptCompiler
at 
__randomizedtesting.SeedInfo.seed([4914F5371C97A83A:1E9C19B102705B45]:0)
at 
org.apache.lucene.expressions.TestExpressionValidation.testInvalidExternal2(Test

[jira] [Commented] (SOLR-4221) Custom sharding

2013-09-17 Thread Yonik Seeley (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13769675#comment-13769675
 ] 

Yonik Seeley commented on SOLR-4221:


Hmmm, it feels like the "routeField" parameter should be scoped under "router" 
somehow... I think there will be additional parameters to configure a router in 
the future (such as number of bits to allocate to parts of the compositeId 
router, etc), as well as custom routers, where their configuration could 
include additional parameters that would best be scoped.

router.routeField?


> Custom sharding
> ---
>
> Key: SOLR-4221
> URL: https://issues.apache.org/jira/browse/SOLR-4221
> Project: Solr
>  Issue Type: New Feature
>Reporter: Yonik Seeley
>Assignee: Noble Paul
> Fix For: 4.5, 5.0
>
> Attachments: SOLR-4221.patch, SOLR-4221.patch, SOLR-4221.patch, 
> SOLR-4221.patch, SOLR-4221.patch
>
>
> Features to let users control everything about sharding/routing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-4221) Custom sharding

2013-09-17 Thread Yonik Seeley (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13769675#comment-13769675
 ] 

Yonik Seeley edited comment on SOLR-4221 at 9/17/13 4:58 PM:
-

Hmmm, it feels like the "routeField" parameter should be scoped under "router" 
somehow... I think there will be additional parameters to configure a router in 
the future (such as number of bits to allocate to parts of the compositeId 
router, etc), as well as custom routers, where their configuration could 
include additional parameters that would best be scoped.

router.routeField?

As far as persistence, it could be flat, but perhaps nicer to scope that as 
well...

"router" : {"name" : "implicit", "routField" : "companyName"}


  was (Author: ysee...@gmail.com):
Hmmm, it feels like the "routeField" parameter should be scoped under 
"router" somehow... I think there will be additional parameters to configure a 
router in the future (such as number of bits to allocate to parts of the 
compositeId router, etc), as well as custom routers, where their configuration 
could include additional parameters that would best be scoped.

router.routeField?

  
> Custom sharding
> ---
>
> Key: SOLR-4221
> URL: https://issues.apache.org/jira/browse/SOLR-4221
> Project: Solr
>  Issue Type: New Feature
>Reporter: Yonik Seeley
>Assignee: Noble Paul
> Fix For: 4.5, 5.0
>
> Attachments: SOLR-4221.patch, SOLR-4221.patch, SOLR-4221.patch, 
> SOLR-4221.patch, SOLR-4221.patch
>
>
> Features to let users control everything about sharding/routing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4221) Custom sharding

2013-09-17 Thread Noble Paul (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13769685#comment-13769685
 ] 

Noble Paul commented on SOLR-4221:
--

Makes sense

when the collection is created let us accept the parameters with the 'router' 
prefix . eg: router.name=implicit&router.field=somefield. So any value that has 
a router prefix will go into the router attribute. we can use it in the future 
for more attributes and even custom attributes for custom routers. eg: 
router.class=my.ClassName&router.attr1=x&router.attr2=y

when it is persisted, it can be

router : {"name":"implicit","field":"somefield"}




> Custom sharding
> ---
>
> Key: SOLR-4221
> URL: https://issues.apache.org/jira/browse/SOLR-4221
> Project: Solr
>  Issue Type: New Feature
>Reporter: Yonik Seeley
>Assignee: Noble Paul
> Fix For: 4.5, 5.0
>
> Attachments: SOLR-4221.patch, SOLR-4221.patch, SOLR-4221.patch, 
> SOLR-4221.patch, SOLR-4221.patch
>
>
> Features to let users control everything about sharding/routing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Documentation on the new compressed DocIdSet implementations

2013-09-17 Thread Smiley, David W.

Lucene has got some new compressed DocIdSet implementations that are 
technically very interesting and exciting: PForDeltaDocIdSet, WAH8DocIdSet, 
EliasFanoDocIdSet, … any more?  Yet it's difficult (at least for me) to 
understand their pros/cons to know when to pick amongst them.  They all seem 
great yet why do we have 3?  Only one is actually used by Lucene itself — 
WAH8DocIdSet in CachingWrapperFilter.   Javadocs are hit & miss; the JIRA 
issues have lots of fascinating background but it's time consuming to distill.  
I think it would be very useful to summarily document key characteristics on 
class level javadocs — not so much implementation details but information to 
help a user choose it versus another.  And as a bonus a table perhaps showing 
relative performance characteristics in package-level javadocs.

Related to this is, I'm wondering does it make sense for a codec's postings 
(assuming no doc freq & no positions?) to be implemented as a serialized 
version of one of these compressed doc id sets?  I think it would be really 
great, not just for compression but also because it might support 
Terms.advance() since some of these compressed formats have indexes.

~ David

Re: [JENKINS-MAVEN] Lucene-Solr-Maven-4.x #450: POMs out of sync

2013-09-17 Thread Robert Muir

OK I see it. My problem was i looked at codecs/ for an example, and i
saw it didnt have any resources/ in its config, so i thought it was
unnecessary.

but this was a bad module for me to look at, because its more
complicated (it makes separate src/test submodules probably due to the
way its used in tests).

Ill fix this

On Tue, Sep 17, 2013 at 10:17 AM, Robert Muir  wrote:
> Looks like the maven configuration is not pulling in src/resources?
>
> I don't know whats wrong with the pom.template...
>
> On Tue, Sep 17, 2013 at 9:56 AM, Apache Jenkins Server
>  wrote:
>> Build: https://builds.apache.org/job/Lucene-Solr-Maven-4.x/450/
>>
>> 83 tests failed.
>> FAILED:  org.apache.lucene.expressions.TestDemoExpressions.testLotsOfBindings
>>
>> Error Message:
>> null
>>
>> Stack Trace:
>> java.lang.ExceptionInInitializerError: null
>> at java.io.Reader.(Reader.java:78)
>> at java.io.InputStreamReader.(InputStreamReader.java:129)
>> at org.apache.lucene.util.IOUtils.getDecodingReader(IOUtils.java:268)
>> at org.apache.lucene.util.IOUtils.getDecodingReader(IOUtils.java:319)
>> at 
>> org.apache.lucene.expressions.js.JavascriptCompiler.(JavascriptCompiler.java:512)
>> at 
>> org.apache.lucene.expressions.TestDemoExpressions.doTestLotsOfBindings(TestDemoExpressions.java:174)
>> at 
>> org.apache.lucene.expressions.TestDemoExpressions.testLotsOfBindings(TestDemoExpressions.java:156)
>>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [JENKINS-MAVEN] Lucene-Solr-Maven-4.x #450: POMs out of sync

2013-09-17 Thread Robert Muir

Looks like the maven configuration is not pulling in src/resources?

I don't know whats wrong with the pom.template...

On Tue, Sep 17, 2013 at 9:56 AM, Apache Jenkins Server
 wrote:
> Build: https://builds.apache.org/job/Lucene-Solr-Maven-4.x/450/
>
> 83 tests failed.
> FAILED:  org.apache.lucene.expressions.TestDemoExpressions.testLotsOfBindings
>
> Error Message:
> null
>
> Stack Trace:
> java.lang.ExceptionInInitializerError: null
> at java.io.Reader.(Reader.java:78)
> at java.io.InputStreamReader.(InputStreamReader.java:129)
> at org.apache.lucene.util.IOUtils.getDecodingReader(IOUtils.java:268)
> at org.apache.lucene.util.IOUtils.getDecodingReader(IOUtils.java:319)
> at 
> org.apache.lucene.expressions.js.JavascriptCompiler.(JavascriptCompiler.java:512)
> at 
> org.apache.lucene.expressions.TestDemoExpressions.doTestLotsOfBindings(TestDemoExpressions.java:174)
> at 
> org.apache.lucene.expressions.TestDemoExpressions.testLotsOfBindings(TestDemoExpressions.java:156)
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5180) ShingleFilter should make shingles from trailing holes

2013-09-17 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13769656#comment-13769656
 ] 

ASF subversion and git services commented on LUCENE-5180:
-

Commit 1524120 from [~mikemccand] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1524120 ]

LUCENE-5180: ShingleFilter creates shingles from trailing holes

> ShingleFilter should make shingles from trailing holes
> --
>
> Key: LUCENE-5180
> URL: https://issues.apache.org/jira/browse/LUCENE-5180
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/analysis
>Reporter: Michael McCandless
>Assignee: Michael McCandless
> Fix For: 5.0, 4.6
>
> Attachments: LUCENE-5180.patch, LUCENE-5180.patch
>
>
> When ShingleFilter hits a hole, it uses _ as the token, e.g. bigrams for "the 
> dog barked", if you have a StopFilter removing the, would be: "_ dog", "dog 
> barked".
> But if the input ends with a stopword, e.g. "wizard of", ShingleFilter fails 
> to produce "wizard _" due to LUCENE-3849 ... once we fix that I think we 
> should fix ShingleFilter to make shingles for trailing holes too ...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-5180) ShingleFilter should make shingles from trailing holes

2013-09-17 Thread Michael McCandless (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-5180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless resolved LUCENE-5180.


Resolution: Fixed

> ShingleFilter should make shingles from trailing holes
> --
>
> Key: LUCENE-5180
> URL: https://issues.apache.org/jira/browse/LUCENE-5180
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/analysis
>Reporter: Michael McCandless
>Assignee: Michael McCandless
> Fix For: 5.0, 4.6
>
> Attachments: LUCENE-5180.patch, LUCENE-5180.patch
>
>
> When ShingleFilter hits a hole, it uses _ as the token, e.g. bigrams for "the 
> dog barked", if you have a StopFilter removing the, would be: "_ dog", "dog 
> barked".
> But if the input ends with a stopword, e.g. "wizard of", ShingleFilter fails 
> to produce "wizard _" due to LUCENE-3849 ... once we fix that I think we 
> should fix ShingleFilter to make shingles for trailing holes too ...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5180) ShingleFilter should make shingles from trailing holes

2013-09-17 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13769657#comment-13769657
 ] 

ASF subversion and git services commented on LUCENE-5180:
-

Commit 1524122 from [~mikemccand] in branch 'dev/trunk'
[ https://svn.apache.org/r1524122 ]

LUCENE-5180: move CHANGES entry

> ShingleFilter should make shingles from trailing holes
> --
>
> Key: LUCENE-5180
> URL: https://issues.apache.org/jira/browse/LUCENE-5180
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/analysis
>Reporter: Michael McCandless
>Assignee: Michael McCandless
> Fix For: 5.0, 4.6
>
> Attachments: LUCENE-5180.patch, LUCENE-5180.patch
>
>
> When ShingleFilter hits a hole, it uses _ as the token, e.g. bigrams for "the 
> dog barked", if you have a StopFilter removing the, would be: "_ dog", "dog 
> barked".
> But if the input ends with a stopword, e.g. "wizard of", ShingleFilter fails 
> to produce "wizard _" due to LUCENE-3849 ... once we fix that I think we 
> should fix ShingleFilter to make shingles for trailing holes too ...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5180) ShingleFilter should make shingles from trailing holes

2013-09-17 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13769655#comment-13769655
 ] 

ASF subversion and git services commented on LUCENE-5180:
-

Commit 1524117 from [~mikemccand] in branch 'dev/trunk'
[ https://svn.apache.org/r1524117 ]

LUCENE-5180: ShingleFilter creates shingles from trailing holes

> ShingleFilter should make shingles from trailing holes
> --
>
> Key: LUCENE-5180
> URL: https://issues.apache.org/jira/browse/LUCENE-5180
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/analysis
>Reporter: Michael McCandless
>Assignee: Michael McCandless
> Fix For: 5.0, 4.6
>
> Attachments: LUCENE-5180.patch, LUCENE-5180.patch
>
>
> When ShingleFilter hits a hole, it uses _ as the token, e.g. bigrams for "the 
> dog barked", if you have a StopFilter removing the, would be: "_ dog", "dog 
> barked".
> But if the input ends with a stopword, e.g. "wizard of", ShingleFilter fails 
> to produce "wizard _" due to LUCENE-3849 ... once we fix that I think we 
> should fix ShingleFilter to make shingles for trailing holes too ...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-5221) SimilarityBase.computeNorm is inconsistent with TFIDFSimilarity

2013-09-17 Thread Yubin Kim (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-5221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yubin Kim updated LUCENE-5221:
--

Description: 
{{SimilarityBase.computeNorm}} Javadoc indicates that the doc length should be 
encoded in the same way as {{TFIDFSimilarity}}. However, when 
{{discountOverlaps}} is {{false}}, what gets encoded is 
{{SmallFloat.floatToByte315((boost / (float) Math.sqrt(docLen / boost)));}} 
rather than  {{SmallFloat.floatToByte315((boost / (float) 
Math.sqrt(length)));}} due to the extra {{/ state.getBoost()}} term in 
{{SimilarityBase.computeNorm}}: 

@Override
public long computeNorm(FieldInvertState state) {
  final float numTerms;
  if (discountOverlaps)
numTerms = state.getLength() - state.getNumOverlap();
  else
numTerms = state.getLength() */ state.getBoost();*
  return encodeNormValue(state.getBoost(), numTerms);
}

  was:
{{SimilarityBase.computeNorm}} Javadoc indicates that the doc length should be 
encoded in the same way as {{TFIDFSimilarity}}. However, when 
{{discountOverlaps}} is {{false}}, what gets encoded is 
{{SmallFloat.floatToByte315((boost / (float) Math.sqrt(docLen / boost)));}} 
rather than  {{SmallFloat.floatToByte315((boost / (float) 
Math.sqrt(length)));}} due to the extra {{/ state.getBoost()}} term in 
{{SimilarityBase.computeNorm}}: 

/** Encodes the document length in the same way as {@link TFIDFSimilarity}. */
@Override
public long computeNorm(FieldInvertState state) {
  final float numTerms;
  if (discountOverlaps)
numTerms = state.getLength() - state.getNumOverlap();
  else
numTerms = state.getLength() */ state.getBoost();*
  return encodeNormValue(state.getBoost(), numTerms);
}


> SimilarityBase.computeNorm is inconsistent with TFIDFSimilarity
> ---
>
> Key: LUCENE-5221
> URL: https://issues.apache.org/jira/browse/LUCENE-5221
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/search
>Affects Versions: 4.4
>Reporter: Yubin Kim
>  Labels: normalize, search, similarity
>
> {{SimilarityBase.computeNorm}} Javadoc indicates that the doc length should 
> be encoded in the same way as {{TFIDFSimilarity}}. However, when 
> {{discountOverlaps}} is {{false}}, what gets encoded is 
> {{SmallFloat.floatToByte315((boost / (float) Math.sqrt(docLen / boost)));}} 
> rather than  {{SmallFloat.floatToByte315((boost / (float) 
> Math.sqrt(length)));}} due to the extra {{/ state.getBoost()}} term in 
> {{SimilarityBase.computeNorm}}: 
> @Override
> public long computeNorm(FieldInvertState state) {
>   final float numTerms;
>   if (discountOverlaps)
> numTerms = state.getLength() - state.getNumOverlap();
>   else
> numTerms = state.getLength() */ state.getBoost();*
>   return encodeNormValue(state.getBoost(), numTerms);
> }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-5221) SimilarityBase.computeNorm is inconsistent with TFIDFSimilarity

2013-09-17 Thread Yubin Kim (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-5221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yubin Kim updated LUCENE-5221:
--

Description: 
{{SimilarityBase.computeNorm}} Javadoc indicates that the doc length should be 
encoded in the same way as {{TFIDFSimilarity}}. However, when 
{{discountOverlaps}} is {{false}}, what gets encoded is 
{{SmallFloat.floatToByte315((boost / (float) Math.sqrt(docLen / boost)));}} 
rather than  {{SmallFloat.floatToByte315((boost / (float) 
Math.sqrt(length)));}} due to the extra {{/ state.getBoost()}} term in 
{{SimilarityBase.computeNorm}}: 

/** Encodes the document length in the same way as {@link TFIDFSimilarity}. */
@Override
public long computeNorm(FieldInvertState state) {
  final float numTerms;
  if (discountOverlaps)
numTerms = state.getLength() - state.getNumOverlap();
  else
numTerms = state.getLength() */ state.getBoost();*
  return encodeNormValue(state.getBoost(), numTerms);
}

  was:
{{SimilarityBase.computeNorm}} Javadoc indicates that the doc length should be 
encoded in the same way as {{TFIDFSimilarity}}. However, when 
{{discountOverlaps}} is {{false}}, what gets encoded is 
{{SmallFloat.floatToByte315((boost / (float) Math.sqrt(docLen / boost)));}} 
rather than  {{SmallFloat.floatToByte315((boost / (float) 
Math.sqrt(length)));}} due to the extra {{/ state.getBoost()}} term in 
{{SimilarityBase.computeNorm}}: 

  /** Encodes the document length in the same way as {@link TFIDFSimilarity}. */
  @Override
  public long computeNorm(FieldInvertState state) {
final float numTerms;
if (discountOverlaps)
  numTerms = state.getLength() - state.getNumOverlap();
else
  numTerms = state.getLength() */ state.getBoost();*
return encodeNormValue(state.getBoost(), numTerms);
  }


> SimilarityBase.computeNorm is inconsistent with TFIDFSimilarity
> ---
>
> Key: LUCENE-5221
> URL: https://issues.apache.org/jira/browse/LUCENE-5221
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/search
>Affects Versions: 4.4
>Reporter: Yubin Kim
>  Labels: normalize, search, similarity
>
> {{SimilarityBase.computeNorm}} Javadoc indicates that the doc length should 
> be encoded in the same way as {{TFIDFSimilarity}}. However, when 
> {{discountOverlaps}} is {{false}}, what gets encoded is 
> {{SmallFloat.floatToByte315((boost / (float) Math.sqrt(docLen / boost)));}} 
> rather than  {{SmallFloat.floatToByte315((boost / (float) 
> Math.sqrt(length)));}} due to the extra {{/ state.getBoost()}} term in 
> {{SimilarityBase.computeNorm}}: 
> /** Encodes the document length in the same way as {@link TFIDFSimilarity}. */
> @Override
> public long computeNorm(FieldInvertState state) {
>   final float numTerms;
>   if (discountOverlaps)
> numTerms = state.getLength() - state.getNumOverlap();
>   else
> numTerms = state.getLength() */ state.getBoost();*
>   return encodeNormValue(state.getBoost(), numTerms);
> }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-5221) SimilarityBase.computeNorm is inconsistent with TFIDFSimilarity

2013-09-17 Thread Yubin Kim (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-5221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yubin Kim updated LUCENE-5221:
--

Description: 
{{SimilarityBase.computeNorm}} Javadoc indicates that the doc length should be 
encoded in the same way as {{TFIDFSimilarity}}. However, when 
{{discountOverlaps}} is {{false}}, what gets encoded is 
{{SmallFloat.floatToByte315((boost / (float) Math.sqrt(docLen / boost)));}} 
rather than  {{SmallFloat.floatToByte315((boost / (float) 
Math.sqrt(length)));}} due to the extra {{/ state.getBoost()}} term in 
{{SimilarityBase.computeNorm}}: 

{{
public long computeNorm(FieldInvertState state) {
  final float numTerms;
  if (discountOverlaps)
numTerms = state.getLength() - state.getNumOverlap();
  else
numTerms = state.getLength() */ state.getBoost();*
  return encodeNormValue(state.getBoost(), numTerms);
}
}}

  was:
{{SimilarityBase.computeNorm}} Javadoc indicates that the doc length should be 
encoded in the same way as {{TFIDFSimilarity}}. However, when 
{{discountOverlaps}} is {{false}}, what gets encoded is 
{{SmallFloat.floatToByte315((boost / (float) Math.sqrt(docLen / boost)));}} 
rather than  {{SmallFloat.floatToByte315((boost / (float) 
Math.sqrt(length)));}} due to the extra {{/ state.getBoost()}} term in 
{{SimilarityBase.computeNorm}}: 

@Override
public long computeNorm(FieldInvertState state) {
  final float numTerms;
  if (discountOverlaps)
numTerms = state.getLength() - state.getNumOverlap();
  else
numTerms = state.getLength() */ state.getBoost();*
  return encodeNormValue(state.getBoost(), numTerms);
}


> SimilarityBase.computeNorm is inconsistent with TFIDFSimilarity
> ---
>
> Key: LUCENE-5221
> URL: https://issues.apache.org/jira/browse/LUCENE-5221
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: core/search
>Affects Versions: 4.4
>Reporter: Yubin Kim
>  Labels: normalize, search, similarity
>
> {{SimilarityBase.computeNorm}} Javadoc indicates that the doc length should 
> be encoded in the same way as {{TFIDFSimilarity}}. However, when 
> {{discountOverlaps}} is {{false}}, what gets encoded is 
> {{SmallFloat.floatToByte315((boost / (float) Math.sqrt(docLen / boost)));}} 
> rather than  {{SmallFloat.floatToByte315((boost / (float) 
> Math.sqrt(length)));}} due to the extra {{/ state.getBoost()}} term in 
> {{SimilarityBase.computeNorm}}: 
> {{
> public long computeNorm(FieldInvertState state) {
>   final float numTerms;
>   if (discountOverlaps)
> numTerms = state.getLength() - state.getNumOverlap();
>   else
> numTerms = state.getLength() */ state.getBoost();*
>   return encodeNormValue(state.getBoost(), numTerms);
> }
> }}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-5221) SimilarityBase.computeNorm is inconsistent with TFIDFSimilarity

2013-09-17 Thread Yubin Kim (JIRA)

Yubin Kim created LUCENE-5221:
-

 Summary: SimilarityBase.computeNorm is inconsistent with 
TFIDFSimilarity
 Key: LUCENE-5221
 URL: https://issues.apache.org/jira/browse/LUCENE-5221
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/search
Affects Versions: 4.4
Reporter: Yubin Kim


{{SimilarityBase.computeNorm}} Javadoc indicates that the doc length should be 
encoded in the same way as {{TFIDFSimilarity}}. However, when 
{{discountOverlaps}} is {{false}}, what gets encoded is 
{{SmallFloat.floatToByte315((boost / (float) Math.sqrt(docLen / boost)));}} 
rather than  {{SmallFloat.floatToByte315((boost / (float) 
Math.sqrt(length)));}} due to the extra {{/ state.getBoost()}} term in 
{{SimilarityBase.computeNorm}}: 

  /** Encodes the document length in the same way as {@link TFIDFSimilarity}. */
  @Override
  public long computeNorm(FieldInvertState state) {
final float numTerms;
if (discountOverlaps)
  numTerms = state.getLength() - state.getNumOverlap();
else
  numTerms = state.getLength() */ state.getBoost();*
return encodeNormValue(state.getBoost(), numTerms);
  }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-5220) Expression SortField has crappy toString/bad equals/hashcode

2013-09-17 Thread Robert Muir (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-5220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-5220:


Attachment: LUCENE-5220.patch

patch

> Expression SortField has crappy toString/bad equals/hashcode
> 
>
> Key: LUCENE-5220
> URL: https://issues.apache.org/jira/browse/LUCENE-5220
> Project: Lucene - Core
>  Issue Type: Bug
>Affects Versions: 4.6
>Reporter: Robert Muir
> Attachments: LUCENE-5220.patch
>
>
> While doing some unrelated debugging:
> I noticed when printing the Sort, the expressions get the inherited toString 
> for a CUSTOM comparator, which is not very good:
> {noformat}
> !
> {noformat}
> I think its better if it looks something like this instead:
> {noformat}
> !
> {noformat}
> Also equals/hashcode is wrong: it will bogusly report equals(true) if the 
> expression is the same: but bindings could be different!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

1 2 >

1 - 100 of 124 matches

Mail list logo