date:20131009


[ 
https://issues.apache.org/jira/browse/LUCENE-5267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13789958#comment-13789958
 ] 

Littlestar edited comment on LUCENE-5267 at 10/9/13 6:54 AM:
-

{noformat}
public static int decompress(DataInput compressed, int decompressedLen, byte[] 
dest, int dOff) throws IOException {
final int destEnd = dest.length;

do {
  ..

  // copying a multiple of 8 bytes can make decompression from 5% to 10% 
faster
  final int fastLen = (matchLen + 7)  0xFFF8;
  if (matchDec  matchLen || dOff + fastLen  destEnd) {
// overlap - naive incremental copy
for (int ref = dOff - matchDec, end = dOff + matchLen; dOff  end; 
++ref, ++dOff) {
  dest[dOff] = dest[ref];
}
  } else {
// no overlap - arraycopy
try {
System.arraycopy(dest, dOff - matchDec, dest, dOff, fastLen);
}catch(Throwable e) {
System.out.println(dest.length= + dest.length + ,dOff= + dOff + 
,matchDec= + matchDec + ,matchLen= + matchLen + ,fastLen= + fastLen);
}
dOff += matchLen;
  }
} while (dOff  decompressedLen);

return dOff;
  }
{noformat}



was (Author: cnstar9988):
{noformat}
public static int decompress(DataInput compressed, int decompressedLen, byte[] 
dest, int dOff) throws IOException {
final int destEnd = dest.length;

do {
  ..

  // copying a multiple of 8 bytes can make decompression from 5% to 10% 
faster
  final int fastLen = (matchLen + 7)  0xFFF8;
  if (matchDec  matchLen || dOff + fastLen  destEnd) {
// overlap - naive incremental copy
for (int ref = dOff - matchDec, end = dOff + matchLen; dOff  end; 
++ref, ++dOff) {
  dest[dOff] = dest[ref];
}
  } else {
// no overlap - arraycopy
   // System.out.println(dest.length= + dest.length + ,dOff= + dOff + 
,matchDec= + matchDec + ,fastLen= + fastLen);
System.arraycopy(dest, dOff - matchDec, dest, dOff, fastLen);//here 
throws java.lang.ArrayIndexOutOfBoundsException
dOff += matchLen;
  }
} while (dOff  decompressedLen);

return dOff;
  }
{noformat}


 java.lang.ArrayIndexOutOfBoundsException on reading data
 

 Key: LUCENE-5267
 URL: https://issues.apache.org/jira/browse/LUCENE-5267
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/index
Affects Versions: 4.4
Reporter: Littlestar
  Labels: LZ4

 java.lang.ArrayIndexOutOfBoundsException
   at org.apache.lucene.codecs.compressing.LZ4.decompress(LZ4.java:132)
   at 
 org.apache.lucene.codecs.compressing.CompressionMode$4.decompress(CompressionMode.java:135)
   at 
 org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.visitDocument(CompressingStoredFieldsReader.java:336)
   at 
 org.apache.lucene.index.SegmentReader.document(SegmentReader.java:133)
   at 
 org.apache.lucene.index.BaseCompositeReader.document(BaseCompositeReader.java:110)
   at 
 org.apache.lucene.index.SlowCompositeReaderWrapper.document(SlowCompositeReaderWrapper.java:212)
   at 
 org.apache.lucene.index.FilterAtomicReader.document(FilterAtomicReader.java:365)
   at 
 org.apache.lucene.index.BaseCompositeReader.document(BaseCompositeReader.java:110)
   at org.apache.lucene.index.IndexReader.document(IndexReader.java:447)
   at org.apache.lucene.search.IndexSearcher.doc(IndexSearcher.java:204)



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5294) Pluggable Dictionary Implementation for Suggester

2013-10-09 Thread Areek Zillur (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790105#comment-13790105
 ] 

Areek Zillur commented on SOLR-5294:


Thanks for reviewing this, Robert!

{quote}
Should we think about fixing the spellchecker stuff too (which seems to have 
totally separate implementations like FileBased and so on to just change the 
dictionary
{quote}
This is an interesting point! After looking through the 
AbstractLuceneSpellChecker and all its implementations, it seems like it would 
be better to refactor those out too. I feel like that should be considered for 
the dictionaryImpl setting to work as expected. 

{quote}
I am not sure if we want to keep spell and suggest entangled?
{quote}
It does make sense to untangle them, but I think that by itself is a bigger 
issue (I will open up an issue about that and will be happy to work on that)

{quote}
Should we name the DictionaryFactoryBase something better (SuggestDictionary? 
SpellingDictionary?)
{quote}
Given the situation, it seems like the dictionary plugin will be shared among 
both suggest and spelling; maybe call it DictionaryFactory?

{quote}
Maybe we can simplify the base plugin class to suit more use cases, like remove 
the setCore() and just check if it implements CoreAware interface?
{quote}
That sounds good to me.

{quote}
I think it would be ideal if we could eliminate the additional hierarchy of 
FileBased* and IndexBased*: couldnt the FileBased impl just take its filename 
in via a parameter in params, and IndexBased take its fieldname in params the 
same way, and we push up create(IndexSearcher) to the base plugin class (the 
file-based just wouldnt use the indexsearcher argument).
{quote}
The reason for having the hierarchy was to separate out the two major types of 
dictionaries (index and file-based). I can change that but at the cost of 
reduced enforcement.

I will upload another patch, incorporating your feedback!


 Pluggable Dictionary Implementation for Suggester
 -

 Key: SOLR-5294
 URL: https://issues.apache.org/jira/browse/SOLR-5294
 Project: Solr
  Issue Type: Improvement
  Components: SearchComponents - other
Reporter: Areek Zillur
 Attachments: SOLR-5294.patch, SOLR-5294.patch


 It would be nice to have the option of plugging in Dictionary implementations 
 for the suggester to consume, like that of the lookup implementation that 
 allows users to specify which lucene suggesters to use. 
 This would allow easy addition of new dictionary implementations that the 
 lucene suggesters can consume. New implementations of dictionary like 
 (https://issues.apache.org/jira/browse/LUCENE-5251) could be easily added. I 
 believe this would give the users more control on what they what their lucene 
 suggesters to consume.
 For the implementation, the user can add a new setting in the spellcomponent 
 in the solrconfig. The new setting would be a string identifying the class 
 path of the dictionary implementation to be used (very similar to the 
 existing lookupImpl). This setting would be used to call the relevant 
 DictionaryFactory.
 A sample solrconfig file would look as follows (note the new dictionaryImpl 
 setting):
 {code}
   searchComponent class=solr.SpellCheckComponent 
 name=fuzzy_suggest_analyzing_with_lucene_dict
 lst name=spellchecker
   str name=namefuzzy_suggest_analyzing_with_lucene_dict/str
   str name=classnameorg.apache.solr.spelling.suggest.Suggester/str
   str 
 name=lookupImplorg.apache.solr.spelling.suggest.fst.FuzzyLookupFactory/str
   str 
 name=dictionaryImplorg.apache.solr.spelling.suggest.LuceneDictionaryFactory/str
  !-- new setting --
   str name=storeDirfuzzy_suggest_analyzing/str
   str name=buildOnCommitfalse/str
   !-- Suggester properties --
   bool name=exactMatchFirsttrue/bool
   str name=suggestAnalyzerFieldTypetext/str
   bool name=preserveSepfalse/bool
   str name=fieldstext/str
 /lst
 {code}
  



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Assigned] (LUCENE-5267) java.lang.ArrayIndexOutOfBoundsException on reading data


 [ 
https://issues.apache.org/jira/browse/LUCENE-5267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand reassigned LUCENE-5267:


Assignee: Adrien Grand

 java.lang.ArrayIndexOutOfBoundsException on reading data
 

 Key: LUCENE-5267
 URL: https://issues.apache.org/jira/browse/LUCENE-5267
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/index
Affects Versions: 4.4
Reporter: Littlestar
Assignee: Adrien Grand
  Labels: LZ4

 java.lang.ArrayIndexOutOfBoundsException
   at org.apache.lucene.codecs.compressing.LZ4.decompress(LZ4.java:132)
   at 
 org.apache.lucene.codecs.compressing.CompressionMode$4.decompress(CompressionMode.java:135)
   at 
 org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.visitDocument(CompressingStoredFieldsReader.java:336)
   at 
 org.apache.lucene.index.SegmentReader.document(SegmentReader.java:133)
   at 
 org.apache.lucene.index.BaseCompositeReader.document(BaseCompositeReader.java:110)
   at 
 org.apache.lucene.index.SlowCompositeReaderWrapper.document(SlowCompositeReaderWrapper.java:212)
   at 
 org.apache.lucene.index.FilterAtomicReader.document(FilterAtomicReader.java:365)
   at 
 org.apache.lucene.index.BaseCompositeReader.document(BaseCompositeReader.java:110)
   at org.apache.lucene.index.IndexReader.document(IndexReader.java:447)
   at org.apache.lucene.search.IndexSearcher.doc(IndexSearcher.java:204)



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5267) java.lang.ArrayIndexOutOfBoundsException on reading data


[ 
https://issues.apache.org/jira/browse/LUCENE-5267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790108#comment-13790108
 ] 

Adrien Grand commented on LUCENE-5267:
--

Thanks for the report. Can you check if there are disk-related issues in your 
system logs and share the .fdx and .fdt files of the broken segment?

 java.lang.ArrayIndexOutOfBoundsException on reading data
 

 Key: LUCENE-5267
 URL: https://issues.apache.org/jira/browse/LUCENE-5267
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/index
Affects Versions: 4.4
Reporter: Littlestar
  Labels: LZ4

 java.lang.ArrayIndexOutOfBoundsException
   at org.apache.lucene.codecs.compressing.LZ4.decompress(LZ4.java:132)
   at 
 org.apache.lucene.codecs.compressing.CompressionMode$4.decompress(CompressionMode.java:135)
   at 
 org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.visitDocument(CompressingStoredFieldsReader.java:336)
   at 
 org.apache.lucene.index.SegmentReader.document(SegmentReader.java:133)
   at 
 org.apache.lucene.index.BaseCompositeReader.document(BaseCompositeReader.java:110)
   at 
 org.apache.lucene.index.SlowCompositeReaderWrapper.document(SlowCompositeReaderWrapper.java:212)
   at 
 org.apache.lucene.index.FilterAtomicReader.document(FilterAtomicReader.java:365)
   at 
 org.apache.lucene.index.BaseCompositeReader.document(BaseCompositeReader.java:110)
   at org.apache.lucene.index.IndexReader.document(IndexReader.java:447)
   at org.apache.lucene.search.IndexSearcher.doc(IndexSearcher.java:204)



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5267) java.lang.ArrayIndexOutOfBoundsException on reading data


[ 
https://issues.apache.org/jira/browse/LUCENE-5267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790110#comment-13790110
 ] 

Adrien Grand commented on LUCENE-5267:
--

Can you also confirm that you are using Lucene42StoredFieldsFormat in your 
hybaseStd42x codec (and not eg. a customized CompressingStoredFieldsFormat)?

 java.lang.ArrayIndexOutOfBoundsException on reading data
 

 Key: LUCENE-5267
 URL: https://issues.apache.org/jira/browse/LUCENE-5267
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/index
Affects Versions: 4.4
Reporter: Littlestar
Assignee: Adrien Grand
  Labels: LZ4

 java.lang.ArrayIndexOutOfBoundsException
   at org.apache.lucene.codecs.compressing.LZ4.decompress(LZ4.java:132)
   at 
 org.apache.lucene.codecs.compressing.CompressionMode$4.decompress(CompressionMode.java:135)
   at 
 org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.visitDocument(CompressingStoredFieldsReader.java:336)
   at 
 org.apache.lucene.index.SegmentReader.document(SegmentReader.java:133)
   at 
 org.apache.lucene.index.BaseCompositeReader.document(BaseCompositeReader.java:110)
   at 
 org.apache.lucene.index.SlowCompositeReaderWrapper.document(SlowCompositeReaderWrapper.java:212)
   at 
 org.apache.lucene.index.FilterAtomicReader.document(FilterAtomicReader.java:365)
   at 
 org.apache.lucene.index.BaseCompositeReader.document(BaseCompositeReader.java:110)
   at org.apache.lucene.index.IndexReader.document(IndexReader.java:447)
   at org.apache.lucene.search.IndexSearcher.doc(IndexSearcher.java:204)



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5267) java.lang.ArrayIndexOutOfBoundsException on reading data


[ 
https://issues.apache.org/jira/browse/LUCENE-5267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790117#comment-13790117
 ] 

Littlestar commented on LUCENE-5267:


dOff - matchDec 0, so throws java.lang.ArrayIndexOutOfBoundsException
dest.length=33288,dOff=3184,matchDec=34510,matchLen=15,fastLen=16
dest.length=33288,dOff=3213,matchDec=34724,matchLen=9,fastLen=16
dest.length=33288,dOff=3229,matchDec=45058,matchLen=12,fastLen=16
dest.length=33288,dOff=3255,matchDec=20482,matchLen=9,fastLen=16
dest.length=33288,dOff=3275,matchDec=26122,matchLen=12,fastLen=16
dest.length=33288,dOff=3570,matchDec=35228,matchLen=6,fastLen=8



 java.lang.ArrayIndexOutOfBoundsException on reading data
 

 Key: LUCENE-5267
 URL: https://issues.apache.org/jira/browse/LUCENE-5267
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/index
Affects Versions: 4.4
Reporter: Littlestar
Assignee: Adrien Grand
  Labels: LZ4

 java.lang.ArrayIndexOutOfBoundsException
   at org.apache.lucene.codecs.compressing.LZ4.decompress(LZ4.java:132)
   at 
 org.apache.lucene.codecs.compressing.CompressionMode$4.decompress(CompressionMode.java:135)
   at 
 org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.visitDocument(CompressingStoredFieldsReader.java:336)
   at 
 org.apache.lucene.index.SegmentReader.document(SegmentReader.java:133)
   at 
 org.apache.lucene.index.BaseCompositeReader.document(BaseCompositeReader.java:110)
   at 
 org.apache.lucene.index.SlowCompositeReaderWrapper.document(SlowCompositeReaderWrapper.java:212)
   at 
 org.apache.lucene.index.FilterAtomicReader.document(FilterAtomicReader.java:365)
   at 
 org.apache.lucene.index.BaseCompositeReader.document(BaseCompositeReader.java:110)
   at org.apache.lucene.index.IndexReader.document(IndexReader.java:447)
   at org.apache.lucene.search.IndexSearcher.doc(IndexSearcher.java:204)



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5267) java.lang.ArrayIndexOutOfBoundsException on reading data


[ 
https://issues.apache.org/jira/browse/LUCENE-5267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790131#comment-13790131
 ] 

Littlestar commented on LUCENE-5267:


// Lucene42Codec + LZ4
public final class Hybase42StandardCodec extends FilterCodec {
public Hybase42StandardCodec() {
super(hybaseStd42x, new Lucene42Codec());
}
}

disk-related issues in your system logs and share the .fdx and .fdt files of 
the broken segment
too big(5G)

 java.lang.ArrayIndexOutOfBoundsException on reading data
 

 Key: LUCENE-5267
 URL: https://issues.apache.org/jira/browse/LUCENE-5267
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/index
Affects Versions: 4.4
Reporter: Littlestar
Assignee: Adrien Grand
  Labels: LZ4

 java.lang.ArrayIndexOutOfBoundsException
   at org.apache.lucene.codecs.compressing.LZ4.decompress(LZ4.java:132)
   at 
 org.apache.lucene.codecs.compressing.CompressionMode$4.decompress(CompressionMode.java:135)
   at 
 org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.visitDocument(CompressingStoredFieldsReader.java:336)
   at 
 org.apache.lucene.index.SegmentReader.document(SegmentReader.java:133)
   at 
 org.apache.lucene.index.BaseCompositeReader.document(BaseCompositeReader.java:110)
   at 
 org.apache.lucene.index.SlowCompositeReaderWrapper.document(SlowCompositeReaderWrapper.java:212)
   at 
 org.apache.lucene.index.FilterAtomicReader.document(FilterAtomicReader.java:365)
   at 
 org.apache.lucene.index.BaseCompositeReader.document(BaseCompositeReader.java:110)
   at org.apache.lucene.index.IndexReader.document(IndexReader.java:447)
   at org.apache.lucene.search.IndexSearcher.doc(IndexSearcher.java:204)



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5267) java.lang.ArrayIndexOutOfBoundsException on reading data


[ 
https://issues.apache.org/jira/browse/LUCENE-5267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790134#comment-13790134
 ] 

Littlestar commented on LUCENE-5267:


when ArrayIndexOutOfBoundsException omit
ERROR [Invalid vLong detected (negative values disallowed)]
java.lang.RuntimeException: Invalid vLong detected (negative values disallowed)
at 
org.apache.lucene.store.ByteArrayDataInput.readVLong(ByteArrayDataInput.java:152)
at 
org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.visitDocument(CompressingStoredFieldsReader.java:342)
at 
org.apache.lucene.index.SegmentReader.document(SegmentReader.java:133)
at org.apache.lucene.index.IndexReader.document(IndexReader.java:436)
at 
org.apache.lucene.index.CheckIndex.testStoredFields(CheckIndex.java:1268)
at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:626)
at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:1903)
test: term vectorsOK [0 total vector count; avg 0 term/freq vector 
fields per doc]
test: docvalues...OK [12 docvalues fields; 7 BINARY; 3 NUMERIC; 2 
SORTED; 0 SORTED_SET]

 java.lang.ArrayIndexOutOfBoundsException on reading data
 

 Key: LUCENE-5267
 URL: https://issues.apache.org/jira/browse/LUCENE-5267
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/index
Affects Versions: 4.4
Reporter: Littlestar
Assignee: Adrien Grand
  Labels: LZ4

 java.lang.ArrayIndexOutOfBoundsException
   at org.apache.lucene.codecs.compressing.LZ4.decompress(LZ4.java:132)
   at 
 org.apache.lucene.codecs.compressing.CompressionMode$4.decompress(CompressionMode.java:135)
   at 
 org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.visitDocument(CompressingStoredFieldsReader.java:336)
   at 
 org.apache.lucene.index.SegmentReader.document(SegmentReader.java:133)
   at 
 org.apache.lucene.index.BaseCompositeReader.document(BaseCompositeReader.java:110)
   at 
 org.apache.lucene.index.SlowCompositeReaderWrapper.document(SlowCompositeReaderWrapper.java:212)
   at 
 org.apache.lucene.index.FilterAtomicReader.document(FilterAtomicReader.java:365)
   at 
 org.apache.lucene.index.BaseCompositeReader.document(BaseCompositeReader.java:110)
   at org.apache.lucene.index.IndexReader.document(IndexReader.java:447)
   at org.apache.lucene.search.IndexSearcher.doc(IndexSearcher.java:204)



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Need help regarding Boolean queries with queryparser

2013-10-09 Thread Devi pulaparti

In our search application ,  queries like  test  usage  is not returning
correct results but if I give the query like test AND usage works fine.
Using queryparser with standard analyzer. Could some one please help me.

RE: Need help regarding Boolean queries with queryparser

2013-10-09 Thread Uwe Schindler

Hi,

 

you have to write your own query parser. Look e.g. at the flexible query parser 
module, which can be customized.

 

Uwe

 

-

Uwe Schindler

H.-H.-Meier-Allee 63, D-28213 Bremen

http://www.thetaphi.de http://www.thetaphi.de/ 

eMail: u...@thetaphi.de

 

From: Devi pulaparti [mailto:pvkd...@gmail.com] 
Sent: Wednesday, October 09, 2013 9:50 AM
To: dev@lucene.apache.org
Subject: Need help regarding Boolean queries with queryparser

 

In our search application ,  queries like  test  usage  is not returning 
correct results but if I give the query like test AND usage works fine.  Using 
queryparser with standard analyzer. Could some one please help me.

Re: Need help regarding Boolean queries with queryparser

2013-10-09 Thread Devi pulaparti

Hi Uwe,
thanks a lot for the quick reply.
I am very new to Lucene. could please shed some light on the capabilities
of queryparser? why do we need a flexible query parser module for symbol
 to work? Doesn't  queryparser handle this?



On Wed, Oct 9, 2013 at 1:24 PM, Uwe Schindler u...@thetaphi.de wrote:

 Hi,

 ** **

 you have to write your own query parser. Look e.g. at the flexible query
 parser module, which can be customized.

 ** **

 Uwe

 ** **

 -

 Uwe Schindler

 H.-H.-Meier-Allee 63, D-28213 Bremen

 http://www.thetaphi.de

 eMail: u...@thetaphi.de

 ** **

 *From:* Devi pulaparti [mailto:pvkd...@gmail.com]
 *Sent:* Wednesday, October 09, 2013 9:50 AM
 *To:* dev@lucene.apache.org
 *Subject:* Need help regarding Boolean queries with queryparser

 ** **

 In our search application ,  queries like  test  usage  is not returning
 correct results but if I give the query like test AND usage works fine.
 Using queryparser with standard analyzer. Could some one please help me.**
 **

[jira] [Resolved] (LUCENE-5267) java.lang.ArrayIndexOutOfBoundsException on reading data


 [ 
https://issues.apache.org/jira/browse/LUCENE-5267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand resolved LUCENE-5267.
--

Resolution: Not A Problem

bq. dOff - matchDec 0, so throws java.lang.ArrayIndexOutOfBoundsException
bq. dest.length=33288,dOff=3184,matchDec=34510,matchLen=15,fastLen=16

Indeed, all the lines you pasted make no sense since matchDec should be lower 
than dOff. To me this really looks like your index got corrupted somehow. It 
could be a single corrupt byte that makes LZ4 read a length on 2 bytes instead 
of 1 and this shift makes LZ4 try to decompress bytes that make no sense at 
all, explaining why all matchDecs are all higher than dOff.

There are likely only a few chunks that are broken so if you want to try to get 
back as many documents as possible from the corrupt segment, the following 
piece of code may help https://gist.github.com/jpountz/6461246

 java.lang.ArrayIndexOutOfBoundsException on reading data
 

 Key: LUCENE-5267
 URL: https://issues.apache.org/jira/browse/LUCENE-5267
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/index
Affects Versions: 4.4
Reporter: Littlestar
Assignee: Adrien Grand
  Labels: LZ4

 java.lang.ArrayIndexOutOfBoundsException
   at org.apache.lucene.codecs.compressing.LZ4.decompress(LZ4.java:132)
   at 
 org.apache.lucene.codecs.compressing.CompressionMode$4.decompress(CompressionMode.java:135)
   at 
 org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.visitDocument(CompressingStoredFieldsReader.java:336)
   at 
 org.apache.lucene.index.SegmentReader.document(SegmentReader.java:133)
   at 
 org.apache.lucene.index.BaseCompositeReader.document(BaseCompositeReader.java:110)
   at 
 org.apache.lucene.index.SlowCompositeReaderWrapper.document(SlowCompositeReaderWrapper.java:212)
   at 
 org.apache.lucene.index.FilterAtomicReader.document(FilterAtomicReader.java:365)
   at 
 org.apache.lucene.index.BaseCompositeReader.document(BaseCompositeReader.java:110)
   at org.apache.lucene.index.IndexReader.document(IndexReader.java:447)
   at org.apache.lucene.search.IndexSearcher.doc(IndexSearcher.java:204)



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5319) Collection ZK nodes do not reflect the correct router chosen

2013-10-09 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790153#comment-13790153
 ] 

Shalin Shekhar Mangar commented on SOLR-5319:
-

The doc router stored in the collection zk node is not used anywhere. We should 
just remove that code.

 Collection ZK nodes do not reflect the correct router chosen
 

 Key: SOLR-5319
 URL: https://issues.apache.org/jira/browse/SOLR-5319
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.4, 4.5, 5.0
Reporter: Jessica Cheng
Assignee: Shalin Shekhar Mangar
  Labels: solrcloud, zookeeper

 In ZkController.createCollectionZkNode, the doc router is determined by this 
 code snippet:
 if (collectionProps.get(DocCollection.DOC_ROUTER) == null) {
 Object numShards = 
 collectionProps.get(ZkStateReader.NUM_SHARDS_PROP);
 if (numShards == null) {
   numShards = System.getProperty(ZkStateReader.NUM_SHARDS_PROP);
 }
 if (numShards == null) {
   collectionProps.put(DocCollection.DOC_ROUTER, 
 ImplicitDocRouter.NAME);
 } else {
   collectionProps.put(DocCollection.DOC_ROUTER, 
 DocRouter.DEFAULT_NAME);
 }
   }
 Since OverseerCollectionProcessor never passes on any params prefixed with 
 collection other than collection.configName in its create core commands, 
 collectionProps.get(DocCollection.DOC_ROUTER) will never be non-null. Thus, 
 it needs to figure out if the router is implicit or compositeID based on if 
 numShards is passed in. However, 
 collectionProps.get(ZkStateReader.NUM_SHARDS_PROP) will also always be null 
 for the same reason collectionProps.get(DocCollection.DOC_ROUTER) is null, 
 and it isn't explicitly set in the code above, so the only way for numShards 
 not to be null is if it's passed in as a system property.
 As an example, here's a cluster state that's created as compositeId router, 
 but the collection ZK node says it's implicit:
 in clusterstate.json:
 example:{
 shards:{shard1:{
 range:8000-7fff,
 state:active,
 replicas:{core_node1:{
 state:active,
 core:example_shard1_replica1,
 node_name:localhost:8983_solr,
 base_url:http://localhost:8983/solr;,
 leader:true,
 router:compositeId},
 in /collections/example data:
 {
   configName:myconf,
   router:implicit}
 I've not sure if the collection ZK node router info is actually used 
 anywhere, so it may not matter, but it's confusing.
 I think the best fix is for OverseerCollectionProcessor to pass on params 
 prefixed with collection. to the core creation requests. Otherwise, 
 ZkController.createCollectionZkNode can explicitly set the numShards 
 collectionProps by cd.getNumShards() too.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-3069) Lucene should have an entirely memory resident term dictionary


[ 
https://issues.apache.org/jira/browse/LUCENE-3069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790156#comment-13790156
 ] 

ASF subversion and git services commented on LUCENE-3069:
-

Commit 1530520 from [~billy] in branch 'dev/trunk'
[ https://svn.apache.org/r1530520 ]

LUCENE-3069: add CHANGES, move new postingsformats to oal.codecs

 Lucene should have an entirely memory resident term dictionary
 --

 Key: LUCENE-3069
 URL: https://issues.apache.org/jira/browse/LUCENE-3069
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/index, core/search
Affects Versions: 4.0-ALPHA
Reporter: Simon Willnauer
Assignee: Han Jiang
  Labels: gsoc2013
 Fix For: 4.6

 Attachments: df-ttf-estimate.txt, example.png, LUCENE-3069.patch, 
 LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, 
 LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, 
 LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, 
 LUCENE-3069.patch


 FST based TermDictionary has been a great improvement yet it still uses a 
 delta codec file for scanning to terms. Some environments have enough memory 
 available to keep the entire FST based term dict in memory. We should add a 
 TermDictionary implementation that encodes all needed information for each 
 term into the FST (custom fst.Output) and builds a FST from the entire term 
 not just the delta.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5319) Collection ZK nodes do not reflect the correct router chosen

2013-10-09 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790157#comment-13790157
 ] 

ASF subversion and git services commented on SOLR-5319:
---

Commit 1530521 from sha...@apache.org in branch 'dev/trunk'
[ https://svn.apache.org/r1530521 ]

SOLR-5319: Remove unused and incorrect router name from Collection ZK nodes

 Collection ZK nodes do not reflect the correct router chosen
 

 Key: SOLR-5319
 URL: https://issues.apache.org/jira/browse/SOLR-5319
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.4, 4.5, 5.0
Reporter: Jessica Cheng
Assignee: Shalin Shekhar Mangar
  Labels: solrcloud, zookeeper
 Fix For: 5.0, 4.6


 In ZkController.createCollectionZkNode, the doc router is determined by this 
 code snippet:
 if (collectionProps.get(DocCollection.DOC_ROUTER) == null) {
 Object numShards = 
 collectionProps.get(ZkStateReader.NUM_SHARDS_PROP);
 if (numShards == null) {
   numShards = System.getProperty(ZkStateReader.NUM_SHARDS_PROP);
 }
 if (numShards == null) {
   collectionProps.put(DocCollection.DOC_ROUTER, 
 ImplicitDocRouter.NAME);
 } else {
   collectionProps.put(DocCollection.DOC_ROUTER, 
 DocRouter.DEFAULT_NAME);
 }
   }
 Since OverseerCollectionProcessor never passes on any params prefixed with 
 collection other than collection.configName in its create core commands, 
 collectionProps.get(DocCollection.DOC_ROUTER) will never be non-null. Thus, 
 it needs to figure out if the router is implicit or compositeID based on if 
 numShards is passed in. However, 
 collectionProps.get(ZkStateReader.NUM_SHARDS_PROP) will also always be null 
 for the same reason collectionProps.get(DocCollection.DOC_ROUTER) is null, 
 and it isn't explicitly set in the code above, so the only way for numShards 
 not to be null is if it's passed in as a system property.
 As an example, here's a cluster state that's created as compositeId router, 
 but the collection ZK node says it's implicit:
 in clusterstate.json:
 example:{
 shards:{shard1:{
 range:8000-7fff,
 state:active,
 replicas:{core_node1:{
 state:active,
 core:example_shard1_replica1,
 node_name:localhost:8983_solr,
 base_url:http://localhost:8983/solr;,
 leader:true,
 router:compositeId},
 in /collections/example data:
 {
   configName:myconf,
   router:implicit}
 I've not sure if the collection ZK node router info is actually used 
 anywhere, so it may not matter, but it's confusing.
 I think the best fix is for OverseerCollectionProcessor to pass on params 
 prefixed with collection. to the core creation requests. Otherwise, 
 ZkController.createCollectionZkNode can explicitly set the numShards 
 collectionProps by cd.getNumShards() too.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5319) Collection ZK nodes do not reflect the correct router chosen

2013-10-09 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790158#comment-13790158
 ] 

ASF subversion and git services commented on SOLR-5319:
---

Commit 1530523 from sha...@apache.org in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1530523 ]

SOLR-5319: Remove unused and incorrect router name from Collection ZK nodes

 Collection ZK nodes do not reflect the correct router chosen
 

 Key: SOLR-5319
 URL: https://issues.apache.org/jira/browse/SOLR-5319
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.4, 4.5, 5.0
Reporter: Jessica Cheng
Assignee: Shalin Shekhar Mangar
  Labels: solrcloud, zookeeper
 Fix For: 5.0, 4.6


 In ZkController.createCollectionZkNode, the doc router is determined by this 
 code snippet:
 if (collectionProps.get(DocCollection.DOC_ROUTER) == null) {
 Object numShards = 
 collectionProps.get(ZkStateReader.NUM_SHARDS_PROP);
 if (numShards == null) {
   numShards = System.getProperty(ZkStateReader.NUM_SHARDS_PROP);
 }
 if (numShards == null) {
   collectionProps.put(DocCollection.DOC_ROUTER, 
 ImplicitDocRouter.NAME);
 } else {
   collectionProps.put(DocCollection.DOC_ROUTER, 
 DocRouter.DEFAULT_NAME);
 }
   }
 Since OverseerCollectionProcessor never passes on any params prefixed with 
 collection other than collection.configName in its create core commands, 
 collectionProps.get(DocCollection.DOC_ROUTER) will never be non-null. Thus, 
 it needs to figure out if the router is implicit or compositeID based on if 
 numShards is passed in. However, 
 collectionProps.get(ZkStateReader.NUM_SHARDS_PROP) will also always be null 
 for the same reason collectionProps.get(DocCollection.DOC_ROUTER) is null, 
 and it isn't explicitly set in the code above, so the only way for numShards 
 not to be null is if it's passed in as a system property.
 As an example, here's a cluster state that's created as compositeId router, 
 but the collection ZK node says it's implicit:
 in clusterstate.json:
 example:{
 shards:{shard1:{
 range:8000-7fff,
 state:active,
 replicas:{core_node1:{
 state:active,
 core:example_shard1_replica1,
 node_name:localhost:8983_solr,
 base_url:http://localhost:8983/solr;,
 leader:true,
 router:compositeId},
 in /collections/example data:
 {
   configName:myconf,
   router:implicit}
 I've not sure if the collection ZK node router info is actually used 
 anywhere, so it may not matter, but it's confusing.
 I think the best fix is for OverseerCollectionProcessor to pass on params 
 prefixed with collection. to the core creation requests. Otherwise, 
 ZkController.createCollectionZkNode can explicitly set the numShards 
 collectionProps by cd.getNumShards() too.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-5319) Collection ZK nodes do not reflect the correct router chosen


 [ 
https://issues.apache.org/jira/browse/SOLR-5319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar resolved SOLR-5319.
-

   Resolution: Fixed
Fix Version/s: 4.6
   5.0

 Collection ZK nodes do not reflect the correct router chosen
 

 Key: SOLR-5319
 URL: https://issues.apache.org/jira/browse/SOLR-5319
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.4, 4.5, 5.0
Reporter: Jessica Cheng
Assignee: Shalin Shekhar Mangar
  Labels: solrcloud, zookeeper
 Fix For: 5.0, 4.6


 In ZkController.createCollectionZkNode, the doc router is determined by this 
 code snippet:
 if (collectionProps.get(DocCollection.DOC_ROUTER) == null) {
 Object numShards = 
 collectionProps.get(ZkStateReader.NUM_SHARDS_PROP);
 if (numShards == null) {
   numShards = System.getProperty(ZkStateReader.NUM_SHARDS_PROP);
 }
 if (numShards == null) {
   collectionProps.put(DocCollection.DOC_ROUTER, 
 ImplicitDocRouter.NAME);
 } else {
   collectionProps.put(DocCollection.DOC_ROUTER, 
 DocRouter.DEFAULT_NAME);
 }
   }
 Since OverseerCollectionProcessor never passes on any params prefixed with 
 collection other than collection.configName in its create core commands, 
 collectionProps.get(DocCollection.DOC_ROUTER) will never be non-null. Thus, 
 it needs to figure out if the router is implicit or compositeID based on if 
 numShards is passed in. However, 
 collectionProps.get(ZkStateReader.NUM_SHARDS_PROP) will also always be null 
 for the same reason collectionProps.get(DocCollection.DOC_ROUTER) is null, 
 and it isn't explicitly set in the code above, so the only way for numShards 
 not to be null is if it's passed in as a system property.
 As an example, here's a cluster state that's created as compositeId router, 
 but the collection ZK node says it's implicit:
 in clusterstate.json:
 example:{
 shards:{shard1:{
 range:8000-7fff,
 state:active,
 replicas:{core_node1:{
 state:active,
 core:example_shard1_replica1,
 node_name:localhost:8983_solr,
 base_url:http://localhost:8983/solr;,
 leader:true,
 router:compositeId},
 in /collections/example data:
 {
   configName:myconf,
   router:implicit}
 I've not sure if the collection ZK node router info is actually used 
 anywhere, so it may not matter, but it's confusing.
 I think the best fix is for OverseerCollectionProcessor to pass on params 
 prefixed with collection. to the core creation requests. Otherwise, 
 ZkController.createCollectionZkNode can explicitly set the numShards 
 collectionProps by cd.getNumShards() too.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-5318) create command don't take into account the transient core property


 [ 
https://issues.apache.org/jira/browse/SOLR-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

olivier soyez updated SOLR-5318:


Description: 
the create core admin command don't take into account the transient core 
property, when the core is registered (so, the core will be never closed by the 
transient core cache)

To reproduce :
set transientCacheSize=2 and start with no cores

Create 3 cores :
curl 
http://ip:port/solr/admin/cores?action=CREATEname=coreXinstanceDir=path_to_instanceDir_coreXdataDir=path_to_dataDir_coreXloadOnStartup=falsetransient=true;

Look at the status :
http://ip:port/solr/admin/cores?action=STATUS

All cores are still loaded.
One core should not be loaded (closed by the transient cache).

  was:
the create core admin command don't take into account the transient core 
property, when the core is registered (so, the core will be never closed by the 
transient core cache)



 create command don't take into account the transient core property
 --

 Key: SOLR-5318
 URL: https://issues.apache.org/jira/browse/SOLR-5318
 Project: Solr
  Issue Type: Bug
  Components: multicore
Affects Versions: 4.6
Reporter: olivier soyez
Assignee: Erick Erickson
Priority: Minor
 Attachments: SOLR-5318.patch


 the create core admin command don't take into account the transient core 
 property, when the core is registered (so, the core will be never closed by 
 the transient core cache)
 To reproduce :
 set transientCacheSize=2 and start with no cores
 Create 3 cores :
 curl 
 http://ip:port/solr/admin/cores?action=CREATEname=coreXinstanceDir=path_to_instanceDir_coreXdataDir=path_to_dataDir_coreXloadOnStartup=falsetransient=true;
 Look at the status :
 http://ip:port/solr/admin/cores?action=STATUS
 All cores are still loaded.
 One core should not be loaded (closed by the transient cache).



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-5318) create command don't take into account the transient core property


 [ 
https://issues.apache.org/jira/browse/SOLR-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

olivier soyez updated SOLR-5318:


Description: 
the create core admin command don't take into account the transient core 
property, when the core is registered (so, the core will be never closed by the 
transient core cache)

To reproduce :
set transientCacheSize=2 and start with no cores

Create 3 cores :
curl 
http://ip:port/solr/admin/cores?action=CREATEname=coreXinstanceDir=path_to_instanceDir_coreXdataDir=path_to_dataDir_coreXloadOnStartup=falsetransient=true;

Look at the status :
http://ip:port/solr/admin/cores?action=STATUS

All cores are still loaded.
One core should not be loaded (closed by the transient cache)

  was:
the create core admin command don't take into account the transient core 
property, when the core is registered (so, the core will be never closed by the 
transient core cache)

To reproduce :
set transientCacheSize=2 and start with no cores

Create 3 cores :
curl 
http://ip:port/solr/admin/cores?action=CREATEname=coreXinstanceDir=path_to_instanceDir_coreXdataDir=path_to_dataDir_coreXloadOnStartup=falsetransient=true;

Look at the status :
http://ip:port/solr/admin/cores?action=STATUS

All cores are still loaded.
One core should not be loaded (closed by the transient cache).


 create command don't take into account the transient core property
 --

 Key: SOLR-5318
 URL: https://issues.apache.org/jira/browse/SOLR-5318
 Project: Solr
  Issue Type: Bug
  Components: multicore
Affects Versions: 4.6
Reporter: olivier soyez
Assignee: Erick Erickson
Priority: Minor
 Attachments: SOLR-5318.patch


 the create core admin command don't take into account the transient core 
 property, when the core is registered (so, the core will be never closed by 
 the transient core cache)
 To reproduce :
 set transientCacheSize=2 and start with no cores
 Create 3 cores :
 curl 
 http://ip:port/solr/admin/cores?action=CREATEname=coreXinstanceDir=path_to_instanceDir_coreXdataDir=path_to_dataDir_coreXloadOnStartup=falsetransient=true;
 Look at the status :
 http://ip:port/solr/admin/cores?action=STATUS
 All cores are still loaded.
 One core should not be loaded (closed by the transient cache)



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5318) create command don't take into account the transient core property


[ 
https://issues.apache.org/jira/browse/SOLR-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790169#comment-13790169
 ] 

olivier soyez commented on SOLR-5318:
-

We are using in production solr 4.2.1, but I also test solr 4.4 and the svn 
solr branch_4X : same issue
I complete the description and the way to reproduce the issue
Not correlated with SOLR-4862

 create command don't take into account the transient core property
 --

 Key: SOLR-5318
 URL: https://issues.apache.org/jira/browse/SOLR-5318
 Project: Solr
  Issue Type: Bug
  Components: multicore
Affects Versions: 4.6
Reporter: olivier soyez
Assignee: Erick Erickson
Priority: Minor
 Attachments: SOLR-5318.patch


 the create core admin command don't take into account the transient core 
 property, when the core is registered (so, the core will be never closed by 
 the transient core cache)
 To reproduce :
 set transientCacheSize=2 and start with no cores
 Create 3 cores :
 curl 
 http://ip:port/solr/admin/cores?action=CREATEname=coreXinstanceDir=path_to_instanceDir_coreXdataDir=path_to_dataDir_coreXloadOnStartup=falsetransient=true;
 Look at the status :
 http://ip:port/solr/admin/cores?action=STATUS
 All cores are still loaded.
 One core should not be loaded (closed by the transient cache)



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-5318) create command don't take into account the transient core property


 [ 
https://issues.apache.org/jira/browse/SOLR-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

olivier soyez updated SOLR-5318:


Affects Version/s: 4.4

 create command don't take into account the transient core property
 --

 Key: SOLR-5318
 URL: https://issues.apache.org/jira/browse/SOLR-5318
 Project: Solr
  Issue Type: Bug
  Components: multicore
Affects Versions: 4.4, 4.6
Reporter: olivier soyez
Assignee: Erick Erickson
Priority: Minor
 Attachments: SOLR-5318.patch


 the create core admin command don't take into account the transient core 
 property, when the core is registered (so, the core will be never closed by 
 the transient core cache)
 To reproduce :
 set transientCacheSize=2 and start with no cores
 Create 3 cores :
 curl 
 http://ip:port/solr/admin/cores?action=CREATEname=coreXinstanceDir=path_to_instanceDir_coreXdataDir=path_to_dataDir_coreXloadOnStartup=falsetransient=true;
 Look at the status :
 http://ip:port/solr/admin/cores?action=STATUS
 All cores are still loaded.
 One core should not be loaded (closed by the transient cache)



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-5321) Overseer.updateState tries to use router name from message but none is sent

Shalin Shekhar Mangar created SOLR-5321:
---

 Summary: Overseer.updateState tries to use router name from 
message but none is sent
 Key: SOLR-5321
 URL: https://issues.apache.org/jira/browse/SOLR-5321
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.5
Reporter: Shalin Shekhar Mangar
Assignee: Shalin Shekhar Mangar
Priority: Minor
 Fix For: 5.0, 4.6


Overseer.updateSlice method has the following code:

{code}
String router = 
message.getStr(OverseerCollectionProcessor.ROUTER,DocRouter.DEFAULT_NAME);
ListString shardNames  = new ArrayListString();

//collection does not yet exist, create placeholders if num shards is 
specified
boolean collectionExists = state.getCollections().contains(collection);
if (!collectionExists  numShards!=null) {
  if(ImplicitDocRouter.NAME.equals(router)){
getShardNames(shardNames, message.getStr(shards,null));
numShards = shardNames.size();
  }else {
getShardNames(numShards, shardNames);
  }
  state = createCollection(state, collection, shardNames, message);
}
{code}

Here it tries to read the router name from the message. Even if we ignore that 
the key to lookup the router is wrong here, the router name is never sent in a 
state message.

Considering that we don't even support creating a collection with implicit 
router from command line, we should stop expecting the parameter.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: svn commit: r1530537 - in /lucene/dev/trunk/lucene: common-build.xml ivy-settings.xml

2013-10-09 Thread Robert Muir

Thanks for updating this!

I think we should merge this back to branch 4.x too? This way the
source code tar.gz is working from China for our next release?

2013/10/9  h...@apache.org:
 Author: han
 Date: Wed Oct  9 08:56:15 2013
 New Revision: 1530537

 URL: http://svn.apache.org/r1530537
 Log:
 update broken links for maven mirror

 Modified:
 lucene/dev/trunk/lucene/common-build.xml
 lucene/dev/trunk/lucene/ivy-settings.xml

 Modified: lucene/dev/trunk/lucene/common-build.xml
 URL: 
 http://svn.apache.org/viewvc/lucene/dev/trunk/lucene/common-build.xml?rev=1530537r1=1530536r2=1530537view=diff
 ==
 --- lucene/dev/trunk/lucene/common-build.xml (original)
 +++ lucene/dev/trunk/lucene/common-build.xml Wed Oct  9 08:56:15 2013
 @@ -360,7 +360,7 @@
property name=ivy_install_path location=${user.home}/.ant/lib /
property name=ivy_bootstrap_url1 value=http://repo1.maven.org/maven2/
!-- you might need to tweak this from china so it works --
 -  property name=ivy_bootstrap_url2 
 value=http://mirror.netcologne.de/maven2/
 +  property name=ivy_bootstrap_url2 value=http://uk.maven.org/maven2/
property name=ivy_checksum_sha1 
 value=c5ebf1c253ad4959a29f4acfe696ee48cdd9f473/

target name=ivy-availability-check unless=ivy.available

 Modified: lucene/dev/trunk/lucene/ivy-settings.xml
 URL: 
 http://svn.apache.org/viewvc/lucene/dev/trunk/lucene/ivy-settings.xml?rev=1530537r1=1530536r2=1530537view=diff
 ==
 --- lucene/dev/trunk/lucene/ivy-settings.xml (original)
 +++ lucene/dev/trunk/lucene/ivy-settings.xml Wed Oct  9 08:56:15 2013
 @@ -35,7 +35,7 @@
  ibiblio name=maven.restlet.org root=http://maven.restlet.org; 
 m2compatible=true /

  !-- you might need to tweak this from china so it works --
 -ibiblio name=working-chinese-mirror 
 root=http://mirror.netcologne.de/maven2; m2compatible=true /
 +ibiblio name=working-chinese-mirror root=http://uk.maven.org/maven2; 
 m2compatible=true /

  !-- temporary to try Clover 3.2.0 snapshots, see 
 https://issues.apache.org/jira/browse/LUCENE-5243, 
 https://jira.atlassian.com/browse/CLOV-1368 --
  ibiblio name=atlassian-clover-snapshots 
 root=https://maven.atlassian.com/content/repositories/atlassian-public-snapshot;
  m2compatible=true /



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: svn commit: r1530537 - in /lucene/dev/trunk/lucene: common-build.xml ivy-settings.xml

2013-10-09 Thread Han Jiang

oh, yes, I'll do that!


On Wed, Oct 9, 2013 at 5:17 PM, Robert Muir rcm...@gmail.com wrote:

 Thanks for updating this!

 I think we should merge this back to branch 4.x too? This way the
 source code tar.gz is working from China for our next release?

 2013/10/9  h...@apache.org:
  Author: han
  Date: Wed Oct  9 08:56:15 2013
  New Revision: 1530537
 
  URL: http://svn.apache.org/r1530537
  Log:
  update broken links for maven mirror
 
  Modified:
  lucene/dev/trunk/lucene/common-build.xml
  lucene/dev/trunk/lucene/ivy-settings.xml
 
  Modified: lucene/dev/trunk/lucene/common-build.xml
  URL:
 http://svn.apache.org/viewvc/lucene/dev/trunk/lucene/common-build.xml?rev=1530537r1=1530536r2=1530537view=diff
 
 ==
  --- lucene/dev/trunk/lucene/common-build.xml (original)
  +++ lucene/dev/trunk/lucene/common-build.xml Wed Oct  9 08:56:15 2013
  @@ -360,7 +360,7 @@
 property name=ivy_install_path location=${user.home}/.ant/lib /
 property name=ivy_bootstrap_url1 value=
 http://repo1.maven.org/maven2/
 !-- you might need to tweak this from china so it works --
  -  property name=ivy_bootstrap_url2 value=
 http://mirror.netcologne.de/maven2/
  +  property name=ivy_bootstrap_url2 value=http://uk.maven.org/maven2
 /
 property name=ivy_checksum_sha1
 value=c5ebf1c253ad4959a29f4acfe696ee48cdd9f473/
 
 target name=ivy-availability-check unless=ivy.available
 
  Modified: lucene/dev/trunk/lucene/ivy-settings.xml
  URL:
 http://svn.apache.org/viewvc/lucene/dev/trunk/lucene/ivy-settings.xml?rev=1530537r1=1530536r2=1530537view=diff
 
 ==
  --- lucene/dev/trunk/lucene/ivy-settings.xml (original)
  +++ lucene/dev/trunk/lucene/ivy-settings.xml Wed Oct  9 08:56:15 2013
  @@ -35,7 +35,7 @@
   ibiblio name=maven.restlet.org root=http://maven.restlet.org;
 m2compatible=true /
 
   !-- you might need to tweak this from china so it works --
  -ibiblio name=working-chinese-mirror root=
 http://mirror.netcologne.de/maven2; m2compatible=true /
  +ibiblio name=working-chinese-mirror root=
 http://uk.maven.org/maven2; m2compatible=true /
 
   !-- temporary to try Clover 3.2.0 snapshots, see
 https://issues.apache.org/jira/browse/LUCENE-5243,
 https://jira.atlassian.com/browse/CLOV-1368 --
   ibiblio name=atlassian-clover-snapshots root=
 https://maven.atlassian.com/content/repositories/atlassian-public-snapshot;
 m2compatible=true /
 
 

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5264) CommonTermsQuery ignores minMustMatch if only high freq terms are present.


[ 
https://issues.apache.org/jira/browse/LUCENE-5264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790203#comment-13790203
 ] 

Adrien Grand commented on LUCENE-5264:
--

+1

 CommonTermsQuery ignores minMustMatch if only high freq terms are present.
 --

 Key: LUCENE-5264
 URL: https://issues.apache.org/jira/browse/LUCENE-5264
 Project: Lucene - Core
  Issue Type: Bug
  Components: modules/other
Affects Versions: 5.0, 4.5
Reporter: Simon Willnauer
Assignee: Simon Willnauer
Priority: Minor
 Fix For: 5.0, 4.6

 Attachments: LUCENE-5264.patch


 if we only have high freq terms we move to a pure conjunction and ignore the 
 min must match entirely if it is  0.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-5310) Add a collection admin command to remove a replica

2013-10-09 Thread Noble Paul (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-5310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-5310:
-

Description: 
the only way a replica can removed is by unloading the core .There is no way to 
remove a replica that is down . So, the clusterstate will have unreferenced 
nodes if a few nodes go down over time

We need a cluster admin command to clean that up

e.g: 
/admin/collections?action=DELETEREPLICAcollection=coll1shard=shard1replica=core_node3


The system would first see if the replica is active. If yes , a core UNLOAD 
command is fired , which would take care of deleting the replica from the 
clusterstate as well

if the state is inactive, then the core or node may be down , in that case the 
entry is removed from cluster state  


  was:
the only way a replica can removed is by unloading the core .There is no way to 
remove a replica that is down . So, the clusterstate will have unreferenced 
nodes if a few nodes go down over time

We need a cluster admin command to clean that up

e.g: 
/admin/collections?action=REMOVEREPLICAcollection=coll1shard=shard1replica=core_node3






 Add a collection admin command to remove a replica
 --

 Key: SOLR-5310
 URL: https://issues.apache.org/jira/browse/SOLR-5310
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
Reporter: Noble Paul
Assignee: Noble Paul
   Original Estimate: 72h
  Remaining Estimate: 72h

 the only way a replica can removed is by unloading the core .There is no way 
 to remove a replica that is down . So, the clusterstate will have 
 unreferenced nodes if a few nodes go down over time
 We need a cluster admin command to clean that up
 e.g: 
 /admin/collections?action=DELETEREPLICAcollection=coll1shard=shard1replica=core_node3
 The system would first see if the replica is active. If yes , a core UNLOAD 
 command is fired , which would take care of deleting the replica from the 
 clusterstate as well
 if the state is inactive, then the core or node may be down , in that case 
 the entry is removed from cluster state  



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-5321) Overseer.updateState tries to use router name from message but none is sent

2013-10-09 Thread ASF subversion and git services (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-5321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar resolved SOLR-5321.
-

Resolution: Fixed

 Overseer.updateState tries to use router name from message but none is sent
 ---

 Key: SOLR-5321
 URL: https://issues.apache.org/jira/browse/SOLR-5321
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.5
Reporter: Shalin Shekhar Mangar
Assignee: Shalin Shekhar Mangar
Priority: Minor
 Fix For: 5.0, 4.6


 Overseer.updateSlice method has the following code:
 {code}
 String router = 
 message.getStr(OverseerCollectionProcessor.ROUTER,DocRouter.DEFAULT_NAME);
 ListString shardNames  = new ArrayListString();
 //collection does not yet exist, create placeholders if num shards is 
 specified
 boolean collectionExists = 
 state.getCollections().contains(collection);
 if (!collectionExists  numShards!=null) {
   if(ImplicitDocRouter.NAME.equals(router)){
 getShardNames(shardNames, message.getStr(shards,null));
 numShards = shardNames.size();
   }else {
 getShardNames(numShards, shardNames);
   }
   state = createCollection(state, collection, shardNames, message);
 }
 {code}
 Here it tries to read the router name from the message. Even if we ignore 
 that the key to lookup the router is wrong here, the router name is never 
 sent in a state message.
 Considering that we don't even support creating a collection with implicit 
 router from command line, we should stop expecting the parameter.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5321) Overseer.updateState tries to use router name from message but none is sent


[ 
https://issues.apache.org/jira/browse/SOLR-5321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790223#comment-13790223
 ] 

ASF subversion and git services commented on SOLR-5321:
---

Commit 1530555 from sha...@apache.org in branch 'dev/trunk'
[ https://svn.apache.org/r1530555 ]

SOLR-5321: Remove unnecessary code in Overseer.updateState method which tries 
to use router name from message where none is ever sent

 Overseer.updateState tries to use router name from message but none is sent
 ---

 Key: SOLR-5321
 URL: https://issues.apache.org/jira/browse/SOLR-5321
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.5
Reporter: Shalin Shekhar Mangar
Assignee: Shalin Shekhar Mangar
Priority: Minor
 Fix For: 5.0, 4.6


 Overseer.updateSlice method has the following code:
 {code}
 String router = 
 message.getStr(OverseerCollectionProcessor.ROUTER,DocRouter.DEFAULT_NAME);
 ListString shardNames  = new ArrayListString();
 //collection does not yet exist, create placeholders if num shards is 
 specified
 boolean collectionExists = 
 state.getCollections().contains(collection);
 if (!collectionExists  numShards!=null) {
   if(ImplicitDocRouter.NAME.equals(router)){
 getShardNames(shardNames, message.getStr(shards,null));
 numShards = shardNames.size();
   }else {
 getShardNames(numShards, shardNames);
   }
   state = createCollection(state, collection, shardNames, message);
 }
 {code}
 Here it tries to read the router name from the message. Even if we ignore 
 that the key to lookup the router is wrong here, the router name is never 
 sent in a state message.
 Considering that we don't even support creating a collection with implicit 
 router from command line, we should stop expecting the parameter.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5321) Overseer.updateState tries to use router name from message but none is sent

2013-10-09 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790225#comment-13790225
 ] 

ASF subversion and git services commented on SOLR-5321:
---

Commit 1530556 from sha...@apache.org in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1530556 ]

SOLR-5321: Remove unnecessary code in Overseer.updateState method which tries 
to use router name from message where none is ever sent

 Overseer.updateState tries to use router name from message but none is sent
 ---

 Key: SOLR-5321
 URL: https://issues.apache.org/jira/browse/SOLR-5321
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.5
Reporter: Shalin Shekhar Mangar
Assignee: Shalin Shekhar Mangar
Priority: Minor
 Fix For: 5.0, 4.6


 Overseer.updateSlice method has the following code:
 {code}
 String router = 
 message.getStr(OverseerCollectionProcessor.ROUTER,DocRouter.DEFAULT_NAME);
 ListString shardNames  = new ArrayListString();
 //collection does not yet exist, create placeholders if num shards is 
 specified
 boolean collectionExists = 
 state.getCollections().contains(collection);
 if (!collectionExists  numShards!=null) {
   if(ImplicitDocRouter.NAME.equals(router)){
 getShardNames(shardNames, message.getStr(shards,null));
 numShards = shardNames.size();
   }else {
 getShardNames(numShards, shardNames);
   }
   state = createCollection(state, collection, shardNames, message);
 }
 {code}
 Here it tries to read the router name from the message. Even if we ignore 
 that the key to lookup the router is wrong here, the router name is never 
 sent in a state message.
 Considering that we don't even support creating a collection with implicit 
 router from command line, we should stop expecting the parameter.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-5322) Permisions didn't check when call discoverUnder

2013-10-09 Thread Said Chavkin (JIRA)

Said Chavkin created SOLR-5322:
--

 Summary: Permisions didn't check when call discoverUnder
 Key: SOLR-5322
 URL: https://issues.apache.org/jira/browse/SOLR-5322
 Project: Solr
  Issue Type: Bug
  Components: contrib - Clustering
Affects Versions: 4.5
 Environment: Centos 6.4
tomcat6
Reporter: Said Chavkin


Hello.

When in solr/home directory exists directory to which solr do not have rights, 
then solr failed to start with exception
2108 [main] INFO org.apache.solr.core.CoresLocator - Looking for core 
definitions underneath /var/lib/solr
2109 [main] ERROR org.apache.solr.servlet.SolrDispatchFilter - Could not start 
Solr. Check solr/home property and the logs
2138 [main] ERROR org.apache.solr.core.SolrCore - 
null:java.lang.NullPointerException
at 
org.apache.solr.core.CorePropertiesLocator.discoverUnder(CorePropertiesLocator.java:121)
at 
org.apache.solr.core.CorePropertiesLocator.discoverUnder(CorePropertiesLocator.java:130)
at 
org.apache.solr.core.CorePropertiesLocator.discover(CorePropertiesLocator.java:113)
at org.apache.solr.core.CoreContainer.load(CoreContainer.java:226)
at 
org.apache.solr.servlet.SolrDispatchFilter.createCoreContainer(SolrDispatchFilter.java:177)
at 
org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:127)
at 
org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:295)
at 
org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:422)
at 
org.apache.catalina.core.ApplicationFilterConfig.init(ApplicationFilterConfig.java:115)
at 
org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:3838)
at 
org.apache.catalina.core.StandardContext.start(StandardContext.java:4488)
at 
org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:791)
at 
org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:771)
at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:526)
at 
org.apache.catalina.startup.HostConfig.deployDescriptor(HostConfig.java:637)
at 
org.apache.catalina.startup.HostConfig.deployDescriptors(HostConfig.java:563)
at 
org.apache.catalina.startup.HostConfig.deployApps(HostConfig.java:498)
at org.apache.catalina.startup.HostConfig.start(HostConfig.java:1277)
at 
org.apache.catalina.startup.HostConfig.lifecycleEvent(HostConfig.java:321)
at 
org.apache.catalina.util.LifecycleSupport.fireLifecycleEvent(LifecycleSupport.java:119)
at org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1053)
at org.apache.catalina.core.StandardHost.start(StandardHost.java:722)
at org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1045)
at 
org.apache.catalina.core.StandardEngine.start(StandardEngine.java:443)
at 
org.apache.catalina.core.StandardService.start(StandardService.java:516)
at 
org.apache.catalina.core.StandardServer.start(StandardServer.java:710)
at org.apache.catalina.startup.Catalina.start(Catalina.java:593)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:616)
at org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:289)
at org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:414)

2138 [main] INFO org.apache.solr.servlet.SolrDispatchFilter - 
SolrDispatchFilter.init() done

For example:
solr home located on /var/lib/solr
/var/lib/solr is another file system, it has lost+found directory.
As result solr can't to star.

Yours faithfully.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5267) java.lang.ArrayIndexOutOfBoundsException on reading data


[ 
https://issues.apache.org/jira/browse/LUCENE-5267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790236#comment-13790236
 ] 

Littlestar commented on LUCENE-5267:


Thanks, most of records recoverd.
But why index got corrupted? mybe compress or writer has bug ...


 java.lang.ArrayIndexOutOfBoundsException on reading data
 

 Key: LUCENE-5267
 URL: https://issues.apache.org/jira/browse/LUCENE-5267
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/index
Affects Versions: 4.4
Reporter: Littlestar
Assignee: Adrien Grand
  Labels: LZ4

 java.lang.ArrayIndexOutOfBoundsException
   at org.apache.lucene.codecs.compressing.LZ4.decompress(LZ4.java:132)
   at 
 org.apache.lucene.codecs.compressing.CompressionMode$4.decompress(CompressionMode.java:135)
   at 
 org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.visitDocument(CompressingStoredFieldsReader.java:336)
   at 
 org.apache.lucene.index.SegmentReader.document(SegmentReader.java:133)
   at 
 org.apache.lucene.index.BaseCompositeReader.document(BaseCompositeReader.java:110)
   at 
 org.apache.lucene.index.SlowCompositeReaderWrapper.document(SlowCompositeReaderWrapper.java:212)
   at 
 org.apache.lucene.index.FilterAtomicReader.document(FilterAtomicReader.java:365)
   at 
 org.apache.lucene.index.BaseCompositeReader.document(BaseCompositeReader.java:110)
   at org.apache.lucene.index.IndexReader.document(IndexReader.java:447)
   at org.apache.lucene.search.IndexSearcher.doc(IndexSearcher.java:204)



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Assigned] (LUCENE-5236) Use broadword bit selection in EliasFanoDecoder


 [ 
https://issues.apache.org/jira/browse/LUCENE-5236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand reassigned LUCENE-5236:


Assignee: Adrien Grand

 Use broadword bit selection in EliasFanoDecoder
 ---

 Key: LUCENE-5236
 URL: https://issues.apache.org/jira/browse/LUCENE-5236
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Paul Elschot
Assignee: Adrien Grand
Priority: Minor
 Attachments: LUCENE-5236.patch, LUCENE-5236.patch, 
 TestDocIdSetBenchmark.java


 Try and speed up decoding



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5267) java.lang.ArrayIndexOutOfBoundsException on reading data


[ 
https://issues.apache.org/jira/browse/LUCENE-5267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790248#comment-13790248
 ] 

Adrien Grand commented on LUCENE-5267:
--

Good question. I've had this issue myself once and the dmesg of the system was 
full with disk-related errors so something really bad probably happened with 
the disk. I am actually thinking of adding some basic checksuming to the future 
stored fields format (4 bytes per chunk, this wouldn't hurt the compression 
ratio much) in order to be able to distinguish easily index corruptions from 
bugs in the stored fields format (and especially the compression layer).

 java.lang.ArrayIndexOutOfBoundsException on reading data
 

 Key: LUCENE-5267
 URL: https://issues.apache.org/jira/browse/LUCENE-5267
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/index
Affects Versions: 4.4
Reporter: Littlestar
Assignee: Adrien Grand
  Labels: LZ4

 java.lang.ArrayIndexOutOfBoundsException
   at org.apache.lucene.codecs.compressing.LZ4.decompress(LZ4.java:132)
   at 
 org.apache.lucene.codecs.compressing.CompressionMode$4.decompress(CompressionMode.java:135)
   at 
 org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.visitDocument(CompressingStoredFieldsReader.java:336)
   at 
 org.apache.lucene.index.SegmentReader.document(SegmentReader.java:133)
   at 
 org.apache.lucene.index.BaseCompositeReader.document(BaseCompositeReader.java:110)
   at 
 org.apache.lucene.index.SlowCompositeReaderWrapper.document(SlowCompositeReaderWrapper.java:212)
   at 
 org.apache.lucene.index.FilterAtomicReader.document(FilterAtomicReader.java:365)
   at 
 org.apache.lucene.index.BaseCompositeReader.document(BaseCompositeReader.java:110)
   at org.apache.lucene.index.IndexReader.document(IndexReader.java:447)
   at org.apache.lucene.search.IndexSearcher.doc(IndexSearcher.java:204)



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-5261) add simple API to build queries from analysis chain

[
https://issues.apache.org/jira/browse/LUCENE-5261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Robert Muir updated LUCENE-5261:

Attachment: LUCENE-5261.patch

Simplified patch:
* I removed get/set defaultOperator and slop, restoring these to the QPs (so
less changes there: including no api impact)
* I removed operator enum completely and just use Occur for that.
* instead createFieldQuery just takes Occur and slop as parameters.
* added javadocs

From the use directly side I just added
createBooleanQuery(String,String,Occur) and
createPhraseQuery(String,String,int).

I think this is much more intuitive, these parameters are really per-query
anyway: they shouldnt be getters/setters on this class. (Thats just brain
damage from our crazy QP).

I think this is ready.

add simple API to build queries from analysis chain
---

Key: LUCENE-5261
URL: https://issues.apache.org/jira/browse/LUCENE-5261
Project: Lucene - Core
Issue Type: New Feature
Reporter: Robert Muir
Attachments: LUCENE-5261.patch, LUCENE-5261.patch, LUCENE-5261.patch

Currently this is pretty crazy stuff.
Additionally its duplicated in like 3 or 4 places in our codebase (i noticed
it doing LUCENE-5259)
We can solve that duplication, and make it easy to simply create queries from
an analyzer (its been asked on the user list), as well as make it easier to
build new queryparsers.

--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2548) Multithreaded faceting

2013-10-09 Thread Erick Erickson (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790257#comment-13790257
 ] 

Erick Erickson commented on SOLR-2548:
--

1. no. Could be extended to I think if you have the energy.
2. no
3. yes
4 all


 Multithreaded faceting
 --

 Key: SOLR-2548
 URL: https://issues.apache.org/jira/browse/SOLR-2548
 Project: Solr
  Issue Type: Improvement
  Components: search
Affects Versions: 3.1
Reporter: Janne Majaranta
Assignee: Erick Erickson
Priority: Minor
  Labels: facet
 Fix For: 4.5, 5.0

 Attachments: SOLR-2548_4.2.1.patch, SOLR-2548_for_31x.patch, 
 SOLR-2548_multithreaded_faceting,_dsmiley.patch, 
 SOLR-2548_multithreaded_faceting,_dsmiley.patch, SOLR-2548.patch, 
 SOLR-2548.patch, SOLR-2548.patch, SOLR-2548.patch, SOLR-2548.patch, 
 SOLR-2548.patch


 Add multithreading support for faceting.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-5322) Permisions didn't check when call discoverUnder

2013-10-09 Thread Erick Erickson (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-5322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson resolved SOLR-5322.
--

Resolution: Invalid

Please raise this kind of issue on the user's list before raising a JIRA to see 
if it's really a but in Solr or a configuration issue.

You can reopen this is you think it's something Solr should manage.

What would you have Solr do? If it's not being run as a process that has 
permissions to a necessary directory what can it do _but_ fail on startup? You 
as the sysadmin are responsible for permissions


 Permisions didn't check when call discoverUnder
 ---

 Key: SOLR-5322
 URL: https://issues.apache.org/jira/browse/SOLR-5322
 Project: Solr
  Issue Type: Bug
  Components: contrib - Clustering
Affects Versions: 4.5
 Environment: Centos 6.4
 tomcat6
Reporter: Said Chavkin

 Hello.
 When in solr/home directory exists directory to which solr do not have 
 rights, then solr failed to start with exception
 2108 [main] INFO org.apache.solr.core.CoresLocator - Looking for core 
 definitions underneath /var/lib/solr
 2109 [main] ERROR org.apache.solr.servlet.SolrDispatchFilter - Could not 
 start Solr. Check solr/home property and the logs
 2138 [main] ERROR org.apache.solr.core.SolrCore - 
 null:java.lang.NullPointerException
 at 
 org.apache.solr.core.CorePropertiesLocator.discoverUnder(CorePropertiesLocator.java:121)
 at 
 org.apache.solr.core.CorePropertiesLocator.discoverUnder(CorePropertiesLocator.java:130)
 at 
 org.apache.solr.core.CorePropertiesLocator.discover(CorePropertiesLocator.java:113)
 at org.apache.solr.core.CoreContainer.load(CoreContainer.java:226)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.createCoreContainer(SolrDispatchFilter.java:177)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:127)
 at 
 org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:295)
 at 
 org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:422)
 at 
 org.apache.catalina.core.ApplicationFilterConfig.init(ApplicationFilterConfig.java:115)
 at 
 org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:3838)
 at 
 org.apache.catalina.core.StandardContext.start(StandardContext.java:4488)
 at 
 org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:791)
 at 
 org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:771)
 at 
 org.apache.catalina.core.StandardHost.addChild(StandardHost.java:526)
 at 
 org.apache.catalina.startup.HostConfig.deployDescriptor(HostConfig.java:637)
 at 
 org.apache.catalina.startup.HostConfig.deployDescriptors(HostConfig.java:563)
 at 
 org.apache.catalina.startup.HostConfig.deployApps(HostConfig.java:498)
 at org.apache.catalina.startup.HostConfig.start(HostConfig.java:1277)
 at 
 org.apache.catalina.startup.HostConfig.lifecycleEvent(HostConfig.java:321)
 at 
 org.apache.catalina.util.LifecycleSupport.fireLifecycleEvent(LifecycleSupport.java:119)
 at 
 org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1053)
 at org.apache.catalina.core.StandardHost.start(StandardHost.java:722)
 at 
 org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1045)
 at 
 org.apache.catalina.core.StandardEngine.start(StandardEngine.java:443)
 at 
 org.apache.catalina.core.StandardService.start(StandardService.java:516)
 at 
 org.apache.catalina.core.StandardServer.start(StandardServer.java:710)
 at org.apache.catalina.startup.Catalina.start(Catalina.java:593)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:616)
 at org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:289)
 at org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:414)
 2138 [main] INFO org.apache.solr.servlet.SolrDispatchFilter - 
 SolrDispatchFilter.init() done
 For example:
 solr home located on /var/lib/solr
 /var/lib/solr is another file system, it has lost+found directory.
 As result solr can't to star.
 Yours faithfully.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-5323) Solr requires -Dsolr.clustering.enabled=false when pointing at example config

2013-10-09 Thread John Berryman (JIRA)

John Berryman created SOLR-5323:
---

 Summary: Solr requires -Dsolr.clustering.enabled=false when 
pointing at example config
 Key: SOLR-5323
 URL: https://issues.apache.org/jira/browse/SOLR-5323
 Project: Solr
  Issue Type: Bug
  Components: contrib - Clustering
Affects Versions: 4.5
 Environment: vanilla mac
Reporter: John Berryman
 Fix For: 5.0, 4.6


my typical use of Solr is something like this: 

cd SOLR_HOME/example
cp -r solr /myProjectDir/solr_home
java -jar -Dsolr.solr.home=/myProjectDir/solr_home  start.jar

But in solr 4.5.0 this fails to start successfully. I get an error:

org.apache.solr.common.SolrException: Error loading class 
'solr.clustering.ClusteringComponent'

The reason is because solr.clustering.enabled defaults to true now. I don't 
know why this might be the case.

you can get around it with 

java -jar -Dsolr.solr.home=/myProjectDir/solr_home 
-Dsolr.clustering.enabled=false start.jar

SOLR-4708 is when this became an issue.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-5323) Solr requires -Dsolr.clustering.enabled=false when pointing at example config


 [ 
https://issues.apache.org/jira/browse/SOLR-5323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Hatcher updated SOLR-5323:
---

Description: 
my typical use of Solr is something like this: 

{code}
cd SOLR_HOME/example
cp -r solr /myProjectDir/solr_home
java -jar -Dsolr.solr.home=/myProjectDir/solr_home  start.jar
{code}

But in solr 4.5.0 this fails to start successfully. I get an error:

{code}
org.apache.solr.common.SolrException: Error loading class 
'solr.clustering.ClusteringComponent'
{code}

The reason is because solr.clustering.enabled defaults to true now. I don't 
know why this might be the case.

you can get around it with 

{code}
java -jar -Dsolr.solr.home=/myProjectDir/solr_home 
-Dsolr.clustering.enabled=false start.jar
{code}

SOLR-4708 is when this became an issue.

  was:
my typical use of Solr is something like this: 

cd SOLR_HOME/example
cp -r solr /myProjectDir/solr_home
java -jar -Dsolr.solr.home=/myProjectDir/solr_home  start.jar

But in solr 4.5.0 this fails to start successfully. I get an error:

org.apache.solr.common.SolrException: Error loading class 
'solr.clustering.ClusteringComponent'

The reason is because solr.clustering.enabled defaults to true now. I don't 
know why this might be the case.

you can get around it with 

java -jar -Dsolr.solr.home=/myProjectDir/solr_home 
-Dsolr.clustering.enabled=false start.jar

SOLR-4708 is when this became an issue.


 Solr requires -Dsolr.clustering.enabled=false when pointing at example config
 -

 Key: SOLR-5323
 URL: https://issues.apache.org/jira/browse/SOLR-5323
 Project: Solr
  Issue Type: Bug
  Components: contrib - Clustering
Affects Versions: 4.5
 Environment: vanilla mac
Reporter: John Berryman
 Fix For: 5.0, 4.6


 my typical use of Solr is something like this: 
 {code}
 cd SOLR_HOME/example
 cp -r solr /myProjectDir/solr_home
 java -jar -Dsolr.solr.home=/myProjectDir/solr_home  start.jar
 {code}
 But in solr 4.5.0 this fails to start successfully. I get an error:
 {code}
 org.apache.solr.common.SolrException: Error loading class 
 'solr.clustering.ClusteringComponent'
 {code}
 The reason is because solr.clustering.enabled defaults to true now. I don't 
 know why this might be the case.
 you can get around it with 
 {code}
 java -jar -Dsolr.solr.home=/myProjectDir/solr_home 
 -Dsolr.clustering.enabled=false start.jar
 {code}
 SOLR-4708 is when this became an issue.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5323) Solr requires -Dsolr.clustering.enabled=false when pointing at example config


[ 
https://issues.apache.org/jira/browse/SOLR-5323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790362#comment-13790362
 ] 

Erik Hatcher commented on SOLR-5323:


I think we should have the lib elements in solrconfig.xml be something like 
this:

{code}
  lib dir=${solr.install.dir}/contrib/clustering/lib/ regex=.*\.jar /
{code}

where solr.install.dir is a property defined by Solr automatically at startup 
that has the root of where Solr is installed.  I've done this manually by 
adjusting the configuration in this exact scenario (copying the example 
configuration and changing all lib's in this way and defining 
solr.install.dir on the command-line), but Solr should be able to do this 
better.

 Solr requires -Dsolr.clustering.enabled=false when pointing at example config
 -

 Key: SOLR-5323
 URL: https://issues.apache.org/jira/browse/SOLR-5323
 Project: Solr
  Issue Type: Bug
  Components: contrib - Clustering
Affects Versions: 4.5
 Environment: vanilla mac
Reporter: John Berryman
 Fix For: 5.0, 4.6


 my typical use of Solr is something like this: 
 {code}
 cd SOLR_HOME/example
 cp -r solr /myProjectDir/solr_home
 java -jar -Dsolr.solr.home=/myProjectDir/solr_home  start.jar
 {code}
 But in solr 4.5.0 this fails to start successfully. I get an error:
 {code}
 org.apache.solr.common.SolrException: Error loading class 
 'solr.clustering.ClusteringComponent'
 {code}
 The reason is because solr.clustering.enabled defaults to true now. I don't 
 know why this might be the case.
 you can get around it with 
 {code}
 java -jar -Dsolr.solr.home=/myProjectDir/solr_home 
 -Dsolr.clustering.enabled=false start.jar
 {code}
 SOLR-4708 is when this became an issue.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-5266) Optimization of the direct PackedInts readers


 [ 
https://issues.apache.org/jira/browse/LUCENE-5266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-5266:


Attachment: LUCENE-5266.patch

here is a patch from playing around this morning.

I'm afraid of specialization here: but this one should help the relatively low 
bpv I think by using readShort ?

 Optimization of the direct PackedInts readers
 -

 Key: LUCENE-5266
 URL: https://issues.apache.org/jira/browse/LUCENE-5266
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Adrien Grand
Assignee: Adrien Grand
Priority: Minor
 Attachments: LUCENE-5266.patch


 Given that the initial focus for PackedInts readers was more on in-memory 
 readers (for storing stuff like the mapping from old to new doc IDs at 
 merging time), I never spent time trying to optimize the direct readers 
 although it could be beneficial now that they are used for disk-based doc 
 values.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-5324) Make sub shard replica recovery and shard state switch asynchronous

Shalin Shekhar Mangar created SOLR-5324:
---

 Summary: Make sub shard replica recovery and shard state switch 
asynchronous
 Key: SOLR-5324
 URL: https://issues.apache.org/jira/browse/SOLR-5324
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
Reporter: Shalin Shekhar Mangar
Assignee: Shalin Shekhar Mangar
 Fix For: 5.0, 4.6


Currently the shard split command waits for all replicas of all sub shards to 
recover and then switches the state of parent to inactive and sub-shards to 
active.

The problem is that shard split (ab)uses the CoreAdmin WaitForState action to 
ask the sub shard leader to wait until the replica states are active. This 
action is prone to timeout.

We should make the shard state switching asynchronous. Once all replicas of all 
sub-shards are 'active', the shard states should be switched automatically.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-1301) Add a Solr contrib that allows for building Solr indexes via Hadoop's Map-Reduce.

2013-10-09 Thread ASF subversion and git services (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790411#comment-13790411
]

Mark Miller commented on SOLR-1301:
---

I have a new patch I'm cleaning up that tackles some of the packaging:

* Split out solr-morphlines-core and solr-morphlines-cell into their own
modules.

* Updated to trunk and the new modules are now using the new dependency version
tracking system.

* Fixed an issue in the code around the TokenStream contract being violated -
the latest code detected this and failed a test - end and close now called.

* Updated to use Morphlines from CDK 0.8.

* Setup the main class in the solr-mr jar manifest.

* I enabled an ignored test which exposed a few bugs because of the required
solr.xml in Solr 5.0 - I addressed those bugs.

* Added a missing metrics health-check dependency that somehow popped up.

* I played around with naming the solr-mr artifact MapReduceIndexTool.jar, but
the system really want's us to follow the rules of the artifacts and have
something like solr-solr-mr-5.0.jar. Anything else has some random issues, such
as with javadoc, and if your name does not start with solr-, it will be changed
to start with lucene-. I'm not yet sure if it's worth the trouble to expand the
system or use a different name, so for now it's still just using the default
jar name based on the contrib module name (solr-mr).

Besides the naming issue, there are a couple other things to button up:

* How we are going to set up the classpath - script, in the manifest, leave it
up to the user and doc, etc.

* All dependencies are currently in solr-morphlines-core - this was a simple
way to split out the modules since solr-mr and solr-morphlines-cell depend on
solr-morphlines-core.

Finally, we will probably need some help from [~steve_rowe] to get the Maven
build setup correctly.

I spent a bunch of time trying to use asm to work around the hacked test policy
issue. There are multiple problems I ran into. One is that another module uses
asm 4.1, but Hadoop brings in asm 3.1 - if you are doing some asm coding, this
can cause compile issues with your ide (at least eclipse). It also ends up
being really hard to get an injection in the right place because of how the
yarn code is structured. After spending a bunch of time trying to get this to
work, I'm backing out and considering other options.

Add a Solr contrib that allows for building Solr indexes via Hadoop's
Map-Reduce.
-

Key: SOLR-1301
URL: https://issues.apache.org/jira/browse/SOLR-1301
Project: Solr
Issue Type: New Feature
Reporter: Andrzej Bialecki
Assignee: Mark Miller
Fix For: 4.6

Attachments: commons-logging-1.0.4.jar,
commons-logging-api-1.0.4.jar, hadoop-0.19.1-core.jar,
hadoop-0.20.1-core.jar, hadoop-core-0.20.2-cdh3u3.jar, hadoop.patch,
log4j-1.2.15.jar, README.txt, SOLR-1301-hadoop-0-20.patch,
SOLR-1301-hadoop-0-20.patch, SOLR-1301.patch, SOLR-1301.patch,
SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch,
SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch,
SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch,
SOLR-1301.patch, SOLR-1301.patch, SolrRecordWriter.java

This patch contains a contrib module that provides distributed indexing
(using Hadoop) to Solr EmbeddedSolrServer. The idea behind this module is
twofold:
* provide an API that is familiar to Hadoop developers, i.e. that of
OutputFormat
* avoid unnecessary export and (de)serialization of data maintained on HDFS.
SolrOutputFormat consumes data produced by reduce tasks directly, without
storing it in intermediate files. Furthermore, by using an
EmbeddedSolrServer, the indexing task is split into as many parts as there
are reducers, and the data to be indexed is not sent over the network.
Design
--
Key/value pairs produced by reduce tasks are passed to SolrOutputFormat,
which in turn uses SolrRecordWriter to write this data. SolrRecordWriter
instantiates an EmbeddedSolrServer, and it also instantiates an
implementation of SolrDocumentConverter, which is responsible for turning
Hadoop (key, value) into a SolrInputDocument. This data is then added to a
batch, which is periodically submitted to EmbeddedSolrServer. When reduce
task completes, and the OutputFormat is closed, SolrRecordWriter calls
commit() and optimize() on the EmbeddedSolrServer.
The API provides facilities to specify an arbitrary existing solr.home
directory, from which the conf/ and lib/ files will be taken.
This process results in the creation of as many partial Solr home directories
as there were reduce tasks. The output shards are placed in the output
directory on the default

[jira] [Commented] (LUCENE-5264) CommonTermsQuery ignores minMustMatch if only high freq terms are present.


[ 
https://issues.apache.org/jira/browse/LUCENE-5264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790413#comment-13790413
 ] 

ASF subversion and git services commented on LUCENE-5264:
-

Commit 1530651 from [~simonw] in branch 'dev/trunk'
[ https://svn.apache.org/r1530651 ]

LUCENE-5264: CommonTermsQuery ignores minMustMatch if only high freq terms are 
present

 CommonTermsQuery ignores minMustMatch if only high freq terms are present.
 --

 Key: LUCENE-5264
 URL: https://issues.apache.org/jira/browse/LUCENE-5264
 Project: Lucene - Core
  Issue Type: Bug
  Components: modules/other
Affects Versions: 5.0, 4.5
Reporter: Simon Willnauer
Assignee: Simon Willnauer
Priority: Minor
 Fix For: 5.0, 4.6

 Attachments: LUCENE-5264.patch


 if we only have high freq terms we move to a pure conjunction and ignore the 
 min must match entirely if it is  0.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5264) CommonTermsQuery ignores minMustMatch if only high freq terms are present.

2013-10-09 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790421#comment-13790421
 ] 

ASF subversion and git services commented on LUCENE-5264:
-

Commit 1530657 from [~simonw] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1530657 ]

LUCENE-5264: CommonTermsQuery ignores minMustMatch if only high freq terms are 
present

 CommonTermsQuery ignores minMustMatch if only high freq terms are present.
 --

 Key: LUCENE-5264
 URL: https://issues.apache.org/jira/browse/LUCENE-5264
 Project: Lucene - Core
  Issue Type: Bug
  Components: modules/other
Affects Versions: 5.0, 4.5
Reporter: Simon Willnauer
Assignee: Simon Willnauer
Priority: Minor
 Fix For: 5.0, 4.6

 Attachments: LUCENE-5264.patch


 if we only have high freq terms we move to a pure conjunction and ignore the 
 min must match entirely if it is  0.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2548) Multithreaded faceting

2013-10-09 Thread Markus Jelsma (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790425#comment-13790425
 ] 

Markus Jelsma commented on SOLR-2548:
-

I'm having a hard time measuring performance differenes without and with 
facet.threads. On my development machine, there are no differences on warmed 
indexes, both measure around 1ms. They're also almost identical after 
stop/start of Jetty with no warm up queries, around 40ms, after that, fast 
again. We're facetting on four fields this time, there are also four threads.

 Multithreaded faceting
 --

 Key: SOLR-2548
 URL: https://issues.apache.org/jira/browse/SOLR-2548
 Project: Solr
  Issue Type: Improvement
  Components: search
Affects Versions: 3.1
Reporter: Janne Majaranta
Assignee: Erick Erickson
Priority: Minor
  Labels: facet
 Fix For: 4.5, 5.0

 Attachments: SOLR-2548_4.2.1.patch, SOLR-2548_for_31x.patch, 
 SOLR-2548_multithreaded_faceting,_dsmiley.patch, 
 SOLR-2548_multithreaded_faceting,_dsmiley.patch, SOLR-2548.patch, 
 SOLR-2548.patch, SOLR-2548.patch, SOLR-2548.patch, SOLR-2548.patch, 
 SOLR-2548.patch


 Add multithreading support for faceting.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-5324) Make sub shard replica recovery and shard state switch asynchronous

[
https://issues.apache.org/jira/browse/SOLR-5324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Shalin Shekhar Mangar updated SOLR-5324:

Attachment: SOLR-5324.patch

Changes:
# A new shard state: 'recovery' is added
# After all sub-shard replicas have been created, the sub-shard state is set to
'recovery'. If replication factor is 1 then the sub-shards are set to 'active'.
The splitshard API returns at this point.
# The state change events in the overseer are used to track when all replicas
of all sub-shards become 'active'. Once that happens, the parent shard is set
to inactive and the sub-shards are set to 'active'.
# To facilitate the above, a slice property called 'parent' is introduced which
is removed once the slice becomes 'active'.
# If the split is retried then we use the 'deleteshard' api to completely
remove the sub-shards before starting the splitting process.

Make sub shard replica recovery and shard state switch asynchronous
---

Key: SOLR-5324
URL: https://issues.apache.org/jira/browse/SOLR-5324
Project: Solr
Issue Type: Improvement
Components: SolrCloud
Reporter: Shalin Shekhar Mangar
Assignee: Shalin Shekhar Mangar
Fix For: 5.0, 4.6

Attachments: SOLR-5324.patch

Currently the shard split command waits for all replicas of all sub shards to
recover and then switches the state of parent to inactive and sub-shards to
active.
The problem is that shard split (ab)uses the CoreAdmin WaitForState action to
ask the sub shard leader to wait until the replica states are active. This
action is prone to timeout.
We should make the shard state switching asynchronous. Once all replicas of
all sub-shards are 'active', the shard states should be switched
automatically.

--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2548) Multithreaded faceting

2013-10-09 Thread Markus Jelsma (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790435#comment-13790435
 ] 

Markus Jelsma commented on SOLR-2548:
-

Alright, i took another index and facetted on much more fields and now i see a 
small improvement after start up of about 12%. It is not much, perhaps this 
machine is too fast in this case.

 Multithreaded faceting
 --

 Key: SOLR-2548
 URL: https://issues.apache.org/jira/browse/SOLR-2548
 Project: Solr
  Issue Type: Improvement
  Components: search
Affects Versions: 3.1
Reporter: Janne Majaranta
Assignee: Erick Erickson
Priority: Minor
  Labels: facet
 Fix For: 4.5, 5.0

 Attachments: SOLR-2548_4.2.1.patch, SOLR-2548_for_31x.patch, 
 SOLR-2548_multithreaded_faceting,_dsmiley.patch, 
 SOLR-2548_multithreaded_faceting,_dsmiley.patch, SOLR-2548.patch, 
 SOLR-2548.patch, SOLR-2548.patch, SOLR-2548.patch, SOLR-2548.patch, 
 SOLR-2548.patch


 Add multithreading support for faceting.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-5264) CommonTermsQuery ignores minMustMatch if only high freq terms are present.

2013-10-09 Thread Simon Willnauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-5264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer resolved LUCENE-5264.
-

   Resolution: Fixed
Lucene Fields: New,Patch Available  (was: New)

 CommonTermsQuery ignores minMustMatch if only high freq terms are present.
 --

 Key: LUCENE-5264
 URL: https://issues.apache.org/jira/browse/LUCENE-5264
 Project: Lucene - Core
  Issue Type: Bug
  Components: modules/other
Affects Versions: 5.0, 4.5
Reporter: Simon Willnauer
Assignee: Simon Willnauer
Priority: Minor
 Fix For: 5.0, 4.6

 Attachments: LUCENE-5264.patch


 if we only have high freq terms we move to a pure conjunction and ignore the 
 min must match entirely if it is  0.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5323) Solr requires -Dsolr.clustering.enabled=false when pointing at example config

2013-10-09 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790452#comment-13790452
 ] 

Erik Hatcher commented on SOLR-5323:


This isn't specific to the clustering component, except that it gets loaded 
non-lazily.  See these comments: 
https://issues.apache.org/jira/browse/SOLR-4708?focusedCommentId=13630567page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13630567

 Solr requires -Dsolr.clustering.enabled=false when pointing at example config
 -

 Key: SOLR-5323
 URL: https://issues.apache.org/jira/browse/SOLR-5323
 Project: Solr
  Issue Type: Bug
  Components: contrib - Clustering
Affects Versions: 4.5
 Environment: vanilla mac
Reporter: John Berryman
 Fix For: 5.0, 4.6


 my typical use of Solr is something like this: 
 {code}
 cd SOLR_HOME/example
 cp -r solr /myProjectDir/solr_home
 java -jar -Dsolr.solr.home=/myProjectDir/solr_home  start.jar
 {code}
 But in solr 4.5.0 this fails to start successfully. I get an error:
 {code}
 org.apache.solr.common.SolrException: Error loading class 
 'solr.clustering.ClusteringComponent'
 {code}
 The reason is because solr.clustering.enabled defaults to true now. I don't 
 know why this might be the case.
 you can get around it with 
 {code}
 java -jar -Dsolr.solr.home=/myProjectDir/solr_home 
 -Dsolr.clustering.enabled=false start.jar
 {code}
 SOLR-4708 is when this became an issue.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5255) Make DocumentsWriter reference final in IW


[ 
https://issues.apache.org/jira/browse/LUCENE-5255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790519#comment-13790519
 ] 

ASF subversion and git services commented on LUCENE-5255:
-

Commit 1530679 from [~simonw] in branch 'dev/trunk'
[ https://svn.apache.org/r1530679 ]

LUCENE-5255: Make DocumentsWriter reference final in IW

 Make DocumentsWriter reference final in IW
 --

 Key: LUCENE-5255
 URL: https://issues.apache.org/jira/browse/LUCENE-5255
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/index
Affects Versions: 5.0, 4.6
Reporter: Simon Willnauer
 Fix For: 5.0, 4.6

 Attachments: LUCENE-5255.patch


 the DocumentWriter ref is nulled on close which seems unnecessary altogether. 
 We can just make it final instead.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5255) Make DocumentsWriter reference final in IW

2013-10-09 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790529#comment-13790529
 ] 

ASF subversion and git services commented on LUCENE-5255:
-

Commit 1530685 from [~simonw] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1530685 ]

LUCENE-5255: Make DocumentsWriter reference final in IW

 Make DocumentsWriter reference final in IW
 --

 Key: LUCENE-5255
 URL: https://issues.apache.org/jira/browse/LUCENE-5255
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/index
Affects Versions: 5.0, 4.6
Reporter: Simon Willnauer
 Fix For: 5.0, 4.6

 Attachments: LUCENE-5255.patch


 the DocumentWriter ref is nulled on close which seems unnecessary altogether. 
 We can just make it final instead.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2548) Multithreaded faceting

2013-10-09 Thread David Smiley (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790544#comment-13790544
 ] 

David Smiley commented on SOLR-2548:


Multithreaded faceting is useful when your CPU core count is much greater than 
the number of Solr cores you have, and you have a ton of data and need to facet 
on multiple fields.  You could theoretically get similar results by sharding 
more but you should limit sharding based on disk IO capabilities (especially 
when there's so much it won't get in RAM), which isn't necessary one-for-one 
with the CPU count.

 Multithreaded faceting
 --

 Key: SOLR-2548
 URL: https://issues.apache.org/jira/browse/SOLR-2548
 Project: Solr
  Issue Type: Improvement
  Components: search
Affects Versions: 3.1
Reporter: Janne Majaranta
Assignee: Erick Erickson
Priority: Minor
  Labels: facet
 Fix For: 4.5, 5.0

 Attachments: SOLR-2548_4.2.1.patch, SOLR-2548_for_31x.patch, 
 SOLR-2548_multithreaded_faceting,_dsmiley.patch, 
 SOLR-2548_multithreaded_faceting,_dsmiley.patch, SOLR-2548.patch, 
 SOLR-2548.patch, SOLR-2548.patch, SOLR-2548.patch, SOLR-2548.patch, 
 SOLR-2548.patch


 Add multithreading support for faceting.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5261) add simple API to build queries from analysis chain

2013-10-09 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790547#comment-13790547
 ] 

ASF subversion and git services commented on LUCENE-5261:
-

Commit 1530693 from [~rcmuir] in branch 'dev/trunk'
[ https://svn.apache.org/r1530693 ]

LUCENE-5261: add simple API to build queries from analysis chain

 add simple API to build queries from analysis chain
 ---

 Key: LUCENE-5261
 URL: https://issues.apache.org/jira/browse/LUCENE-5261
 Project: Lucene - Core
  Issue Type: New Feature
Reporter: Robert Muir
 Attachments: LUCENE-5261.patch, LUCENE-5261.patch, LUCENE-5261.patch


 Currently this is pretty crazy stuff.
 Additionally its duplicated in like 3 or 4 places in our codebase (i noticed 
 it doing LUCENE-5259)
 We can solve that duplication, and make it easy to simply create queries from 
 an analyzer (its been asked on the user list), as well as make it easier to 
 build new queryparsers.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5323) Solr requires -Dsolr.clustering.enabled=false when pointing at example config

2013-10-09 Thread Yonik Seeley (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790551#comment-13790551
 ] 

Yonik Seeley commented on SOLR-5323:


Hmmm, I agree this is a bug.
My comment in SOLR-4708 was +1, provided that everything (except clustering) 
still works if you copy example somewhere else.

 Solr requires -Dsolr.clustering.enabled=false when pointing at example config
 -

 Key: SOLR-5323
 URL: https://issues.apache.org/jira/browse/SOLR-5323
 Project: Solr
  Issue Type: Bug
  Components: contrib - Clustering
Affects Versions: 4.5
 Environment: vanilla mac
Reporter: John Berryman
 Fix For: 5.0, 4.6


 my typical use of Solr is something like this: 
 {code}
 cd SOLR_HOME/example
 cp -r solr /myProjectDir/solr_home
 java -jar -Dsolr.solr.home=/myProjectDir/solr_home  start.jar
 {code}
 But in solr 4.5.0 this fails to start successfully. I get an error:
 {code}
 org.apache.solr.common.SolrException: Error loading class 
 'solr.clustering.ClusteringComponent'
 {code}
 The reason is because solr.clustering.enabled defaults to true now. I don't 
 know why this might be the case.
 you can get around it with 
 {code}
 java -jar -Dsolr.solr.home=/myProjectDir/solr_home 
 -Dsolr.clustering.enabled=false start.jar
 {code}
 SOLR-4708 is when this became an issue.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5323) Solr requires -Dsolr.clustering.enabled=false when pointing at example config


[ 
https://issues.apache.org/jira/browse/SOLR-5323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790568#comment-13790568
 ] 

Erik Hatcher commented on SOLR-5323:


bq.  My comment in SOLR-4708 was +1, provided that everything (except 
clustering) still works if you copy example somewhere else.

And that's the reason I didn't commit it before.  I thought somehow Dawid had 
worked some magic to alleviate this issue when he took it on.

We should perhaps have lazy loaded SearchComponents too? 

 Solr requires -Dsolr.clustering.enabled=false when pointing at example config
 -

 Key: SOLR-5323
 URL: https://issues.apache.org/jira/browse/SOLR-5323
 Project: Solr
  Issue Type: Bug
  Components: contrib - Clustering
Affects Versions: 4.5
 Environment: vanilla mac
Reporter: John Berryman
 Fix For: 5.0, 4.6


 my typical use of Solr is something like this: 
 {code}
 cd SOLR_HOME/example
 cp -r solr /myProjectDir/solr_home
 java -jar -Dsolr.solr.home=/myProjectDir/solr_home  start.jar
 {code}
 But in solr 4.5.0 this fails to start successfully. I get an error:
 {code}
 org.apache.solr.common.SolrException: Error loading class 
 'solr.clustering.ClusteringComponent'
 {code}
 The reason is because solr.clustering.enabled defaults to true now. I don't 
 know why this might be the case.
 you can get around it with 
 {code}
 java -jar -Dsolr.solr.home=/myProjectDir/solr_home 
 -Dsolr.clustering.enabled=false start.jar
 {code}
 SOLR-4708 is when this became an issue.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-5324) Make sub shard replica recovery and shard state switch asynchronous


 [ 
https://issues.apache.org/jira/browse/SOLR-5324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar updated SOLR-5324:


Attachment: SOLR-5324.patch

# On unsuccessful replica recovery, the sub-shard state was incorrectly being 
set active
# The split by route field test should wait for the right collection to recover

 Make sub shard replica recovery and shard state switch asynchronous
 ---

 Key: SOLR-5324
 URL: https://issues.apache.org/jira/browse/SOLR-5324
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
Reporter: Shalin Shekhar Mangar
Assignee: Shalin Shekhar Mangar
 Fix For: 5.0, 4.6

 Attachments: SOLR-5324.patch, SOLR-5324.patch


 Currently the shard split command waits for all replicas of all sub shards to 
 recover and then switches the state of parent to inactive and sub-shards to 
 active.
 The problem is that shard split (ab)uses the CoreAdmin WaitForState action to 
 ask the sub shard leader to wait until the replica states are active. This 
 action is prone to timeout.
 We should make the shard state switching asynchronous. Once all replicas of 
 all sub-shards are 'active', the shard states should be switched 
 automatically.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-5325) zk connection loss causes overseer leader loss

Christine Poerschke created SOLR-5325:
-

 Summary: zk connection loss causes overseer leader loss
 Key: SOLR-5325
 URL: https://issues.apache.org/jira/browse/SOLR-5325
 Project: Solr
  Issue Type: Bug
Reporter: Christine Poerschke






--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-5325) zk connection loss causes overseer leader loss

2013-10-09 Thread ASF subversion and git services (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-5325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christine Poerschke updated SOLR-5325:
--

  Description: 
The problem we saw was that when the solr overseer leader experienced temporary 
zk connectivity problems it stopped processing overseer queue events.

This first happened when quorum within the external zk ensemble was lost due to 
too many zookeepers being stopped (similar to SOLR-5199). The second time it 
happened when there was a sufficient number of zookeepers but they were holding 
zookeeper leadership elections and thus refused connections (the elections were 
taking several seconds, we were using the default zookeeper.cnxTimeout=5s value 
and it was hit for one ensemble member).

Affects Version/s: 4.3
   4.4

 zk connection loss causes overseer leader loss
 --

 Key: SOLR-5325
 URL: https://issues.apache.org/jira/browse/SOLR-5325
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.3, 4.4
Reporter: Christine Poerschke

 The problem we saw was that when the solr overseer leader experienced 
 temporary zk connectivity problems it stopped processing overseer queue 
 events.
 This first happened when quorum within the external zk ensemble was lost due 
 to too many zookeepers being stopped (similar to SOLR-5199). The second time 
 it happened when there was a sufficient number of zookeepers but they were 
 holding zookeeper leadership elections and thus refused connections (the 
 elections were taking several seconds, we were using the default 
 zookeeper.cnxTimeout=5s value and it was hit for one ensemble member).



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5261) add simple API to build queries from analysis chain


[ 
https://issues.apache.org/jira/browse/LUCENE-5261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790580#comment-13790580
 ] 

ASF subversion and git services commented on LUCENE-5261:
-

Commit 1530701 from [~rcmuir] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1530701 ]

LUCENE-5261: add simple API to build queries from analysis chain

 add simple API to build queries from analysis chain
 ---

 Key: LUCENE-5261
 URL: https://issues.apache.org/jira/browse/LUCENE-5261
 Project: Lucene - Core
  Issue Type: New Feature
Reporter: Robert Muir
 Attachments: LUCENE-5261.patch, LUCENE-5261.patch, LUCENE-5261.patch


 Currently this is pretty crazy stuff.
 Additionally its duplicated in like 3 or 4 places in our codebase (i noticed 
 it doing LUCENE-5259)
 We can solve that duplication, and make it easy to simply create queries from 
 an analyzer (its been asked on the user list), as well as make it easier to 
 build new queryparsers.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-5261) add simple API to build queries from analysis chain


 [ 
https://issues.apache.org/jira/browse/LUCENE-5261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir resolved LUCENE-5261.
-

   Resolution: Fixed
Fix Version/s: 4.6
   5.0

 add simple API to build queries from analysis chain
 ---

 Key: LUCENE-5261
 URL: https://issues.apache.org/jira/browse/LUCENE-5261
 Project: Lucene - Core
  Issue Type: New Feature
Reporter: Robert Muir
 Fix For: 5.0, 4.6

 Attachments: LUCENE-5261.patch, LUCENE-5261.patch, LUCENE-5261.patch


 Currently this is pretty crazy stuff.
 Additionally its duplicated in like 3 or 4 places in our codebase (i noticed 
 it doing LUCENE-5259)
 We can solve that duplication, and make it easy to simply create queries from 
 an analyzer (its been asked on the user list), as well as make it easier to 
 build new queryparsers.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5307) Solr 4.5 collection api ignores collection.configName when used in cloud


[ 
https://issues.apache.org/jira/browse/SOLR-5307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790579#comment-13790579
 ] 

Mark Miller commented on SOLR-5307:
---

Ouch - this sounds like a pretty bad bug.

 Solr 4.5 collection api ignores collection.configName when used in cloud
 

 Key: SOLR-5307
 URL: https://issues.apache.org/jira/browse/SOLR-5307
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.5
Reporter: Nathan Neulinger
  Labels: cloud, collection-api, zookeeper

 This worked properly in 4.4, but on 4.5, specifying collection.configName 
 when creating a collection doesn't work - it gets the default regardless of 
 what has been uploaded into zk. Explicitly linking config name to collection 
 ahead of time with zkcli.sh is a workaround I'm using for the moment, but 
 that did not appear to be necessary with 4.4 unless I was doing something 
 wrong and not realizing it.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-5325) zk connection loss causes overseer leader loss


 [ 
https://issues.apache.org/jira/browse/SOLR-5325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christine Poerschke updated SOLR-5325:
--

Attachment: SOLR-5325.patch

Attaching Overseer.java patch for solr 4.4.0, OverseerCollectionProcessor.java 
could be changed in similar way.

 zk connection loss causes overseer leader loss
 --

 Key: SOLR-5325
 URL: https://issues.apache.org/jira/browse/SOLR-5325
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.3, 4.4
Reporter: Christine Poerschke
 Attachments: SOLR-5325.patch


 The problem we saw was that when the solr overseer leader experienced 
 temporary zk connectivity problems it stopped processing overseer queue 
 events.
 This first happened when quorum within the external zk ensemble was lost due 
 to too many zookeepers being stopped (similar to SOLR-5199). The second time 
 it happened when there was a sufficient number of zookeepers but they were 
 holding zookeeper leadership elections and thus refused connections (the 
 elections were taking several seconds, we were using the default 
 zookeeper.cnxTimeout=5s value and it was hit for one ensemble member).



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5213) collections?action=SPLITSHARD parent vs. sub-shards numDocs

[
https://issues.apache.org/jira/browse/SOLR-5213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790590#comment-13790590
]

Christine Poerschke commented on SOLR-5213:
---

Two occurrences of lost documents were seen. The one with the majority of
documents lost was tracked down to operational error (shardX files were copied
to be shardY files), a second loss was of a few dozen documents only, for that
never figured out if it was operational or something else. Other shard splits
since then were fine i.e. no losses.

collections?action=SPLITSHARD parent vs. sub-shards numDocs
---

Key: SOLR-5213
URL: https://issues.apache.org/jira/browse/SOLR-5213
Project: Solr
Issue Type: Improvement
Components: update
Affects Versions: 4.4
Reporter: Christine Poerschke
Assignee: Shalin Shekhar Mangar
Attachments: SOLR-5213.patch

The problem we saw was that splitting a shard took a long time and at the end
of it the sub-shards contained fewer documents than the original shard.
The root cause was eventually tracked down to the disappearing documents not
falling into the hash ranges of the sub-shards.
Could SolrIndexSplitter split report per-segment numDocs for parent and
sub-shards, with at least a warning logged for any discrepancies (documents
falling into none of the sub-shards or documents falling into several
sub-shards)?
Additionally, could a case be made for erroring out when discrepancies are
detected i.e. not proceeding with the shard split? Either to always error or
to have an verifyNumDocs=false/true optional parameter for the SPLITSHARD
action.

--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5213) collections?action=SPLITSHARD parent vs. sub-shards numDocs

[
https://issues.apache.org/jira/browse/SOLR-5213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790608#comment-13790608
]

Shalin Shekhar Mangar commented on SOLR-5213:
-

I'm seeing similar problems as well on the ShardSplitTest sporadically. I've
opened SOLR-5309 to track it.

I'll review and commit your patch shortly.

collections?action=SPLITSHARD parent vs. sub-shards numDocs
---

--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5307) Solr 4.5 collection api ignores collection.configName when used in cloud


[ 
https://issues.apache.org/jira/browse/SOLR-5307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790612#comment-13790612
 ] 

Shalin Shekhar Mangar commented on SOLR-5307:
-

bq. Ouch - this sounds like a pretty bad bug.

Yeah, SOLR-5317 too.

 Solr 4.5 collection api ignores collection.configName when used in cloud
 

 Key: SOLR-5307
 URL: https://issues.apache.org/jira/browse/SOLR-5307
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.5
Reporter: Nathan Neulinger
  Labels: cloud, collection-api, zookeeper

 This worked properly in 4.4, but on 4.5, specifying collection.configName 
 when creating a collection doesn't work - it gets the default regardless of 
 what has been uploaded into zk. Explicitly linking config name to collection 
 ahead of time with zkcli.sh is a workaround I'm using for the moment, but 
 that did not appear to be necessary with 4.4 unless I was doing something 
 wrong and not realizing it.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Assigned] (SOLR-5307) Solr 4.5 collection api ignores collection.configName when used in cloud


 [ 
https://issues.apache.org/jira/browse/SOLR-5307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller reassigned SOLR-5307:
-

Assignee: Mark Miller

 Solr 4.5 collection api ignores collection.configName when used in cloud
 

 Key: SOLR-5307
 URL: https://issues.apache.org/jira/browse/SOLR-5307
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.5
Reporter: Nathan Neulinger
Assignee: Mark Miller
  Labels: cloud, collection-api, zookeeper

 This worked properly in 4.4, but on 4.5, specifying collection.configName 
 when creating a collection doesn't work - it gets the default regardless of 
 what has been uploaded into zk. Explicitly linking config name to collection 
 ahead of time with zkcli.sh is a workaround I'm using for the moment, but 
 that did not appear to be necessary with 4.4 unless I was doing something 
 wrong and not realizing it.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Assigned] (SOLR-5306) can not create collection when have over one config


 [ 
https://issues.apache.org/jira/browse/SOLR-5306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller reassigned SOLR-5306:
-

Assignee: Mark Miller

 can not create collection when have over one config
 ---

 Key: SOLR-5306
 URL: https://issues.apache.org/jira/browse/SOLR-5306
 Project: Solr
  Issue Type: Bug
  Components: clients - java
Affects Versions: 4.5
 Environment: win7 jdk 7
Reporter: Liang Tianyu
Assignee: Mark Miller
Priority: Critical

 I have uploaded zookeeper two config: patent and applicant. I can not create 
 collection:http://localhost:8080/solr/admin/collections?action=CREATEname=patent_main_1numShards=1collection.configName=patent.show
  errors:patent_main_1_shard1_replica1: 
 org.apache.solr.common.cloud.ZooKeeperException:org.apache.solr.common.cloud.ZooKeeperException:
  Could not find configName for collection patent_main_1 found:[applicant, 
 patent]. In solr 4.4 I can create sucessfully.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5213) collections?action=SPLITSHARD parent vs. sub-shards numDocs


[ 
https://issues.apache.org/jira/browse/SOLR-5213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790657#comment-13790657
 ] 

Christine Poerschke commented on SOLR-5213:
---

A variation of the patch i uploaded here would be to 'rescue' (and id+hash log) 
any documents that would have been lost otherwise e.g. always put them in the 
first sub-shard, they don't belong there but at least that way they are not 
lost and could be analysed and dealt with later on.

 collections?action=SPLITSHARD parent vs. sub-shards numDocs
 ---

 Key: SOLR-5213
 URL: https://issues.apache.org/jira/browse/SOLR-5213
 Project: Solr
  Issue Type: Improvement
  Components: update
Affects Versions: 4.4
Reporter: Christine Poerschke
Assignee: Shalin Shekhar Mangar
 Attachments: SOLR-5213.patch


 The problem we saw was that splitting a shard took a long time and at the end 
 of it the sub-shards contained fewer documents than the original shard.
 The root cause was eventually tracked down to the disappearing documents not 
 falling into the hash ranges of the sub-shards.
 Could SolrIndexSplitter split report per-segment numDocs for parent and 
 sub-shards, with at least a warning logged for any discrepancies (documents 
 falling into none of the sub-shards or documents falling into several 
 sub-shards)?
 Additionally, could a case be made for erroring out when discrepancies are 
 detected i.e. not proceeding with the shard split? Either to always error or 
 to have an verifyNumDocs=false/true optional parameter for the SPLITSHARD 
 action.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-5326) admin/collections?action=SPLITSHARD support for multiple shards

2013-10-09 Thread ASF subversion and git services (JIRA)

Christine Poerschke created SOLR-5326:
-

 Summary: admin/collections?action=SPLITSHARD support for multiple 
shards
 Key: SOLR-5326
 URL: https://issues.apache.org/jira/browse/SOLR-5326
 Project: Solr
  Issue Type: New Feature
Affects Versions: 4.4
Reporter: Christine Poerschke


The problem we saw was that splitting one shard took 'a long time' (around 4 
hours) and with there being 'many' (8 at the time) shards to split and the solr 
overseer serialising action=SPLITSHARD requests a full collection split would 
have taken 'a very long time'.

Separately, shard splitting distributing replica2, replica3, etc. of each shard 
randomly across machines was not desirable and as in SOLR-5004 splitting into 
'n' rather than '2' sub-shards was useful.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5263) Deletes may be silently lost if an IOException is hit and later not hit (e.g., disk fills up and then frees up)


[ 
https://issues.apache.org/jira/browse/LUCENE-5263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790677#comment-13790677
 ] 

ASF subversion and git services commented on LUCENE-5263:
-

Commit 1530741 from [~mikemccand] in branch 'dev/trunk'
[ https://svn.apache.org/r1530741 ]

LUCENE-5263: remove extra deleter.checkpoint

 Deletes may be silently lost if an IOException is hit and later not hit 
 (e.g., disk fills up and then frees up)
 ---

 Key: LUCENE-5263
 URL: https://issues.apache.org/jira/browse/LUCENE-5263
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/index
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 4.6, 5.0

 Attachments: LUCENE-5263.patch, LUCENE-5263.patch


 This case is tricky to handle, yet I think realistic: disk fills up
 temporarily, causes an exception in writeLiveDocs, and then the app
 keeps using the IW instance.
 Meanwhile disk later frees up again, IW is closed successfully.  In
 certain cases, we can silently lose deletes in this case.
 I had already committed
 TestIndexWriterDeletes.testNoLostDeletesOnDiskFull, and Jenkins seems
 happy with it so far, but when I added fangs to the test (cutover to
 RandomIndexWriter from IndexWriter, allow IOE during getReader, add
 randomness to when exc is thrown, etc.), it uncovered some real/nasty
 bugs:
   * ReaderPool.dropAll was suppressing any exception it hit, because
 {code}if (priorE != null){code} should instead be {code}if (priorE == 
 null){code}
   * After a merge, we have to write deletes before committing the
 segment, because an exception when writing deletes means we need
 to abort the merge
   * Several places that were directly calling deleter.checkpoint must
 also increment the changeCount else on close IW thinks there are
 no changes and doesn't write a new segments file.
   * closeInternal was dropping pooled readers after writing the
 segments file, which would lose deletes still buffered due to a
 previous exc.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-5268) Cutover more postings formats to the inverted pull API

2013-10-09 Thread Michael McCandless (JIRA)

Michael McCandless created LUCENE-5268:
--

 Summary: Cutover more postings formats to the inverted pull API
 Key: LUCENE-5268
 URL: https://issues.apache.org/jira/browse/LUCENE-5268
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/index
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 5.0


In LUCENE-5123, we added a new, more flexible, pull API for writing
postings.  This API allows the postings format to iterate the
fields/terms/postings more than once, and mirrors the API for writing
doc values.

But that was just the first step (only SimpleText was cutover to the
new API).  I want to cutover more components, so we can (finally)
e.g. play with different encodings depending on the term's postings,
such as using a bitset for high freq DOCS_ONLY terms (LUCENE-5052).




--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-5268) Cutover more postings formats to the inverted pull API

2013-10-09 Thread Michael McCandless (JIRA)

[
https://issues.apache.org/jira/browse/LUCENE-5268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Michael McCandless updated LUCENE-5268:
---

Attachment: LUCENE-5268.patch

Patch with these changes:

* Cutover BlockTreeTermsWriter, BlockTermsWriter, FST/OrdTermsWriter
from PushFieldsConsumer to FieldsConsumer

* Changed PostingsBaseWriter to a pull API, with a single method
to write the current term's postings, and then added a new
PushPostingsBaseWriter that has the push API.

* Cutover some formats to new PostingsBaseWriter; pulsing and bloom
were nice cleanups. For the rest I just switched them to
PushPostingsBaseWriter.

* Only two PushFieldsConsumers remain: MemoryPF and RAMOnlyPF
(test-framework); I'm tempted to just cut those over and then
remove PushFieldsConsumer here.

Still a few nocommits but I think it's close ...

Cutover more postings formats to the inverted pull API

Key: LUCENE-5268
URL: https://issues.apache.org/jira/browse/LUCENE-5268
Project: Lucene - Core
Issue Type: Improvement
Components: core/index
Reporter: Michael McCandless
Assignee: Michael McCandless
Fix For: 5.0

Attachments: LUCENE-5268.patch

In LUCENE-5123, we added a new, more flexible, pull API for writing
postings. This API allows the postings format to iterate the
fields/terms/postings more than once, and mirrors the API for writing
doc values.
But that was just the first step (only SimpleText was cutover to the
new API). I want to cutover more components, so we can (finally)
e.g. play with different encodings depending on the term's postings,
such as using a bitset for high freq DOCS_ONLY terms (LUCENE-5052).

--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-5248) Improve the data structure used in ReaderAndLiveDocs to hold the updates

2013-10-09 Thread Shai Erera (JIRA)

[
https://issues.apache.org/jira/browse/LUCENE-5248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Shai Erera updated LUCENE-5248:
---

Attachment: LUCENE-5248.patch

Patch replaces MapNumericFieldUpdates with PackedNumericFieldUpdates which hold
the docs/values data in PagedMutable and PagedGrowableWriter respectively. It
also holds a FixedBitSet the size of maxDoc to mark which documents have a
numeric value (e.g. for unsetting a value from a document).

Improve the data structure used in ReaderAndLiveDocs to hold the updates

Key: LUCENE-5248
URL: https://issues.apache.org/jira/browse/LUCENE-5248
Project: Lucene - Core
Issue Type: Improvement
Components: core/index
Reporter: Shai Erera
Assignee: Shai Erera
Attachments: LUCENE-5248.patch, LUCENE-5248.patch, LUCENE-5248.patch,
LUCENE-5248.patch

Currently ReaderAndLiveDocs holds the updates in two structures:
+MapString,MapInteger,Long+
Holds a mapping from each field, to all docs that were updated and their
values. This structure is updated when applyDeletes is called, and needs to
satisfy several requirements:
# Un-ordered writes: if a field f is updated by two terms, termA and termB,
in that order, and termA affects doc=100 and termB doc=2, then the updates
are applied in that order, meaning we cannot rely on updates coming in order.
# Same document may be updated multiple times, either by same term (e.g.
several calls to IW.updateNDV) or by different terms. Last update wins.
# Sequential read: when writing the updates to the Directory
(fieldsConsumer), we iterate on the docs in-order and for each one check if
it's updated and if not, pull its value from the current DV.
# A single update may affect several million documents, therefore need to be
efficient w.r.t. memory consumption.
+MapInteger,MapString,Long+
Holds a mapping from a document, to all the fields that it was updated in and
the updated value for each field. This is used by IW.commitMergedDeletes to
apply the updates that came in while the segment was merging. The
requirements this structure needs to satisfy are:
# Access in doc order: this is how commitMergedDeletes works.
# One-pass: we visit a document once (currently) and so if we can, it's
better if we know all the fields in which it was updated. The updates are
applied to the merged ReaderAndLiveDocs (where they are stored in the first
structure mentioned above).
Comments with proposals will follow next.

--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-5027) Result Set Collapse and Expand Plugins


 [ 
https://issues.apache.org/jira/browse/SOLR-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-5027:
-

Attachment: SOLR-5027.patch

Added support for the QueryElevationComponent and test case.

 Result Set Collapse and Expand Plugins
 --

 Key: SOLR-5027
 URL: https://issues.apache.org/jira/browse/SOLR-5027
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 5.0
Reporter: Joel Bernstein
Priority: Minor
 Attachments: SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch, 
 SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch


 This ticket introduces two new Solr plugins, the *CollapsingQParserPlugin* 
 and the *ExpandComponent*.
 The *CollapsingQParserPlugin* is a PostFilter that performs field collapsing.
 Collapse based on the highest scoring document:
 {code}
 fq=(!collapse field=field_name}
 {code}
 Collapse based on the min value of a numeric field:
 {code}
 fq={!collapse field=field_name min=field_name}
 {code}
 Collapse based on the max value of a numeric field:
 {code}
 fq={!collapse field=field_name max=field_name}
 {code}
 Collapse with a null policy:
 {code}
 fq={!collapse field=field_name nullPolicy=null_policy}
 {code}
 There are three null policies:
 ignore : removes docs with a null value in the collapse field (default).
 expand : treats each doc with a null value in the collapse field as a 
 separate group.
 collapse : collapses all docs with a null value into a single group using 
 either highest score, or min/max.
 The *ExpandComponent* is a search component that takes the collapsed docList 
 and expands the groups for a single page based on parameters provided.
 Initial syntax:
 expand=true   - Turns on the expand component.
 expand.field=field - Expands results for this field
 expand.limit=5 - Limits the documents for each expanded group.
 expand.sort=sort spec - The sort spec for the expanded documents. Default 
 is score.
 expand.rows=500 - The max number of expanded results to bring back. Default 
 is 500.
 *Note:* Recent patches don't contain the expand component. The July 16 patch 
 does. This will be brought back in when the collapse is finished, or possible 
 moved to it's own ticket.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-5307) Solr 4.5 collection api ignores collection.configName when used in cloud


 [ 
https://issues.apache.org/jira/browse/SOLR-5307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller resolved SOLR-5307.
---

Resolution: Duplicate

 Solr 4.5 collection api ignores collection.configName when used in cloud
 

 Key: SOLR-5307
 URL: https://issues.apache.org/jira/browse/SOLR-5307
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.5
Reporter: Nathan Neulinger
Assignee: Mark Miller
  Labels: cloud, collection-api, zookeeper

 This worked properly in 4.4, but on 4.5, specifying collection.configName 
 when creating a collection doesn't work - it gets the default regardless of 
 what has been uploaded into zk. Explicitly linking config name to collection 
 ahead of time with zkcli.sh is a workaround I'm using for the moment, but 
 that did not appear to be necessary with 4.4 unless I was doing something 
 wrong and not realizing it.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-5306) can not create collection when have over one config


 [ 
https://issues.apache.org/jira/browse/SOLR-5306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller updated SOLR-5306:
--

Attachment: SOLR-5306.patch

 can not create collection when have over one config
 ---

 Key: SOLR-5306
 URL: https://issues.apache.org/jira/browse/SOLR-5306
 Project: Solr
  Issue Type: Bug
  Components: clients - java
Affects Versions: 4.5
 Environment: win7 jdk 7
Reporter: Liang Tianyu
Assignee: Mark Miller
Priority: Critical
 Attachments: SOLR-5306.patch


 I have uploaded zookeeper two config: patent and applicant. I can not create 
 collection:http://localhost:8080/solr/admin/collections?action=CREATEname=patent_main_1numShards=1collection.configName=patent.show
  errors:patent_main_1_shard1_replica1: 
 org.apache.solr.common.cloud.ZooKeeperException:org.apache.solr.common.cloud.ZooKeeperException:
  Could not find configName for collection patent_main_1 found:[applicant, 
 patent]. In solr 4.4 I can create sucessfully.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-5306) can not create collection when have over one config


 [ 
https://issues.apache.org/jira/browse/SOLR-5306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller updated SOLR-5306:
--

Fix Version/s: 5.0
   4.6
   4.5.1

 can not create collection when have over one config
 ---

 Key: SOLR-5306
 URL: https://issues.apache.org/jira/browse/SOLR-5306
 Project: Solr
  Issue Type: Bug
  Components: clients - java
Affects Versions: 4.5
 Environment: win7 jdk 7
Reporter: Liang Tianyu
Assignee: Mark Miller
Priority: Critical
 Fix For: 4.5.1, 4.6, 5.0

 Attachments: SOLR-5306.patch


 I have uploaded zookeeper two config: patent and applicant. I can not create 
 collection:http://localhost:8080/solr/admin/collections?action=CREATEname=patent_main_1numShards=1collection.configName=patent.show
  errors:patent_main_1_shard1_replica1: 
 org.apache.solr.common.cloud.ZooKeeperException:org.apache.solr.common.cloud.ZooKeeperException:
  Could not find configName for collection patent_main_1 found:[applicant, 
 patent]. In solr 4.4 I can create sucessfully.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-5317) CoreAdmin API is not persisting data properly


 [ 
https://issues.apache.org/jira/browse/SOLR-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller updated SOLR-5317:
--

Fix Version/s: 5.0
   4.6
   4.5.1

 CoreAdmin API is not persisting data properly
 -

 Key: SOLR-5317
 URL: https://issues.apache.org/jira/browse/SOLR-5317
 Project: Solr
  Issue Type: Bug
Reporter: Yago Riveiro
Priority: Critical
 Fix For: 4.5.1, 4.6, 5.0


 There is a regression between 4.4 and 4.5 with the CoreAdmin API, the command 
 doesn't save the result on solr.xml at time that is executed.
 The full process is describe here: https://gist.github.com/yriveiro/6883208



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-5027) Field Collapsing PostFilter


 [ 
https://issues.apache.org/jira/browse/SOLR-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-5027:
-

Summary: Field Collapsing PostFilter  (was: Result Set Collapse and Expand 
Plugins)

 Field Collapsing PostFilter
 ---

 Key: SOLR-5027
 URL: https://issues.apache.org/jira/browse/SOLR-5027
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 5.0
Reporter: Joel Bernstein
Priority: Minor
 Attachments: SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch, 
 SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch


 This ticket introduces two new Solr plugins, the *CollapsingQParserPlugin* 
 and the *ExpandComponent*.
 The *CollapsingQParserPlugin* is a PostFilter that performs field collapsing.
 Collapse based on the highest scoring document:
 {code}
 fq=(!collapse field=field_name}
 {code}
 Collapse based on the min value of a numeric field:
 {code}
 fq={!collapse field=field_name min=field_name}
 {code}
 Collapse based on the max value of a numeric field:
 {code}
 fq={!collapse field=field_name max=field_name}
 {code}
 Collapse with a null policy:
 {code}
 fq={!collapse field=field_name nullPolicy=null_policy}
 {code}
 There are three null policies:
 ignore : removes docs with a null value in the collapse field (default).
 expand : treats each doc with a null value in the collapse field as a 
 separate group.
 collapse : collapses all docs with a null value into a single group using 
 either highest score, or min/max.
 The *ExpandComponent* is a search component that takes the collapsed docList 
 and expands the groups for a single page based on parameters provided.
 Initial syntax:
 expand=true   - Turns on the expand component.
 expand.field=field - Expands results for this field
 expand.limit=5 - Limits the documents for each expanded group.
 expand.sort=sort spec - The sort spec for the expanded documents. Default 
 is score.
 expand.rows=500 - The max number of expanded results to bring back. Default 
 is 500.
 *Note:* Recent patches don't contain the expand component. The July 16 patch 
 does. This will be brought back in when the collapse is finished, or possible 
 moved to it's own ticket.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-5027) Field Collapsing PostFilter


 [ 
https://issues.apache.org/jira/browse/SOLR-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-5027:
-

Description: 
This ticket introduces the *CollapsingQParserPlugin* 

The *CollapsingQParserPlugin* is a PostFilter that performs field collapsing. 
This is a high performance alternative to standard Solr field collapsing (with 
*ngroups*) when the number of distinct groups in the result set is high.

For example in one performance test, a search with 10 million full results and 
1 million collapsed groups:
Standard grouping with ngroups : 17 seconds.
CollapsingQParserPlugin: 300 milli-seconds.

Sample syntax:

Collapse based on the highest scoring document:

{code}
fq=(!collapse field=field_name}
{code}

Collapse based on the min value of a numeric field:
{code}
fq={!collapse field=field_name min=field_name}
{code}

Collapse based on the max value of a numeric field:
{code}
fq={!collapse field=field_name max=field_name}
{code}

Collapse with a null policy:
{code}
fq={!collapse field=field_name nullPolicy=null_policy}
{code}
There are three null policies:
ignore : removes docs with a null value in the collapse field (default).
expand : treats each doc with a null value in the collapse field as a separate 
group.
collapse : collapses all docs with a null value into a single group using 
either highest score, or min/max.

*Note:*  The July 16 patch also includes and ExpandComponent that expands the 
collapsed groups for the current search result page. This functionality will 
moved to it's own ticket.






  was:
This ticket introduces two new Solr plugins, the *CollapsingQParserPlugin* and 
the *ExpandComponent*.


The *CollapsingQParserPlugin* is a PostFilter that performs field collapsing.

Collapse based on the highest scoring document:

{code}
fq=(!collapse field=field_name}
{code}

Collapse based on the min value of a numeric field:
{code}
fq={!collapse field=field_name min=field_name}
{code}

Collapse based on the max value of a numeric field:
{code}
fq={!collapse field=field_name max=field_name}
{code}

Collapse with a null policy:
{code}
fq={!collapse field=field_name nullPolicy=null_policy}
{code}
There are three null policies:
ignore : removes docs with a null value in the collapse field (default).
expand : treats each doc with a null value in the collapse field as a separate 
group.
collapse : collapses all docs with a null value into a single group using 
either highest score, or min/max.









The *ExpandComponent* is a search component that takes the collapsed docList 
and expands the groups for a single page based on parameters provided.

Initial syntax:

expand=true   - Turns on the expand component.
expand.field=field - Expands results for this field
expand.limit=5 - Limits the documents for each expanded group.
expand.sort=sort spec - The sort spec for the expanded documents. Default is 
score.
expand.rows=500 - The max number of expanded results to bring back. Default is 
500.

*Note:* Recent patches don't contain the expand component. The July 16 patch 
does. This will be brought back in when the collapse is finished, or possible 
moved to it's own ticket.







 Field Collapsing PostFilter
 ---

 Key: SOLR-5027
 URL: https://issues.apache.org/jira/browse/SOLR-5027
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 5.0
Reporter: Joel Bernstein
Priority: Minor
 Attachments: SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch, 
 SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch


 This ticket introduces the *CollapsingQParserPlugin* 
 The *CollapsingQParserPlugin* is a PostFilter that performs field collapsing. 
 This is a high performance alternative to standard Solr field collapsing 
 (with *ngroups*) when the number of distinct groups in the result set is high.
 For example in one performance test, a search with 10 million full results 
 and 1 million collapsed groups:
 Standard grouping with ngroups : 17 seconds.
 CollapsingQParserPlugin: 300 milli-seconds.
 Sample syntax:
 Collapse based on the highest scoring document:
 {code}
 fq=(!collapse field=field_name}
 {code}
 Collapse based on the min value of a numeric field:
 {code}
 fq={!collapse field=field_name min=field_name}
 {code}
 Collapse based on the max value of a numeric field:
 {code}
 fq={!collapse field=field_name max=field_name}
 {code}
 Collapse with a null policy:
 {code}
 fq={!collapse field=field_name nullPolicy=null_policy}
 {code}
 There are three null policies:
 ignore : removes docs with a null value in the collapse field (default).
 expand : treats each doc with a null value in the collapse field as a 
 separate group.
 collapse : collapses all docs with a null value into a single group using 
 either highest score, or min/max.
 *Note:*  The July 16 patch also includes

[jira] [Assigned] (SOLR-5027) Field Collapsing PostFilter


 [ 
https://issues.apache.org/jira/browse/SOLR-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein reassigned SOLR-5027:


Assignee: Joel Bernstein

 Field Collapsing PostFilter
 ---

 Key: SOLR-5027
 URL: https://issues.apache.org/jira/browse/SOLR-5027
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 5.0
Reporter: Joel Bernstein
Assignee: Joel Bernstein
Priority: Minor
 Fix For: 4.6, 5.0

 Attachments: SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch, 
 SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch


 This ticket introduces the *CollapsingQParserPlugin* 
 The *CollapsingQParserPlugin* is a PostFilter that performs field collapsing. 
 This is a high performance alternative to standard Solr field collapsing 
 (with *ngroups*) when the number of distinct groups in the result set is high.
 For example in one performance test, a search with 10 million full results 
 and 1 million collapsed groups:
 Standard grouping with ngroups : 17 seconds.
 CollapsingQParserPlugin: 300 milli-seconds.
 Sample syntax:
 Collapse based on the highest scoring document:
 {code}
 fq=(!collapse field=field_name}
 {code}
 Collapse based on the min value of a numeric field:
 {code}
 fq={!collapse field=field_name min=field_name}
 {code}
 Collapse based on the max value of a numeric field:
 {code}
 fq={!collapse field=field_name max=field_name}
 {code}
 Collapse with a null policy:
 {code}
 fq={!collapse field=field_name nullPolicy=null_policy}
 {code}
 There are three null policies:
 ignore : removes docs with a null value in the collapse field (default).
 expand : treats each doc with a null value in the collapse field as a 
 separate group.
 collapse : collapses all docs with a null value into a single group using 
 either highest score, or min/max.
 *Note:*  The July 16 patch also includes and ExpandComponent that expands the 
 collapsed groups for the current search result page. This functionality will 
 moved to it's own ticket.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-5027) Field Collapsing PostFilter


 [ 
https://issues.apache.org/jira/browse/SOLR-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-5027:
-

Fix Version/s: 5.0
   4.6

 Field Collapsing PostFilter
 ---

 Key: SOLR-5027
 URL: https://issues.apache.org/jira/browse/SOLR-5027
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 5.0
Reporter: Joel Bernstein
Assignee: Joel Bernstein
Priority: Minor
 Fix For: 4.6, 5.0

 Attachments: SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch, 
 SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch


 This ticket introduces the *CollapsingQParserPlugin* 
 The *CollapsingQParserPlugin* is a PostFilter that performs field collapsing. 
 This is a high performance alternative to standard Solr field collapsing 
 (with *ngroups*) when the number of distinct groups in the result set is high.
 For example in one performance test, a search with 10 million full results 
 and 1 million collapsed groups:
 Standard grouping with ngroups : 17 seconds.
 CollapsingQParserPlugin: 300 milli-seconds.
 Sample syntax:
 Collapse based on the highest scoring document:
 {code}
 fq=(!collapse field=field_name}
 {code}
 Collapse based on the min value of a numeric field:
 {code}
 fq={!collapse field=field_name min=field_name}
 {code}
 Collapse based on the max value of a numeric field:
 {code}
 fq={!collapse field=field_name max=field_name}
 {code}
 Collapse with a null policy:
 {code}
 fq={!collapse field=field_name nullPolicy=null_policy}
 {code}
 There are three null policies:
 ignore : removes docs with a null value in the collapse field (default).
 expand : treats each doc with a null value in the collapse field as a 
 separate group.
 collapse : collapses all docs with a null value into a single group using 
 either highest score, or min/max.
 *Note:*  The July 16 patch also includes and ExpandComponent that expands the 
 collapsed groups for the current search result page. This functionality will 
 moved to it's own ticket.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-5027) Field Collapsing PostFilter


 [ 
https://issues.apache.org/jira/browse/SOLR-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-5027:
-

Description: 
This ticket introduces the *CollapsingQParserPlugin* 

The *CollapsingQParserPlugin* is a PostFilter that performs field collapsing. 
This is a high performance alternative to standard Solr field collapsing (with 
*ngroups*) when the number of distinct groups in the result set is high.

For example in one performance test, a search with 10 million full results and 
1 million collapsed groups:
Standard grouping with ngroups : 17 seconds.
CollapsingQParserPlugin: 300 milli-seconds.

Sample syntax:

Collapse based on the highest scoring document:

{code}
fq=(!collapse field=field_name}
{code}

Collapse based on the min value of a numeric field:
{code}
fq={!collapse field=field_name min=field_name}
{code}

Collapse based on the max value of a numeric field:
{code}
fq={!collapse field=field_name max=field_name}
{code}

Collapse with a null policy:
{code}
fq={!collapse field=field_name nullPolicy=null_policy}
{code}
There are three null policies:
ignore : removes docs with a null value in the collapse field (default).
expand : treats each doc with a null value in the collapse field as a separate 
group.
collapse : collapses all docs with a null value into a single group using 
either highest score, or min/max.

The CollapsingQParserPlugin also fully supports the QueryElevationComponent

*Note:*  The July 16 patch also includes and ExpandComponent that expands the 
collapsed groups for the current search result page. This functionality will 
moved to it's own ticket.






  was:
This ticket introduces the *CollapsingQParserPlugin* 

The *CollapsingQParserPlugin* is a PostFilter that performs field collapsing. 
This is a high performance alternative to standard Solr field collapsing (with 
*ngroups*) when the number of distinct groups in the result set is high.

For example in one performance test, a search with 10 million full results and 
1 million collapsed groups:
Standard grouping with ngroups : 17 seconds.
CollapsingQParserPlugin: 300 milli-seconds.

Sample syntax:

Collapse based on the highest scoring document:

{code}
fq=(!collapse field=field_name}
{code}

Collapse based on the min value of a numeric field:
{code}
fq={!collapse field=field_name min=field_name}
{code}

Collapse based on the max value of a numeric field:
{code}
fq={!collapse field=field_name max=field_name}
{code}

Collapse with a null policy:
{code}
fq={!collapse field=field_name nullPolicy=null_policy}
{code}
There are three null policies:
ignore : removes docs with a null value in the collapse field (default).
expand : treats each doc with a null value in the collapse field as a separate 
group.
collapse : collapses all docs with a null value into a single group using 
either highest score, or min/max.

*Note:*  The July 16 patch also includes and ExpandComponent that expands the 
collapsed groups for the current search result page. This functionality will 
moved to it's own ticket.







 Field Collapsing PostFilter
 ---

 Key: SOLR-5027
 URL: https://issues.apache.org/jira/browse/SOLR-5027
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 5.0
Reporter: Joel Bernstein
Assignee: Joel Bernstein
Priority: Minor
 Fix For: 4.6, 5.0

 Attachments: SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch, 
 SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch


 This ticket introduces the *CollapsingQParserPlugin* 
 The *CollapsingQParserPlugin* is a PostFilter that performs field collapsing. 
 This is a high performance alternative to standard Solr field collapsing 
 (with *ngroups*) when the number of distinct groups in the result set is high.
 For example in one performance test, a search with 10 million full results 
 and 1 million collapsed groups:
 Standard grouping with ngroups : 17 seconds.
 CollapsingQParserPlugin: 300 milli-seconds.
 Sample syntax:
 Collapse based on the highest scoring document:
 {code}
 fq=(!collapse field=field_name}
 {code}
 Collapse based on the min value of a numeric field:
 {code}
 fq={!collapse field=field_name min=field_name}
 {code}
 Collapse based on the max value of a numeric field:
 {code}
 fq={!collapse field=field_name max=field_name}
 {code}
 Collapse with a null policy:
 {code}
 fq={!collapse field=field_name nullPolicy=null_policy}
 {code}
 There are three null policies:
 ignore : removes docs with a null value in the collapse field (default).
 expand : treats each doc with a null value in the collapse field as a 
 separate group.
 collapse : collapses all docs with a null value into a single group using 
 either highest score, or min/max.
 The CollapsingQParserPlugin also fully supports the QueryElevationComponent
 *Note:*

[jira] [Updated] (SOLR-5027) Field Collapsing PostFilter

2013-10-09 Thread ASF subversion and git services (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-5027:
-

Description: 
This ticket introduces the *CollapsingQParserPlugin* 

The *CollapsingQParserPlugin* is a PostFilter that performs field collapsing. 
This is a high performance alternative to standard Solr field collapsing (with 
*ngroups*) when the number of distinct groups in the result set is high.

For example in one performance test, a search with 10 million full results and 
1 million collapsed groups:
Standard grouping with ngroups : 17 seconds.
CollapsingQParserPlugin: 300 milli-seconds.

Sample syntax:

Collapse based on the highest scoring document:

{code}
fq=(!collapse field=field_name}
{code}

Collapse based on the min value of a numeric field:
{code}
fq={!collapse field=field_name min=field_name}
{code}

Collapse based on the max value of a numeric field:
{code}
fq={!collapse field=field_name max=field_name}
{code}

Collapse with a null policy:
{code}
fq={!collapse field=field_name nullPolicy=null_policy}
{code}
There are three null policies:
ignore : removes docs with a null value in the collapse field (default).
expand : treats each doc with a null value in the collapse field as a separate 
group.
collapse : collapses all docs with a null value into a single group using 
either highest score, or min/max.

The CollapsingQParserPlugin also fully supports the QueryElevationComponent

*Note:*  The July 16 patch also includes and ExpandComponent that expands the 
collapsed groups for the current search result page. This functionality will be 
moved to it's own ticket.






  was:
This ticket introduces the *CollapsingQParserPlugin* 

The *CollapsingQParserPlugin* is a PostFilter that performs field collapsing. 
This is a high performance alternative to standard Solr field collapsing (with 
*ngroups*) when the number of distinct groups in the result set is high.

For example in one performance test, a search with 10 million full results and 
1 million collapsed groups:
Standard grouping with ngroups : 17 seconds.
CollapsingQParserPlugin: 300 milli-seconds.

Sample syntax:

Collapse based on the highest scoring document:

{code}
fq=(!collapse field=field_name}
{code}

Collapse based on the min value of a numeric field:
{code}
fq={!collapse field=field_name min=field_name}
{code}

Collapse based on the max value of a numeric field:
{code}
fq={!collapse field=field_name max=field_name}
{code}

Collapse with a null policy:
{code}
fq={!collapse field=field_name nullPolicy=null_policy}
{code}
There are three null policies:
ignore : removes docs with a null value in the collapse field (default).
expand : treats each doc with a null value in the collapse field as a separate 
group.
collapse : collapses all docs with a null value into a single group using 
either highest score, or min/max.

The CollapsingQParserPlugin also fully supports the QueryElevationComponent

*Note:*  The July 16 patch also includes and ExpandComponent that expands the 
collapsed groups for the current search result page. This functionality will 
moved to it's own ticket.







 Field Collapsing PostFilter
 ---

 Key: SOLR-5027
 URL: https://issues.apache.org/jira/browse/SOLR-5027
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 5.0
Reporter: Joel Bernstein
Assignee: Joel Bernstein
Priority: Minor
 Fix For: 4.6, 5.0

 Attachments: SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch, 
 SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch


 This ticket introduces the *CollapsingQParserPlugin* 
 The *CollapsingQParserPlugin* is a PostFilter that performs field collapsing. 
 This is a high performance alternative to standard Solr field collapsing 
 (with *ngroups*) when the number of distinct groups in the result set is high.
 For example in one performance test, a search with 10 million full results 
 and 1 million collapsed groups:
 Standard grouping with ngroups : 17 seconds.
 CollapsingQParserPlugin: 300 milli-seconds.
 Sample syntax:
 Collapse based on the highest scoring document:
 {code}
 fq=(!collapse field=field_name}
 {code}
 Collapse based on the min value of a numeric field:
 {code}
 fq={!collapse field=field_name min=field_name}
 {code}
 Collapse based on the max value of a numeric field:
 {code}
 fq={!collapse field=field_name max=field_name}
 {code}
 Collapse with a null policy:
 {code}
 fq={!collapse field=field_name nullPolicy=null_policy}
 {code}
 There are three null policies:
 ignore : removes docs with a null value in the collapse field (default).
 expand : treats each doc with a null value in the collapse field as a 
 separate group.
 collapse : collapses all docs with a null value into a single group using 
 either highest score, or min/max.
 The

[jira] [Commented] (SOLR-5306) can not create collection when have over one config


[ 
https://issues.apache.org/jira/browse/SOLR-5306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790763#comment-13790763
 ] 

ASF subversion and git services commented on SOLR-5306:
---

Commit 1530772 from [~markrmil...@gmail.com] in branch 'dev/trunk'
[ https://svn.apache.org/r1530772 ]

SOLR-5306: Extra collection creation parameters like collection.configName are 
not being respected.

 can not create collection when have over one config
 ---

 Key: SOLR-5306
 URL: https://issues.apache.org/jira/browse/SOLR-5306
 Project: Solr
  Issue Type: Bug
  Components: clients - java
Affects Versions: 4.5
 Environment: win7 jdk 7
Reporter: Liang Tianyu
Assignee: Mark Miller
Priority: Critical
 Fix For: 4.5.1, 4.6, 5.0

 Attachments: SOLR-5306.patch


 I have uploaded zookeeper two config: patent and applicant. I can not create 
 collection:http://localhost:8080/solr/admin/collections?action=CREATEname=patent_main_1numShards=1collection.configName=patent.show
  errors:patent_main_1_shard1_replica1: 
 org.apache.solr.common.cloud.ZooKeeperException:org.apache.solr.common.cloud.ZooKeeperException:
  Could not find configName for collection patent_main_1 found:[applicant, 
 patent]. In solr 4.4 I can create sucessfully.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5306) can not create collection when have over one config

2013-10-09 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790765#comment-13790765
 ] 

ASF subversion and git services commented on SOLR-5306:
---

Commit 1530773 from [~markrmil...@gmail.com] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1530773 ]

SOLR-5306: Extra collection creation parameters like collection.configName are 
not being respected.

 can not create collection when have over one config
 ---

 Key: SOLR-5306
 URL: https://issues.apache.org/jira/browse/SOLR-5306
 Project: Solr
  Issue Type: Bug
  Components: clients - java
Affects Versions: 4.5
 Environment: win7 jdk 7
Reporter: Liang Tianyu
Assignee: Mark Miller
Priority: Critical
 Fix For: 4.5.1, 4.6, 5.0

 Attachments: SOLR-5306.patch


 I have uploaded zookeeper two config: patent and applicant. I can not create 
 collection:http://localhost:8080/solr/admin/collections?action=CREATEname=patent_main_1numShards=1collection.configName=patent.show
  errors:patent_main_1_shard1_replica1: 
 org.apache.solr.common.cloud.ZooKeeperException:org.apache.solr.common.cloud.ZooKeeperException:
  Could not find configName for collection patent_main_1 found:[applicant, 
 patent]. In solr 4.4 I can create sucessfully.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5323) Solr requires -Dsolr.clustering.enabled=false when pointing at example config

2013-10-09 Thread Dawid Weiss (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790788#comment-13790788
 ] 

Dawid Weiss commented on SOLR-5323:
---

I can't remember but I think the problem was that it wasn't possible to define 
install-dir relative directories for lib element. I'll take a look.

 Solr requires -Dsolr.clustering.enabled=false when pointing at example config
 -

 Key: SOLR-5323
 URL: https://issues.apache.org/jira/browse/SOLR-5323
 Project: Solr
  Issue Type: Bug
  Components: contrib - Clustering
Affects Versions: 4.5
 Environment: vanilla mac
Reporter: John Berryman
 Fix For: 4.6, 5.0


 my typical use of Solr is something like this: 
 {code}
 cd SOLR_HOME/example
 cp -r solr /myProjectDir/solr_home
 java -jar -Dsolr.solr.home=/myProjectDir/solr_home  start.jar
 {code}
 But in solr 4.5.0 this fails to start successfully. I get an error:
 {code}
 org.apache.solr.common.SolrException: Error loading class 
 'solr.clustering.ClusteringComponent'
 {code}
 The reason is because solr.clustering.enabled defaults to true now. I don't 
 know why this might be the case.
 you can get around it with 
 {code}
 java -jar -Dsolr.solr.home=/myProjectDir/solr_home 
 -Dsolr.clustering.enabled=false start.jar
 {code}
 SOLR-4708 is when this became an issue.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS] Lucene-Solr-NightlyTests-4.x - Build # 400 - Still Failing

2013-10-09 Thread Apache Jenkins Server

Build: https://builds.apache.org/job/Lucene-Solr-NightlyTests-4.x/400/

1 tests failed.
REGRESSION:  org.apache.lucene.analysis.core.TestRandomChains.testRandomChains

Error Message:
first posInc must be  0

Stack Trace:
java.lang.IllegalStateException: first posInc must be  0
at 
__randomizedtesting.SeedInfo.seed([D025BEA04DE60E8F:EDC497C10AF4134F]:0)
at 
org.apache.lucene.analysis.ValidatingTokenFilter.incrementToken(ValidatingTokenFilter.java:89)
at 
org.apache.lucene.analysis.BaseTokenStreamTestCase.checkAnalysisConsistency(BaseTokenStreamTestCase.java:694)
at 
org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:605)
at 
org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:506)
at 
org.apache.lucene.analysis.core.TestRandomChains.testRandomChains(TestRandomChains.java:925)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:616)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
at 
org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
at java.lang.Thread.run(Thread.java:679)




Build Log:
[...truncated 4359 lines...]
   [junit4] Suite: org.apache.lucene.analysis.core.TestRandomChains
   [junit4]   2 TEST FAIL: useCharFilter=false text='\ucd6f\u8537\uab05d\uf3cd 
qkt  \u0136'

[jira] [Commented] (LUCENE-5248) Improve the data structure used in ReaderAndLiveDocs to hold the updates


[ 
https://issues.apache.org/jira/browse/LUCENE-5248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790834#comment-13790834
 ] 

Robert Muir commented on LUCENE-5248:
-

Hi Shai:

should UpdatesIterator implement DISI? It seems like it might be a good fit.

{code}
+private final FixedBitSet docsWithField;
+private PagedMutable docs;
+private PagedGrowableWriter values;
{code}

When we have multiple related structures like this, maybe we can add a comment 
as to what each is?
Something like:
{code}
// bit per docid: set if the value is real
// TODO: is bitset(maxdoc) really needed since usually its sparse? why not an 
openbitset parallel with docs?
private final FixedBitSet docsWithField;
// holds a list of documents.
// TODO: do these really need to be absolute-encoded?
private PagedMutable docs;
// holds a list of values, parallel with docs
private PagedGrowableWriter values;
{code}

{code}
+  docsWithField = new FixedBitSet(maxDoc);
+  docsWithField.clear(0, maxDoc)
{code}

The clear should be unnecessary!

{code}
+public void add(int doc, Long value) {
+  assert value != null;
+  if (size == Integer.MAX_VALUE) {
+throw new IllegalStateException(cannot support more than 
Integer.MAX_VALUE doc/value entries);
+  }
{code}

Is this really a limitation?

{code}
+@Override
+protected int compare(int i, int j) {
+  return (int) (docs.get(i) - docs.get(j));
+}
{code}

Can we just use Long.compare? this subtraction may be safe... but it would 
smell better.

 Improve the data structure used in ReaderAndLiveDocs to hold the updates
 

 Key: LUCENE-5248
 URL: https://issues.apache.org/jira/browse/LUCENE-5248
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/index
Reporter: Shai Erera
Assignee: Shai Erera
 Attachments: LUCENE-5248.patch, LUCENE-5248.patch, LUCENE-5248.patch, 
 LUCENE-5248.patch


 Currently ReaderAndLiveDocs holds the updates in two structures:
 +MapString,MapInteger,Long+
 Holds a mapping from each field, to all docs that were updated and their 
 values. This structure is updated when applyDeletes is called, and needs to 
 satisfy several requirements:
 # Un-ordered writes: if a field f is updated by two terms, termA and termB, 
 in that order, and termA affects doc=100 and termB doc=2, then the updates 
 are applied in that order, meaning we cannot rely on updates coming in order.
 # Same document may be updated multiple times, either by same term (e.g. 
 several calls to IW.updateNDV) or by different terms. Last update wins.
 # Sequential read: when writing the updates to the Directory 
 (fieldsConsumer), we iterate on the docs in-order and for each one check if 
 it's updated and if not, pull its value from the current DV.
 # A single update may affect several million documents, therefore need to be 
 efficient w.r.t. memory consumption.
 +MapInteger,MapString,Long+
 Holds a mapping from a document, to all the fields that it was updated in and 
 the updated value for each field. This is used by IW.commitMergedDeletes to 
 apply the updates that came in while the segment was merging. The 
 requirements this structure needs to satisfy are:
 # Access in doc order: this is how commitMergedDeletes works.
 # One-pass: we visit a document once (currently) and so if we can, it's 
 better if we know all the fields in which it was updated. The updates are 
 applied to the merged ReaderAndLiveDocs (where they are stored in the first 
 structure mentioned above).
 Comments with proposals will follow next.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [JENKINS] Lucene-Solr-NightlyTests-4.x - Build # 400 - Still Failing

2013-10-09 Thread Robert Muir

I will investigate. looks like fun.

On Wed, Oct 9, 2013 at 4:18 PM, Apache Jenkins Server
jenk...@builds.apache.org wrote:
 Build: https://builds.apache.org/job/Lucene-Solr-NightlyTests-4.x/400/

 1 tests failed.
 REGRESSION:  org.apache.lucene.analysis.core.TestRandomChains.testRandomChains

 Error Message:
 first posInc must be  0

 Stack Trace:
 java.lang.IllegalStateException: first posInc must be  0
 at 
 __randomizedtesting.SeedInfo.seed([D025BEA04DE60E8F:EDC497C10AF4134F]:0)
 at 
 org.apache.lucene.analysis.ValidatingTokenFilter.incrementToken(ValidatingTokenFilter.java:89)
 at 
 org.apache.lucene.analysis.BaseTokenStreamTestCase.checkAnalysisConsistency(BaseTokenStreamTestCase.java:694)
 at 
 org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:605)
 at 
 org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:506)
 at 
 org.apache.lucene.analysis.core.TestRandomChains.testRandomChains(TestRandomChains.java:925)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:616)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787)
 at 
 org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
 at 
 org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
 at 
 org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
 at 
 com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
 at 
 org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
 at 
 org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
 at 
 org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
 at 
 com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
 at 
 com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
 at 
 com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782)
 at 
 com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693)
 at 
 org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
 at 
 org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
 at 
 com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
 at 
 com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
 at 
 com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
 at 
 com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
 at 
 org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43)
 at 
 org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
 at 
 org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
 at 
 org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55)
 at 
 com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
 at 
 com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
 at

[JENKINS] Lucene-Solr-trunk-MacOSX (64bit/jdk1.7.0) - Build # 900 - Failure!

2013-10-09 Thread Policeman Jenkins Server

Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-MacOSX/900/
Java: 64bit/jdk1.7.0 -XX:+UseCompressedOops -XX:+UseParallelGC

All tests passed

Build Log:
[...truncated 10176 lines...]
   [junit4] ERROR: JVM J0 ended with an exception, command line: 
/Library/Java/JavaVirtualMachines/jdk1.7.0_40.jdk/Contents/Home/jre/bin/java 
-XX:+UseCompressedOops -XX:+UseParallelGC -XX:+HeapDumpOnOutOfMemoryError 
-XX:HeapDumpPath=/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/heapdumps 
-Dtests.prefix=tests -Dtests.seed=EE974D36626AC16B -Xmx512M -Dtests.iters= 
-Dtests.verbose=false -Dtests.infostream=false -Dtests.codec=random 
-Dtests.postingsformat=random -Dtests.docvaluesformat=random 
-Dtests.locale=random -Dtests.timezone=random -Dtests.directory=random 
-Dtests.linedocsfile=europarl.lines.txt.gz -Dtests.luceneMatchVersion=5.0 
-Dtests.cleanthreads=perClass 
-Djava.util.logging.config.file=/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/lucene/tools/junit4/logging.properties
 -Dtests.nightly=false -Dtests.weekly=false -Dtests.slow=true 
-Dtests.asserts.gracious=false -Dtests.multiplier=1 -DtempDir=. 
-Djava.io.tmpdir=. 
-Djunit4.tempDir=/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/solr/build/solr-core/test/temp
 
-Dclover.db.dir=/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/lucene/build/clover/db
 -Djava.security.manager=org.apache.lucene.util.TestSecurityManager 
-Djava.security.policy=/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/lucene/tools/junit4/tests.policy
 -Dlucene.version=5.0-SNAPSHOT -Djetty.testMode=1 -Djetty.insecurerandom=1 
-Dsolr.directoryFactory=org.apache.solr.core.MockDirectoryFactory 
-Djava.awt.headless=true -Dtests.disableHdfs=true -Dfile.encoding=ISO-8859-1 
-classpath

Re: [JENKINS] Lucene-Solr-trunk-MacOSX (64bit/jdk1.7.0) - Build # 900 - Failure!

2013-10-09 Thread Robert Muir

malloc/free bug.

On Wed, Oct 9, 2013 at 4:47 PM, Policeman Jenkins Server
jenk...@thetaphi.de wrote:
 Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-MacOSX/900/
 Java: 64bit/jdk1.7.0 -XX:+UseCompressedOops -XX:+UseParallelGC

 All tests passed

 Build Log:
 [...truncated 10176 lines...]
[junit4] ERROR: JVM J0 ended with an exception, command line: 
 /Library/Java/JavaVirtualMachines/jdk1.7.0_40.jdk/Contents/Home/jre/bin/java 
 -XX:+UseCompressedOops -XX:+UseParallelGC -XX:+HeapDumpOnOutOfMemoryError 
 -XX:HeapDumpPath=/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/heapdumps 
 -Dtests.prefix=tests -Dtests.seed=EE974D36626AC16B -Xmx512M -Dtests.iters= 
 -Dtests.verbose=false -Dtests.infostream=false -Dtests.codec=random 
 -Dtests.postingsformat=random -Dtests.docvaluesformat=random 
 -Dtests.locale=random -Dtests.timezone=random -Dtests.directory=random 
 -Dtests.linedocsfile=europarl.lines.txt.gz -Dtests.luceneMatchVersion=5.0 
 -Dtests.cleanthreads=perClass 
 -Djava.util.logging.config.file=/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/lucene/tools/junit4/logging.properties
  -Dtests.nightly=false -Dtests.weekly=false -Dtests.slow=true 
 -Dtests.asserts.gracious=false -Dtests.multiplier=1 -DtempDir=. 
 -Djava.io.tmpdir=. 
 -Djunit4.tempDir=/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/solr/build/solr-core/test/temp
  
 -Dclover.db.dir=/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/lucene/build/clover/db
  -Djava.security.manager=org.apache.lucene.util.TestSecurityManager 
 -Djava.security.policy=/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/lucene/tools/junit4/tests.policy
  -Dlucene.version=5.0-SNAPSHOT -Djetty.testMode=1 -Djetty.insecurerandom=1 
 -Dsolr.directoryFactory=org.apache.solr.core.MockDirectoryFactory 
 -Djava.awt.headless=true -Dtests.disableHdfs=true -Dfile.encoding=ISO-8859-1 
 -classpath

[jira] [Created] (LUCENE-5269) TestRandomChains failure

Robert Muir created LUCENE-5269:
---

 Summary: TestRandomChains failure
 Key: LUCENE-5269
 URL: https://issues.apache.org/jira/browse/LUCENE-5269
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Robert Muir


One of EdgeNGramTokenizer, ShingleFilter, NGramTokenFilter is buggy, or 
possibly only the combination of them conspiring together.






--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-5269) TestRandomChains failure


 [ 
https://issues.apache.org/jira/browse/LUCENE-5269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-5269:


Attachment: LUCENE-5269_test.patch

Here's a test. For whatever reason the exact text in jenkins wouldnt reproduce 
with checkAnalysisConsistency with the exact configuration.

However the random seed reproduces in jenkins easily. I suspect maybe there is 
something not reset and the linedocs file is triggering it???

If i blast random data at the configuration it fails the same way.

I then removed various harmless filters and so on until I was left with these 
three and it was still failing...

 TestRandomChains failure
 

 Key: LUCENE-5269
 URL: https://issues.apache.org/jira/browse/LUCENE-5269
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Robert Muir
 Attachments: LUCENE-5269_test.patch


 One of EdgeNGramTokenizer, ShingleFilter, NGramTokenFilter is buggy, or 
 possibly only the combination of them conspiring together.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-5270) add Terms.hasFreqs

2013-10-09 Thread Michael McCandless (JIRA)

Michael McCandless created LUCENE-5270:
--

 Summary: add Terms.hasFreqs
 Key: LUCENE-5270
 URL: https://issues.apache.org/jira/browse/LUCENE-5270
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 4.6, 5.0


While working on LUCENE-5268, I realized we have hasPositions/Offsets/Payloads 
methods in Terms but not hasFreqs ...



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Assigned] (SOLR-5317) CoreAdmin API is not persisting data properly