[jira] [Commented] (LUCENE-5260) Make older Suggesters more accepting of TermFreqPayloadIterator

2013-10-09 Thread Areek Zillur (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790075#comment-13790075
 ] 

Areek Zillur commented on LUCENE-5260:
--

Hey Michael, I was thinking about how to nicely replace TermFreqIterator.
 - I was thinking about having some kind of wrapper for TermFreqPayloadIterator 
that will nullify the payload field for the current TermFreqIterator consumers 
and a way for the wrapper to signal early on to the consumers that they dont 
need to deal with the payload at all.
 - Also It seems like there are a lot of implementations for TermFreqIterator 
(e.g BufferedTermFreqIteratorWrapper, SortedTermFreqIteratorWrapper); I will 
make sure all these implementation work with TermFreqPayloadIterator and its 
new wrapper (for mimicking TermFreqIterator).

Any thoughts? I will try to come up with a rough patch soon.

 Make older Suggesters more accepting of TermFreqPayloadIterator
 ---

 Key: LUCENE-5260
 URL: https://issues.apache.org/jira/browse/LUCENE-5260
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/search
Reporter: Areek Zillur

 As discussed in https://issues.apache.org/jira/browse/LUCENE-5251, it would 
 be nice to make the older suggesters accepting of TermFreqPayloadIterator and 
 throw an exception if payload is found (if it cannot be used). 
 This will also allow us to nuke most of the other interfaces for 
 BytesRefIterator. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5251) New Dictionary Implementation for Suggester consumption

2013-10-09 Thread Areek Zillur (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790079#comment-13790079
 ] 

Areek Zillur commented on LUCENE-5251:
--

Thanks for committing the patch, Michael! 

 New Dictionary Implementation for Suggester consumption
 ---

 Key: LUCENE-5251
 URL: https://issues.apache.org/jira/browse/LUCENE-5251
 Project: Lucene - Core
  Issue Type: New Feature
  Components: core/search
Reporter: Areek Zillur
 Fix For: 5.0, 4.6

 Attachments: LUCENE-5251.patch, LUCENE-5251.patch, LUCENE-5251.patch, 
 LUCENE-5251.patch


 With the vast array of new suggester, It would be nice to have a dictionary 
 implementation that could feed the suggesters terms, weights and (optionally) 
 payloads from the lucene index.
 The idea of this dictionary implementation is to grab stored documents from 
 the index and use user-configured fields for terms, weights and payloads.
 use-case: If you have a document with three fields 
- product_id
- product_name
- product_popularity_score
 then using this implementation would enable you to have a suggester for 
 product_name using the weight of product_popularity_score and return you the 
 payload of product_id, with which you can do further processing on (example: 
 construct a url etc).  



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5320) Multi level compositeId router

2013-10-09 Thread Anshum Gupta (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anshum Gupta updated SOLR-5320:
---

Remaining Estimate: 336h
 Original Estimate: 336h

 Multi level compositeId router
 --

 Key: SOLR-5320
 URL: https://issues.apache.org/jira/browse/SOLR-5320
 Project: Solr
  Issue Type: New Feature
  Components: SolrCloud
Reporter: Anshum Gupta
   Original Estimate: 336h
  Remaining Estimate: 336h

 This would enable multi level routing as compared to the 2 level routing 
 available as of now. On the usage bit, here's an example:
 Document Id: myapp!dummyuser!doc
 myapp!dummyuser! can be used as the shardkey for searching content for 
 dummyuser.
 myapp! can be used for searching across all users of myapp.
 I am looking at either a 3 (or 4) level routing. The 32 bit hash would then 
 comprise of 8X4 components from each part (in case of 4 level).



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-5320) Multi level compositeId router

2013-10-09 Thread Anshum Gupta (JIRA)
Anshum Gupta created SOLR-5320:
--

 Summary: Multi level compositeId router
 Key: SOLR-5320
 URL: https://issues.apache.org/jira/browse/SOLR-5320
 Project: Solr
  Issue Type: New Feature
  Components: SolrCloud
Reporter: Anshum Gupta


This would enable multi level routing as compared to the 2 level routing 
available as of now. On the usage bit, here's an example:

Document Id: myapp!dummyuser!doc
myapp!dummyuser! can be used as the shardkey for searching content for 
dummyuser.
myapp! can be used for searching across all users of myapp.

I am looking at either a 3 (or 4) level routing. The 32 bit hash would then 
comprise of 8X4 components from each part (in case of 4 level).



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (LUCENE-5267) java.lang.ArrayIndexOutOfBoundsException on reading data

2013-10-09 Thread Littlestar (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13789958#comment-13789958
 ] 

Littlestar edited comment on LUCENE-5267 at 10/9/13 6:54 AM:
-

{noformat}
public static int decompress(DataInput compressed, int decompressedLen, byte[] 
dest, int dOff) throws IOException {
final int destEnd = dest.length;

do {
  ..

  // copying a multiple of 8 bytes can make decompression from 5% to 10% 
faster
  final int fastLen = (matchLen + 7)  0xFFF8;
  if (matchDec  matchLen || dOff + fastLen  destEnd) {
// overlap - naive incremental copy
for (int ref = dOff - matchDec, end = dOff + matchLen; dOff  end; 
++ref, ++dOff) {
  dest[dOff] = dest[ref];
}
  } else {
// no overlap - arraycopy
try {
System.arraycopy(dest, dOff - matchDec, dest, dOff, fastLen);
}catch(Throwable e) {
System.out.println(dest.length= + dest.length + ,dOff= + dOff + 
,matchDec= + matchDec + ,matchLen= + matchLen + ,fastLen= + fastLen);
}
dOff += matchLen;
  }
} while (dOff  decompressedLen);

return dOff;
  }
{noformat}



was (Author: cnstar9988):
{noformat}
public static int decompress(DataInput compressed, int decompressedLen, byte[] 
dest, int dOff) throws IOException {
final int destEnd = dest.length;

do {
  ..

  // copying a multiple of 8 bytes can make decompression from 5% to 10% 
faster
  final int fastLen = (matchLen + 7)  0xFFF8;
  if (matchDec  matchLen || dOff + fastLen  destEnd) {
// overlap - naive incremental copy
for (int ref = dOff - matchDec, end = dOff + matchLen; dOff  end; 
++ref, ++dOff) {
  dest[dOff] = dest[ref];
}
  } else {
// no overlap - arraycopy
   // System.out.println(dest.length= + dest.length + ,dOff= + dOff + 
,matchDec= + matchDec + ,fastLen= + fastLen);
System.arraycopy(dest, dOff - matchDec, dest, dOff, fastLen);//here 
throws java.lang.ArrayIndexOutOfBoundsException
dOff += matchLen;
  }
} while (dOff  decompressedLen);

return dOff;
  }
{noformat}


 java.lang.ArrayIndexOutOfBoundsException on reading data
 

 Key: LUCENE-5267
 URL: https://issues.apache.org/jira/browse/LUCENE-5267
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/index
Affects Versions: 4.4
Reporter: Littlestar
  Labels: LZ4

 java.lang.ArrayIndexOutOfBoundsException
   at org.apache.lucene.codecs.compressing.LZ4.decompress(LZ4.java:132)
   at 
 org.apache.lucene.codecs.compressing.CompressionMode$4.decompress(CompressionMode.java:135)
   at 
 org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.visitDocument(CompressingStoredFieldsReader.java:336)
   at 
 org.apache.lucene.index.SegmentReader.document(SegmentReader.java:133)
   at 
 org.apache.lucene.index.BaseCompositeReader.document(BaseCompositeReader.java:110)
   at 
 org.apache.lucene.index.SlowCompositeReaderWrapper.document(SlowCompositeReaderWrapper.java:212)
   at 
 org.apache.lucene.index.FilterAtomicReader.document(FilterAtomicReader.java:365)
   at 
 org.apache.lucene.index.BaseCompositeReader.document(BaseCompositeReader.java:110)
   at org.apache.lucene.index.IndexReader.document(IndexReader.java:447)
   at org.apache.lucene.search.IndexSearcher.doc(IndexSearcher.java:204)



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5294) Pluggable Dictionary Implementation for Suggester

2013-10-09 Thread Areek Zillur (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790105#comment-13790105
 ] 

Areek Zillur commented on SOLR-5294:


Thanks for reviewing this, Robert!

{quote}
Should we think about fixing the spellchecker stuff too (which seems to have 
totally separate implementations like FileBased and so on to just change the 
dictionary
{quote}
This is an interesting point! After looking through the 
AbstractLuceneSpellChecker and all its implementations, it seems like it would 
be better to refactor those out too. I feel like that should be considered for 
the dictionaryImpl setting to work as expected. 

{quote}
I am not sure if we want to keep spell and suggest entangled?
{quote}
It does make sense to untangle them, but I think that by itself is a bigger 
issue (I will open up an issue about that and will be happy to work on that)

{quote}
Should we name the DictionaryFactoryBase something better (SuggestDictionary? 
SpellingDictionary?)
{quote}
Given the situation, it seems like the dictionary plugin will be shared among 
both suggest and spelling; maybe call it DictionaryFactory?

{quote}
Maybe we can simplify the base plugin class to suit more use cases, like remove 
the setCore() and just check if it implements CoreAware interface?
{quote}
That sounds good to me.

{quote}
I think it would be ideal if we could eliminate the additional hierarchy of 
FileBased* and IndexBased*: couldnt the FileBased impl just take its filename 
in via a parameter in params, and IndexBased take its fieldname in params the 
same way, and we push up create(IndexSearcher) to the base plugin class (the 
file-based just wouldnt use the indexsearcher argument).
{quote}
The reason for having the hierarchy was to separate out the two major types of 
dictionaries (index and file-based). I can change that but at the cost of 
reduced enforcement.

I will upload another patch, incorporating your feedback!


 Pluggable Dictionary Implementation for Suggester
 -

 Key: SOLR-5294
 URL: https://issues.apache.org/jira/browse/SOLR-5294
 Project: Solr
  Issue Type: Improvement
  Components: SearchComponents - other
Reporter: Areek Zillur
 Attachments: SOLR-5294.patch, SOLR-5294.patch


 It would be nice to have the option of plugging in Dictionary implementations 
 for the suggester to consume, like that of the lookup implementation that 
 allows users to specify which lucene suggesters to use. 
 This would allow easy addition of new dictionary implementations that the 
 lucene suggesters can consume. New implementations of dictionary like 
 (https://issues.apache.org/jira/browse/LUCENE-5251) could be easily added. I 
 believe this would give the users more control on what they what their lucene 
 suggesters to consume.
 For the implementation, the user can add a new setting in the spellcomponent 
 in the solrconfig. The new setting would be a string identifying the class 
 path of the dictionary implementation to be used (very similar to the 
 existing lookupImpl). This setting would be used to call the relevant 
 DictionaryFactory.
 A sample solrconfig file would look as follows (note the new dictionaryImpl 
 setting):
 {code}
   searchComponent class=solr.SpellCheckComponent 
 name=fuzzy_suggest_analyzing_with_lucene_dict
 lst name=spellchecker
   str name=namefuzzy_suggest_analyzing_with_lucene_dict/str
   str name=classnameorg.apache.solr.spelling.suggest.Suggester/str
   str 
 name=lookupImplorg.apache.solr.spelling.suggest.fst.FuzzyLookupFactory/str
   str 
 name=dictionaryImplorg.apache.solr.spelling.suggest.LuceneDictionaryFactory/str
  !-- new setting --
   str name=storeDirfuzzy_suggest_analyzing/str
   str name=buildOnCommitfalse/str
   !-- Suggester properties --
   bool name=exactMatchFirsttrue/bool
   str name=suggestAnalyzerFieldTypetext/str
   bool name=preserveSepfalse/bool
   str name=fieldstext/str
 /lst
 {code}
  



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Assigned] (LUCENE-5267) java.lang.ArrayIndexOutOfBoundsException on reading data

2013-10-09 Thread Adrien Grand (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand reassigned LUCENE-5267:


Assignee: Adrien Grand

 java.lang.ArrayIndexOutOfBoundsException on reading data
 

 Key: LUCENE-5267
 URL: https://issues.apache.org/jira/browse/LUCENE-5267
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/index
Affects Versions: 4.4
Reporter: Littlestar
Assignee: Adrien Grand
  Labels: LZ4

 java.lang.ArrayIndexOutOfBoundsException
   at org.apache.lucene.codecs.compressing.LZ4.decompress(LZ4.java:132)
   at 
 org.apache.lucene.codecs.compressing.CompressionMode$4.decompress(CompressionMode.java:135)
   at 
 org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.visitDocument(CompressingStoredFieldsReader.java:336)
   at 
 org.apache.lucene.index.SegmentReader.document(SegmentReader.java:133)
   at 
 org.apache.lucene.index.BaseCompositeReader.document(BaseCompositeReader.java:110)
   at 
 org.apache.lucene.index.SlowCompositeReaderWrapper.document(SlowCompositeReaderWrapper.java:212)
   at 
 org.apache.lucene.index.FilterAtomicReader.document(FilterAtomicReader.java:365)
   at 
 org.apache.lucene.index.BaseCompositeReader.document(BaseCompositeReader.java:110)
   at org.apache.lucene.index.IndexReader.document(IndexReader.java:447)
   at org.apache.lucene.search.IndexSearcher.doc(IndexSearcher.java:204)



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5267) java.lang.ArrayIndexOutOfBoundsException on reading data

2013-10-09 Thread Adrien Grand (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790108#comment-13790108
 ] 

Adrien Grand commented on LUCENE-5267:
--

Thanks for the report. Can you check if there are disk-related issues in your 
system logs and share the .fdx and .fdt files of the broken segment?

 java.lang.ArrayIndexOutOfBoundsException on reading data
 

 Key: LUCENE-5267
 URL: https://issues.apache.org/jira/browse/LUCENE-5267
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/index
Affects Versions: 4.4
Reporter: Littlestar
  Labels: LZ4

 java.lang.ArrayIndexOutOfBoundsException
   at org.apache.lucene.codecs.compressing.LZ4.decompress(LZ4.java:132)
   at 
 org.apache.lucene.codecs.compressing.CompressionMode$4.decompress(CompressionMode.java:135)
   at 
 org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.visitDocument(CompressingStoredFieldsReader.java:336)
   at 
 org.apache.lucene.index.SegmentReader.document(SegmentReader.java:133)
   at 
 org.apache.lucene.index.BaseCompositeReader.document(BaseCompositeReader.java:110)
   at 
 org.apache.lucene.index.SlowCompositeReaderWrapper.document(SlowCompositeReaderWrapper.java:212)
   at 
 org.apache.lucene.index.FilterAtomicReader.document(FilterAtomicReader.java:365)
   at 
 org.apache.lucene.index.BaseCompositeReader.document(BaseCompositeReader.java:110)
   at org.apache.lucene.index.IndexReader.document(IndexReader.java:447)
   at org.apache.lucene.search.IndexSearcher.doc(IndexSearcher.java:204)



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5267) java.lang.ArrayIndexOutOfBoundsException on reading data

2013-10-09 Thread Adrien Grand (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790110#comment-13790110
 ] 

Adrien Grand commented on LUCENE-5267:
--

Can you also confirm that you are using Lucene42StoredFieldsFormat in your 
hybaseStd42x codec (and not eg. a customized CompressingStoredFieldsFormat)?

 java.lang.ArrayIndexOutOfBoundsException on reading data
 

 Key: LUCENE-5267
 URL: https://issues.apache.org/jira/browse/LUCENE-5267
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/index
Affects Versions: 4.4
Reporter: Littlestar
Assignee: Adrien Grand
  Labels: LZ4

 java.lang.ArrayIndexOutOfBoundsException
   at org.apache.lucene.codecs.compressing.LZ4.decompress(LZ4.java:132)
   at 
 org.apache.lucene.codecs.compressing.CompressionMode$4.decompress(CompressionMode.java:135)
   at 
 org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.visitDocument(CompressingStoredFieldsReader.java:336)
   at 
 org.apache.lucene.index.SegmentReader.document(SegmentReader.java:133)
   at 
 org.apache.lucene.index.BaseCompositeReader.document(BaseCompositeReader.java:110)
   at 
 org.apache.lucene.index.SlowCompositeReaderWrapper.document(SlowCompositeReaderWrapper.java:212)
   at 
 org.apache.lucene.index.FilterAtomicReader.document(FilterAtomicReader.java:365)
   at 
 org.apache.lucene.index.BaseCompositeReader.document(BaseCompositeReader.java:110)
   at org.apache.lucene.index.IndexReader.document(IndexReader.java:447)
   at org.apache.lucene.search.IndexSearcher.doc(IndexSearcher.java:204)



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5267) java.lang.ArrayIndexOutOfBoundsException on reading data

2013-10-09 Thread Littlestar (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790117#comment-13790117
 ] 

Littlestar commented on LUCENE-5267:


dOff - matchDec 0, so throws java.lang.ArrayIndexOutOfBoundsException
dest.length=33288,dOff=3184,matchDec=34510,matchLen=15,fastLen=16
dest.length=33288,dOff=3213,matchDec=34724,matchLen=9,fastLen=16
dest.length=33288,dOff=3229,matchDec=45058,matchLen=12,fastLen=16
dest.length=33288,dOff=3255,matchDec=20482,matchLen=9,fastLen=16
dest.length=33288,dOff=3275,matchDec=26122,matchLen=12,fastLen=16
dest.length=33288,dOff=3570,matchDec=35228,matchLen=6,fastLen=8



 java.lang.ArrayIndexOutOfBoundsException on reading data
 

 Key: LUCENE-5267
 URL: https://issues.apache.org/jira/browse/LUCENE-5267
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/index
Affects Versions: 4.4
Reporter: Littlestar
Assignee: Adrien Grand
  Labels: LZ4

 java.lang.ArrayIndexOutOfBoundsException
   at org.apache.lucene.codecs.compressing.LZ4.decompress(LZ4.java:132)
   at 
 org.apache.lucene.codecs.compressing.CompressionMode$4.decompress(CompressionMode.java:135)
   at 
 org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.visitDocument(CompressingStoredFieldsReader.java:336)
   at 
 org.apache.lucene.index.SegmentReader.document(SegmentReader.java:133)
   at 
 org.apache.lucene.index.BaseCompositeReader.document(BaseCompositeReader.java:110)
   at 
 org.apache.lucene.index.SlowCompositeReaderWrapper.document(SlowCompositeReaderWrapper.java:212)
   at 
 org.apache.lucene.index.FilterAtomicReader.document(FilterAtomicReader.java:365)
   at 
 org.apache.lucene.index.BaseCompositeReader.document(BaseCompositeReader.java:110)
   at org.apache.lucene.index.IndexReader.document(IndexReader.java:447)
   at org.apache.lucene.search.IndexSearcher.doc(IndexSearcher.java:204)



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5267) java.lang.ArrayIndexOutOfBoundsException on reading data

2013-10-09 Thread Littlestar (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790131#comment-13790131
 ] 

Littlestar commented on LUCENE-5267:


// Lucene42Codec + LZ4
public final class Hybase42StandardCodec extends FilterCodec {
public Hybase42StandardCodec() {
super(hybaseStd42x, new Lucene42Codec());
}
}

disk-related issues in your system logs and share the .fdx and .fdt files of 
the broken segment
too big(5G)

 java.lang.ArrayIndexOutOfBoundsException on reading data
 

 Key: LUCENE-5267
 URL: https://issues.apache.org/jira/browse/LUCENE-5267
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/index
Affects Versions: 4.4
Reporter: Littlestar
Assignee: Adrien Grand
  Labels: LZ4

 java.lang.ArrayIndexOutOfBoundsException
   at org.apache.lucene.codecs.compressing.LZ4.decompress(LZ4.java:132)
   at 
 org.apache.lucene.codecs.compressing.CompressionMode$4.decompress(CompressionMode.java:135)
   at 
 org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.visitDocument(CompressingStoredFieldsReader.java:336)
   at 
 org.apache.lucene.index.SegmentReader.document(SegmentReader.java:133)
   at 
 org.apache.lucene.index.BaseCompositeReader.document(BaseCompositeReader.java:110)
   at 
 org.apache.lucene.index.SlowCompositeReaderWrapper.document(SlowCompositeReaderWrapper.java:212)
   at 
 org.apache.lucene.index.FilterAtomicReader.document(FilterAtomicReader.java:365)
   at 
 org.apache.lucene.index.BaseCompositeReader.document(BaseCompositeReader.java:110)
   at org.apache.lucene.index.IndexReader.document(IndexReader.java:447)
   at org.apache.lucene.search.IndexSearcher.doc(IndexSearcher.java:204)



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5267) java.lang.ArrayIndexOutOfBoundsException on reading data

2013-10-09 Thread Littlestar (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790134#comment-13790134
 ] 

Littlestar commented on LUCENE-5267:


when ArrayIndexOutOfBoundsException omit
ERROR [Invalid vLong detected (negative values disallowed)]
java.lang.RuntimeException: Invalid vLong detected (negative values disallowed)
at 
org.apache.lucene.store.ByteArrayDataInput.readVLong(ByteArrayDataInput.java:152)
at 
org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.visitDocument(CompressingStoredFieldsReader.java:342)
at 
org.apache.lucene.index.SegmentReader.document(SegmentReader.java:133)
at org.apache.lucene.index.IndexReader.document(IndexReader.java:436)
at 
org.apache.lucene.index.CheckIndex.testStoredFields(CheckIndex.java:1268)
at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:626)
at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:1903)
test: term vectorsOK [0 total vector count; avg 0 term/freq vector 
fields per doc]
test: docvalues...OK [12 docvalues fields; 7 BINARY; 3 NUMERIC; 2 
SORTED; 0 SORTED_SET]

 java.lang.ArrayIndexOutOfBoundsException on reading data
 

 Key: LUCENE-5267
 URL: https://issues.apache.org/jira/browse/LUCENE-5267
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/index
Affects Versions: 4.4
Reporter: Littlestar
Assignee: Adrien Grand
  Labels: LZ4

 java.lang.ArrayIndexOutOfBoundsException
   at org.apache.lucene.codecs.compressing.LZ4.decompress(LZ4.java:132)
   at 
 org.apache.lucene.codecs.compressing.CompressionMode$4.decompress(CompressionMode.java:135)
   at 
 org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.visitDocument(CompressingStoredFieldsReader.java:336)
   at 
 org.apache.lucene.index.SegmentReader.document(SegmentReader.java:133)
   at 
 org.apache.lucene.index.BaseCompositeReader.document(BaseCompositeReader.java:110)
   at 
 org.apache.lucene.index.SlowCompositeReaderWrapper.document(SlowCompositeReaderWrapper.java:212)
   at 
 org.apache.lucene.index.FilterAtomicReader.document(FilterAtomicReader.java:365)
   at 
 org.apache.lucene.index.BaseCompositeReader.document(BaseCompositeReader.java:110)
   at org.apache.lucene.index.IndexReader.document(IndexReader.java:447)
   at org.apache.lucene.search.IndexSearcher.doc(IndexSearcher.java:204)



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Need help regarding Boolean queries with queryparser

2013-10-09 Thread Devi pulaparti
In our search application ,  queries like  test  usage  is not returning
correct results but if I give the query like test AND usage works fine.
Using queryparser with standard analyzer. Could some one please help me.


RE: Need help regarding Boolean queries with queryparser

2013-10-09 Thread Uwe Schindler
Hi,

 

you have to write your own query parser. Look e.g. at the flexible query parser 
module, which can be customized.

 

Uwe

 

-

Uwe Schindler

H.-H.-Meier-Allee 63, D-28213 Bremen

http://www.thetaphi.de http://www.thetaphi.de/ 

eMail: u...@thetaphi.de

 

From: Devi pulaparti [mailto:pvkd...@gmail.com] 
Sent: Wednesday, October 09, 2013 9:50 AM
To: dev@lucene.apache.org
Subject: Need help regarding Boolean queries with queryparser

 

In our search application ,  queries like  test  usage  is not returning 
correct results but if I give the query like test AND usage works fine.  Using 
queryparser with standard analyzer. Could some one please help me.



Re: Need help regarding Boolean queries with queryparser

2013-10-09 Thread Devi pulaparti
Hi Uwe,
thanks a lot for the quick reply.
I am very new to Lucene. could please shed some light on the capabilities
of queryparser? why do we need a flexible query parser module for symbol
 to work? Doesn't  queryparser handle this?



On Wed, Oct 9, 2013 at 1:24 PM, Uwe Schindler u...@thetaphi.de wrote:

 Hi,

 ** **

 you have to write your own query parser. Look e.g. at the flexible query
 parser module, which can be customized.

 ** **

 Uwe

 ** **

 -

 Uwe Schindler

 H.-H.-Meier-Allee 63, D-28213 Bremen

 http://www.thetaphi.de

 eMail: u...@thetaphi.de

 ** **

 *From:* Devi pulaparti [mailto:pvkd...@gmail.com]
 *Sent:* Wednesday, October 09, 2013 9:50 AM
 *To:* dev@lucene.apache.org
 *Subject:* Need help regarding Boolean queries with queryparser

 ** **

 In our search application ,  queries like  test  usage  is not returning
 correct results but if I give the query like test AND usage works fine.
 Using queryparser with standard analyzer. Could some one please help me.**
 **



[jira] [Resolved] (LUCENE-5267) java.lang.ArrayIndexOutOfBoundsException on reading data

2013-10-09 Thread Adrien Grand (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand resolved LUCENE-5267.
--

Resolution: Not A Problem

bq. dOff - matchDec 0, so throws java.lang.ArrayIndexOutOfBoundsException
bq. dest.length=33288,dOff=3184,matchDec=34510,matchLen=15,fastLen=16

Indeed, all the lines you pasted make no sense since matchDec should be lower 
than dOff. To me this really looks like your index got corrupted somehow. It 
could be a single corrupt byte that makes LZ4 read a length on 2 bytes instead 
of 1 and this shift makes LZ4 try to decompress bytes that make no sense at 
all, explaining why all matchDecs are all higher than dOff.

There are likely only a few chunks that are broken so if you want to try to get 
back as many documents as possible from the corrupt segment, the following 
piece of code may help https://gist.github.com/jpountz/6461246

 java.lang.ArrayIndexOutOfBoundsException on reading data
 

 Key: LUCENE-5267
 URL: https://issues.apache.org/jira/browse/LUCENE-5267
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/index
Affects Versions: 4.4
Reporter: Littlestar
Assignee: Adrien Grand
  Labels: LZ4

 java.lang.ArrayIndexOutOfBoundsException
   at org.apache.lucene.codecs.compressing.LZ4.decompress(LZ4.java:132)
   at 
 org.apache.lucene.codecs.compressing.CompressionMode$4.decompress(CompressionMode.java:135)
   at 
 org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.visitDocument(CompressingStoredFieldsReader.java:336)
   at 
 org.apache.lucene.index.SegmentReader.document(SegmentReader.java:133)
   at 
 org.apache.lucene.index.BaseCompositeReader.document(BaseCompositeReader.java:110)
   at 
 org.apache.lucene.index.SlowCompositeReaderWrapper.document(SlowCompositeReaderWrapper.java:212)
   at 
 org.apache.lucene.index.FilterAtomicReader.document(FilterAtomicReader.java:365)
   at 
 org.apache.lucene.index.BaseCompositeReader.document(BaseCompositeReader.java:110)
   at org.apache.lucene.index.IndexReader.document(IndexReader.java:447)
   at org.apache.lucene.search.IndexSearcher.doc(IndexSearcher.java:204)



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5319) Collection ZK nodes do not reflect the correct router chosen

2013-10-09 Thread Shalin Shekhar Mangar (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790153#comment-13790153
 ] 

Shalin Shekhar Mangar commented on SOLR-5319:
-

The doc router stored in the collection zk node is not used anywhere. We should 
just remove that code.

 Collection ZK nodes do not reflect the correct router chosen
 

 Key: SOLR-5319
 URL: https://issues.apache.org/jira/browse/SOLR-5319
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.4, 4.5, 5.0
Reporter: Jessica Cheng
Assignee: Shalin Shekhar Mangar
  Labels: solrcloud, zookeeper

 In ZkController.createCollectionZkNode, the doc router is determined by this 
 code snippet:
 if (collectionProps.get(DocCollection.DOC_ROUTER) == null) {
 Object numShards = 
 collectionProps.get(ZkStateReader.NUM_SHARDS_PROP);
 if (numShards == null) {
   numShards = System.getProperty(ZkStateReader.NUM_SHARDS_PROP);
 }
 if (numShards == null) {
   collectionProps.put(DocCollection.DOC_ROUTER, 
 ImplicitDocRouter.NAME);
 } else {
   collectionProps.put(DocCollection.DOC_ROUTER, 
 DocRouter.DEFAULT_NAME);
 }
   }
 Since OverseerCollectionProcessor never passes on any params prefixed with 
 collection other than collection.configName in its create core commands, 
 collectionProps.get(DocCollection.DOC_ROUTER) will never be non-null. Thus, 
 it needs to figure out if the router is implicit or compositeID based on if 
 numShards is passed in. However, 
 collectionProps.get(ZkStateReader.NUM_SHARDS_PROP) will also always be null 
 for the same reason collectionProps.get(DocCollection.DOC_ROUTER) is null, 
 and it isn't explicitly set in the code above, so the only way for numShards 
 not to be null is if it's passed in as a system property.
 As an example, here's a cluster state that's created as compositeId router, 
 but the collection ZK node says it's implicit:
 in clusterstate.json:
 example:{
 shards:{shard1:{
 range:8000-7fff,
 state:active,
 replicas:{core_node1:{
 state:active,
 core:example_shard1_replica1,
 node_name:localhost:8983_solr,
 base_url:http://localhost:8983/solr;,
 leader:true,
 router:compositeId},
 in /collections/example data:
 {
   configName:myconf,
   router:implicit}
 I've not sure if the collection ZK node router info is actually used 
 anywhere, so it may not matter, but it's confusing.
 I think the best fix is for OverseerCollectionProcessor to pass on params 
 prefixed with collection. to the core creation requests. Otherwise, 
 ZkController.createCollectionZkNode can explicitly set the numShards 
 collectionProps by cd.getNumShards() too.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-3069) Lucene should have an entirely memory resident term dictionary

2013-10-09 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-3069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790156#comment-13790156
 ] 

ASF subversion and git services commented on LUCENE-3069:
-

Commit 1530520 from [~billy] in branch 'dev/trunk'
[ https://svn.apache.org/r1530520 ]

LUCENE-3069: add CHANGES, move new postingsformats to oal.codecs

 Lucene should have an entirely memory resident term dictionary
 --

 Key: LUCENE-3069
 URL: https://issues.apache.org/jira/browse/LUCENE-3069
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/index, core/search
Affects Versions: 4.0-ALPHA
Reporter: Simon Willnauer
Assignee: Han Jiang
  Labels: gsoc2013
 Fix For: 4.6

 Attachments: df-ttf-estimate.txt, example.png, LUCENE-3069.patch, 
 LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, 
 LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, 
 LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, LUCENE-3069.patch, 
 LUCENE-3069.patch


 FST based TermDictionary has been a great improvement yet it still uses a 
 delta codec file for scanning to terms. Some environments have enough memory 
 available to keep the entire FST based term dict in memory. We should add a 
 TermDictionary implementation that encodes all needed information for each 
 term into the FST (custom fst.Output) and builds a FST from the entire term 
 not just the delta.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5319) Collection ZK nodes do not reflect the correct router chosen

2013-10-09 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790157#comment-13790157
 ] 

ASF subversion and git services commented on SOLR-5319:
---

Commit 1530521 from sha...@apache.org in branch 'dev/trunk'
[ https://svn.apache.org/r1530521 ]

SOLR-5319: Remove unused and incorrect router name from Collection ZK nodes

 Collection ZK nodes do not reflect the correct router chosen
 

 Key: SOLR-5319
 URL: https://issues.apache.org/jira/browse/SOLR-5319
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.4, 4.5, 5.0
Reporter: Jessica Cheng
Assignee: Shalin Shekhar Mangar
  Labels: solrcloud, zookeeper
 Fix For: 5.0, 4.6


 In ZkController.createCollectionZkNode, the doc router is determined by this 
 code snippet:
 if (collectionProps.get(DocCollection.DOC_ROUTER) == null) {
 Object numShards = 
 collectionProps.get(ZkStateReader.NUM_SHARDS_PROP);
 if (numShards == null) {
   numShards = System.getProperty(ZkStateReader.NUM_SHARDS_PROP);
 }
 if (numShards == null) {
   collectionProps.put(DocCollection.DOC_ROUTER, 
 ImplicitDocRouter.NAME);
 } else {
   collectionProps.put(DocCollection.DOC_ROUTER, 
 DocRouter.DEFAULT_NAME);
 }
   }
 Since OverseerCollectionProcessor never passes on any params prefixed with 
 collection other than collection.configName in its create core commands, 
 collectionProps.get(DocCollection.DOC_ROUTER) will never be non-null. Thus, 
 it needs to figure out if the router is implicit or compositeID based on if 
 numShards is passed in. However, 
 collectionProps.get(ZkStateReader.NUM_SHARDS_PROP) will also always be null 
 for the same reason collectionProps.get(DocCollection.DOC_ROUTER) is null, 
 and it isn't explicitly set in the code above, so the only way for numShards 
 not to be null is if it's passed in as a system property.
 As an example, here's a cluster state that's created as compositeId router, 
 but the collection ZK node says it's implicit:
 in clusterstate.json:
 example:{
 shards:{shard1:{
 range:8000-7fff,
 state:active,
 replicas:{core_node1:{
 state:active,
 core:example_shard1_replica1,
 node_name:localhost:8983_solr,
 base_url:http://localhost:8983/solr;,
 leader:true,
 router:compositeId},
 in /collections/example data:
 {
   configName:myconf,
   router:implicit}
 I've not sure if the collection ZK node router info is actually used 
 anywhere, so it may not matter, but it's confusing.
 I think the best fix is for OverseerCollectionProcessor to pass on params 
 prefixed with collection. to the core creation requests. Otherwise, 
 ZkController.createCollectionZkNode can explicitly set the numShards 
 collectionProps by cd.getNumShards() too.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5319) Collection ZK nodes do not reflect the correct router chosen

2013-10-09 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790158#comment-13790158
 ] 

ASF subversion and git services commented on SOLR-5319:
---

Commit 1530523 from sha...@apache.org in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1530523 ]

SOLR-5319: Remove unused and incorrect router name from Collection ZK nodes

 Collection ZK nodes do not reflect the correct router chosen
 

 Key: SOLR-5319
 URL: https://issues.apache.org/jira/browse/SOLR-5319
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.4, 4.5, 5.0
Reporter: Jessica Cheng
Assignee: Shalin Shekhar Mangar
  Labels: solrcloud, zookeeper
 Fix For: 5.0, 4.6


 In ZkController.createCollectionZkNode, the doc router is determined by this 
 code snippet:
 if (collectionProps.get(DocCollection.DOC_ROUTER) == null) {
 Object numShards = 
 collectionProps.get(ZkStateReader.NUM_SHARDS_PROP);
 if (numShards == null) {
   numShards = System.getProperty(ZkStateReader.NUM_SHARDS_PROP);
 }
 if (numShards == null) {
   collectionProps.put(DocCollection.DOC_ROUTER, 
 ImplicitDocRouter.NAME);
 } else {
   collectionProps.put(DocCollection.DOC_ROUTER, 
 DocRouter.DEFAULT_NAME);
 }
   }
 Since OverseerCollectionProcessor never passes on any params prefixed with 
 collection other than collection.configName in its create core commands, 
 collectionProps.get(DocCollection.DOC_ROUTER) will never be non-null. Thus, 
 it needs to figure out if the router is implicit or compositeID based on if 
 numShards is passed in. However, 
 collectionProps.get(ZkStateReader.NUM_SHARDS_PROP) will also always be null 
 for the same reason collectionProps.get(DocCollection.DOC_ROUTER) is null, 
 and it isn't explicitly set in the code above, so the only way for numShards 
 not to be null is if it's passed in as a system property.
 As an example, here's a cluster state that's created as compositeId router, 
 but the collection ZK node says it's implicit:
 in clusterstate.json:
 example:{
 shards:{shard1:{
 range:8000-7fff,
 state:active,
 replicas:{core_node1:{
 state:active,
 core:example_shard1_replica1,
 node_name:localhost:8983_solr,
 base_url:http://localhost:8983/solr;,
 leader:true,
 router:compositeId},
 in /collections/example data:
 {
   configName:myconf,
   router:implicit}
 I've not sure if the collection ZK node router info is actually used 
 anywhere, so it may not matter, but it's confusing.
 I think the best fix is for OverseerCollectionProcessor to pass on params 
 prefixed with collection. to the core creation requests. Otherwise, 
 ZkController.createCollectionZkNode can explicitly set the numShards 
 collectionProps by cd.getNumShards() too.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-5319) Collection ZK nodes do not reflect the correct router chosen

2013-10-09 Thread Shalin Shekhar Mangar (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar resolved SOLR-5319.
-

   Resolution: Fixed
Fix Version/s: 4.6
   5.0

 Collection ZK nodes do not reflect the correct router chosen
 

 Key: SOLR-5319
 URL: https://issues.apache.org/jira/browse/SOLR-5319
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.4, 4.5, 5.0
Reporter: Jessica Cheng
Assignee: Shalin Shekhar Mangar
  Labels: solrcloud, zookeeper
 Fix For: 5.0, 4.6


 In ZkController.createCollectionZkNode, the doc router is determined by this 
 code snippet:
 if (collectionProps.get(DocCollection.DOC_ROUTER) == null) {
 Object numShards = 
 collectionProps.get(ZkStateReader.NUM_SHARDS_PROP);
 if (numShards == null) {
   numShards = System.getProperty(ZkStateReader.NUM_SHARDS_PROP);
 }
 if (numShards == null) {
   collectionProps.put(DocCollection.DOC_ROUTER, 
 ImplicitDocRouter.NAME);
 } else {
   collectionProps.put(DocCollection.DOC_ROUTER, 
 DocRouter.DEFAULT_NAME);
 }
   }
 Since OverseerCollectionProcessor never passes on any params prefixed with 
 collection other than collection.configName in its create core commands, 
 collectionProps.get(DocCollection.DOC_ROUTER) will never be non-null. Thus, 
 it needs to figure out if the router is implicit or compositeID based on if 
 numShards is passed in. However, 
 collectionProps.get(ZkStateReader.NUM_SHARDS_PROP) will also always be null 
 for the same reason collectionProps.get(DocCollection.DOC_ROUTER) is null, 
 and it isn't explicitly set in the code above, so the only way for numShards 
 not to be null is if it's passed in as a system property.
 As an example, here's a cluster state that's created as compositeId router, 
 but the collection ZK node says it's implicit:
 in clusterstate.json:
 example:{
 shards:{shard1:{
 range:8000-7fff,
 state:active,
 replicas:{core_node1:{
 state:active,
 core:example_shard1_replica1,
 node_name:localhost:8983_solr,
 base_url:http://localhost:8983/solr;,
 leader:true,
 router:compositeId},
 in /collections/example data:
 {
   configName:myconf,
   router:implicit}
 I've not sure if the collection ZK node router info is actually used 
 anywhere, so it may not matter, but it's confusing.
 I think the best fix is for OverseerCollectionProcessor to pass on params 
 prefixed with collection. to the core creation requests. Otherwise, 
 ZkController.createCollectionZkNode can explicitly set the numShards 
 collectionProps by cd.getNumShards() too.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5318) create command don't take into account the transient core property

2013-10-09 Thread olivier soyez (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

olivier soyez updated SOLR-5318:


Description: 
the create core admin command don't take into account the transient core 
property, when the core is registered (so, the core will be never closed by the 
transient core cache)

To reproduce :
set transientCacheSize=2 and start with no cores

Create 3 cores :
curl 
http://ip:port/solr/admin/cores?action=CREATEname=coreXinstanceDir=path_to_instanceDir_coreXdataDir=path_to_dataDir_coreXloadOnStartup=falsetransient=true;

Look at the status :
http://ip:port/solr/admin/cores?action=STATUS

All cores are still loaded.
One core should not be loaded (closed by the transient cache).

  was:
the create core admin command don't take into account the transient core 
property, when the core is registered (so, the core will be never closed by the 
transient core cache)



 create command don't take into account the transient core property
 --

 Key: SOLR-5318
 URL: https://issues.apache.org/jira/browse/SOLR-5318
 Project: Solr
  Issue Type: Bug
  Components: multicore
Affects Versions: 4.6
Reporter: olivier soyez
Assignee: Erick Erickson
Priority: Minor
 Attachments: SOLR-5318.patch


 the create core admin command don't take into account the transient core 
 property, when the core is registered (so, the core will be never closed by 
 the transient core cache)
 To reproduce :
 set transientCacheSize=2 and start with no cores
 Create 3 cores :
 curl 
 http://ip:port/solr/admin/cores?action=CREATEname=coreXinstanceDir=path_to_instanceDir_coreXdataDir=path_to_dataDir_coreXloadOnStartup=falsetransient=true;
 Look at the status :
 http://ip:port/solr/admin/cores?action=STATUS
 All cores are still loaded.
 One core should not be loaded (closed by the transient cache).



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5318) create command don't take into account the transient core property

2013-10-09 Thread olivier soyez (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

olivier soyez updated SOLR-5318:


Description: 
the create core admin command don't take into account the transient core 
property, when the core is registered (so, the core will be never closed by the 
transient core cache)

To reproduce :
set transientCacheSize=2 and start with no cores

Create 3 cores :
curl 
http://ip:port/solr/admin/cores?action=CREATEname=coreXinstanceDir=path_to_instanceDir_coreXdataDir=path_to_dataDir_coreXloadOnStartup=falsetransient=true;

Look at the status :
http://ip:port/solr/admin/cores?action=STATUS

All cores are still loaded.
One core should not be loaded (closed by the transient cache)

  was:
the create core admin command don't take into account the transient core 
property, when the core is registered (so, the core will be never closed by the 
transient core cache)

To reproduce :
set transientCacheSize=2 and start with no cores

Create 3 cores :
curl 
http://ip:port/solr/admin/cores?action=CREATEname=coreXinstanceDir=path_to_instanceDir_coreXdataDir=path_to_dataDir_coreXloadOnStartup=falsetransient=true;

Look at the status :
http://ip:port/solr/admin/cores?action=STATUS

All cores are still loaded.
One core should not be loaded (closed by the transient cache).


 create command don't take into account the transient core property
 --

 Key: SOLR-5318
 URL: https://issues.apache.org/jira/browse/SOLR-5318
 Project: Solr
  Issue Type: Bug
  Components: multicore
Affects Versions: 4.6
Reporter: olivier soyez
Assignee: Erick Erickson
Priority: Minor
 Attachments: SOLR-5318.patch


 the create core admin command don't take into account the transient core 
 property, when the core is registered (so, the core will be never closed by 
 the transient core cache)
 To reproduce :
 set transientCacheSize=2 and start with no cores
 Create 3 cores :
 curl 
 http://ip:port/solr/admin/cores?action=CREATEname=coreXinstanceDir=path_to_instanceDir_coreXdataDir=path_to_dataDir_coreXloadOnStartup=falsetransient=true;
 Look at the status :
 http://ip:port/solr/admin/cores?action=STATUS
 All cores are still loaded.
 One core should not be loaded (closed by the transient cache)



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5318) create command don't take into account the transient core property

2013-10-09 Thread olivier soyez (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790169#comment-13790169
 ] 

olivier soyez commented on SOLR-5318:
-

We are using in production solr 4.2.1, but I also test solr 4.4 and the svn 
solr branch_4X : same issue
I complete the description and the way to reproduce the issue
Not correlated with SOLR-4862

 create command don't take into account the transient core property
 --

 Key: SOLR-5318
 URL: https://issues.apache.org/jira/browse/SOLR-5318
 Project: Solr
  Issue Type: Bug
  Components: multicore
Affects Versions: 4.6
Reporter: olivier soyez
Assignee: Erick Erickson
Priority: Minor
 Attachments: SOLR-5318.patch


 the create core admin command don't take into account the transient core 
 property, when the core is registered (so, the core will be never closed by 
 the transient core cache)
 To reproduce :
 set transientCacheSize=2 and start with no cores
 Create 3 cores :
 curl 
 http://ip:port/solr/admin/cores?action=CREATEname=coreXinstanceDir=path_to_instanceDir_coreXdataDir=path_to_dataDir_coreXloadOnStartup=falsetransient=true;
 Look at the status :
 http://ip:port/solr/admin/cores?action=STATUS
 All cores are still loaded.
 One core should not be loaded (closed by the transient cache)



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5318) create command don't take into account the transient core property

2013-10-09 Thread olivier soyez (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

olivier soyez updated SOLR-5318:


Affects Version/s: 4.4

 create command don't take into account the transient core property
 --

 Key: SOLR-5318
 URL: https://issues.apache.org/jira/browse/SOLR-5318
 Project: Solr
  Issue Type: Bug
  Components: multicore
Affects Versions: 4.4, 4.6
Reporter: olivier soyez
Assignee: Erick Erickson
Priority: Minor
 Attachments: SOLR-5318.patch


 the create core admin command don't take into account the transient core 
 property, when the core is registered (so, the core will be never closed by 
 the transient core cache)
 To reproduce :
 set transientCacheSize=2 and start with no cores
 Create 3 cores :
 curl 
 http://ip:port/solr/admin/cores?action=CREATEname=coreXinstanceDir=path_to_instanceDir_coreXdataDir=path_to_dataDir_coreXloadOnStartup=falsetransient=true;
 Look at the status :
 http://ip:port/solr/admin/cores?action=STATUS
 All cores are still loaded.
 One core should not be loaded (closed by the transient cache)



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-5321) Overseer.updateState tries to use router name from message but none is sent

2013-10-09 Thread Shalin Shekhar Mangar (JIRA)
Shalin Shekhar Mangar created SOLR-5321:
---

 Summary: Overseer.updateState tries to use router name from 
message but none is sent
 Key: SOLR-5321
 URL: https://issues.apache.org/jira/browse/SOLR-5321
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.5
Reporter: Shalin Shekhar Mangar
Assignee: Shalin Shekhar Mangar
Priority: Minor
 Fix For: 5.0, 4.6


Overseer.updateSlice method has the following code:

{code}
String router = 
message.getStr(OverseerCollectionProcessor.ROUTER,DocRouter.DEFAULT_NAME);
ListString shardNames  = new ArrayListString();

//collection does not yet exist, create placeholders if num shards is 
specified
boolean collectionExists = state.getCollections().contains(collection);
if (!collectionExists  numShards!=null) {
  if(ImplicitDocRouter.NAME.equals(router)){
getShardNames(shardNames, message.getStr(shards,null));
numShards = shardNames.size();
  }else {
getShardNames(numShards, shardNames);
  }
  state = createCollection(state, collection, shardNames, message);
}
{code}

Here it tries to read the router name from the message. Even if we ignore that 
the key to lookup the router is wrong here, the router name is never sent in a 
state message.

Considering that we don't even support creating a collection with implicit 
router from command line, we should stop expecting the parameter.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: svn commit: r1530537 - in /lucene/dev/trunk/lucene: common-build.xml ivy-settings.xml

2013-10-09 Thread Robert Muir
Thanks for updating this!

I think we should merge this back to branch 4.x too? This way the
source code tar.gz is working from China for our next release?

2013/10/9  h...@apache.org:
 Author: han
 Date: Wed Oct  9 08:56:15 2013
 New Revision: 1530537

 URL: http://svn.apache.org/r1530537
 Log:
 update broken links for maven mirror

 Modified:
 lucene/dev/trunk/lucene/common-build.xml
 lucene/dev/trunk/lucene/ivy-settings.xml

 Modified: lucene/dev/trunk/lucene/common-build.xml
 URL: 
 http://svn.apache.org/viewvc/lucene/dev/trunk/lucene/common-build.xml?rev=1530537r1=1530536r2=1530537view=diff
 ==
 --- lucene/dev/trunk/lucene/common-build.xml (original)
 +++ lucene/dev/trunk/lucene/common-build.xml Wed Oct  9 08:56:15 2013
 @@ -360,7 +360,7 @@
property name=ivy_install_path location=${user.home}/.ant/lib /
property name=ivy_bootstrap_url1 value=http://repo1.maven.org/maven2/
!-- you might need to tweak this from china so it works --
 -  property name=ivy_bootstrap_url2 
 value=http://mirror.netcologne.de/maven2/
 +  property name=ivy_bootstrap_url2 value=http://uk.maven.org/maven2/
property name=ivy_checksum_sha1 
 value=c5ebf1c253ad4959a29f4acfe696ee48cdd9f473/

target name=ivy-availability-check unless=ivy.available

 Modified: lucene/dev/trunk/lucene/ivy-settings.xml
 URL: 
 http://svn.apache.org/viewvc/lucene/dev/trunk/lucene/ivy-settings.xml?rev=1530537r1=1530536r2=1530537view=diff
 ==
 --- lucene/dev/trunk/lucene/ivy-settings.xml (original)
 +++ lucene/dev/trunk/lucene/ivy-settings.xml Wed Oct  9 08:56:15 2013
 @@ -35,7 +35,7 @@
  ibiblio name=maven.restlet.org root=http://maven.restlet.org; 
 m2compatible=true /

  !-- you might need to tweak this from china so it works --
 -ibiblio name=working-chinese-mirror 
 root=http://mirror.netcologne.de/maven2; m2compatible=true /
 +ibiblio name=working-chinese-mirror root=http://uk.maven.org/maven2; 
 m2compatible=true /

  !-- temporary to try Clover 3.2.0 snapshots, see 
 https://issues.apache.org/jira/browse/LUCENE-5243, 
 https://jira.atlassian.com/browse/CLOV-1368 --
  ibiblio name=atlassian-clover-snapshots 
 root=https://maven.atlassian.com/content/repositories/atlassian-public-snapshot;
  m2compatible=true /



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: svn commit: r1530537 - in /lucene/dev/trunk/lucene: common-build.xml ivy-settings.xml

2013-10-09 Thread Han Jiang
oh, yes, I'll do that!


On Wed, Oct 9, 2013 at 5:17 PM, Robert Muir rcm...@gmail.com wrote:

 Thanks for updating this!

 I think we should merge this back to branch 4.x too? This way the
 source code tar.gz is working from China for our next release?

 2013/10/9  h...@apache.org:
  Author: han
  Date: Wed Oct  9 08:56:15 2013
  New Revision: 1530537
 
  URL: http://svn.apache.org/r1530537
  Log:
  update broken links for maven mirror
 
  Modified:
  lucene/dev/trunk/lucene/common-build.xml
  lucene/dev/trunk/lucene/ivy-settings.xml
 
  Modified: lucene/dev/trunk/lucene/common-build.xml
  URL:
 http://svn.apache.org/viewvc/lucene/dev/trunk/lucene/common-build.xml?rev=1530537r1=1530536r2=1530537view=diff
 
 ==
  --- lucene/dev/trunk/lucene/common-build.xml (original)
  +++ lucene/dev/trunk/lucene/common-build.xml Wed Oct  9 08:56:15 2013
  @@ -360,7 +360,7 @@
 property name=ivy_install_path location=${user.home}/.ant/lib /
 property name=ivy_bootstrap_url1 value=
 http://repo1.maven.org/maven2/
 !-- you might need to tweak this from china so it works --
  -  property name=ivy_bootstrap_url2 value=
 http://mirror.netcologne.de/maven2/
  +  property name=ivy_bootstrap_url2 value=http://uk.maven.org/maven2
 /
 property name=ivy_checksum_sha1
 value=c5ebf1c253ad4959a29f4acfe696ee48cdd9f473/
 
 target name=ivy-availability-check unless=ivy.available
 
  Modified: lucene/dev/trunk/lucene/ivy-settings.xml
  URL:
 http://svn.apache.org/viewvc/lucene/dev/trunk/lucene/ivy-settings.xml?rev=1530537r1=1530536r2=1530537view=diff
 
 ==
  --- lucene/dev/trunk/lucene/ivy-settings.xml (original)
  +++ lucene/dev/trunk/lucene/ivy-settings.xml Wed Oct  9 08:56:15 2013
  @@ -35,7 +35,7 @@
   ibiblio name=maven.restlet.org root=http://maven.restlet.org;
 m2compatible=true /
 
   !-- you might need to tweak this from china so it works --
  -ibiblio name=working-chinese-mirror root=
 http://mirror.netcologne.de/maven2; m2compatible=true /
  +ibiblio name=working-chinese-mirror root=
 http://uk.maven.org/maven2; m2compatible=true /
 
   !-- temporary to try Clover 3.2.0 snapshots, see
 https://issues.apache.org/jira/browse/LUCENE-5243,
 https://jira.atlassian.com/browse/CLOV-1368 --
   ibiblio name=atlassian-clover-snapshots root=
 https://maven.atlassian.com/content/repositories/atlassian-public-snapshot;
 m2compatible=true /
 
 

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org




[jira] [Commented] (LUCENE-5264) CommonTermsQuery ignores minMustMatch if only high freq terms are present.

2013-10-09 Thread Adrien Grand (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790203#comment-13790203
 ] 

Adrien Grand commented on LUCENE-5264:
--

+1

 CommonTermsQuery ignores minMustMatch if only high freq terms are present.
 --

 Key: LUCENE-5264
 URL: https://issues.apache.org/jira/browse/LUCENE-5264
 Project: Lucene - Core
  Issue Type: Bug
  Components: modules/other
Affects Versions: 5.0, 4.5
Reporter: Simon Willnauer
Assignee: Simon Willnauer
Priority: Minor
 Fix For: 5.0, 4.6

 Attachments: LUCENE-5264.patch


 if we only have high freq terms we move to a pure conjunction and ignore the 
 min must match entirely if it is  0.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5310) Add a collection admin command to remove a replica

2013-10-09 Thread Noble Paul (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noble Paul updated SOLR-5310:
-

Description: 
the only way a replica can removed is by unloading the core .There is no way to 
remove a replica that is down . So, the clusterstate will have unreferenced 
nodes if a few nodes go down over time

We need a cluster admin command to clean that up

e.g: 
/admin/collections?action=DELETEREPLICAcollection=coll1shard=shard1replica=core_node3


The system would first see if the replica is active. If yes , a core UNLOAD 
command is fired , which would take care of deleting the replica from the 
clusterstate as well

if the state is inactive, then the core or node may be down , in that case the 
entry is removed from cluster state  


  was:
the only way a replica can removed is by unloading the core .There is no way to 
remove a replica that is down . So, the clusterstate will have unreferenced 
nodes if a few nodes go down over time

We need a cluster admin command to clean that up

e.g: 
/admin/collections?action=REMOVEREPLICAcollection=coll1shard=shard1replica=core_node3






 Add a collection admin command to remove a replica
 --

 Key: SOLR-5310
 URL: https://issues.apache.org/jira/browse/SOLR-5310
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
Reporter: Noble Paul
Assignee: Noble Paul
   Original Estimate: 72h
  Remaining Estimate: 72h

 the only way a replica can removed is by unloading the core .There is no way 
 to remove a replica that is down . So, the clusterstate will have 
 unreferenced nodes if a few nodes go down over time
 We need a cluster admin command to clean that up
 e.g: 
 /admin/collections?action=DELETEREPLICAcollection=coll1shard=shard1replica=core_node3
 The system would first see if the replica is active. If yes , a core UNLOAD 
 command is fired , which would take care of deleting the replica from the 
 clusterstate as well
 if the state is inactive, then the core or node may be down , in that case 
 the entry is removed from cluster state  



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-5321) Overseer.updateState tries to use router name from message but none is sent

2013-10-09 Thread Shalin Shekhar Mangar (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar resolved SOLR-5321.
-

Resolution: Fixed

 Overseer.updateState tries to use router name from message but none is sent
 ---

 Key: SOLR-5321
 URL: https://issues.apache.org/jira/browse/SOLR-5321
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.5
Reporter: Shalin Shekhar Mangar
Assignee: Shalin Shekhar Mangar
Priority: Minor
 Fix For: 5.0, 4.6


 Overseer.updateSlice method has the following code:
 {code}
 String router = 
 message.getStr(OverseerCollectionProcessor.ROUTER,DocRouter.DEFAULT_NAME);
 ListString shardNames  = new ArrayListString();
 //collection does not yet exist, create placeholders if num shards is 
 specified
 boolean collectionExists = 
 state.getCollections().contains(collection);
 if (!collectionExists  numShards!=null) {
   if(ImplicitDocRouter.NAME.equals(router)){
 getShardNames(shardNames, message.getStr(shards,null));
 numShards = shardNames.size();
   }else {
 getShardNames(numShards, shardNames);
   }
   state = createCollection(state, collection, shardNames, message);
 }
 {code}
 Here it tries to read the router name from the message. Even if we ignore 
 that the key to lookup the router is wrong here, the router name is never 
 sent in a state message.
 Considering that we don't even support creating a collection with implicit 
 router from command line, we should stop expecting the parameter.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5321) Overseer.updateState tries to use router name from message but none is sent

2013-10-09 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790223#comment-13790223
 ] 

ASF subversion and git services commented on SOLR-5321:
---

Commit 1530555 from sha...@apache.org in branch 'dev/trunk'
[ https://svn.apache.org/r1530555 ]

SOLR-5321: Remove unnecessary code in Overseer.updateState method which tries 
to use router name from message where none is ever sent

 Overseer.updateState tries to use router name from message but none is sent
 ---

 Key: SOLR-5321
 URL: https://issues.apache.org/jira/browse/SOLR-5321
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.5
Reporter: Shalin Shekhar Mangar
Assignee: Shalin Shekhar Mangar
Priority: Minor
 Fix For: 5.0, 4.6


 Overseer.updateSlice method has the following code:
 {code}
 String router = 
 message.getStr(OverseerCollectionProcessor.ROUTER,DocRouter.DEFAULT_NAME);
 ListString shardNames  = new ArrayListString();
 //collection does not yet exist, create placeholders if num shards is 
 specified
 boolean collectionExists = 
 state.getCollections().contains(collection);
 if (!collectionExists  numShards!=null) {
   if(ImplicitDocRouter.NAME.equals(router)){
 getShardNames(shardNames, message.getStr(shards,null));
 numShards = shardNames.size();
   }else {
 getShardNames(numShards, shardNames);
   }
   state = createCollection(state, collection, shardNames, message);
 }
 {code}
 Here it tries to read the router name from the message. Even if we ignore 
 that the key to lookup the router is wrong here, the router name is never 
 sent in a state message.
 Considering that we don't even support creating a collection with implicit 
 router from command line, we should stop expecting the parameter.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5321) Overseer.updateState tries to use router name from message but none is sent

2013-10-09 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790225#comment-13790225
 ] 

ASF subversion and git services commented on SOLR-5321:
---

Commit 1530556 from sha...@apache.org in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1530556 ]

SOLR-5321: Remove unnecessary code in Overseer.updateState method which tries 
to use router name from message where none is ever sent

 Overseer.updateState tries to use router name from message but none is sent
 ---

 Key: SOLR-5321
 URL: https://issues.apache.org/jira/browse/SOLR-5321
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.5
Reporter: Shalin Shekhar Mangar
Assignee: Shalin Shekhar Mangar
Priority: Minor
 Fix For: 5.0, 4.6


 Overseer.updateSlice method has the following code:
 {code}
 String router = 
 message.getStr(OverseerCollectionProcessor.ROUTER,DocRouter.DEFAULT_NAME);
 ListString shardNames  = new ArrayListString();
 //collection does not yet exist, create placeholders if num shards is 
 specified
 boolean collectionExists = 
 state.getCollections().contains(collection);
 if (!collectionExists  numShards!=null) {
   if(ImplicitDocRouter.NAME.equals(router)){
 getShardNames(shardNames, message.getStr(shards,null));
 numShards = shardNames.size();
   }else {
 getShardNames(numShards, shardNames);
   }
   state = createCollection(state, collection, shardNames, message);
 }
 {code}
 Here it tries to read the router name from the message. Even if we ignore 
 that the key to lookup the router is wrong here, the router name is never 
 sent in a state message.
 Considering that we don't even support creating a collection with implicit 
 router from command line, we should stop expecting the parameter.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-5322) Permisions didn't check when call discoverUnder

2013-10-09 Thread Said Chavkin (JIRA)
Said Chavkin created SOLR-5322:
--

 Summary: Permisions didn't check when call discoverUnder
 Key: SOLR-5322
 URL: https://issues.apache.org/jira/browse/SOLR-5322
 Project: Solr
  Issue Type: Bug
  Components: contrib - Clustering
Affects Versions: 4.5
 Environment: Centos 6.4
tomcat6
Reporter: Said Chavkin


Hello.

When in solr/home directory exists directory to which solr do not have rights, 
then solr failed to start with exception
2108 [main] INFO org.apache.solr.core.CoresLocator - Looking for core 
definitions underneath /var/lib/solr
2109 [main] ERROR org.apache.solr.servlet.SolrDispatchFilter - Could not start 
Solr. Check solr/home property and the logs
2138 [main] ERROR org.apache.solr.core.SolrCore - 
null:java.lang.NullPointerException
at 
org.apache.solr.core.CorePropertiesLocator.discoverUnder(CorePropertiesLocator.java:121)
at 
org.apache.solr.core.CorePropertiesLocator.discoverUnder(CorePropertiesLocator.java:130)
at 
org.apache.solr.core.CorePropertiesLocator.discover(CorePropertiesLocator.java:113)
at org.apache.solr.core.CoreContainer.load(CoreContainer.java:226)
at 
org.apache.solr.servlet.SolrDispatchFilter.createCoreContainer(SolrDispatchFilter.java:177)
at 
org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:127)
at 
org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:295)
at 
org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:422)
at 
org.apache.catalina.core.ApplicationFilterConfig.init(ApplicationFilterConfig.java:115)
at 
org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:3838)
at 
org.apache.catalina.core.StandardContext.start(StandardContext.java:4488)
at 
org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:791)
at 
org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:771)
at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:526)
at 
org.apache.catalina.startup.HostConfig.deployDescriptor(HostConfig.java:637)
at 
org.apache.catalina.startup.HostConfig.deployDescriptors(HostConfig.java:563)
at 
org.apache.catalina.startup.HostConfig.deployApps(HostConfig.java:498)
at org.apache.catalina.startup.HostConfig.start(HostConfig.java:1277)
at 
org.apache.catalina.startup.HostConfig.lifecycleEvent(HostConfig.java:321)
at 
org.apache.catalina.util.LifecycleSupport.fireLifecycleEvent(LifecycleSupport.java:119)
at org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1053)
at org.apache.catalina.core.StandardHost.start(StandardHost.java:722)
at org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1045)
at 
org.apache.catalina.core.StandardEngine.start(StandardEngine.java:443)
at 
org.apache.catalina.core.StandardService.start(StandardService.java:516)
at 
org.apache.catalina.core.StandardServer.start(StandardServer.java:710)
at org.apache.catalina.startup.Catalina.start(Catalina.java:593)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:616)
at org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:289)
at org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:414)

2138 [main] INFO org.apache.solr.servlet.SolrDispatchFilter - 
SolrDispatchFilter.init() done

For example:
solr home located on /var/lib/solr
/var/lib/solr is another file system, it has lost+found directory.
As result solr can't to star.

Yours faithfully.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5267) java.lang.ArrayIndexOutOfBoundsException on reading data

2013-10-09 Thread Littlestar (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790236#comment-13790236
 ] 

Littlestar commented on LUCENE-5267:


Thanks, most of records recoverd.
But why index got corrupted? mybe compress or writer has bug ...


 java.lang.ArrayIndexOutOfBoundsException on reading data
 

 Key: LUCENE-5267
 URL: https://issues.apache.org/jira/browse/LUCENE-5267
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/index
Affects Versions: 4.4
Reporter: Littlestar
Assignee: Adrien Grand
  Labels: LZ4

 java.lang.ArrayIndexOutOfBoundsException
   at org.apache.lucene.codecs.compressing.LZ4.decompress(LZ4.java:132)
   at 
 org.apache.lucene.codecs.compressing.CompressionMode$4.decompress(CompressionMode.java:135)
   at 
 org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.visitDocument(CompressingStoredFieldsReader.java:336)
   at 
 org.apache.lucene.index.SegmentReader.document(SegmentReader.java:133)
   at 
 org.apache.lucene.index.BaseCompositeReader.document(BaseCompositeReader.java:110)
   at 
 org.apache.lucene.index.SlowCompositeReaderWrapper.document(SlowCompositeReaderWrapper.java:212)
   at 
 org.apache.lucene.index.FilterAtomicReader.document(FilterAtomicReader.java:365)
   at 
 org.apache.lucene.index.BaseCompositeReader.document(BaseCompositeReader.java:110)
   at org.apache.lucene.index.IndexReader.document(IndexReader.java:447)
   at org.apache.lucene.search.IndexSearcher.doc(IndexSearcher.java:204)



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Assigned] (LUCENE-5236) Use broadword bit selection in EliasFanoDecoder

2013-10-09 Thread Adrien Grand (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand reassigned LUCENE-5236:


Assignee: Adrien Grand

 Use broadword bit selection in EliasFanoDecoder
 ---

 Key: LUCENE-5236
 URL: https://issues.apache.org/jira/browse/LUCENE-5236
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Paul Elschot
Assignee: Adrien Grand
Priority: Minor
 Attachments: LUCENE-5236.patch, LUCENE-5236.patch, 
 TestDocIdSetBenchmark.java


 Try and speed up decoding



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5267) java.lang.ArrayIndexOutOfBoundsException on reading data

2013-10-09 Thread Adrien Grand (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790248#comment-13790248
 ] 

Adrien Grand commented on LUCENE-5267:
--

Good question. I've had this issue myself once and the dmesg of the system was 
full with disk-related errors so something really bad probably happened with 
the disk. I am actually thinking of adding some basic checksuming to the future 
stored fields format (4 bytes per chunk, this wouldn't hurt the compression 
ratio much) in order to be able to distinguish easily index corruptions from 
bugs in the stored fields format (and especially the compression layer).

 java.lang.ArrayIndexOutOfBoundsException on reading data
 

 Key: LUCENE-5267
 URL: https://issues.apache.org/jira/browse/LUCENE-5267
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/index
Affects Versions: 4.4
Reporter: Littlestar
Assignee: Adrien Grand
  Labels: LZ4

 java.lang.ArrayIndexOutOfBoundsException
   at org.apache.lucene.codecs.compressing.LZ4.decompress(LZ4.java:132)
   at 
 org.apache.lucene.codecs.compressing.CompressionMode$4.decompress(CompressionMode.java:135)
   at 
 org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.visitDocument(CompressingStoredFieldsReader.java:336)
   at 
 org.apache.lucene.index.SegmentReader.document(SegmentReader.java:133)
   at 
 org.apache.lucene.index.BaseCompositeReader.document(BaseCompositeReader.java:110)
   at 
 org.apache.lucene.index.SlowCompositeReaderWrapper.document(SlowCompositeReaderWrapper.java:212)
   at 
 org.apache.lucene.index.FilterAtomicReader.document(FilterAtomicReader.java:365)
   at 
 org.apache.lucene.index.BaseCompositeReader.document(BaseCompositeReader.java:110)
   at org.apache.lucene.index.IndexReader.document(IndexReader.java:447)
   at org.apache.lucene.search.IndexSearcher.doc(IndexSearcher.java:204)



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5261) add simple API to build queries from analysis chain

2013-10-09 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-5261:


Attachment: LUCENE-5261.patch

Simplified patch:
* I removed get/set defaultOperator and slop, restoring these to the QPs (so 
less changes there: including no api impact)
* I removed operator enum completely and just use Occur for that.
* instead createFieldQuery just takes Occur and slop as parameters.
* added javadocs

From the use directly side I just added 
createBooleanQuery(String,String,Occur) and 
createPhraseQuery(String,String,int).

I think this is much more intuitive, these parameters are really per-query 
anyway: they shouldnt be getters/setters on this class. (Thats just brain 
damage from our crazy QP).

I think this is ready.

 add simple API to build queries from analysis chain
 ---

 Key: LUCENE-5261
 URL: https://issues.apache.org/jira/browse/LUCENE-5261
 Project: Lucene - Core
  Issue Type: New Feature
Reporter: Robert Muir
 Attachments: LUCENE-5261.patch, LUCENE-5261.patch, LUCENE-5261.patch


 Currently this is pretty crazy stuff.
 Additionally its duplicated in like 3 or 4 places in our codebase (i noticed 
 it doing LUCENE-5259)
 We can solve that duplication, and make it easy to simply create queries from 
 an analyzer (its been asked on the user list), as well as make it easier to 
 build new queryparsers.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2548) Multithreaded faceting

2013-10-09 Thread Erick Erickson (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790257#comment-13790257
 ] 

Erick Erickson commented on SOLR-2548:
--

1. no. Could be extended to I think if you have the energy.
2. no
3. yes
4 all


 Multithreaded faceting
 --

 Key: SOLR-2548
 URL: https://issues.apache.org/jira/browse/SOLR-2548
 Project: Solr
  Issue Type: Improvement
  Components: search
Affects Versions: 3.1
Reporter: Janne Majaranta
Assignee: Erick Erickson
Priority: Minor
  Labels: facet
 Fix For: 4.5, 5.0

 Attachments: SOLR-2548_4.2.1.patch, SOLR-2548_for_31x.patch, 
 SOLR-2548_multithreaded_faceting,_dsmiley.patch, 
 SOLR-2548_multithreaded_faceting,_dsmiley.patch, SOLR-2548.patch, 
 SOLR-2548.patch, SOLR-2548.patch, SOLR-2548.patch, SOLR-2548.patch, 
 SOLR-2548.patch


 Add multithreading support for faceting.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-5322) Permisions didn't check when call discoverUnder

2013-10-09 Thread Erick Erickson (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson resolved SOLR-5322.
--

Resolution: Invalid

Please raise this kind of issue on the user's list before raising a JIRA to see 
if it's really a but in Solr or a configuration issue.

You can reopen this is you think it's something Solr should manage.

What would you have Solr do? If it's not being run as a process that has 
permissions to a necessary directory what can it do _but_ fail on startup? You 
as the sysadmin are responsible for permissions


 Permisions didn't check when call discoverUnder
 ---

 Key: SOLR-5322
 URL: https://issues.apache.org/jira/browse/SOLR-5322
 Project: Solr
  Issue Type: Bug
  Components: contrib - Clustering
Affects Versions: 4.5
 Environment: Centos 6.4
 tomcat6
Reporter: Said Chavkin

 Hello.
 When in solr/home directory exists directory to which solr do not have 
 rights, then solr failed to start with exception
 2108 [main] INFO org.apache.solr.core.CoresLocator - Looking for core 
 definitions underneath /var/lib/solr
 2109 [main] ERROR org.apache.solr.servlet.SolrDispatchFilter - Could not 
 start Solr. Check solr/home property and the logs
 2138 [main] ERROR org.apache.solr.core.SolrCore - 
 null:java.lang.NullPointerException
 at 
 org.apache.solr.core.CorePropertiesLocator.discoverUnder(CorePropertiesLocator.java:121)
 at 
 org.apache.solr.core.CorePropertiesLocator.discoverUnder(CorePropertiesLocator.java:130)
 at 
 org.apache.solr.core.CorePropertiesLocator.discover(CorePropertiesLocator.java:113)
 at org.apache.solr.core.CoreContainer.load(CoreContainer.java:226)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.createCoreContainer(SolrDispatchFilter.java:177)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:127)
 at 
 org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:295)
 at 
 org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:422)
 at 
 org.apache.catalina.core.ApplicationFilterConfig.init(ApplicationFilterConfig.java:115)
 at 
 org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:3838)
 at 
 org.apache.catalina.core.StandardContext.start(StandardContext.java:4488)
 at 
 org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:791)
 at 
 org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:771)
 at 
 org.apache.catalina.core.StandardHost.addChild(StandardHost.java:526)
 at 
 org.apache.catalina.startup.HostConfig.deployDescriptor(HostConfig.java:637)
 at 
 org.apache.catalina.startup.HostConfig.deployDescriptors(HostConfig.java:563)
 at 
 org.apache.catalina.startup.HostConfig.deployApps(HostConfig.java:498)
 at org.apache.catalina.startup.HostConfig.start(HostConfig.java:1277)
 at 
 org.apache.catalina.startup.HostConfig.lifecycleEvent(HostConfig.java:321)
 at 
 org.apache.catalina.util.LifecycleSupport.fireLifecycleEvent(LifecycleSupport.java:119)
 at 
 org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1053)
 at org.apache.catalina.core.StandardHost.start(StandardHost.java:722)
 at 
 org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1045)
 at 
 org.apache.catalina.core.StandardEngine.start(StandardEngine.java:443)
 at 
 org.apache.catalina.core.StandardService.start(StandardService.java:516)
 at 
 org.apache.catalina.core.StandardServer.start(StandardServer.java:710)
 at org.apache.catalina.startup.Catalina.start(Catalina.java:593)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:616)
 at org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:289)
 at org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:414)
 2138 [main] INFO org.apache.solr.servlet.SolrDispatchFilter - 
 SolrDispatchFilter.init() done
 For example:
 solr home located on /var/lib/solr
 /var/lib/solr is another file system, it has lost+found directory.
 As result solr can't to star.
 Yours faithfully.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-5323) Solr requires -Dsolr.clustering.enabled=false when pointing at example config

2013-10-09 Thread John Berryman (JIRA)
John Berryman created SOLR-5323:
---

 Summary: Solr requires -Dsolr.clustering.enabled=false when 
pointing at example config
 Key: SOLR-5323
 URL: https://issues.apache.org/jira/browse/SOLR-5323
 Project: Solr
  Issue Type: Bug
  Components: contrib - Clustering
Affects Versions: 4.5
 Environment: vanilla mac
Reporter: John Berryman
 Fix For: 5.0, 4.6


my typical use of Solr is something like this: 

cd SOLR_HOME/example
cp -r solr /myProjectDir/solr_home
java -jar -Dsolr.solr.home=/myProjectDir/solr_home  start.jar

But in solr 4.5.0 this fails to start successfully. I get an error:

org.apache.solr.common.SolrException: Error loading class 
'solr.clustering.ClusteringComponent'

The reason is because solr.clustering.enabled defaults to true now. I don't 
know why this might be the case.

you can get around it with 

java -jar -Dsolr.solr.home=/myProjectDir/solr_home 
-Dsolr.clustering.enabled=false start.jar

SOLR-4708 is when this became an issue.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5323) Solr requires -Dsolr.clustering.enabled=false when pointing at example config

2013-10-09 Thread Erik Hatcher (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Hatcher updated SOLR-5323:
---

Description: 
my typical use of Solr is something like this: 

{code}
cd SOLR_HOME/example
cp -r solr /myProjectDir/solr_home
java -jar -Dsolr.solr.home=/myProjectDir/solr_home  start.jar
{code}

But in solr 4.5.0 this fails to start successfully. I get an error:

{code}
org.apache.solr.common.SolrException: Error loading class 
'solr.clustering.ClusteringComponent'
{code}

The reason is because solr.clustering.enabled defaults to true now. I don't 
know why this might be the case.

you can get around it with 

{code}
java -jar -Dsolr.solr.home=/myProjectDir/solr_home 
-Dsolr.clustering.enabled=false start.jar
{code}

SOLR-4708 is when this became an issue.

  was:
my typical use of Solr is something like this: 

cd SOLR_HOME/example
cp -r solr /myProjectDir/solr_home
java -jar -Dsolr.solr.home=/myProjectDir/solr_home  start.jar

But in solr 4.5.0 this fails to start successfully. I get an error:

org.apache.solr.common.SolrException: Error loading class 
'solr.clustering.ClusteringComponent'

The reason is because solr.clustering.enabled defaults to true now. I don't 
know why this might be the case.

you can get around it with 

java -jar -Dsolr.solr.home=/myProjectDir/solr_home 
-Dsolr.clustering.enabled=false start.jar

SOLR-4708 is when this became an issue.


 Solr requires -Dsolr.clustering.enabled=false when pointing at example config
 -

 Key: SOLR-5323
 URL: https://issues.apache.org/jira/browse/SOLR-5323
 Project: Solr
  Issue Type: Bug
  Components: contrib - Clustering
Affects Versions: 4.5
 Environment: vanilla mac
Reporter: John Berryman
 Fix For: 5.0, 4.6


 my typical use of Solr is something like this: 
 {code}
 cd SOLR_HOME/example
 cp -r solr /myProjectDir/solr_home
 java -jar -Dsolr.solr.home=/myProjectDir/solr_home  start.jar
 {code}
 But in solr 4.5.0 this fails to start successfully. I get an error:
 {code}
 org.apache.solr.common.SolrException: Error loading class 
 'solr.clustering.ClusteringComponent'
 {code}
 The reason is because solr.clustering.enabled defaults to true now. I don't 
 know why this might be the case.
 you can get around it with 
 {code}
 java -jar -Dsolr.solr.home=/myProjectDir/solr_home 
 -Dsolr.clustering.enabled=false start.jar
 {code}
 SOLR-4708 is when this became an issue.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5323) Solr requires -Dsolr.clustering.enabled=false when pointing at example config

2013-10-09 Thread Erik Hatcher (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790362#comment-13790362
 ] 

Erik Hatcher commented on SOLR-5323:


I think we should have the lib elements in solrconfig.xml be something like 
this:

{code}
  lib dir=${solr.install.dir}/contrib/clustering/lib/ regex=.*\.jar /
{code}

where solr.install.dir is a property defined by Solr automatically at startup 
that has the root of where Solr is installed.  I've done this manually by 
adjusting the configuration in this exact scenario (copying the example 
configuration and changing all lib's in this way and defining 
solr.install.dir on the command-line), but Solr should be able to do this 
better.

 Solr requires -Dsolr.clustering.enabled=false when pointing at example config
 -

 Key: SOLR-5323
 URL: https://issues.apache.org/jira/browse/SOLR-5323
 Project: Solr
  Issue Type: Bug
  Components: contrib - Clustering
Affects Versions: 4.5
 Environment: vanilla mac
Reporter: John Berryman
 Fix For: 5.0, 4.6


 my typical use of Solr is something like this: 
 {code}
 cd SOLR_HOME/example
 cp -r solr /myProjectDir/solr_home
 java -jar -Dsolr.solr.home=/myProjectDir/solr_home  start.jar
 {code}
 But in solr 4.5.0 this fails to start successfully. I get an error:
 {code}
 org.apache.solr.common.SolrException: Error loading class 
 'solr.clustering.ClusteringComponent'
 {code}
 The reason is because solr.clustering.enabled defaults to true now. I don't 
 know why this might be the case.
 you can get around it with 
 {code}
 java -jar -Dsolr.solr.home=/myProjectDir/solr_home 
 -Dsolr.clustering.enabled=false start.jar
 {code}
 SOLR-4708 is when this became an issue.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5266) Optimization of the direct PackedInts readers

2013-10-09 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-5266:


Attachment: LUCENE-5266.patch

here is a patch from playing around this morning.

I'm afraid of specialization here: but this one should help the relatively low 
bpv I think by using readShort ?

 Optimization of the direct PackedInts readers
 -

 Key: LUCENE-5266
 URL: https://issues.apache.org/jira/browse/LUCENE-5266
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Adrien Grand
Assignee: Adrien Grand
Priority: Minor
 Attachments: LUCENE-5266.patch


 Given that the initial focus for PackedInts readers was more on in-memory 
 readers (for storing stuff like the mapping from old to new doc IDs at 
 merging time), I never spent time trying to optimize the direct readers 
 although it could be beneficial now that they are used for disk-based doc 
 values.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-5324) Make sub shard replica recovery and shard state switch asynchronous

2013-10-09 Thread Shalin Shekhar Mangar (JIRA)
Shalin Shekhar Mangar created SOLR-5324:
---

 Summary: Make sub shard replica recovery and shard state switch 
asynchronous
 Key: SOLR-5324
 URL: https://issues.apache.org/jira/browse/SOLR-5324
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
Reporter: Shalin Shekhar Mangar
Assignee: Shalin Shekhar Mangar
 Fix For: 5.0, 4.6


Currently the shard split command waits for all replicas of all sub shards to 
recover and then switches the state of parent to inactive and sub-shards to 
active.

The problem is that shard split (ab)uses the CoreAdmin WaitForState action to 
ask the sub shard leader to wait until the replica states are active. This 
action is prone to timeout.

We should make the shard state switching asynchronous. Once all replicas of all 
sub-shards are 'active', the shard states should be switched automatically.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-1301) Add a Solr contrib that allows for building Solr indexes via Hadoop's Map-Reduce.

2013-10-09 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790411#comment-13790411
 ] 

Mark Miller commented on SOLR-1301:
---

I have a new patch I'm cleaning up that tackles some of the packaging:


* Split out solr-morphlines-core and solr-morphlines-cell into their own 
modules.

* Updated to trunk and the new modules are now using the new dependency version 
tracking system.

* Fixed an issue in the code around the TokenStream contract being violated - 
the latest code detected this and failed a test - end and close now called.

* Updated to use Morphlines from CDK 0.8.

* Setup the main class in the solr-mr jar manifest.

* I enabled an ignored test which exposed a few bugs because of the required 
solr.xml in Solr 5.0 - I addressed those bugs.

* Added a missing metrics health-check dependency that somehow popped up.

* I played around with naming the solr-mr artifact MapReduceIndexTool.jar, but 
the system really want's us to follow the rules of the artifacts and have 
something like solr-solr-mr-5.0.jar. Anything else has some random issues, such 
as with javadoc, and if your name does not start with solr-, it will be changed 
to start with lucene-. I'm not yet sure if it's worth the trouble to expand the 
system or use a different name, so for now it's still just using the default 
jar name based on the contrib module name (solr-mr).

Besides the naming issue, there are a couple other things to button up:

* How we are going to set up the classpath - script, in the manifest, leave it 
up to the user and doc, etc.

* All dependencies are currently in solr-morphlines-core - this was a simple 
way to split out the modules since solr-mr and solr-morphlines-cell depend on 
solr-morphlines-core.

Finally, we will probably need some help from [~steve_rowe] to get the Maven 
build setup correctly.

I spent a bunch of time trying to use asm to work around the hacked test policy 
issue. There are multiple problems I ran into. One is that another module uses 
asm 4.1, but Hadoop brings in asm 3.1 - if you are doing some asm coding, this 
can cause compile issues with your ide (at least eclipse). It also ends up 
being really hard to get an injection in the right place because of how the 
yarn code is structured. After spending a bunch of time trying to get this to 
work, I'm backing out and considering other options.

 Add a Solr contrib that allows for building Solr indexes via Hadoop's 
 Map-Reduce.
 -

 Key: SOLR-1301
 URL: https://issues.apache.org/jira/browse/SOLR-1301
 Project: Solr
  Issue Type: New Feature
Reporter: Andrzej Bialecki 
Assignee: Mark Miller
 Fix For: 4.6

 Attachments: commons-logging-1.0.4.jar, 
 commons-logging-api-1.0.4.jar, hadoop-0.19.1-core.jar, 
 hadoop-0.20.1-core.jar, hadoop-core-0.20.2-cdh3u3.jar, hadoop.patch, 
 log4j-1.2.15.jar, README.txt, SOLR-1301-hadoop-0-20.patch, 
 SOLR-1301-hadoop-0-20.patch, SOLR-1301.patch, SOLR-1301.patch, 
 SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
 SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
 SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, SOLR-1301.patch, 
 SOLR-1301.patch, SOLR-1301.patch, SolrRecordWriter.java


 This patch contains  a contrib module that provides distributed indexing 
 (using Hadoop) to Solr EmbeddedSolrServer. The idea behind this module is 
 twofold:
 * provide an API that is familiar to Hadoop developers, i.e. that of 
 OutputFormat
 * avoid unnecessary export and (de)serialization of data maintained on HDFS. 
 SolrOutputFormat consumes data produced by reduce tasks directly, without 
 storing it in intermediate files. Furthermore, by using an 
 EmbeddedSolrServer, the indexing task is split into as many parts as there 
 are reducers, and the data to be indexed is not sent over the network.
 Design
 --
 Key/value pairs produced by reduce tasks are passed to SolrOutputFormat, 
 which in turn uses SolrRecordWriter to write this data. SolrRecordWriter 
 instantiates an EmbeddedSolrServer, and it also instantiates an 
 implementation of SolrDocumentConverter, which is responsible for turning 
 Hadoop (key, value) into a SolrInputDocument. This data is then added to a 
 batch, which is periodically submitted to EmbeddedSolrServer. When reduce 
 task completes, and the OutputFormat is closed, SolrRecordWriter calls 
 commit() and optimize() on the EmbeddedSolrServer.
 The API provides facilities to specify an arbitrary existing solr.home 
 directory, from which the conf/ and lib/ files will be taken.
 This process results in the creation of as many partial Solr home directories 
 as there were reduce tasks. The output shards are placed in the output 
 directory on the default 

[jira] [Commented] (LUCENE-5264) CommonTermsQuery ignores minMustMatch if only high freq terms are present.

2013-10-09 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790413#comment-13790413
 ] 

ASF subversion and git services commented on LUCENE-5264:
-

Commit 1530651 from [~simonw] in branch 'dev/trunk'
[ https://svn.apache.org/r1530651 ]

LUCENE-5264: CommonTermsQuery ignores minMustMatch if only high freq terms are 
present

 CommonTermsQuery ignores minMustMatch if only high freq terms are present.
 --

 Key: LUCENE-5264
 URL: https://issues.apache.org/jira/browse/LUCENE-5264
 Project: Lucene - Core
  Issue Type: Bug
  Components: modules/other
Affects Versions: 5.0, 4.5
Reporter: Simon Willnauer
Assignee: Simon Willnauer
Priority: Minor
 Fix For: 5.0, 4.6

 Attachments: LUCENE-5264.patch


 if we only have high freq terms we move to a pure conjunction and ignore the 
 min must match entirely if it is  0.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5264) CommonTermsQuery ignores minMustMatch if only high freq terms are present.

2013-10-09 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790421#comment-13790421
 ] 

ASF subversion and git services commented on LUCENE-5264:
-

Commit 1530657 from [~simonw] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1530657 ]

LUCENE-5264: CommonTermsQuery ignores minMustMatch if only high freq terms are 
present

 CommonTermsQuery ignores minMustMatch if only high freq terms are present.
 --

 Key: LUCENE-5264
 URL: https://issues.apache.org/jira/browse/LUCENE-5264
 Project: Lucene - Core
  Issue Type: Bug
  Components: modules/other
Affects Versions: 5.0, 4.5
Reporter: Simon Willnauer
Assignee: Simon Willnauer
Priority: Minor
 Fix For: 5.0, 4.6

 Attachments: LUCENE-5264.patch


 if we only have high freq terms we move to a pure conjunction and ignore the 
 min must match entirely if it is  0.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2548) Multithreaded faceting

2013-10-09 Thread Markus Jelsma (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790425#comment-13790425
 ] 

Markus Jelsma commented on SOLR-2548:
-

I'm having a hard time measuring performance differenes without and with 
facet.threads. On my development machine, there are no differences on warmed 
indexes, both measure around 1ms. They're also almost identical after 
stop/start of Jetty with no warm up queries, around 40ms, after that, fast 
again. We're facetting on four fields this time, there are also four threads.

 Multithreaded faceting
 --

 Key: SOLR-2548
 URL: https://issues.apache.org/jira/browse/SOLR-2548
 Project: Solr
  Issue Type: Improvement
  Components: search
Affects Versions: 3.1
Reporter: Janne Majaranta
Assignee: Erick Erickson
Priority: Minor
  Labels: facet
 Fix For: 4.5, 5.0

 Attachments: SOLR-2548_4.2.1.patch, SOLR-2548_for_31x.patch, 
 SOLR-2548_multithreaded_faceting,_dsmiley.patch, 
 SOLR-2548_multithreaded_faceting,_dsmiley.patch, SOLR-2548.patch, 
 SOLR-2548.patch, SOLR-2548.patch, SOLR-2548.patch, SOLR-2548.patch, 
 SOLR-2548.patch


 Add multithreading support for faceting.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5324) Make sub shard replica recovery and shard state switch asynchronous

2013-10-09 Thread Shalin Shekhar Mangar (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar updated SOLR-5324:


Attachment: SOLR-5324.patch

Changes:
# A new shard state: 'recovery' is added
# After all sub-shard replicas have been created, the sub-shard state is set to 
'recovery'. If replication factor is 1 then the sub-shards are set to 'active'. 
The splitshard API returns at this point.
# The state change events in the overseer are used to track when all replicas 
of all sub-shards become 'active'. Once that happens, the parent shard is set 
to inactive and the sub-shards are set to 'active'.
# To facilitate the above, a slice property called 'parent' is introduced which 
is removed once the slice becomes 'active'.
# If the split is retried then we use the 'deleteshard' api to completely 
remove the sub-shards before starting the splitting process.

 Make sub shard replica recovery and shard state switch asynchronous
 ---

 Key: SOLR-5324
 URL: https://issues.apache.org/jira/browse/SOLR-5324
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
Reporter: Shalin Shekhar Mangar
Assignee: Shalin Shekhar Mangar
 Fix For: 5.0, 4.6

 Attachments: SOLR-5324.patch


 Currently the shard split command waits for all replicas of all sub shards to 
 recover and then switches the state of parent to inactive and sub-shards to 
 active.
 The problem is that shard split (ab)uses the CoreAdmin WaitForState action to 
 ask the sub shard leader to wait until the replica states are active. This 
 action is prone to timeout.
 We should make the shard state switching asynchronous. Once all replicas of 
 all sub-shards are 'active', the shard states should be switched 
 automatically.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2548) Multithreaded faceting

2013-10-09 Thread Markus Jelsma (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790435#comment-13790435
 ] 

Markus Jelsma commented on SOLR-2548:
-

Alright, i took another index and facetted on much more fields and now i see a 
small improvement after start up of about 12%. It is not much, perhaps this 
machine is too fast in this case.

 Multithreaded faceting
 --

 Key: SOLR-2548
 URL: https://issues.apache.org/jira/browse/SOLR-2548
 Project: Solr
  Issue Type: Improvement
  Components: search
Affects Versions: 3.1
Reporter: Janne Majaranta
Assignee: Erick Erickson
Priority: Minor
  Labels: facet
 Fix For: 4.5, 5.0

 Attachments: SOLR-2548_4.2.1.patch, SOLR-2548_for_31x.patch, 
 SOLR-2548_multithreaded_faceting,_dsmiley.patch, 
 SOLR-2548_multithreaded_faceting,_dsmiley.patch, SOLR-2548.patch, 
 SOLR-2548.patch, SOLR-2548.patch, SOLR-2548.patch, SOLR-2548.patch, 
 SOLR-2548.patch


 Add multithreading support for faceting.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-5264) CommonTermsQuery ignores minMustMatch if only high freq terms are present.

2013-10-09 Thread Simon Willnauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer resolved LUCENE-5264.
-

   Resolution: Fixed
Lucene Fields: New,Patch Available  (was: New)

 CommonTermsQuery ignores minMustMatch if only high freq terms are present.
 --

 Key: LUCENE-5264
 URL: https://issues.apache.org/jira/browse/LUCENE-5264
 Project: Lucene - Core
  Issue Type: Bug
  Components: modules/other
Affects Versions: 5.0, 4.5
Reporter: Simon Willnauer
Assignee: Simon Willnauer
Priority: Minor
 Fix For: 5.0, 4.6

 Attachments: LUCENE-5264.patch


 if we only have high freq terms we move to a pure conjunction and ignore the 
 min must match entirely if it is  0.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5323) Solr requires -Dsolr.clustering.enabled=false when pointing at example config

2013-10-09 Thread Erik Hatcher (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790452#comment-13790452
 ] 

Erik Hatcher commented on SOLR-5323:


This isn't specific to the clustering component, except that it gets loaded 
non-lazily.  See these comments: 
https://issues.apache.org/jira/browse/SOLR-4708?focusedCommentId=13630567page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13630567

 Solr requires -Dsolr.clustering.enabled=false when pointing at example config
 -

 Key: SOLR-5323
 URL: https://issues.apache.org/jira/browse/SOLR-5323
 Project: Solr
  Issue Type: Bug
  Components: contrib - Clustering
Affects Versions: 4.5
 Environment: vanilla mac
Reporter: John Berryman
 Fix For: 5.0, 4.6


 my typical use of Solr is something like this: 
 {code}
 cd SOLR_HOME/example
 cp -r solr /myProjectDir/solr_home
 java -jar -Dsolr.solr.home=/myProjectDir/solr_home  start.jar
 {code}
 But in solr 4.5.0 this fails to start successfully. I get an error:
 {code}
 org.apache.solr.common.SolrException: Error loading class 
 'solr.clustering.ClusteringComponent'
 {code}
 The reason is because solr.clustering.enabled defaults to true now. I don't 
 know why this might be the case.
 you can get around it with 
 {code}
 java -jar -Dsolr.solr.home=/myProjectDir/solr_home 
 -Dsolr.clustering.enabled=false start.jar
 {code}
 SOLR-4708 is when this became an issue.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5255) Make DocumentsWriter reference final in IW

2013-10-09 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790519#comment-13790519
 ] 

ASF subversion and git services commented on LUCENE-5255:
-

Commit 1530679 from [~simonw] in branch 'dev/trunk'
[ https://svn.apache.org/r1530679 ]

LUCENE-5255: Make DocumentsWriter reference final in IW

 Make DocumentsWriter reference final in IW
 --

 Key: LUCENE-5255
 URL: https://issues.apache.org/jira/browse/LUCENE-5255
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/index
Affects Versions: 5.0, 4.6
Reporter: Simon Willnauer
 Fix For: 5.0, 4.6

 Attachments: LUCENE-5255.patch


 the DocumentWriter ref is nulled on close which seems unnecessary altogether. 
 We can just make it final instead.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5255) Make DocumentsWriter reference final in IW

2013-10-09 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790529#comment-13790529
 ] 

ASF subversion and git services commented on LUCENE-5255:
-

Commit 1530685 from [~simonw] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1530685 ]

LUCENE-5255: Make DocumentsWriter reference final in IW

 Make DocumentsWriter reference final in IW
 --

 Key: LUCENE-5255
 URL: https://issues.apache.org/jira/browse/LUCENE-5255
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/index
Affects Versions: 5.0, 4.6
Reporter: Simon Willnauer
 Fix For: 5.0, 4.6

 Attachments: LUCENE-5255.patch


 the DocumentWriter ref is nulled on close which seems unnecessary altogether. 
 We can just make it final instead.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2548) Multithreaded faceting

2013-10-09 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790544#comment-13790544
 ] 

David Smiley commented on SOLR-2548:


Multithreaded faceting is useful when your CPU core count is much greater than 
the number of Solr cores you have, and you have a ton of data and need to facet 
on multiple fields.  You could theoretically get similar results by sharding 
more but you should limit sharding based on disk IO capabilities (especially 
when there's so much it won't get in RAM), which isn't necessary one-for-one 
with the CPU count.

 Multithreaded faceting
 --

 Key: SOLR-2548
 URL: https://issues.apache.org/jira/browse/SOLR-2548
 Project: Solr
  Issue Type: Improvement
  Components: search
Affects Versions: 3.1
Reporter: Janne Majaranta
Assignee: Erick Erickson
Priority: Minor
  Labels: facet
 Fix For: 4.5, 5.0

 Attachments: SOLR-2548_4.2.1.patch, SOLR-2548_for_31x.patch, 
 SOLR-2548_multithreaded_faceting,_dsmiley.patch, 
 SOLR-2548_multithreaded_faceting,_dsmiley.patch, SOLR-2548.patch, 
 SOLR-2548.patch, SOLR-2548.patch, SOLR-2548.patch, SOLR-2548.patch, 
 SOLR-2548.patch


 Add multithreading support for faceting.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5261) add simple API to build queries from analysis chain

2013-10-09 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790547#comment-13790547
 ] 

ASF subversion and git services commented on LUCENE-5261:
-

Commit 1530693 from [~rcmuir] in branch 'dev/trunk'
[ https://svn.apache.org/r1530693 ]

LUCENE-5261: add simple API to build queries from analysis chain

 add simple API to build queries from analysis chain
 ---

 Key: LUCENE-5261
 URL: https://issues.apache.org/jira/browse/LUCENE-5261
 Project: Lucene - Core
  Issue Type: New Feature
Reporter: Robert Muir
 Attachments: LUCENE-5261.patch, LUCENE-5261.patch, LUCENE-5261.patch


 Currently this is pretty crazy stuff.
 Additionally its duplicated in like 3 or 4 places in our codebase (i noticed 
 it doing LUCENE-5259)
 We can solve that duplication, and make it easy to simply create queries from 
 an analyzer (its been asked on the user list), as well as make it easier to 
 build new queryparsers.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5323) Solr requires -Dsolr.clustering.enabled=false when pointing at example config

2013-10-09 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790551#comment-13790551
 ] 

Yonik Seeley commented on SOLR-5323:


Hmmm, I agree this is a bug.
My comment in SOLR-4708 was +1, provided that everything (except clustering) 
still works if you copy example somewhere else.

 Solr requires -Dsolr.clustering.enabled=false when pointing at example config
 -

 Key: SOLR-5323
 URL: https://issues.apache.org/jira/browse/SOLR-5323
 Project: Solr
  Issue Type: Bug
  Components: contrib - Clustering
Affects Versions: 4.5
 Environment: vanilla mac
Reporter: John Berryman
 Fix For: 5.0, 4.6


 my typical use of Solr is something like this: 
 {code}
 cd SOLR_HOME/example
 cp -r solr /myProjectDir/solr_home
 java -jar -Dsolr.solr.home=/myProjectDir/solr_home  start.jar
 {code}
 But in solr 4.5.0 this fails to start successfully. I get an error:
 {code}
 org.apache.solr.common.SolrException: Error loading class 
 'solr.clustering.ClusteringComponent'
 {code}
 The reason is because solr.clustering.enabled defaults to true now. I don't 
 know why this might be the case.
 you can get around it with 
 {code}
 java -jar -Dsolr.solr.home=/myProjectDir/solr_home 
 -Dsolr.clustering.enabled=false start.jar
 {code}
 SOLR-4708 is when this became an issue.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5323) Solr requires -Dsolr.clustering.enabled=false when pointing at example config

2013-10-09 Thread Erik Hatcher (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790568#comment-13790568
 ] 

Erik Hatcher commented on SOLR-5323:


bq.  My comment in SOLR-4708 was +1, provided that everything (except 
clustering) still works if you copy example somewhere else.

And that's the reason I didn't commit it before.  I thought somehow Dawid had 
worked some magic to alleviate this issue when he took it on.

We should perhaps have lazy loaded SearchComponents too? 

 Solr requires -Dsolr.clustering.enabled=false when pointing at example config
 -

 Key: SOLR-5323
 URL: https://issues.apache.org/jira/browse/SOLR-5323
 Project: Solr
  Issue Type: Bug
  Components: contrib - Clustering
Affects Versions: 4.5
 Environment: vanilla mac
Reporter: John Berryman
 Fix For: 5.0, 4.6


 my typical use of Solr is something like this: 
 {code}
 cd SOLR_HOME/example
 cp -r solr /myProjectDir/solr_home
 java -jar -Dsolr.solr.home=/myProjectDir/solr_home  start.jar
 {code}
 But in solr 4.5.0 this fails to start successfully. I get an error:
 {code}
 org.apache.solr.common.SolrException: Error loading class 
 'solr.clustering.ClusteringComponent'
 {code}
 The reason is because solr.clustering.enabled defaults to true now. I don't 
 know why this might be the case.
 you can get around it with 
 {code}
 java -jar -Dsolr.solr.home=/myProjectDir/solr_home 
 -Dsolr.clustering.enabled=false start.jar
 {code}
 SOLR-4708 is when this became an issue.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5324) Make sub shard replica recovery and shard state switch asynchronous

2013-10-09 Thread Shalin Shekhar Mangar (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar updated SOLR-5324:


Attachment: SOLR-5324.patch

# On unsuccessful replica recovery, the sub-shard state was incorrectly being 
set active
# The split by route field test should wait for the right collection to recover

 Make sub shard replica recovery and shard state switch asynchronous
 ---

 Key: SOLR-5324
 URL: https://issues.apache.org/jira/browse/SOLR-5324
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
Reporter: Shalin Shekhar Mangar
Assignee: Shalin Shekhar Mangar
 Fix For: 5.0, 4.6

 Attachments: SOLR-5324.patch, SOLR-5324.patch


 Currently the shard split command waits for all replicas of all sub shards to 
 recover and then switches the state of parent to inactive and sub-shards to 
 active.
 The problem is that shard split (ab)uses the CoreAdmin WaitForState action to 
 ask the sub shard leader to wait until the replica states are active. This 
 action is prone to timeout.
 We should make the shard state switching asynchronous. Once all replicas of 
 all sub-shards are 'active', the shard states should be switched 
 automatically.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-5325) zk connection loss causes overseer leader loss

2013-10-09 Thread Christine Poerschke (JIRA)
Christine Poerschke created SOLR-5325:
-

 Summary: zk connection loss causes overseer leader loss
 Key: SOLR-5325
 URL: https://issues.apache.org/jira/browse/SOLR-5325
 Project: Solr
  Issue Type: Bug
Reporter: Christine Poerschke






--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5325) zk connection loss causes overseer leader loss

2013-10-09 Thread Christine Poerschke (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christine Poerschke updated SOLR-5325:
--

  Description: 
The problem we saw was that when the solr overseer leader experienced temporary 
zk connectivity problems it stopped processing overseer queue events.

This first happened when quorum within the external zk ensemble was lost due to 
too many zookeepers being stopped (similar to SOLR-5199). The second time it 
happened when there was a sufficient number of zookeepers but they were holding 
zookeeper leadership elections and thus refused connections (the elections were 
taking several seconds, we were using the default zookeeper.cnxTimeout=5s value 
and it was hit for one ensemble member).

Affects Version/s: 4.3
   4.4

 zk connection loss causes overseer leader loss
 --

 Key: SOLR-5325
 URL: https://issues.apache.org/jira/browse/SOLR-5325
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.3, 4.4
Reporter: Christine Poerschke

 The problem we saw was that when the solr overseer leader experienced 
 temporary zk connectivity problems it stopped processing overseer queue 
 events.
 This first happened when quorum within the external zk ensemble was lost due 
 to too many zookeepers being stopped (similar to SOLR-5199). The second time 
 it happened when there was a sufficient number of zookeepers but they were 
 holding zookeeper leadership elections and thus refused connections (the 
 elections were taking several seconds, we were using the default 
 zookeeper.cnxTimeout=5s value and it was hit for one ensemble member).



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5261) add simple API to build queries from analysis chain

2013-10-09 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790580#comment-13790580
 ] 

ASF subversion and git services commented on LUCENE-5261:
-

Commit 1530701 from [~rcmuir] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1530701 ]

LUCENE-5261: add simple API to build queries from analysis chain

 add simple API to build queries from analysis chain
 ---

 Key: LUCENE-5261
 URL: https://issues.apache.org/jira/browse/LUCENE-5261
 Project: Lucene - Core
  Issue Type: New Feature
Reporter: Robert Muir
 Attachments: LUCENE-5261.patch, LUCENE-5261.patch, LUCENE-5261.patch


 Currently this is pretty crazy stuff.
 Additionally its duplicated in like 3 or 4 places in our codebase (i noticed 
 it doing LUCENE-5259)
 We can solve that duplication, and make it easy to simply create queries from 
 an analyzer (its been asked on the user list), as well as make it easier to 
 build new queryparsers.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-5261) add simple API to build queries from analysis chain

2013-10-09 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir resolved LUCENE-5261.
-

   Resolution: Fixed
Fix Version/s: 4.6
   5.0

 add simple API to build queries from analysis chain
 ---

 Key: LUCENE-5261
 URL: https://issues.apache.org/jira/browse/LUCENE-5261
 Project: Lucene - Core
  Issue Type: New Feature
Reporter: Robert Muir
 Fix For: 5.0, 4.6

 Attachments: LUCENE-5261.patch, LUCENE-5261.patch, LUCENE-5261.patch


 Currently this is pretty crazy stuff.
 Additionally its duplicated in like 3 or 4 places in our codebase (i noticed 
 it doing LUCENE-5259)
 We can solve that duplication, and make it easy to simply create queries from 
 an analyzer (its been asked on the user list), as well as make it easier to 
 build new queryparsers.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5307) Solr 4.5 collection api ignores collection.configName when used in cloud

2013-10-09 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790579#comment-13790579
 ] 

Mark Miller commented on SOLR-5307:
---

Ouch - this sounds like a pretty bad bug.

 Solr 4.5 collection api ignores collection.configName when used in cloud
 

 Key: SOLR-5307
 URL: https://issues.apache.org/jira/browse/SOLR-5307
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.5
Reporter: Nathan Neulinger
  Labels: cloud, collection-api, zookeeper

 This worked properly in 4.4, but on 4.5, specifying collection.configName 
 when creating a collection doesn't work - it gets the default regardless of 
 what has been uploaded into zk. Explicitly linking config name to collection 
 ahead of time with zkcli.sh is a workaround I'm using for the moment, but 
 that did not appear to be necessary with 4.4 unless I was doing something 
 wrong and not realizing it.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5325) zk connection loss causes overseer leader loss

2013-10-09 Thread Christine Poerschke (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christine Poerschke updated SOLR-5325:
--

Attachment: SOLR-5325.patch

Attaching Overseer.java patch for solr 4.4.0, OverseerCollectionProcessor.java 
could be changed in similar way.

 zk connection loss causes overseer leader loss
 --

 Key: SOLR-5325
 URL: https://issues.apache.org/jira/browse/SOLR-5325
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.3, 4.4
Reporter: Christine Poerschke
 Attachments: SOLR-5325.patch


 The problem we saw was that when the solr overseer leader experienced 
 temporary zk connectivity problems it stopped processing overseer queue 
 events.
 This first happened when quorum within the external zk ensemble was lost due 
 to too many zookeepers being stopped (similar to SOLR-5199). The second time 
 it happened when there was a sufficient number of zookeepers but they were 
 holding zookeeper leadership elections and thus refused connections (the 
 elections were taking several seconds, we were using the default 
 zookeeper.cnxTimeout=5s value and it was hit for one ensemble member).



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5213) collections?action=SPLITSHARD parent vs. sub-shards numDocs

2013-10-09 Thread Christine Poerschke (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790590#comment-13790590
 ] 

Christine Poerschke commented on SOLR-5213:
---

Two occurrences of lost documents were seen. The one with the majority of 
documents lost was tracked down to operational error (shardX files were copied 
to be shardY files), a second loss was of a few dozen documents only, for that 
never figured out if it was operational or something else. Other shard splits 
since then were fine i.e. no losses.

 collections?action=SPLITSHARD parent vs. sub-shards numDocs
 ---

 Key: SOLR-5213
 URL: https://issues.apache.org/jira/browse/SOLR-5213
 Project: Solr
  Issue Type: Improvement
  Components: update
Affects Versions: 4.4
Reporter: Christine Poerschke
Assignee: Shalin Shekhar Mangar
 Attachments: SOLR-5213.patch


 The problem we saw was that splitting a shard took a long time and at the end 
 of it the sub-shards contained fewer documents than the original shard.
 The root cause was eventually tracked down to the disappearing documents not 
 falling into the hash ranges of the sub-shards.
 Could SolrIndexSplitter split report per-segment numDocs for parent and 
 sub-shards, with at least a warning logged for any discrepancies (documents 
 falling into none of the sub-shards or documents falling into several 
 sub-shards)?
 Additionally, could a case be made for erroring out when discrepancies are 
 detected i.e. not proceeding with the shard split? Either to always error or 
 to have an verifyNumDocs=false/true optional parameter for the SPLITSHARD 
 action.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5213) collections?action=SPLITSHARD parent vs. sub-shards numDocs

2013-10-09 Thread Shalin Shekhar Mangar (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790608#comment-13790608
 ] 

Shalin Shekhar Mangar commented on SOLR-5213:
-

I'm seeing similar problems as well on the ShardSplitTest sporadically. I've 
opened SOLR-5309 to track it.

I'll review and commit your patch shortly.

 collections?action=SPLITSHARD parent vs. sub-shards numDocs
 ---

 Key: SOLR-5213
 URL: https://issues.apache.org/jira/browse/SOLR-5213
 Project: Solr
  Issue Type: Improvement
  Components: update
Affects Versions: 4.4
Reporter: Christine Poerschke
Assignee: Shalin Shekhar Mangar
 Attachments: SOLR-5213.patch


 The problem we saw was that splitting a shard took a long time and at the end 
 of it the sub-shards contained fewer documents than the original shard.
 The root cause was eventually tracked down to the disappearing documents not 
 falling into the hash ranges of the sub-shards.
 Could SolrIndexSplitter split report per-segment numDocs for parent and 
 sub-shards, with at least a warning logged for any discrepancies (documents 
 falling into none of the sub-shards or documents falling into several 
 sub-shards)?
 Additionally, could a case be made for erroring out when discrepancies are 
 detected i.e. not proceeding with the shard split? Either to always error or 
 to have an verifyNumDocs=false/true optional parameter for the SPLITSHARD 
 action.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5307) Solr 4.5 collection api ignores collection.configName when used in cloud

2013-10-09 Thread Shalin Shekhar Mangar (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790612#comment-13790612
 ] 

Shalin Shekhar Mangar commented on SOLR-5307:
-

bq. Ouch - this sounds like a pretty bad bug.

Yeah, SOLR-5317 too.

 Solr 4.5 collection api ignores collection.configName when used in cloud
 

 Key: SOLR-5307
 URL: https://issues.apache.org/jira/browse/SOLR-5307
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.5
Reporter: Nathan Neulinger
  Labels: cloud, collection-api, zookeeper

 This worked properly in 4.4, but on 4.5, specifying collection.configName 
 when creating a collection doesn't work - it gets the default regardless of 
 what has been uploaded into zk. Explicitly linking config name to collection 
 ahead of time with zkcli.sh is a workaround I'm using for the moment, but 
 that did not appear to be necessary with 4.4 unless I was doing something 
 wrong and not realizing it.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Assigned] (SOLR-5307) Solr 4.5 collection api ignores collection.configName when used in cloud

2013-10-09 Thread Mark Miller (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller reassigned SOLR-5307:
-

Assignee: Mark Miller

 Solr 4.5 collection api ignores collection.configName when used in cloud
 

 Key: SOLR-5307
 URL: https://issues.apache.org/jira/browse/SOLR-5307
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.5
Reporter: Nathan Neulinger
Assignee: Mark Miller
  Labels: cloud, collection-api, zookeeper

 This worked properly in 4.4, but on 4.5, specifying collection.configName 
 when creating a collection doesn't work - it gets the default regardless of 
 what has been uploaded into zk. Explicitly linking config name to collection 
 ahead of time with zkcli.sh is a workaround I'm using for the moment, but 
 that did not appear to be necessary with 4.4 unless I was doing something 
 wrong and not realizing it.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Assigned] (SOLR-5306) can not create collection when have over one config

2013-10-09 Thread Mark Miller (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller reassigned SOLR-5306:
-

Assignee: Mark Miller

 can not create collection when have over one config
 ---

 Key: SOLR-5306
 URL: https://issues.apache.org/jira/browse/SOLR-5306
 Project: Solr
  Issue Type: Bug
  Components: clients - java
Affects Versions: 4.5
 Environment: win7 jdk 7
Reporter: Liang Tianyu
Assignee: Mark Miller
Priority: Critical

 I have uploaded zookeeper two config: patent and applicant. I can not create 
 collection:http://localhost:8080/solr/admin/collections?action=CREATEname=patent_main_1numShards=1collection.configName=patent.show
  errors:patent_main_1_shard1_replica1: 
 org.apache.solr.common.cloud.ZooKeeperException:org.apache.solr.common.cloud.ZooKeeperException:
  Could not find configName for collection patent_main_1 found:[applicant, 
 patent]. In solr 4.4 I can create sucessfully.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5213) collections?action=SPLITSHARD parent vs. sub-shards numDocs

2013-10-09 Thread Christine Poerschke (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790657#comment-13790657
 ] 

Christine Poerschke commented on SOLR-5213:
---

A variation of the patch i uploaded here would be to 'rescue' (and id+hash log) 
any documents that would have been lost otherwise e.g. always put them in the 
first sub-shard, they don't belong there but at least that way they are not 
lost and could be analysed and dealt with later on.

 collections?action=SPLITSHARD parent vs. sub-shards numDocs
 ---

 Key: SOLR-5213
 URL: https://issues.apache.org/jira/browse/SOLR-5213
 Project: Solr
  Issue Type: Improvement
  Components: update
Affects Versions: 4.4
Reporter: Christine Poerschke
Assignee: Shalin Shekhar Mangar
 Attachments: SOLR-5213.patch


 The problem we saw was that splitting a shard took a long time and at the end 
 of it the sub-shards contained fewer documents than the original shard.
 The root cause was eventually tracked down to the disappearing documents not 
 falling into the hash ranges of the sub-shards.
 Could SolrIndexSplitter split report per-segment numDocs for parent and 
 sub-shards, with at least a warning logged for any discrepancies (documents 
 falling into none of the sub-shards or documents falling into several 
 sub-shards)?
 Additionally, could a case be made for erroring out when discrepancies are 
 detected i.e. not proceeding with the shard split? Either to always error or 
 to have an verifyNumDocs=false/true optional parameter for the SPLITSHARD 
 action.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-5326) admin/collections?action=SPLITSHARD support for multiple shards

2013-10-09 Thread Christine Poerschke (JIRA)
Christine Poerschke created SOLR-5326:
-

 Summary: admin/collections?action=SPLITSHARD support for multiple 
shards
 Key: SOLR-5326
 URL: https://issues.apache.org/jira/browse/SOLR-5326
 Project: Solr
  Issue Type: New Feature
Affects Versions: 4.4
Reporter: Christine Poerschke


The problem we saw was that splitting one shard took 'a long time' (around 4 
hours) and with there being 'many' (8 at the time) shards to split and the solr 
overseer serialising action=SPLITSHARD requests a full collection split would 
have taken 'a very long time'.

Separately, shard splitting distributing replica2, replica3, etc. of each shard 
randomly across machines was not desirable and as in SOLR-5004 splitting into 
'n' rather than '2' sub-shards was useful.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5263) Deletes may be silently lost if an IOException is hit and later not hit (e.g., disk fills up and then frees up)

2013-10-09 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790677#comment-13790677
 ] 

ASF subversion and git services commented on LUCENE-5263:
-

Commit 1530741 from [~mikemccand] in branch 'dev/trunk'
[ https://svn.apache.org/r1530741 ]

LUCENE-5263: remove extra deleter.checkpoint

 Deletes may be silently lost if an IOException is hit and later not hit 
 (e.g., disk fills up and then frees up)
 ---

 Key: LUCENE-5263
 URL: https://issues.apache.org/jira/browse/LUCENE-5263
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/index
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 4.6, 5.0

 Attachments: LUCENE-5263.patch, LUCENE-5263.patch


 This case is tricky to handle, yet I think realistic: disk fills up
 temporarily, causes an exception in writeLiveDocs, and then the app
 keeps using the IW instance.
 Meanwhile disk later frees up again, IW is closed successfully.  In
 certain cases, we can silently lose deletes in this case.
 I had already committed
 TestIndexWriterDeletes.testNoLostDeletesOnDiskFull, and Jenkins seems
 happy with it so far, but when I added fangs to the test (cutover to
 RandomIndexWriter from IndexWriter, allow IOE during getReader, add
 randomness to when exc is thrown, etc.), it uncovered some real/nasty
 bugs:
   * ReaderPool.dropAll was suppressing any exception it hit, because
 {code}if (priorE != null){code} should instead be {code}if (priorE == 
 null){code}
   * After a merge, we have to write deletes before committing the
 segment, because an exception when writing deletes means we need
 to abort the merge
   * Several places that were directly calling deleter.checkpoint must
 also increment the changeCount else on close IW thinks there are
 no changes and doesn't write a new segments file.
   * closeInternal was dropping pooled readers after writing the
 segments file, which would lose deletes still buffered due to a
 previous exc.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-5268) Cutover more postings formats to the inverted pull API

2013-10-09 Thread Michael McCandless (JIRA)
Michael McCandless created LUCENE-5268:
--

 Summary: Cutover more postings formats to the inverted pull API
 Key: LUCENE-5268
 URL: https://issues.apache.org/jira/browse/LUCENE-5268
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/index
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 5.0


In LUCENE-5123, we added a new, more flexible, pull API for writing
postings.  This API allows the postings format to iterate the
fields/terms/postings more than once, and mirrors the API for writing
doc values.

But that was just the first step (only SimpleText was cutover to the
new API).  I want to cutover more components, so we can (finally)
e.g. play with different encodings depending on the term's postings,
such as using a bitset for high freq DOCS_ONLY terms (LUCENE-5052).




--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5268) Cutover more postings formats to the inverted pull API

2013-10-09 Thread Michael McCandless (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-5268:
---

Attachment: LUCENE-5268.patch

Patch with these changes:

  * Cutover BlockTreeTermsWriter, BlockTermsWriter, FST/OrdTermsWriter
from PushFieldsConsumer to FieldsConsumer

  * Changed PostingsBaseWriter to a pull API, with a single method
to write the current term's postings, and then added a new
PushPostingsBaseWriter that has the push API.

  * Cutover some formats to new PostingsBaseWriter; pulsing and bloom
were nice cleanups.  For the rest I just switched them to
PushPostingsBaseWriter.

  * Only two PushFieldsConsumers remain: MemoryPF and RAMOnlyPF
(test-framework); I'm tempted to just cut those over and then
remove PushFieldsConsumer here.

Still a few nocommits but I think it's close ...


 Cutover more postings formats to the inverted pull API
 

 Key: LUCENE-5268
 URL: https://issues.apache.org/jira/browse/LUCENE-5268
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/index
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 5.0

 Attachments: LUCENE-5268.patch


 In LUCENE-5123, we added a new, more flexible, pull API for writing
 postings.  This API allows the postings format to iterate the
 fields/terms/postings more than once, and mirrors the API for writing
 doc values.
 But that was just the first step (only SimpleText was cutover to the
 new API).  I want to cutover more components, so we can (finally)
 e.g. play with different encodings depending on the term's postings,
 such as using a bitset for high freq DOCS_ONLY terms (LUCENE-5052).



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5248) Improve the data structure used in ReaderAndLiveDocs to hold the updates

2013-10-09 Thread Shai Erera (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera updated LUCENE-5248:
---

Attachment: LUCENE-5248.patch

Patch replaces MapNumericFieldUpdates with PackedNumericFieldUpdates which hold 
the docs/values data in PagedMutable and PagedGrowableWriter respectively. It 
also holds a FixedBitSet the size of maxDoc to mark which documents have a 
numeric value (e.g. for unsetting a value from a document).

 Improve the data structure used in ReaderAndLiveDocs to hold the updates
 

 Key: LUCENE-5248
 URL: https://issues.apache.org/jira/browse/LUCENE-5248
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/index
Reporter: Shai Erera
Assignee: Shai Erera
 Attachments: LUCENE-5248.patch, LUCENE-5248.patch, LUCENE-5248.patch, 
 LUCENE-5248.patch


 Currently ReaderAndLiveDocs holds the updates in two structures:
 +MapString,MapInteger,Long+
 Holds a mapping from each field, to all docs that were updated and their 
 values. This structure is updated when applyDeletes is called, and needs to 
 satisfy several requirements:
 # Un-ordered writes: if a field f is updated by two terms, termA and termB, 
 in that order, and termA affects doc=100 and termB doc=2, then the updates 
 are applied in that order, meaning we cannot rely on updates coming in order.
 # Same document may be updated multiple times, either by same term (e.g. 
 several calls to IW.updateNDV) or by different terms. Last update wins.
 # Sequential read: when writing the updates to the Directory 
 (fieldsConsumer), we iterate on the docs in-order and for each one check if 
 it's updated and if not, pull its value from the current DV.
 # A single update may affect several million documents, therefore need to be 
 efficient w.r.t. memory consumption.
 +MapInteger,MapString,Long+
 Holds a mapping from a document, to all the fields that it was updated in and 
 the updated value for each field. This is used by IW.commitMergedDeletes to 
 apply the updates that came in while the segment was merging. The 
 requirements this structure needs to satisfy are:
 # Access in doc order: this is how commitMergedDeletes works.
 # One-pass: we visit a document once (currently) and so if we can, it's 
 better if we know all the fields in which it was updated. The updates are 
 applied to the merged ReaderAndLiveDocs (where they are stored in the first 
 structure mentioned above).
 Comments with proposals will follow next.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5027) Result Set Collapse and Expand Plugins

2013-10-09 Thread Joel Bernstein (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-5027:
-

Attachment: SOLR-5027.patch

Added support for the QueryElevationComponent and test case.

 Result Set Collapse and Expand Plugins
 --

 Key: SOLR-5027
 URL: https://issues.apache.org/jira/browse/SOLR-5027
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 5.0
Reporter: Joel Bernstein
Priority: Minor
 Attachments: SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch, 
 SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch


 This ticket introduces two new Solr plugins, the *CollapsingQParserPlugin* 
 and the *ExpandComponent*.
 The *CollapsingQParserPlugin* is a PostFilter that performs field collapsing.
 Collapse based on the highest scoring document:
 {code}
 fq=(!collapse field=field_name}
 {code}
 Collapse based on the min value of a numeric field:
 {code}
 fq={!collapse field=field_name min=field_name}
 {code}
 Collapse based on the max value of a numeric field:
 {code}
 fq={!collapse field=field_name max=field_name}
 {code}
 Collapse with a null policy:
 {code}
 fq={!collapse field=field_name nullPolicy=null_policy}
 {code}
 There are three null policies:
 ignore : removes docs with a null value in the collapse field (default).
 expand : treats each doc with a null value in the collapse field as a 
 separate group.
 collapse : collapses all docs with a null value into a single group using 
 either highest score, or min/max.
 The *ExpandComponent* is a search component that takes the collapsed docList 
 and expands the groups for a single page based on parameters provided.
 Initial syntax:
 expand=true   - Turns on the expand component.
 expand.field=field - Expands results for this field
 expand.limit=5 - Limits the documents for each expanded group.
 expand.sort=sort spec - The sort spec for the expanded documents. Default 
 is score.
 expand.rows=500 - The max number of expanded results to bring back. Default 
 is 500.
 *Note:* Recent patches don't contain the expand component. The July 16 patch 
 does. This will be brought back in when the collapse is finished, or possible 
 moved to it's own ticket.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-5307) Solr 4.5 collection api ignores collection.configName when used in cloud

2013-10-09 Thread Mark Miller (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller resolved SOLR-5307.
---

Resolution: Duplicate

 Solr 4.5 collection api ignores collection.configName when used in cloud
 

 Key: SOLR-5307
 URL: https://issues.apache.org/jira/browse/SOLR-5307
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.5
Reporter: Nathan Neulinger
Assignee: Mark Miller
  Labels: cloud, collection-api, zookeeper

 This worked properly in 4.4, but on 4.5, specifying collection.configName 
 when creating a collection doesn't work - it gets the default regardless of 
 what has been uploaded into zk. Explicitly linking config name to collection 
 ahead of time with zkcli.sh is a workaround I'm using for the moment, but 
 that did not appear to be necessary with 4.4 unless I was doing something 
 wrong and not realizing it.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5306) can not create collection when have over one config

2013-10-09 Thread Mark Miller (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller updated SOLR-5306:
--

Attachment: SOLR-5306.patch

 can not create collection when have over one config
 ---

 Key: SOLR-5306
 URL: https://issues.apache.org/jira/browse/SOLR-5306
 Project: Solr
  Issue Type: Bug
  Components: clients - java
Affects Versions: 4.5
 Environment: win7 jdk 7
Reporter: Liang Tianyu
Assignee: Mark Miller
Priority: Critical
 Attachments: SOLR-5306.patch


 I have uploaded zookeeper two config: patent and applicant. I can not create 
 collection:http://localhost:8080/solr/admin/collections?action=CREATEname=patent_main_1numShards=1collection.configName=patent.show
  errors:patent_main_1_shard1_replica1: 
 org.apache.solr.common.cloud.ZooKeeperException:org.apache.solr.common.cloud.ZooKeeperException:
  Could not find configName for collection patent_main_1 found:[applicant, 
 patent]. In solr 4.4 I can create sucessfully.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5306) can not create collection when have over one config

2013-10-09 Thread Mark Miller (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller updated SOLR-5306:
--

Fix Version/s: 5.0
   4.6
   4.5.1

 can not create collection when have over one config
 ---

 Key: SOLR-5306
 URL: https://issues.apache.org/jira/browse/SOLR-5306
 Project: Solr
  Issue Type: Bug
  Components: clients - java
Affects Versions: 4.5
 Environment: win7 jdk 7
Reporter: Liang Tianyu
Assignee: Mark Miller
Priority: Critical
 Fix For: 4.5.1, 4.6, 5.0

 Attachments: SOLR-5306.patch


 I have uploaded zookeeper two config: patent and applicant. I can not create 
 collection:http://localhost:8080/solr/admin/collections?action=CREATEname=patent_main_1numShards=1collection.configName=patent.show
  errors:patent_main_1_shard1_replica1: 
 org.apache.solr.common.cloud.ZooKeeperException:org.apache.solr.common.cloud.ZooKeeperException:
  Could not find configName for collection patent_main_1 found:[applicant, 
 patent]. In solr 4.4 I can create sucessfully.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5317) CoreAdmin API is not persisting data properly

2013-10-09 Thread Mark Miller (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller updated SOLR-5317:
--

Fix Version/s: 5.0
   4.6
   4.5.1

 CoreAdmin API is not persisting data properly
 -

 Key: SOLR-5317
 URL: https://issues.apache.org/jira/browse/SOLR-5317
 Project: Solr
  Issue Type: Bug
Reporter: Yago Riveiro
Priority: Critical
 Fix For: 4.5.1, 4.6, 5.0


 There is a regression between 4.4 and 4.5 with the CoreAdmin API, the command 
 doesn't save the result on solr.xml at time that is executed.
 The full process is describe here: https://gist.github.com/yriveiro/6883208



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5027) Field Collapsing PostFilter

2013-10-09 Thread Joel Bernstein (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-5027:
-

Summary: Field Collapsing PostFilter  (was: Result Set Collapse and Expand 
Plugins)

 Field Collapsing PostFilter
 ---

 Key: SOLR-5027
 URL: https://issues.apache.org/jira/browse/SOLR-5027
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 5.0
Reporter: Joel Bernstein
Priority: Minor
 Attachments: SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch, 
 SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch


 This ticket introduces two new Solr plugins, the *CollapsingQParserPlugin* 
 and the *ExpandComponent*.
 The *CollapsingQParserPlugin* is a PostFilter that performs field collapsing.
 Collapse based on the highest scoring document:
 {code}
 fq=(!collapse field=field_name}
 {code}
 Collapse based on the min value of a numeric field:
 {code}
 fq={!collapse field=field_name min=field_name}
 {code}
 Collapse based on the max value of a numeric field:
 {code}
 fq={!collapse field=field_name max=field_name}
 {code}
 Collapse with a null policy:
 {code}
 fq={!collapse field=field_name nullPolicy=null_policy}
 {code}
 There are three null policies:
 ignore : removes docs with a null value in the collapse field (default).
 expand : treats each doc with a null value in the collapse field as a 
 separate group.
 collapse : collapses all docs with a null value into a single group using 
 either highest score, or min/max.
 The *ExpandComponent* is a search component that takes the collapsed docList 
 and expands the groups for a single page based on parameters provided.
 Initial syntax:
 expand=true   - Turns on the expand component.
 expand.field=field - Expands results for this field
 expand.limit=5 - Limits the documents for each expanded group.
 expand.sort=sort spec - The sort spec for the expanded documents. Default 
 is score.
 expand.rows=500 - The max number of expanded results to bring back. Default 
 is 500.
 *Note:* Recent patches don't contain the expand component. The July 16 patch 
 does. This will be brought back in when the collapse is finished, or possible 
 moved to it's own ticket.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5027) Field Collapsing PostFilter

2013-10-09 Thread Joel Bernstein (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-5027:
-

Description: 
This ticket introduces the *CollapsingQParserPlugin* 

The *CollapsingQParserPlugin* is a PostFilter that performs field collapsing. 
This is a high performance alternative to standard Solr field collapsing (with 
*ngroups*) when the number of distinct groups in the result set is high.

For example in one performance test, a search with 10 million full results and 
1 million collapsed groups:
Standard grouping with ngroups : 17 seconds.
CollapsingQParserPlugin: 300 milli-seconds.

Sample syntax:

Collapse based on the highest scoring document:

{code}
fq=(!collapse field=field_name}
{code}

Collapse based on the min value of a numeric field:
{code}
fq={!collapse field=field_name min=field_name}
{code}

Collapse based on the max value of a numeric field:
{code}
fq={!collapse field=field_name max=field_name}
{code}

Collapse with a null policy:
{code}
fq={!collapse field=field_name nullPolicy=null_policy}
{code}
There are three null policies:
ignore : removes docs with a null value in the collapse field (default).
expand : treats each doc with a null value in the collapse field as a separate 
group.
collapse : collapses all docs with a null value into a single group using 
either highest score, or min/max.

*Note:*  The July 16 patch also includes and ExpandComponent that expands the 
collapsed groups for the current search result page. This functionality will 
moved to it's own ticket.






  was:
This ticket introduces two new Solr plugins, the *CollapsingQParserPlugin* and 
the *ExpandComponent*.


The *CollapsingQParserPlugin* is a PostFilter that performs field collapsing.

Collapse based on the highest scoring document:

{code}
fq=(!collapse field=field_name}
{code}

Collapse based on the min value of a numeric field:
{code}
fq={!collapse field=field_name min=field_name}
{code}

Collapse based on the max value of a numeric field:
{code}
fq={!collapse field=field_name max=field_name}
{code}

Collapse with a null policy:
{code}
fq={!collapse field=field_name nullPolicy=null_policy}
{code}
There are three null policies:
ignore : removes docs with a null value in the collapse field (default).
expand : treats each doc with a null value in the collapse field as a separate 
group.
collapse : collapses all docs with a null value into a single group using 
either highest score, or min/max.









The *ExpandComponent* is a search component that takes the collapsed docList 
and expands the groups for a single page based on parameters provided.

Initial syntax:

expand=true   - Turns on the expand component.
expand.field=field - Expands results for this field
expand.limit=5 - Limits the documents for each expanded group.
expand.sort=sort spec - The sort spec for the expanded documents. Default is 
score.
expand.rows=500 - The max number of expanded results to bring back. Default is 
500.

*Note:* Recent patches don't contain the expand component. The July 16 patch 
does. This will be brought back in when the collapse is finished, or possible 
moved to it's own ticket.







 Field Collapsing PostFilter
 ---

 Key: SOLR-5027
 URL: https://issues.apache.org/jira/browse/SOLR-5027
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 5.0
Reporter: Joel Bernstein
Priority: Minor
 Attachments: SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch, 
 SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch


 This ticket introduces the *CollapsingQParserPlugin* 
 The *CollapsingQParserPlugin* is a PostFilter that performs field collapsing. 
 This is a high performance alternative to standard Solr field collapsing 
 (with *ngroups*) when the number of distinct groups in the result set is high.
 For example in one performance test, a search with 10 million full results 
 and 1 million collapsed groups:
 Standard grouping with ngroups : 17 seconds.
 CollapsingQParserPlugin: 300 milli-seconds.
 Sample syntax:
 Collapse based on the highest scoring document:
 {code}
 fq=(!collapse field=field_name}
 {code}
 Collapse based on the min value of a numeric field:
 {code}
 fq={!collapse field=field_name min=field_name}
 {code}
 Collapse based on the max value of a numeric field:
 {code}
 fq={!collapse field=field_name max=field_name}
 {code}
 Collapse with a null policy:
 {code}
 fq={!collapse field=field_name nullPolicy=null_policy}
 {code}
 There are three null policies:
 ignore : removes docs with a null value in the collapse field (default).
 expand : treats each doc with a null value in the collapse field as a 
 separate group.
 collapse : collapses all docs with a null value into a single group using 
 either highest score, or min/max.
 *Note:*  The July 16 patch also includes 

[jira] [Assigned] (SOLR-5027) Field Collapsing PostFilter

2013-10-09 Thread Joel Bernstein (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein reassigned SOLR-5027:


Assignee: Joel Bernstein

 Field Collapsing PostFilter
 ---

 Key: SOLR-5027
 URL: https://issues.apache.org/jira/browse/SOLR-5027
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 5.0
Reporter: Joel Bernstein
Assignee: Joel Bernstein
Priority: Minor
 Fix For: 4.6, 5.0

 Attachments: SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch, 
 SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch


 This ticket introduces the *CollapsingQParserPlugin* 
 The *CollapsingQParserPlugin* is a PostFilter that performs field collapsing. 
 This is a high performance alternative to standard Solr field collapsing 
 (with *ngroups*) when the number of distinct groups in the result set is high.
 For example in one performance test, a search with 10 million full results 
 and 1 million collapsed groups:
 Standard grouping with ngroups : 17 seconds.
 CollapsingQParserPlugin: 300 milli-seconds.
 Sample syntax:
 Collapse based on the highest scoring document:
 {code}
 fq=(!collapse field=field_name}
 {code}
 Collapse based on the min value of a numeric field:
 {code}
 fq={!collapse field=field_name min=field_name}
 {code}
 Collapse based on the max value of a numeric field:
 {code}
 fq={!collapse field=field_name max=field_name}
 {code}
 Collapse with a null policy:
 {code}
 fq={!collapse field=field_name nullPolicy=null_policy}
 {code}
 There are three null policies:
 ignore : removes docs with a null value in the collapse field (default).
 expand : treats each doc with a null value in the collapse field as a 
 separate group.
 collapse : collapses all docs with a null value into a single group using 
 either highest score, or min/max.
 *Note:*  The July 16 patch also includes and ExpandComponent that expands the 
 collapsed groups for the current search result page. This functionality will 
 moved to it's own ticket.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5027) Field Collapsing PostFilter

2013-10-09 Thread Joel Bernstein (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-5027:
-

Fix Version/s: 5.0
   4.6

 Field Collapsing PostFilter
 ---

 Key: SOLR-5027
 URL: https://issues.apache.org/jira/browse/SOLR-5027
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 5.0
Reporter: Joel Bernstein
Assignee: Joel Bernstein
Priority: Minor
 Fix For: 4.6, 5.0

 Attachments: SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch, 
 SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch


 This ticket introduces the *CollapsingQParserPlugin* 
 The *CollapsingQParserPlugin* is a PostFilter that performs field collapsing. 
 This is a high performance alternative to standard Solr field collapsing 
 (with *ngroups*) when the number of distinct groups in the result set is high.
 For example in one performance test, a search with 10 million full results 
 and 1 million collapsed groups:
 Standard grouping with ngroups : 17 seconds.
 CollapsingQParserPlugin: 300 milli-seconds.
 Sample syntax:
 Collapse based on the highest scoring document:
 {code}
 fq=(!collapse field=field_name}
 {code}
 Collapse based on the min value of a numeric field:
 {code}
 fq={!collapse field=field_name min=field_name}
 {code}
 Collapse based on the max value of a numeric field:
 {code}
 fq={!collapse field=field_name max=field_name}
 {code}
 Collapse with a null policy:
 {code}
 fq={!collapse field=field_name nullPolicy=null_policy}
 {code}
 There are three null policies:
 ignore : removes docs with a null value in the collapse field (default).
 expand : treats each doc with a null value in the collapse field as a 
 separate group.
 collapse : collapses all docs with a null value into a single group using 
 either highest score, or min/max.
 *Note:*  The July 16 patch also includes and ExpandComponent that expands the 
 collapsed groups for the current search result page. This functionality will 
 moved to it's own ticket.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5027) Field Collapsing PostFilter

2013-10-09 Thread Joel Bernstein (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-5027:
-

Description: 
This ticket introduces the *CollapsingQParserPlugin* 

The *CollapsingQParserPlugin* is a PostFilter that performs field collapsing. 
This is a high performance alternative to standard Solr field collapsing (with 
*ngroups*) when the number of distinct groups in the result set is high.

For example in one performance test, a search with 10 million full results and 
1 million collapsed groups:
Standard grouping with ngroups : 17 seconds.
CollapsingQParserPlugin: 300 milli-seconds.

Sample syntax:

Collapse based on the highest scoring document:

{code}
fq=(!collapse field=field_name}
{code}

Collapse based on the min value of a numeric field:
{code}
fq={!collapse field=field_name min=field_name}
{code}

Collapse based on the max value of a numeric field:
{code}
fq={!collapse field=field_name max=field_name}
{code}

Collapse with a null policy:
{code}
fq={!collapse field=field_name nullPolicy=null_policy}
{code}
There are three null policies:
ignore : removes docs with a null value in the collapse field (default).
expand : treats each doc with a null value in the collapse field as a separate 
group.
collapse : collapses all docs with a null value into a single group using 
either highest score, or min/max.

The CollapsingQParserPlugin also fully supports the QueryElevationComponent

*Note:*  The July 16 patch also includes and ExpandComponent that expands the 
collapsed groups for the current search result page. This functionality will 
moved to it's own ticket.






  was:
This ticket introduces the *CollapsingQParserPlugin* 

The *CollapsingQParserPlugin* is a PostFilter that performs field collapsing. 
This is a high performance alternative to standard Solr field collapsing (with 
*ngroups*) when the number of distinct groups in the result set is high.

For example in one performance test, a search with 10 million full results and 
1 million collapsed groups:
Standard grouping with ngroups : 17 seconds.
CollapsingQParserPlugin: 300 milli-seconds.

Sample syntax:

Collapse based on the highest scoring document:

{code}
fq=(!collapse field=field_name}
{code}

Collapse based on the min value of a numeric field:
{code}
fq={!collapse field=field_name min=field_name}
{code}

Collapse based on the max value of a numeric field:
{code}
fq={!collapse field=field_name max=field_name}
{code}

Collapse with a null policy:
{code}
fq={!collapse field=field_name nullPolicy=null_policy}
{code}
There are three null policies:
ignore : removes docs with a null value in the collapse field (default).
expand : treats each doc with a null value in the collapse field as a separate 
group.
collapse : collapses all docs with a null value into a single group using 
either highest score, or min/max.

*Note:*  The July 16 patch also includes and ExpandComponent that expands the 
collapsed groups for the current search result page. This functionality will 
moved to it's own ticket.







 Field Collapsing PostFilter
 ---

 Key: SOLR-5027
 URL: https://issues.apache.org/jira/browse/SOLR-5027
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 5.0
Reporter: Joel Bernstein
Assignee: Joel Bernstein
Priority: Minor
 Fix For: 4.6, 5.0

 Attachments: SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch, 
 SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch


 This ticket introduces the *CollapsingQParserPlugin* 
 The *CollapsingQParserPlugin* is a PostFilter that performs field collapsing. 
 This is a high performance alternative to standard Solr field collapsing 
 (with *ngroups*) when the number of distinct groups in the result set is high.
 For example in one performance test, a search with 10 million full results 
 and 1 million collapsed groups:
 Standard grouping with ngroups : 17 seconds.
 CollapsingQParserPlugin: 300 milli-seconds.
 Sample syntax:
 Collapse based on the highest scoring document:
 {code}
 fq=(!collapse field=field_name}
 {code}
 Collapse based on the min value of a numeric field:
 {code}
 fq={!collapse field=field_name min=field_name}
 {code}
 Collapse based on the max value of a numeric field:
 {code}
 fq={!collapse field=field_name max=field_name}
 {code}
 Collapse with a null policy:
 {code}
 fq={!collapse field=field_name nullPolicy=null_policy}
 {code}
 There are three null policies:
 ignore : removes docs with a null value in the collapse field (default).
 expand : treats each doc with a null value in the collapse field as a 
 separate group.
 collapse : collapses all docs with a null value into a single group using 
 either highest score, or min/max.
 The CollapsingQParserPlugin also fully supports the QueryElevationComponent
 *Note:*  

[jira] [Updated] (SOLR-5027) Field Collapsing PostFilter

2013-10-09 Thread Joel Bernstein (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-5027:
-

Description: 
This ticket introduces the *CollapsingQParserPlugin* 

The *CollapsingQParserPlugin* is a PostFilter that performs field collapsing. 
This is a high performance alternative to standard Solr field collapsing (with 
*ngroups*) when the number of distinct groups in the result set is high.

For example in one performance test, a search with 10 million full results and 
1 million collapsed groups:
Standard grouping with ngroups : 17 seconds.
CollapsingQParserPlugin: 300 milli-seconds.

Sample syntax:

Collapse based on the highest scoring document:

{code}
fq=(!collapse field=field_name}
{code}

Collapse based on the min value of a numeric field:
{code}
fq={!collapse field=field_name min=field_name}
{code}

Collapse based on the max value of a numeric field:
{code}
fq={!collapse field=field_name max=field_name}
{code}

Collapse with a null policy:
{code}
fq={!collapse field=field_name nullPolicy=null_policy}
{code}
There are three null policies:
ignore : removes docs with a null value in the collapse field (default).
expand : treats each doc with a null value in the collapse field as a separate 
group.
collapse : collapses all docs with a null value into a single group using 
either highest score, or min/max.

The CollapsingQParserPlugin also fully supports the QueryElevationComponent

*Note:*  The July 16 patch also includes and ExpandComponent that expands the 
collapsed groups for the current search result page. This functionality will be 
moved to it's own ticket.






  was:
This ticket introduces the *CollapsingQParserPlugin* 

The *CollapsingQParserPlugin* is a PostFilter that performs field collapsing. 
This is a high performance alternative to standard Solr field collapsing (with 
*ngroups*) when the number of distinct groups in the result set is high.

For example in one performance test, a search with 10 million full results and 
1 million collapsed groups:
Standard grouping with ngroups : 17 seconds.
CollapsingQParserPlugin: 300 milli-seconds.

Sample syntax:

Collapse based on the highest scoring document:

{code}
fq=(!collapse field=field_name}
{code}

Collapse based on the min value of a numeric field:
{code}
fq={!collapse field=field_name min=field_name}
{code}

Collapse based on the max value of a numeric field:
{code}
fq={!collapse field=field_name max=field_name}
{code}

Collapse with a null policy:
{code}
fq={!collapse field=field_name nullPolicy=null_policy}
{code}
There are three null policies:
ignore : removes docs with a null value in the collapse field (default).
expand : treats each doc with a null value in the collapse field as a separate 
group.
collapse : collapses all docs with a null value into a single group using 
either highest score, or min/max.

The CollapsingQParserPlugin also fully supports the QueryElevationComponent

*Note:*  The July 16 patch also includes and ExpandComponent that expands the 
collapsed groups for the current search result page. This functionality will 
moved to it's own ticket.







 Field Collapsing PostFilter
 ---

 Key: SOLR-5027
 URL: https://issues.apache.org/jira/browse/SOLR-5027
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 5.0
Reporter: Joel Bernstein
Assignee: Joel Bernstein
Priority: Minor
 Fix For: 4.6, 5.0

 Attachments: SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch, 
 SOLR-5027.patch, SOLR-5027.patch, SOLR-5027.patch


 This ticket introduces the *CollapsingQParserPlugin* 
 The *CollapsingQParserPlugin* is a PostFilter that performs field collapsing. 
 This is a high performance alternative to standard Solr field collapsing 
 (with *ngroups*) when the number of distinct groups in the result set is high.
 For example in one performance test, a search with 10 million full results 
 and 1 million collapsed groups:
 Standard grouping with ngroups : 17 seconds.
 CollapsingQParserPlugin: 300 milli-seconds.
 Sample syntax:
 Collapse based on the highest scoring document:
 {code}
 fq=(!collapse field=field_name}
 {code}
 Collapse based on the min value of a numeric field:
 {code}
 fq={!collapse field=field_name min=field_name}
 {code}
 Collapse based on the max value of a numeric field:
 {code}
 fq={!collapse field=field_name max=field_name}
 {code}
 Collapse with a null policy:
 {code}
 fq={!collapse field=field_name nullPolicy=null_policy}
 {code}
 There are three null policies:
 ignore : removes docs with a null value in the collapse field (default).
 expand : treats each doc with a null value in the collapse field as a 
 separate group.
 collapse : collapses all docs with a null value into a single group using 
 either highest score, or min/max.
 The 

[jira] [Commented] (SOLR-5306) can not create collection when have over one config

2013-10-09 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790763#comment-13790763
 ] 

ASF subversion and git services commented on SOLR-5306:
---

Commit 1530772 from [~markrmil...@gmail.com] in branch 'dev/trunk'
[ https://svn.apache.org/r1530772 ]

SOLR-5306: Extra collection creation parameters like collection.configName are 
not being respected.

 can not create collection when have over one config
 ---

 Key: SOLR-5306
 URL: https://issues.apache.org/jira/browse/SOLR-5306
 Project: Solr
  Issue Type: Bug
  Components: clients - java
Affects Versions: 4.5
 Environment: win7 jdk 7
Reporter: Liang Tianyu
Assignee: Mark Miller
Priority: Critical
 Fix For: 4.5.1, 4.6, 5.0

 Attachments: SOLR-5306.patch


 I have uploaded zookeeper two config: patent and applicant. I can not create 
 collection:http://localhost:8080/solr/admin/collections?action=CREATEname=patent_main_1numShards=1collection.configName=patent.show
  errors:patent_main_1_shard1_replica1: 
 org.apache.solr.common.cloud.ZooKeeperException:org.apache.solr.common.cloud.ZooKeeperException:
  Could not find configName for collection patent_main_1 found:[applicant, 
 patent]. In solr 4.4 I can create sucessfully.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5306) can not create collection when have over one config

2013-10-09 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790765#comment-13790765
 ] 

ASF subversion and git services commented on SOLR-5306:
---

Commit 1530773 from [~markrmil...@gmail.com] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1530773 ]

SOLR-5306: Extra collection creation parameters like collection.configName are 
not being respected.

 can not create collection when have over one config
 ---

 Key: SOLR-5306
 URL: https://issues.apache.org/jira/browse/SOLR-5306
 Project: Solr
  Issue Type: Bug
  Components: clients - java
Affects Versions: 4.5
 Environment: win7 jdk 7
Reporter: Liang Tianyu
Assignee: Mark Miller
Priority: Critical
 Fix For: 4.5.1, 4.6, 5.0

 Attachments: SOLR-5306.patch


 I have uploaded zookeeper two config: patent and applicant. I can not create 
 collection:http://localhost:8080/solr/admin/collections?action=CREATEname=patent_main_1numShards=1collection.configName=patent.show
  errors:patent_main_1_shard1_replica1: 
 org.apache.solr.common.cloud.ZooKeeperException:org.apache.solr.common.cloud.ZooKeeperException:
  Could not find configName for collection patent_main_1 found:[applicant, 
 patent]. In solr 4.4 I can create sucessfully.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5323) Solr requires -Dsolr.clustering.enabled=false when pointing at example config

2013-10-09 Thread Dawid Weiss (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790788#comment-13790788
 ] 

Dawid Weiss commented on SOLR-5323:
---

I can't remember but I think the problem was that it wasn't possible to define 
install-dir relative directories for lib element. I'll take a look.

 Solr requires -Dsolr.clustering.enabled=false when pointing at example config
 -

 Key: SOLR-5323
 URL: https://issues.apache.org/jira/browse/SOLR-5323
 Project: Solr
  Issue Type: Bug
  Components: contrib - Clustering
Affects Versions: 4.5
 Environment: vanilla mac
Reporter: John Berryman
 Fix For: 4.6, 5.0


 my typical use of Solr is something like this: 
 {code}
 cd SOLR_HOME/example
 cp -r solr /myProjectDir/solr_home
 java -jar -Dsolr.solr.home=/myProjectDir/solr_home  start.jar
 {code}
 But in solr 4.5.0 this fails to start successfully. I get an error:
 {code}
 org.apache.solr.common.SolrException: Error loading class 
 'solr.clustering.ClusteringComponent'
 {code}
 The reason is because solr.clustering.enabled defaults to true now. I don't 
 know why this might be the case.
 you can get around it with 
 {code}
 java -jar -Dsolr.solr.home=/myProjectDir/solr_home 
 -Dsolr.clustering.enabled=false start.jar
 {code}
 SOLR-4708 is when this became an issue.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-NightlyTests-4.x - Build # 400 - Still Failing

2013-10-09 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-NightlyTests-4.x/400/

1 tests failed.
REGRESSION:  org.apache.lucene.analysis.core.TestRandomChains.testRandomChains

Error Message:
first posInc must be  0

Stack Trace:
java.lang.IllegalStateException: first posInc must be  0
at 
__randomizedtesting.SeedInfo.seed([D025BEA04DE60E8F:EDC497C10AF4134F]:0)
at 
org.apache.lucene.analysis.ValidatingTokenFilter.incrementToken(ValidatingTokenFilter.java:89)
at 
org.apache.lucene.analysis.BaseTokenStreamTestCase.checkAnalysisConsistency(BaseTokenStreamTestCase.java:694)
at 
org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:605)
at 
org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:506)
at 
org.apache.lucene.analysis.core.TestRandomChains.testRandomChains(TestRandomChains.java:925)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:616)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
at 
org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
at java.lang.Thread.run(Thread.java:679)




Build Log:
[...truncated 4359 lines...]
   [junit4] Suite: org.apache.lucene.analysis.core.TestRandomChains
   [junit4]   2 TEST FAIL: useCharFilter=false text='\ucd6f\u8537\uab05d\uf3cd 
qkt  \u0136'
   

[jira] [Commented] (LUCENE-5248) Improve the data structure used in ReaderAndLiveDocs to hold the updates

2013-10-09 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790834#comment-13790834
 ] 

Robert Muir commented on LUCENE-5248:
-

Hi Shai:

should UpdatesIterator implement DISI? It seems like it might be a good fit.

{code}
+private final FixedBitSet docsWithField;
+private PagedMutable docs;
+private PagedGrowableWriter values;
{code}

When we have multiple related structures like this, maybe we can add a comment 
as to what each is?
Something like:
{code}
// bit per docid: set if the value is real
// TODO: is bitset(maxdoc) really needed since usually its sparse? why not an 
openbitset parallel with docs?
private final FixedBitSet docsWithField;
// holds a list of documents.
// TODO: do these really need to be absolute-encoded?
private PagedMutable docs;
// holds a list of values, parallel with docs
private PagedGrowableWriter values;
{code}

{code}
+  docsWithField = new FixedBitSet(maxDoc);
+  docsWithField.clear(0, maxDoc)
{code}

The clear should be unnecessary!

{code}
+public void add(int doc, Long value) {
+  assert value != null;
+  if (size == Integer.MAX_VALUE) {
+throw new IllegalStateException(cannot support more than 
Integer.MAX_VALUE doc/value entries);
+  }
{code}

Is this really a limitation?

{code}
+@Override
+protected int compare(int i, int j) {
+  return (int) (docs.get(i) - docs.get(j));
+}
{code}

Can we just use Long.compare? this subtraction may be safe... but it would 
smell better.

 Improve the data structure used in ReaderAndLiveDocs to hold the updates
 

 Key: LUCENE-5248
 URL: https://issues.apache.org/jira/browse/LUCENE-5248
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/index
Reporter: Shai Erera
Assignee: Shai Erera
 Attachments: LUCENE-5248.patch, LUCENE-5248.patch, LUCENE-5248.patch, 
 LUCENE-5248.patch


 Currently ReaderAndLiveDocs holds the updates in two structures:
 +MapString,MapInteger,Long+
 Holds a mapping from each field, to all docs that were updated and their 
 values. This structure is updated when applyDeletes is called, and needs to 
 satisfy several requirements:
 # Un-ordered writes: if a field f is updated by two terms, termA and termB, 
 in that order, and termA affects doc=100 and termB doc=2, then the updates 
 are applied in that order, meaning we cannot rely on updates coming in order.
 # Same document may be updated multiple times, either by same term (e.g. 
 several calls to IW.updateNDV) or by different terms. Last update wins.
 # Sequential read: when writing the updates to the Directory 
 (fieldsConsumer), we iterate on the docs in-order and for each one check if 
 it's updated and if not, pull its value from the current DV.
 # A single update may affect several million documents, therefore need to be 
 efficient w.r.t. memory consumption.
 +MapInteger,MapString,Long+
 Holds a mapping from a document, to all the fields that it was updated in and 
 the updated value for each field. This is used by IW.commitMergedDeletes to 
 apply the updates that came in while the segment was merging. The 
 requirements this structure needs to satisfy are:
 # Access in doc order: this is how commitMergedDeletes works.
 # One-pass: we visit a document once (currently) and so if we can, it's 
 better if we know all the fields in which it was updated. The updates are 
 applied to the merged ReaderAndLiveDocs (where they are stored in the first 
 structure mentioned above).
 Comments with proposals will follow next.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [JENKINS] Lucene-Solr-NightlyTests-4.x - Build # 400 - Still Failing

2013-10-09 Thread Robert Muir
I will investigate. looks like fun.

On Wed, Oct 9, 2013 at 4:18 PM, Apache Jenkins Server
jenk...@builds.apache.org wrote:
 Build: https://builds.apache.org/job/Lucene-Solr-NightlyTests-4.x/400/

 1 tests failed.
 REGRESSION:  org.apache.lucene.analysis.core.TestRandomChains.testRandomChains

 Error Message:
 first posInc must be  0

 Stack Trace:
 java.lang.IllegalStateException: first posInc must be  0
 at 
 __randomizedtesting.SeedInfo.seed([D025BEA04DE60E8F:EDC497C10AF4134F]:0)
 at 
 org.apache.lucene.analysis.ValidatingTokenFilter.incrementToken(ValidatingTokenFilter.java:89)
 at 
 org.apache.lucene.analysis.BaseTokenStreamTestCase.checkAnalysisConsistency(BaseTokenStreamTestCase.java:694)
 at 
 org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:605)
 at 
 org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:506)
 at 
 org.apache.lucene.analysis.core.TestRandomChains.testRandomChains(TestRandomChains.java:925)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:616)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1559)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner.access$600(RandomizedRunner.java:79)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:737)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:773)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:787)
 at 
 org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
 at 
 org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51)
 at 
 org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
 at 
 com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
 at 
 org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
 at 
 org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
 at 
 org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
 at 
 com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
 at 
 com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
 at 
 com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:782)
 at 
 com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:442)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:746)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:648)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:682)
 at 
 com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:693)
 at 
 org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
 at 
 org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
 at 
 com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
 at 
 com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
 at 
 com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
 at 
 com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
 at 
 org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43)
 at 
 org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
 at 
 org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70)
 at 
 org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55)
 at 
 com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
 at 
 com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:358)
 at 

[JENKINS] Lucene-Solr-trunk-MacOSX (64bit/jdk1.7.0) - Build # 900 - Failure!

2013-10-09 Thread Policeman Jenkins Server
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-MacOSX/900/
Java: 64bit/jdk1.7.0 -XX:+UseCompressedOops -XX:+UseParallelGC

All tests passed

Build Log:
[...truncated 10176 lines...]
   [junit4] ERROR: JVM J0 ended with an exception, command line: 
/Library/Java/JavaVirtualMachines/jdk1.7.0_40.jdk/Contents/Home/jre/bin/java 
-XX:+UseCompressedOops -XX:+UseParallelGC -XX:+HeapDumpOnOutOfMemoryError 
-XX:HeapDumpPath=/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/heapdumps 
-Dtests.prefix=tests -Dtests.seed=EE974D36626AC16B -Xmx512M -Dtests.iters= 
-Dtests.verbose=false -Dtests.infostream=false -Dtests.codec=random 
-Dtests.postingsformat=random -Dtests.docvaluesformat=random 
-Dtests.locale=random -Dtests.timezone=random -Dtests.directory=random 
-Dtests.linedocsfile=europarl.lines.txt.gz -Dtests.luceneMatchVersion=5.0 
-Dtests.cleanthreads=perClass 
-Djava.util.logging.config.file=/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/lucene/tools/junit4/logging.properties
 -Dtests.nightly=false -Dtests.weekly=false -Dtests.slow=true 
-Dtests.asserts.gracious=false -Dtests.multiplier=1 -DtempDir=. 
-Djava.io.tmpdir=. 
-Djunit4.tempDir=/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/solr/build/solr-core/test/temp
 
-Dclover.db.dir=/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/lucene/build/clover/db
 -Djava.security.manager=org.apache.lucene.util.TestSecurityManager 
-Djava.security.policy=/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/lucene/tools/junit4/tests.policy
 -Dlucene.version=5.0-SNAPSHOT -Djetty.testMode=1 -Djetty.insecurerandom=1 
-Dsolr.directoryFactory=org.apache.solr.core.MockDirectoryFactory 
-Djava.awt.headless=true -Dtests.disableHdfs=true -Dfile.encoding=ISO-8859-1 
-classpath 

Re: [JENKINS] Lucene-Solr-trunk-MacOSX (64bit/jdk1.7.0) - Build # 900 - Failure!

2013-10-09 Thread Robert Muir
malloc/free bug.

On Wed, Oct 9, 2013 at 4:47 PM, Policeman Jenkins Server
jenk...@thetaphi.de wrote:
 Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-MacOSX/900/
 Java: 64bit/jdk1.7.0 -XX:+UseCompressedOops -XX:+UseParallelGC

 All tests passed

 Build Log:
 [...truncated 10176 lines...]
[junit4] ERROR: JVM J0 ended with an exception, command line: 
 /Library/Java/JavaVirtualMachines/jdk1.7.0_40.jdk/Contents/Home/jre/bin/java 
 -XX:+UseCompressedOops -XX:+UseParallelGC -XX:+HeapDumpOnOutOfMemoryError 
 -XX:HeapDumpPath=/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/heapdumps 
 -Dtests.prefix=tests -Dtests.seed=EE974D36626AC16B -Xmx512M -Dtests.iters= 
 -Dtests.verbose=false -Dtests.infostream=false -Dtests.codec=random 
 -Dtests.postingsformat=random -Dtests.docvaluesformat=random 
 -Dtests.locale=random -Dtests.timezone=random -Dtests.directory=random 
 -Dtests.linedocsfile=europarl.lines.txt.gz -Dtests.luceneMatchVersion=5.0 
 -Dtests.cleanthreads=perClass 
 -Djava.util.logging.config.file=/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/lucene/tools/junit4/logging.properties
  -Dtests.nightly=false -Dtests.weekly=false -Dtests.slow=true 
 -Dtests.asserts.gracious=false -Dtests.multiplier=1 -DtempDir=. 
 -Djava.io.tmpdir=. 
 -Djunit4.tempDir=/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/solr/build/solr-core/test/temp
  
 -Dclover.db.dir=/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/lucene/build/clover/db
  -Djava.security.manager=org.apache.lucene.util.TestSecurityManager 
 -Djava.security.policy=/Users/jenkins/workspace/Lucene-Solr-trunk-MacOSX/lucene/tools/junit4/tests.policy
  -Dlucene.version=5.0-SNAPSHOT -Djetty.testMode=1 -Djetty.insecurerandom=1 
 -Dsolr.directoryFactory=org.apache.solr.core.MockDirectoryFactory 
 -Djava.awt.headless=true -Dtests.disableHdfs=true -Dfile.encoding=ISO-8859-1 
 -classpath 
 

[jira] [Created] (LUCENE-5269) TestRandomChains failure

2013-10-09 Thread Robert Muir (JIRA)
Robert Muir created LUCENE-5269:
---

 Summary: TestRandomChains failure
 Key: LUCENE-5269
 URL: https://issues.apache.org/jira/browse/LUCENE-5269
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Robert Muir


One of EdgeNGramTokenizer, ShingleFilter, NGramTokenFilter is buggy, or 
possibly only the combination of them conspiring together.






--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5269) TestRandomChains failure

2013-10-09 Thread Robert Muir (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-5269:


Attachment: LUCENE-5269_test.patch

Here's a test. For whatever reason the exact text in jenkins wouldnt reproduce 
with checkAnalysisConsistency with the exact configuration.

However the random seed reproduces in jenkins easily. I suspect maybe there is 
something not reset and the linedocs file is triggering it???

If i blast random data at the configuration it fails the same way.

I then removed various harmless filters and so on until I was left with these 
three and it was still failing...

 TestRandomChains failure
 

 Key: LUCENE-5269
 URL: https://issues.apache.org/jira/browse/LUCENE-5269
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Robert Muir
 Attachments: LUCENE-5269_test.patch


 One of EdgeNGramTokenizer, ShingleFilter, NGramTokenFilter is buggy, or 
 possibly only the combination of them conspiring together.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-5270) add Terms.hasFreqs

2013-10-09 Thread Michael McCandless (JIRA)
Michael McCandless created LUCENE-5270:
--

 Summary: add Terms.hasFreqs
 Key: LUCENE-5270
 URL: https://issues.apache.org/jira/browse/LUCENE-5270
 Project: Lucene - Core
  Issue Type: Improvement
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 4.6, 5.0


While working on LUCENE-5268, I realized we have hasPositions/Offsets/Payloads 
methods in Terms but not hasFreqs ...



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Assigned] (SOLR-5317) CoreAdmin API is not persisting data properly

2013-10-09 Thread Mark Miller (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller reassigned SOLR-5317:
-

Assignee: Mark Miller

 CoreAdmin API is not persisting data properly
 -

 Key: SOLR-5317
 URL: https://issues.apache.org/jira/browse/SOLR-5317
 Project: Solr
  Issue Type: Bug
Reporter: Yago Riveiro
Assignee: Mark Miller
Priority: Critical
 Fix For: 4.5.1, 4.6, 5.0


 There is a regression between 4.4 and 4.5 with the CoreAdmin API, the command 
 doesn't save the result on solr.xml at time that is executed.
 The full process is describe here: https://gist.github.com/yriveiro/6883208



--
This message was sent by Atlassian JIRA
(v6.1#6144)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



  1   2   >