[jira] [Commented] (SOLR-5104) Remove Default Core

2013-08-01 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13727053#comment-13727053
 ] 

Mark Miller commented on SOLR-5104:
---

We certainly want to do this, and I don't consider it removing a feature. It's 
purely an improvement IMO. The main reason I have not pushed for it yet was 
that the it killed the admin UI - but now that that is fixed, this is the next 
step.

This is a relic from the pre multi core days - when Solr was one index and that 
is it. Backcompat sludge is what has kept it around IMO - we want to act like 
most systems and start empty. It should be very simple for a user to create his 
first collection, but he should be the one to name it. As Grant mentions, this 
is certainly what you want for scriptibility, and it's more consistent with 
other systems for users as well.

A new user should:

1. Start Solr
2. Name and create their first collection.

When they want more collections, repeat step 2, a step they learned right away.

 Remove Default Core
 ---

 Key: SOLR-5104
 URL: https://issues.apache.org/jira/browse/SOLR-5104
 Project: Solr
  Issue Type: Sub-task
Reporter: Grant Ingersoll
 Fix For: 5.0


 I see no reason to maintain the notion of a default Core/Collection.  We can 
 either default to Collection1, or just simply create a core on the fly based 
 on the client's request.  Thus, all APIs that are accessing a core would 
 require the core to be in the address path.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5104) Remove Default Core

2013-08-01 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13727060#comment-13727060
 ] 

Mark Miller commented on SOLR-5104:
---

Took me a moment for me to realize that is not referring to removing the core 
that ships with Solr, but the default core feature. I want to remove the actual 
default core that is setup, so certainly +1 on dropping this. I think we 
already discussed it some for 5.0.

 Remove Default Core
 ---

 Key: SOLR-5104
 URL: https://issues.apache.org/jira/browse/SOLR-5104
 Project: Solr
  Issue Type: Sub-task
Reporter: Grant Ingersoll
 Fix For: 5.0


 I see no reason to maintain the notion of a default Core/Collection.  We can 
 either default to Collection1, or just simply create a core on the fly based 
 on the client's request.  Thus, all APIs that are accessing a core would 
 require the core to be in the address path.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Welcome Cassandra Targett as Lucene/Solr committer

2013-08-01 Thread Mark Miller
Welcome!

- Mark

On Jul 31, 2013, at 6:47 PM, Robert Muir rcm...@gmail.com wrote:

 I'm pleased to announce that Cassandra Targett has accepted to join our ranks 
 as a committer.
 
 Cassandra worked on the donation of the new Solr Reference Guide [1] and 
 getting things in order for its first official release [2].
 Cassandra, it is tradition that you introduce yourself with a brief bio.
 
 Welcome!
 
 P.S. As soon as your SVN access is setup, you should then be able to add 
 yourself to the committers list on the website as well.
 
 [1] 
 https://cwiki.apache.org/confluence/display/solr/Apache+Solr+Reference+Guide
 [2] https://www.apache.org/dyn/closer.cgi/lucene/solr/ref-guide/
 



[jira] [Created] (LUCENE-5156) CompressingTermVectors termsEnum should probably not support seek-by-ord

2013-08-01 Thread Robert Muir (JIRA)
Robert Muir created LUCENE-5156:
---

 Summary: CompressingTermVectors termsEnum should probably not 
support seek-by-ord
 Key: LUCENE-5156
 URL: https://issues.apache.org/jira/browse/LUCENE-5156
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Robert Muir


Just like term vectors before it, it has a O(n) seek-by-term. 

But this one also advertises a seek-by-ord, only this is also O(n).

This could cause e.g. checkindex to be very slow, because if termsenum supports 
ord it does a bunch of seeking tests. (Another solution would be to leave it, 
and add a boolean so checkindex never does seeking tests for term vectors, only 
real fields).

However, I think its also kinda a trap, in my opinion if seek-by-ord is 
supported anywhere, you kinda expect it to be faster than linear time...?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Welcome Cassandra Targett as Lucene/Solr committer

2013-08-01 Thread Kranti Parisa
Awesome!
Congrats  Welcome Cassandra!

Thanks  Regards,
Kranti K Parisa
http://www.linkedin.com/in/krantiparisa



On Thu, Aug 1, 2013 at 7:23 PM, Mark Miller markrmil...@gmail.com wrote:

 Welcome!

 - Mark

 On Jul 31, 2013, at 6:47 PM, Robert Muir rcm...@gmail.com wrote:

 I'm pleased to announce that Cassandra Targett has accepted to join our
 ranks as a committer.

 Cassandra worked on the donation of the new Solr Reference Guide [1] and
 getting things in order for its first official release [2].
 Cassandra, it is tradition that you introduce yourself with a brief bio.

 Welcome!

 P.S. As soon as your SVN access is setup, you should then be able to add
 yourself to the committers list on the website as well.

 [1]
 https://cwiki.apache.org/confluence/display/solr/Apache+Solr+Reference+Guide
 [2] https://www.apache.org/dyn/closer.cgi/lucene/solr/ref-guide/





[jira] [Updated] (SOLR-2570) randomize indexwriter settings in solr tests

2013-08-01 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-2570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-2570:
---

Attachment: SOLR-2570.patch

updated patch taking into account the work already done in SOLR-4942 and 
SOLR-4951.

In my limited testing so far, I haven't seen any obvious failures -- so i'd 
like to commit soon and then move forward with using the xml include snippet in 
more configs (SOLR-4952)

 randomize indexwriter settings in solr tests
 

 Key: SOLR-2570
 URL: https://issues.apache.org/jira/browse/SOLR-2570
 Project: Solr
  Issue Type: Sub-task
  Components: Build
Reporter: Robert Muir
Assignee: Robert Muir
 Fix For: 4.5, 5.0

 Attachments: SOLR-2570.patch, SOLR-2570.patch


 we should randomize indexwriter settings like lucene tests do, to vary # of 
 segments and such.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Welcome Cassandra Targett as Lucene/Solr committer

2013-08-01 Thread Tommaso Teofili
Welcome onboard Cassandra!
Tommaso


2013/8/1 Robert Muir rcm...@gmail.com

 I'm pleased to announce that Cassandra Targett has accepted to join our
 ranks as a committer.

 Cassandra worked on the donation of the new Solr Reference Guide [1] and
 getting things in order for its first official release [2].
 Cassandra, it is tradition that you introduce yourself with a brief bio.

 Welcome!

 P.S. As soon as your SVN access is setup, you should then be able to add
 yourself to the committers list on the website as well.

 [1]
 https://cwiki.apache.org/confluence/display/solr/Apache+Solr+Reference+Guide
 [2] https://www.apache.org/dyn/closer.cgi/lucene/solr/ref-guide/




Re: Welcome Cassandra Targett as Lucene/Solr committer

2013-08-01 Thread Christian Moen
Welcome Cassandra!

Christian

On Aug 1, 2013, at 7:47 AM, Robert Muir rcm...@gmail.com wrote:

 I'm pleased to announce that Cassandra Targett has accepted to join our ranks 
 as a committer.
 
 Cassandra worked on the donation of the new Solr Reference Guide [1] and 
 getting things in order for its first official release [2].
 Cassandra, it is tradition that you introduce yourself with a brief bio.
 
 Welcome!
 
 P.S. As soon as your SVN access is setup, you should then be able to add 
 yourself to the committers list on the website as well.
 
 [1] 
 https://cwiki.apache.org/confluence/display/solr/Apache+Solr+Reference+Guide
 [2] https://www.apache.org/dyn/closer.cgi/lucene/solr/ref-guide/
 



[jira] [Created] (SOLR-5100) java.lang.OutOfMemoryError: Requested array size exceeds VM limit

2013-08-01 Thread Grzegorz Sobczyk (JIRA)
Grzegorz Sobczyk created SOLR-5100:
--

 Summary: java.lang.OutOfMemoryError: Requested array size exceeds 
VM limit
 Key: SOLR-5100
 URL: https://issues.apache.org/jira/browse/SOLR-5100
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.2.1
 Environment: Linux 3.2.0-4-amd64 #1 SMP Debian 3.2.46-1 x86_64 
GNU/Linux
Java 7, Tomcat, ZK standalone
Reporter: Grzegorz Sobczyk


Today I found exception in log (lmsiprse01):

{code}
sie 01, 2013 5:27:26 AM org.apache.solr.core.SolrCore execute
INFO: [products] webapp=/solr path=/select 
params={facet=truestart=0q=facet.limit=-1facet.field=attribute_u-typfacet.field=attribute_u-gama-kolorystycznafacet.field=brand_namewt=javabinfq=node_id:1056version=2rows=0}
 hits=1241 status=0 QTime=33 
sie 01, 2013 5:27:26 AM org.apache.solr.common.SolrException log
SEVERE: null:java.lang.RuntimeException: java.lang.OutOfMemoryError: Requested 
array size exceeds VM limit
at 
org.apache.solr.servlet.SolrDispatchFilter.sendError(SolrDispatchFilter.java:653)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:366)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:141)
at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at 
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
at 
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
at 
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
at 
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
at 
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
at 
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293)
at 
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:859)
at 
org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:602)
at 
org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489)
at java.lang.Thread.run(Thread.java:724)
Caused by: java.lang.OutOfMemoryError: Requested array size exceeds VM limit
at org.apache.lucene.util.PriorityQueue.init(PriorityQueue.java:64)
at org.apache.lucene.util.PriorityQueue.init(PriorityQueue.java:37)
at 
org.apache.solr.handler.component.ShardFieldSortedHitQueue.init(ShardDoc.java:113)
at 
org.apache.solr.handler.component.QueryComponent.mergeIds(QueryComponent.java:766)
at 
org.apache.solr.handler.component.QueryComponent.handleRegularResponses(QueryComponent.java:625)
at 
org.apache.solr.handler.component.QueryComponent.handleResponses(QueryComponent.java:604)
at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:311)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1817)
at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:639)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:345)
... 13 more

{code}


We have: 
* 3x standalone zK
* 3x Solr 4.2.1 on Tomcat

Exception shows up after leader was stopped:

* lmsiprse01:
[2013-08-01 05:23:43]: /etc/init.d/tomcat6-1 stop
[2013-08-01 05:25:09]: /etc/init.d/tomcat6-1 start
* lmsiprse02 (leader):
2013-08-01 05:27:21]: /etc/init.d/tomcat6-1 stop
2013-08-01 05:29:31]: /etc/init.d/tomcat6-1 start
* lmsiprse03:
[2013-08-01 05:25:48]: /etc/init.d/tomcat6-1 stop
[2013-08-01 05:26:42]: /etc/init.d/tomcat6-1 start


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5084) new field type - EnumField

2013-08-01 Thread Elran Dvir (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elran Dvir updated SOLR-5084:
-

Attachment: Solr-5084.patch

 new field type - EnumField
 --

 Key: SOLR-5084
 URL: https://issues.apache.org/jira/browse/SOLR-5084
 Project: Solr
  Issue Type: New Feature
Reporter: Elran Dvir
 Attachments: enumsConfig.xml, schema_example.xml, Solr-5084.patch, 
 Solr-5084.patch


 We have encountered a use case in our system where we have a few fields 
 (Severity. Risk etc) with a closed set of values, where the sort order for 
 these values is pre-determined but not lexicographic (Critical is higher than 
 High). Generically this is very close to how enums work.
 To implement, I have prototyped a new type of field: EnumField where the 
 inputs are a closed predefined  set of strings in a special configuration 
 file (similar to currency.xml).
 The code is based on 4.2.1.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5084) new field type - EnumField

2013-08-01 Thread Elran Dvir (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726151#comment-13726151
 ] 

Elran Dvir commented on SOLR-5084:
--

I reformatted the code.
I hope it's OK now.

Thanks. 

 new field type - EnumField
 --

 Key: SOLR-5084
 URL: https://issues.apache.org/jira/browse/SOLR-5084
 Project: Solr
  Issue Type: New Feature
Reporter: Elran Dvir
 Attachments: enumsConfig.xml, schema_example.xml, Solr-5084.patch, 
 Solr-5084.patch


 We have encountered a use case in our system where we have a few fields 
 (Severity. Risk etc) with a closed set of values, where the sort order for 
 these values is pre-determined but not lexicographic (Critical is higher than 
 High). Generically this is very close to how enums work.
 To implement, I have prototyped a new type of field: EnumField where the 
 inputs are a closed predefined  set of strings in a special configuration 
 file (similar to currency.xml).
 The code is based on 4.2.1.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Welcome Cassandra Targett as Lucene/Solr committer

2013-08-01 Thread Adrien Grand
Welome Cassandra!

-- 
Adrien

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Welcome Cassandra Targett as Lucene/Solr committer

2013-08-01 Thread Martijn v Groningen
Welcome!


On 1 August 2013 09:06, Adrien Grand jpou...@gmail.com wrote:

 Welome Cassandra!

 --
 Adrien

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org




-- 
Met vriendelijke groet,

Martijn van Groningen


Re: Welcome Cassandra Targett as Lucene/Solr committer

2013-08-01 Thread Simon Willnauer
welcome!

On Thu, Aug 1, 2013 at 9:18 AM, Martijn v Groningen
martijn.v.gronin...@gmail.com wrote:
 Welcome!


 On 1 August 2013 09:06, Adrien Grand jpou...@gmail.com wrote:

 Welome Cassandra!

 --
 Adrien

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org




 --
 Met vriendelijke groet,

 Martijn van Groningen

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-2899) Add OpenNLP Analysis capabilities as a module

2013-08-01 Thread Joern Kottmann (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726191#comment-13726191
 ] 

Joern Kottmann commented on LUCENE-2899:


Stanford NLP is licensed under GPLv2, this license is not compatible with the 
AL 2.0 and therefore such a component can't be contributed to an Apache project 
directly.

 Add OpenNLP Analysis capabilities as a module
 -

 Key: LUCENE-2899
 URL: https://issues.apache.org/jira/browse/LUCENE-2899
 Project: Lucene - Core
  Issue Type: New Feature
  Components: modules/analysis
Reporter: Grant Ingersoll
Assignee: Grant Ingersoll
Priority: Minor
 Fix For: 5.0, 4.5

 Attachments: LUCENE-2899-current.patch, LUCENE-2899.patch, 
 LUCENE-2899.patch, LUCENE-2899.patch, LUCENE-2899.patch, LUCENE-2899.patch, 
 LUCENE-2899.patch, LUCENE-2899-RJN.patch, LUCENE-2899-x.patch, 
 LUCENE-2899-x.patch, LUCENE-2899-x.patch, OpenNLPFilter.java, 
 OpenNLPFilter.java, OpenNLPTokenizer.java, opennlp_trunk.patch


 Now that OpenNLP is an ASF project and has a nice license, it would be nice 
 to have a submodule (under analysis) that exposed capabilities for it. Drew 
 Farris, Tom Morton and I have code that does:
 * Sentence Detection as a Tokenizer (could also be a TokenFilter, although it 
 would have to change slightly to buffer tokens)
 * NamedEntity recognition as a TokenFilter
 We are also planning a Tokenizer/TokenFilter that can put parts of speech as 
 either payloads (PartOfSpeechAttribute?) on a token or at the same position.
 I'd propose it go under:
 modules/analysis/opennlp

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Welcome Cassandra Targett as Lucene/Solr committer

2013-08-01 Thread Alan Woodward
Welcome Cassandra!

Alan Woodward
www.flax.co.uk


On 31 Jul 2013, at 23:47, Robert Muir wrote:

 I'm pleased to announce that Cassandra Targett has accepted to join our ranks 
 as a committer.
 
 Cassandra worked on the donation of the new Solr Reference Guide [1] and 
 getting things in order for its first official release [2].
 Cassandra, it is tradition that you introduce yourself with a brief bio.
 
 Welcome!
 
 P.S. As soon as your SVN access is setup, you should then be able to add 
 yourself to the committers list on the website as well.
 
 [1] 
 https://cwiki.apache.org/confluence/display/solr/Apache+Solr+Reference+Guide
 [2] https://www.apache.org/dyn/closer.cgi/lucene/solr/ref-guide/
 



[jira] [Assigned] (SOLR-5099) The core.properties not created during collection creation

2013-08-01 Thread Alan Woodward (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Woodward reassigned SOLR-5099:
---

Assignee: Alan Woodward

 The core.properties not created during collection creation
 --

 Key: SOLR-5099
 URL: https://issues.apache.org/jira/browse/SOLR-5099
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.5, 5.0
Reporter: Herb Jiang
Assignee: Alan Woodward
Priority: Critical
 Attachments: CorePropertiesLocator.java.patch


 When using the new solr.xml structure. The core auto discovery mechanism 
 trying to find core.properties. 
 But I found the core.properties cannot be create when I dynamically create a 
 collection.
 The root issue is the CorePropertiesLocator trying to create properties 
 before the instanceDir is created. 
 And collection creation process will done and looks fine at runtime, but it 
 will cause issues (cores are not auto discovered after server restart).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-2899) Add OpenNLP Analysis capabilities as a module

2013-08-01 Thread Andrew Janowczyk (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726221#comment-13726221
 ] 

Andrew Janowczyk commented on LUCENE-2899:
--

ahhh thanks for the info. i found a relevant link discussing the licenses which 
clearly explains the details 
[here|http://www.apache.org/licenses/GPL-compatibility.html]. oh well, it was 
worth a try :)

 Add OpenNLP Analysis capabilities as a module
 -

 Key: LUCENE-2899
 URL: https://issues.apache.org/jira/browse/LUCENE-2899
 Project: Lucene - Core
  Issue Type: New Feature
  Components: modules/analysis
Reporter: Grant Ingersoll
Assignee: Grant Ingersoll
Priority: Minor
 Fix For: 5.0, 4.5

 Attachments: LUCENE-2899-current.patch, LUCENE-2899.patch, 
 LUCENE-2899.patch, LUCENE-2899.patch, LUCENE-2899.patch, LUCENE-2899.patch, 
 LUCENE-2899.patch, LUCENE-2899-RJN.patch, LUCENE-2899-x.patch, 
 LUCENE-2899-x.patch, LUCENE-2899-x.patch, OpenNLPFilter.java, 
 OpenNLPFilter.java, OpenNLPTokenizer.java, opennlp_trunk.patch


 Now that OpenNLP is an ASF project and has a nice license, it would be nice 
 to have a submodule (under analysis) that exposed capabilities for it. Drew 
 Farris, Tom Morton and I have code that does:
 * Sentence Detection as a Tokenizer (could also be a TokenFilter, although it 
 would have to change slightly to buffer tokens)
 * NamedEntity recognition as a TokenFilter
 We are also planning a Tokenizer/TokenFilter that can put parts of speech as 
 either payloads (PartOfSpeechAttribute?) on a token or at the same position.
 I'd propose it go under:
 modules/analysis/opennlp

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: VInt block lenght in Lucene 4.1 postings format

2013-08-01 Thread Han Jiang
Hi Aleksandra,

The PostingsReader uses a skip list to determine the start file
pointer of each block (both FOR packed and vInt encoded). The
information
is currently maintained by Lucene41SkipReader.

The tricky part is, for each term, the skip data is exactly at the end
of TermFreqs blocks, so, if you fetch the startFP for vInt block, and
knows the docTermStartOffset  skipOffset for current term, you can
calculate out what you need.

http://lucene.apache.org/core/4_4_0/core/org/apache/lucene/codecs/lucene41/Lucene41PostingsFormat.html#Frequencies

On Thu, Aug 1, 2013 at 4:20 PM, Aleksandra Woźniak
aleksandra.k.wozn...@gmail.com wrote:
 Hi all,

 recently I wanted to try out some modifications of Lucene's postings format
 (namely, copying blocks that have no deletions without int-decoding/encoding
 -- this is similar to what was described here:
 https://issues.apache.org/jira/browse/LUCENE-2082). I started with changing
 Lucene 4.1 postings format to check what can be done there.

 I came across the following problem: in Lucene41PostingsReader the length
 (number of bytes) of the last, vInt-encoded, block of posting in not known
 before all individual postings are read and decoded. When reading this block
 we only know the number of postings that should be read and decoded -- since
 vInts have different sizes by definition.

 If I wanted to copy the whole block without vInt decoding/encoding, I need
 to know how many bytes I have to read from postings index input. So, my
 question is: is there a clean way to determine the length of this block (ie.
 the number of bytes that this block has)? Is the number of bytes in a
 posting list tracked somewhere in Lucene 4.1 postings format?

 Thanks,
 Aleksandra



-- 
Han Jiang

Team of Search Engine and Web Mining,
School of Electronic Engineering and Computer Science,
Peking University, China

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5099) The core.properties not created during collection creation

2013-08-01 Thread Alan Woodward (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726244#comment-13726244
 ] 

Alan Woodward commented on SOLR-5099:
-

This is because creating a core in normal mode requires that the instance dir 
is already present, but creation via SolrCloud allows you to load all config 
from zookeeper, and so doesn't need an actual instance dir.  Nice catch.

I'll add a test for the Collections API as well.

 The core.properties not created during collection creation
 --

 Key: SOLR-5099
 URL: https://issues.apache.org/jira/browse/SOLR-5099
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.5, 5.0
Reporter: Herb Jiang
Assignee: Alan Woodward
Priority: Critical
 Attachments: CorePropertiesLocator.java.patch


 When using the new solr.xml structure. The core auto discovery mechanism 
 trying to find core.properties. 
 But I found the core.properties cannot be create when I dynamically create a 
 collection.
 The root issue is the CorePropertiesLocator trying to create properties 
 before the instanceDir is created. 
 And collection creation process will done and looks fine at runtime, but it 
 will cause issues (cores are not auto discovered after server restart).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5155) Add OrdinalValueResolver in favor of FacetRequest.getValueOf

2013-08-01 Thread Gilad Barkai (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726262#comment-13726262
 ] 

Gilad Barkai commented on LUCENE-5155:
--

Patch looks good.
+1 for commit.

Perhaps also document that FRNode is now comparable?


 Add OrdinalValueResolver in favor of FacetRequest.getValueOf
 

 Key: LUCENE-5155
 URL: https://issues.apache.org/jira/browse/LUCENE-5155
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/facet
Reporter: Shai Erera
Assignee: Shai Erera
 Attachments: LUCENE-5155.patch


 FacetRequest.getValueOf is responsible for resolving an ordinal's value. It 
 is given FacetArrays, and typically does something like 
 {{arrays.getIntArray()[ord]}} -- for every ordinal! The purpose of this 
 method is to allow special requests, e.g. average, to do some post processing 
 on the values, that couldn't be done during aggregation.
 I feel that getValueOf is in the wrong place -- the calls to 
 getInt/FloatArray are really redundant. Also, if an aggregator maintains some 
 statistics by which it needs to correct the aggregated values, it's not 
 trivial to pass it from the aggregator to the request.
 Therefore I would like to make the following changes:
 * Remove FacetRequest.getValueOf and .getFacetArraysSource
 * Add FacetsAggregator.createOrdinalValueResolver which takes the FacetArrays 
 and has a simple API .valueOf(ordinal).
 * Modify the FacetResultHandlers to use OrdValResolver.
 This allows an OVR to initialize the right array instance(s) in the ctor, and 
 return the value of the requested ordinal, without doing arrays.getArray() 
 calls.
 Will post a patch shortly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5155) Add OrdinalValueResolver in favor of FacetRequest.getValueOf

2013-08-01 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726279#comment-13726279
 ] 

ASF subversion and git services commented on LUCENE-5155:
-

Commit 1509152 from [~shaie] in branch 'dev/trunk'
[ https://svn.apache.org/r1509152 ]

LUCENE-5155: add OrdinalValueResolver

 Add OrdinalValueResolver in favor of FacetRequest.getValueOf
 

 Key: LUCENE-5155
 URL: https://issues.apache.org/jira/browse/LUCENE-5155
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/facet
Reporter: Shai Erera
Assignee: Shai Erera
 Attachments: LUCENE-5155.patch


 FacetRequest.getValueOf is responsible for resolving an ordinal's value. It 
 is given FacetArrays, and typically does something like 
 {{arrays.getIntArray()[ord]}} -- for every ordinal! The purpose of this 
 method is to allow special requests, e.g. average, to do some post processing 
 on the values, that couldn't be done during aggregation.
 I feel that getValueOf is in the wrong place -- the calls to 
 getInt/FloatArray are really redundant. Also, if an aggregator maintains some 
 statistics by which it needs to correct the aggregated values, it's not 
 trivial to pass it from the aggregator to the request.
 Therefore I would like to make the following changes:
 * Remove FacetRequest.getValueOf and .getFacetArraysSource
 * Add FacetsAggregator.createOrdinalValueResolver which takes the FacetArrays 
 and has a simple API .valueOf(ordinal).
 * Modify the FacetResultHandlers to use OrdValResolver.
 This allows an OVR to initialize the right array instance(s) in the ctor, and 
 return the value of the requested ordinal, without doing arrays.getArray() 
 calls.
 Will post a patch shortly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-5155) Add OrdinalValueResolver in favor of FacetRequest.getValueOf

2013-08-01 Thread Shai Erera (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shai Erera resolved LUCENE-5155.


   Resolution: Fixed
Fix Version/s: 4.5
   5.0

Thanks Gilad, added a comment and committed.

 Add OrdinalValueResolver in favor of FacetRequest.getValueOf
 

 Key: LUCENE-5155
 URL: https://issues.apache.org/jira/browse/LUCENE-5155
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/facet
Reporter: Shai Erera
Assignee: Shai Erera
 Fix For: 5.0, 4.5

 Attachments: LUCENE-5155.patch


 FacetRequest.getValueOf is responsible for resolving an ordinal's value. It 
 is given FacetArrays, and typically does something like 
 {{arrays.getIntArray()[ord]}} -- for every ordinal! The purpose of this 
 method is to allow special requests, e.g. average, to do some post processing 
 on the values, that couldn't be done during aggregation.
 I feel that getValueOf is in the wrong place -- the calls to 
 getInt/FloatArray are really redundant. Also, if an aggregator maintains some 
 statistics by which it needs to correct the aggregated values, it's not 
 trivial to pass it from the aggregator to the request.
 Therefore I would like to make the following changes:
 * Remove FacetRequest.getValueOf and .getFacetArraysSource
 * Add FacetsAggregator.createOrdinalValueResolver which takes the FacetArrays 
 and has a simple API .valueOf(ordinal).
 * Modify the FacetResultHandlers to use OrdValResolver.
 This allows an OVR to initialize the right array instance(s) in the ctor, and 
 return the value of the requested ordinal, without doing arrays.getArray() 
 calls.
 Will post a patch shortly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5155) Add OrdinalValueResolver in favor of FacetRequest.getValueOf

2013-08-01 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726283#comment-13726283
 ] 

ASF subversion and git services commented on LUCENE-5155:
-

Commit 1509154 from [~shaie] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1509154 ]

LUCENE-5155: add OrdinalValueResolver

 Add OrdinalValueResolver in favor of FacetRequest.getValueOf
 

 Key: LUCENE-5155
 URL: https://issues.apache.org/jira/browse/LUCENE-5155
 Project: Lucene - Core
  Issue Type: Improvement
  Components: modules/facet
Reporter: Shai Erera
Assignee: Shai Erera
 Attachments: LUCENE-5155.patch


 FacetRequest.getValueOf is responsible for resolving an ordinal's value. It 
 is given FacetArrays, and typically does something like 
 {{arrays.getIntArray()[ord]}} -- for every ordinal! The purpose of this 
 method is to allow special requests, e.g. average, to do some post processing 
 on the values, that couldn't be done during aggregation.
 I feel that getValueOf is in the wrong place -- the calls to 
 getInt/FloatArray are really redundant. Also, if an aggregator maintains some 
 statistics by which it needs to correct the aggregated values, it's not 
 trivial to pass it from the aggregator to the request.
 Therefore I would like to make the following changes:
 * Remove FacetRequest.getValueOf and .getFacetArraysSource
 * Add FacetsAggregator.createOrdinalValueResolver which takes the FacetArrays 
 and has a simple API .valueOf(ordinal).
 * Modify the FacetResultHandlers to use OrdValResolver.
 This allows an OVR to initialize the right array instance(s) in the ctor, and 
 return the value of the requested ordinal, without doing arrays.getArray() 
 calls.
 Will post a patch shortly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5091) Clean up Servlets APIs, Kill SolrDispatchFilter, simplify API creation

2013-08-01 Thread Markus Jelsma (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726285#comment-13726285
 ] 

Markus Jelsma commented on SOLR-5091:
-

Can you include SOLR-4018 if you're replacing the dispatch filter or i'll have 
to keep updating it as trunk progresses :)

 Clean up Servlets APIs, Kill SolrDispatchFilter, simplify API creation
 --

 Key: SOLR-5091
 URL: https://issues.apache.org/jira/browse/SOLR-5091
 Project: Solr
  Issue Type: Improvement
Reporter: Grant Ingersoll
Assignee: Grant Ingersoll
 Fix For: 5.0


 This is an issue to track a series of sub issues related to deprecated and 
 crufty Servlet/REST API code.  I'll create sub-tasks to manage them.
 # Clean up all the old UI stuff (old redirects)
 # Kill/Simplify SolrDispatchFilter -- for instance, why not make the user 
 always have a core name in 5.0?  i.e. /collection1 is the default core
 ## I'd like to move to just using Guice's servlet extension to do this, 
 which, I think will also make it easier to run Solr in other containers (i.e. 
 non-servlet environments) due to the fact that you don't have to tie the 
 request handling logic specifically to a Servlet.
 # Simplify the creation and testing of REST and other APIs via Guice + 
 Restlet, which I've done on a number of occasions.
 ## It might be also possible to move all of the APIs onto Restlet and 
 maintain back compat through a simple restlet proxy (still exploring this).  
 This would also have the benefit of abstracting the core request processing 
 out of the Servlet context and make that an implementation detail.
 ## Moving to Guice, IMO, will make it easier to isolate and test individual 
 components by being able to inject mocks easier.
 I am close to a working patch for some of this.  I will post incremental 
 updates/issues as I move forward on this, but I think we should take 5.x as 
 an opportunity to be more agnostic of container and I believe the approach I 
 have in mind will do so.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5091) Clean up Servlets APIs, Kill SolrDispatchFilter, simplify API creation

2013-08-01 Thread Grant Ingersoll (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726287#comment-13726287
 ] 

Grant Ingersoll commented on SOLR-5091:
---

I'll see what I can do.

 Clean up Servlets APIs, Kill SolrDispatchFilter, simplify API creation
 --

 Key: SOLR-5091
 URL: https://issues.apache.org/jira/browse/SOLR-5091
 Project: Solr
  Issue Type: Improvement
Reporter: Grant Ingersoll
Assignee: Grant Ingersoll
 Fix For: 5.0


 This is an issue to track a series of sub issues related to deprecated and 
 crufty Servlet/REST API code.  I'll create sub-tasks to manage them.
 # Clean up all the old UI stuff (old redirects)
 # Kill/Simplify SolrDispatchFilter -- for instance, why not make the user 
 always have a core name in 5.0?  i.e. /collection1 is the default core
 ## I'd like to move to just using Guice's servlet extension to do this, 
 which, I think will also make it easier to run Solr in other containers (i.e. 
 non-servlet environments) due to the fact that you don't have to tie the 
 request handling logic specifically to a Servlet.
 # Simplify the creation and testing of REST and other APIs via Guice + 
 Restlet, which I've done on a number of occasions.
 ## It might be also possible to move all of the APIs onto Restlet and 
 maintain back compat through a simple restlet proxy (still exploring this).  
 This would also have the benefit of abstracting the core request processing 
 out of the Servlet context and make that an implementation detail.
 ## Moving to Guice, IMO, will make it easier to isolate and test individual 
 components by being able to inject mocks easier.
 I am close to a working patch for some of this.  I will post incremental 
 updates/issues as I move forward on this, but I think we should take 5.x as 
 an opportunity to be more agnostic of container and I believe the approach I 
 have in mind will do so.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-5102) Simplify Solr Home

2013-08-01 Thread Grant Ingersoll (JIRA)
Grant Ingersoll created SOLR-5102:
-

 Summary: Simplify Solr Home
 Key: SOLR-5102
 URL: https://issues.apache.org/jira/browse/SOLR-5102
 Project: Solr
  Issue Type: Bug
Reporter: Grant Ingersoll
Assignee: Grant Ingersoll
 Fix For: 5.0


I think for 5.0, we should re-think some of the variations we support around 
things like Solr Home, etc.  We have a fair bit of code, I suspect that could 
just go away if make it easier by assuming there is a single solr home where 
everything lives.  The notion of making that stuff configurable has outlived 
its usefulness

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-5103) Plugin Improvements

2013-08-01 Thread Grant Ingersoll (JIRA)
Grant Ingersoll created SOLR-5103:
-

 Summary: Plugin Improvements
 Key: SOLR-5103
 URL: https://issues.apache.org/jira/browse/SOLR-5103
 Project: Solr
  Issue Type: Improvement
Reporter: Grant Ingersoll
Assignee: Grant Ingersoll
 Fix For: 5.0


I think for 5.0, we should make it easier to add plugins by defining a plugin 
package, ala a Hadoop Job jar, which is a self--contained archive of a plugin 
that can be easily installed (even from the UI!) and configured 
programmatically.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-5104) Remove Default Core

2013-08-01 Thread Grant Ingersoll (JIRA)
Grant Ingersoll created SOLR-5104:
-

 Summary: Remove Default Core
 Key: SOLR-5104
 URL: https://issues.apache.org/jira/browse/SOLR-5104
 Project: Solr
  Issue Type: Sub-task
Reporter: Grant Ingersoll
 Fix For: 5.0


I see no reason to maintain the notion of a default Core/Collection.  We can 
either default to Collection1, or just simply create a core on the fly based on 
the client's request.  Thus, all APIs that are accessing a core would require 
the core to be in the address path.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4860) MoreLikeThisHandler doesn't work with numeric or date fields in 4.x

2013-08-01 Thread Mike (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726314#comment-13726314
 ] 

Mike commented on SOLR-4860:


I came across this issue as well, I wanted to use _val_ hook and numeric field 
values for boosting mlt query via mlt.fl parameter. For regular search (via bf) 
this approach works just fine.
Do you plan to fix this, or I should start working on different solution for my 
mlt query? What's the probability it will be fixed this year? :)

 MoreLikeThisHandler doesn't work with numeric or date fields in 4.x
 ---

 Key: SOLR-4860
 URL: https://issues.apache.org/jira/browse/SOLR-4860
 Project: Solr
  Issue Type: Bug
  Components: MoreLikeThis
Affects Versions: 4.2
Reporter: Thomas Seidl

 After upgrading to Solr 4.2 (from 3.x), I realized that my MLT queries no 
 longer work. It happens if I pass an integer ({{solr.TrieIntField}}), float 
 ({{solr.TrieFloatField}}) or date ({{solr.DateField}}) field as part of the 
 {{mlt.fl}} parameter. The field's {{multiValued}} setting doesn't seem to 
 matter.
 This is the error I get:
 {noformat}
 NumericTokenStream does not support CharTermAttribute.
 java.lang.IllegalArgumentException: NumericTokenStream does not support 
 CharTermAttribute.
   at 
 org.apache.lucene.analysis.NumericTokenStream$NumericAttributeFactory.createAttributeInstance(NumericTokenStream.java:136)
   at 
 org.apache.lucene.util.AttributeSource.addAttribute(AttributeSource.java:271)
   at 
 org.apache.lucene.queries.mlt.MoreLikeThis.addTermFrequencies(MoreLikeThis.java:781)
   at 
 org.apache.lucene.queries.mlt.MoreLikeThis.retrieveTerms(MoreLikeThis.java:724)
   at 
 org.apache.lucene.queries.mlt.MoreLikeThis.like(MoreLikeThis.java:578)
   at 
 org.apache.solr.handler.MoreLikeThisHandler$MoreLikeThisHelper.getMoreLikeThis(MoreLikeThisHandler.java:348)
   at 
 org.apache.solr.handler.MoreLikeThisHandler.handleRequestBody(MoreLikeThisHandler.java:167)
   at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
   at org.apache.solr.core.SolrCore.execute(SolrCore.java:1817)
   at 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:639)
   at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:345)
   at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:141)
   at 
 org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1307)
   at 
 org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:453)
   at 
 org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
   at 
 org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:560)
   at 
 org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
   at 
 org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1072)
   at 
 org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:382)
   at 
 org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
   at 
 org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1006)
   at 
 org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
   at 
 org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
   at 
 org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)
   at 
 org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
   at org.eclipse.jetty.server.Server.handle(Server.java:365)
   at 
 org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:485)
   at 
 org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53)
   at 
 org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:926)
   at 
 org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:988)
   at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:635)
   at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
   at 
 org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72)
   at 
 org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264)
   at 
 org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
   at 
 org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
   at java.lang.Thread.run(Thread.java:679)
 {noformat}
 The 

[jira] [Commented] (SOLR-5103) Plugin Improvements

2013-08-01 Thread Grant Ingersoll (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726362#comment-13726362
 ] 

Grant Ingersoll commented on SOLR-5103:
---

https://code.google.com/p/google-guice/wiki/Multibindings has some baseline 
good ideas in it, see SOLR-5091 as well for how Guice gets brought in.

 Plugin Improvements
 ---

 Key: SOLR-5103
 URL: https://issues.apache.org/jira/browse/SOLR-5103
 Project: Solr
  Issue Type: Improvement
Reporter: Grant Ingersoll
Assignee: Grant Ingersoll
 Fix For: 5.0


 I think for 5.0, we should make it easier to add plugins by defining a plugin 
 package, ala a Hadoop Job jar, which is a self--contained archive of a plugin 
 that can be easily installed (even from the UI!) and configured 
 programmatically.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Measuring SOLR performance

2013-08-01 Thread Dmitry Kan
Hi Roman,

When I try to run with -q
/home/dmitry/projects/lab/solrjmeter/queries/demo/demo.queries

here what is reported:
Traceback (most recent call last):
  File solrjmeter.py, line 1390, in module
main(sys.argv)
  File solrjmeter.py, line 1309, in main
tests = find_tests(options)
  File solrjmeter.py, line 461, in find_tests
with changed_dir(pattern):
  File /usr/lib/python2.7/contextlib.py, line 17, in __enter__
return self.gen.next()
  File solrjmeter.py, line 229, in changed_dir
os.chdir(new)
OSError: [Errno 20] Not a directory:
'/home/dmitry/projects/lab/solrjmeter/queries/demo/demo.queries'

Best,

Dmitry



On Wed, Jul 31, 2013 at 7:21 PM, Roman Chyla roman.ch...@gmail.com wrote:

 Hi Dmitry,
 probably mistake in the readme, try calling it with -q
 /home/dmitry/projects/lab/solrjmeter/queries/demo/demo.queries

 as for the base_url, i was testing it on solr4.0, where it tries contactin
 /solr/admin/system - is it different for 4.3? I guess I should make it
 configurable (it already is, the endpoint is set at the check_options())

 thanks

 roman


 On Wed, Jul 31, 2013 at 10:01 AM, Dmitry Kan solrexp...@gmail.com wrote:

  Ok, got the error fixed by modifying the base solr ulr in solrjmeter.py
  (added core name after /solr part).
  Next error is:
 
  WARNING: no test name(s) supplied nor found in:
  ['/home/dmitry/projects/lab/solrjmeter/demo/queries/demo.queries']
 
  It is a 'slow start with new tool' symptom I guess.. :)
 
 
  On Wed, Jul 31, 2013 at 4:39 PM, Dmitry Kan solrexp...@gmail.com
 wrote:
 
  Hi Roman,
 
  What  version and config of SOLR does the tool expect?
 
  Tried to run, but got:
 
  **ERROR**
File solrjmeter.py, line 1390, in module
  main(sys.argv)
File solrjmeter.py, line 1296, in main
  check_prerequisities(options)
File solrjmeter.py, line 351, in check_prerequisities
  error('Cannot contact: %s' % options.query_endpoint)
File solrjmeter.py, line 66, in error
  traceback.print_stack()
  Cannot contact: http://localhost:8983/solr
 
 
  complains about URL, clicking which leads properly to the admin page...
  solr 4.3.1, 2 cores shard
 
  Dmitry
 
 
  On Wed, Jul 31, 2013 at 3:59 AM, Roman Chyla roman.ch...@gmail.com
 wrote:
 
  Hello,
 
  I have been wanting some tools for measuring performance of SOLR,
 similar
  to Mike McCandles' lucene benchmark.
 
  so yet another monitor was born, is described here:
 
 http://29min.wordpress.com/2013/07/31/measuring-solr-query-performance/
 
  I tested it on the problem of garbage collectors (see the blogs for
  details) and so far I can't conclude whether highly customized G1 is
  better
  than highly customized CMS, but I think interesting details can be seen
  there.
 
  Hope this helps someone, and of course, feel free to improve the tool
 and
  share!
 
  roman
 
 
 
 



[jira] [Resolved] (SOLR-5100) java.lang.OutOfMemoryError: Requested array size exceeds VM limit

2013-08-01 Thread Erick Erickson (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson resolved SOLR-5100.
--

Resolution: Invalid

Please raise this on the user's list. OOM errors are common when one has not 
allocated enough heap to the JVM or otherwise tries to do too much with too few 
resources. The user's list will offer lots of help to change your setup to no 
longer OOM.

 java.lang.OutOfMemoryError: Requested array size exceeds VM limit
 -

 Key: SOLR-5100
 URL: https://issues.apache.org/jira/browse/SOLR-5100
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.2.1
 Environment: Linux 3.2.0-4-amd64 #1 SMP Debian 3.2.46-1 x86_64 
 GNU/Linux
 Java 7, Tomcat, ZK standalone
Reporter: Grzegorz Sobczyk

 Today I found exception in log (lmsiprse01):
 {code}
 sie 01, 2013 5:27:26 AM org.apache.solr.core.SolrCore execute
 INFO: [products] webapp=/solr path=/select 
 params={facet=truestart=0q=facet.limit=-1facet.field=attribute_u-typfacet.field=attribute_u-gama-kolorystycznafacet.field=brand_namewt=javabinfq=node_id:1056version=2rows=0}
  hits=1241 status=0 QTime=33 
 sie 01, 2013 5:27:26 AM org.apache.solr.common.SolrException log
 SEVERE: null:java.lang.RuntimeException: java.lang.OutOfMemoryError: 
 Requested array size exceeds VM limit
 at 
 org.apache.solr.servlet.SolrDispatchFilter.sendError(SolrDispatchFilter.java:653)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:366)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:141)
 at 
 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
 at 
 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
 at 
 org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
 at 
 org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
 at 
 org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
 at 
 org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
 at 
 org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
 at 
 org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293)
 at 
 org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:859)
 at 
 org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:602)
 at 
 org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489)
 at java.lang.Thread.run(Thread.java:724)
 Caused by: java.lang.OutOfMemoryError: Requested array size exceeds VM limit
 at org.apache.lucene.util.PriorityQueue.init(PriorityQueue.java:64)
 at org.apache.lucene.util.PriorityQueue.init(PriorityQueue.java:37)
 at 
 org.apache.solr.handler.component.ShardFieldSortedHitQueue.init(ShardDoc.java:113)
 at 
 org.apache.solr.handler.component.QueryComponent.mergeIds(QueryComponent.java:766)
 at 
 org.apache.solr.handler.component.QueryComponent.handleRegularResponses(QueryComponent.java:625)
 at 
 org.apache.solr.handler.component.QueryComponent.handleResponses(QueryComponent.java:604)
 at 
 org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:311)
 at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
 at org.apache.solr.core.SolrCore.execute(SolrCore.java:1817)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:639)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:345)
 ... 13 more
 {code}
 We have: 
 * 3x standalone zK
 * 3x Solr 4.2.1 on Tomcat
 Exception shows up after leader was stopped:
 * lmsiprse01:
 [2013-08-01 05:23:43]: /etc/init.d/tomcat6-1 stop
 [2013-08-01 05:25:09]: /etc/init.d/tomcat6-1 start
 * lmsiprse02 (leader):
 2013-08-01 05:27:21]: /etc/init.d/tomcat6-1 stop
 2013-08-01 05:29:31]: /etc/init.d/tomcat6-1 start
 * lmsiprse03:
 [2013-08-01 05:25:48]: /etc/init.d/tomcat6-1 stop
 [2013-08-01 05:26:42]: /etc/init.d/tomcat6-1 start

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-5101) Invalid UTF-8 character 0xfffe during shard update

2013-08-01 Thread Erick Erickson (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson resolved SOLR-5101.
--

Resolution: Invalid

Please raise this on the user's list and verify that it is indeed a bug before 
raising a JIRA. Offhand this sounds like a configuration error in your servlet 
container, but that's just a guess.

 Invalid UTF-8 character 0xfffe during shard update
 --

 Key: SOLR-5101
 URL: https://issues.apache.org/jira/browse/SOLR-5101
 Project: Solr
  Issue Type: Bug
  Components: clients - java
Affects Versions: 4.3
 Environment: Ubuntu 12.04.2
 java version 1.6.0_27
 OpenJDK Runtime Environment (IcedTea6 1.12.5) (6b27-1.12.5-0ubuntu0.12.04.1)
 OpenJDK 64-Bit Server VM (build 20.0-b12, mixed mode)
Reporter: Federico Chiacchiaretta

 On data import from a PostgreSQL db, I get the following error in solr.log:
 ERROR - 2013-08-01 09:51:00.217; org.apache.solr.common.SolrException; shard 
 update error RetryNode: 
 http://172.16.201.173:8983/solr/archive/:org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:
  Invalid UTF-8 character 0xfffe at char #416, byte #127)
at 
 org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:402)
at 
 org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:180)
at 
 org.apache.solr.update.SolrCmdDistributor$1.call(SolrCmdDistributor.java:332)
at 
 org.apache.solr.update.SolrCmdDistributor$1.call(SolrCmdDistributor.java:306)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:679)
 This prevents the document from being successfully added to the index, and a 
 few documents targeting the same shard are also missing.
 This happens silently, because data import completes successfully, and the 
 whole number of documents reported as Added includes those who failed (and 
 are actually lost).
 Is there a known workaround for this issue?
 Regards,
 Federico Chiacchiaretta

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Welcome Cassandra Targett as Lucene/Solr committer

2013-08-01 Thread Michael McCandless
Welcome Cassandra!

Mike McCandless

http://blog.mikemccandless.com


On Wed, Jul 31, 2013 at 6:47 PM, Robert Muir rcm...@gmail.com wrote:
 I'm pleased to announce that Cassandra Targett has accepted to join our
 ranks as a committer.

 Cassandra worked on the donation of the new Solr Reference Guide [1] and
 getting things in order for its first official release [2].
 Cassandra, it is tradition that you introduce yourself with a brief bio.

 Welcome!

 P.S. As soon as your SVN access is setup, you should then be able to add
 yourself to the committers list on the website as well.

 [1]
 https://cwiki.apache.org/confluence/display/solr/Apache+Solr+Reference+Guide
 [2] https://www.apache.org/dyn/closer.cgi/lucene/solr/ref-guide/


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-5105) Merge CoreAdmin and Collections API

2013-08-01 Thread Alan Woodward (JIRA)
Alan Woodward created SOLR-5105:
---

 Summary: Merge CoreAdmin and Collections API
 Key: SOLR-5105
 URL: https://issues.apache.org/jira/browse/SOLR-5105
 Project: Solr
  Issue Type: Improvement
Reporter: Alan Woodward
 Fix For: 5.0


For 5.0, we should remove the distinction between the Core Admin API and the 
Collections API.  It's confusing for users, and adds unnecessary complexity and 
duplication to the core code.

* Under the hood, the AdminHandlers should just be deserializing the various 
core parameters and then passing them onto the CoreContainer to do the actual 
work.
* The CoreContainer API can be cleaned up (need a distinction between loading 
existing cores and creating new ones, remove the various 'registerCore' 
methods) 
* ZkContainer should become a subclass of CoreContainer (maybe 
CloudCoreContainer?) and deal with the zookeeper interactions, while the base 
class deals with local cores.
* The CoreContainer should be dealing with all core name logic (aliases, 
collections, etc).  This will have the nice side-effect of simplifying the core 
dispatch logic as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5152) Lucene FST is not immutale

2013-08-01 Thread Simon Willnauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer updated LUCENE-5152:


Attachment: LUCENE-5152.patch

here is a patch that adds a #deepCopy method to Outputs that allows me to do a 
deep copy if the actual arc that is returned is a cached root arc. I think we 
should never return a pointer into the root arcs though. this is way to 
dangerous! I haven't run any perf tests will do once I am on my worksstation 
again.. if somebody beats me go ahead!

 Lucene FST is not immutale
 --

 Key: LUCENE-5152
 URL: https://issues.apache.org/jira/browse/LUCENE-5152
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/FSTs
Affects Versions: 4.4
Reporter: Simon Willnauer
Priority: Blocker
 Fix For: 5.0, 4.5

 Attachments: LUCENE-5152.patch, LUCENE-5152.patch


 a spinnoff from LUCENE-5120 where the analyzing suggester modified a returned 
 output from and FST (BytesRef) which caused sideffects in later execution. 
 I added an assertion into the FST that checks if a cached root arc is 
 modified and in-fact this happens for instance in our MemoryPostingsFormat 
 and I bet we find more places. We need to think about how to make this less 
 trappy since it can cause bugs that are super hard to find.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5101) Invalid UTF-8 character 0xfffe during shard update

2013-08-01 Thread Federico Chiacchiaretta (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726408#comment-13726408
 ] 

Federico Chiacchiaretta commented on SOLR-5101:
---

Hi Erick,
I'll post this on the user's list and I'll be back here when I have an update.
Regarding servlet container config, I'm using included jetty stock 
configuration.

 Invalid UTF-8 character 0xfffe during shard update
 --

 Key: SOLR-5101
 URL: https://issues.apache.org/jira/browse/SOLR-5101
 Project: Solr
  Issue Type: Bug
  Components: clients - java
Affects Versions: 4.3
 Environment: Ubuntu 12.04.2
 java version 1.6.0_27
 OpenJDK Runtime Environment (IcedTea6 1.12.5) (6b27-1.12.5-0ubuntu0.12.04.1)
 OpenJDK 64-Bit Server VM (build 20.0-b12, mixed mode)
Reporter: Federico Chiacchiaretta

 On data import from a PostgreSQL db, I get the following error in solr.log:
 ERROR - 2013-08-01 09:51:00.217; org.apache.solr.common.SolrException; shard 
 update error RetryNode: 
 http://172.16.201.173:8983/solr/archive/:org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:
  Invalid UTF-8 character 0xfffe at char #416, byte #127)
at 
 org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:402)
at 
 org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:180)
at 
 org.apache.solr.update.SolrCmdDistributor$1.call(SolrCmdDistributor.java:332)
at 
 org.apache.solr.update.SolrCmdDistributor$1.call(SolrCmdDistributor.java:306)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:679)
 This prevents the document from being successfully added to the index, and a 
 few documents targeting the same shard are also missing.
 This happens silently, because data import completes successfully, and the 
 whole number of documents reported as Added includes those who failed (and 
 are actually lost).
 Is there a known workaround for this issue?
 Regards,
 Federico Chiacchiaretta

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



FlushPolicy and maxBufDelTerm

2013-08-01 Thread Shai Erera
Hi

I'm a little confused about FlushPolicy and
IndexWriterConfig.setMaxBufferedDeleteTerms documentation. FlushPolicy
jdocs say:

 * Segments are traditionally flushed by:
 * ul
 * liRAM consumption - configured via
...
 * li*Number of buffered delete terms/queries* - configured via
 * {@link IndexWriterConfig#setMaxBufferedDeleteTerms(int)}/li
 * /ul

Yet IWC.setMaxBufDelTerm says:

NOTE: This setting won't trigger a segment flush.

And FlushByRamOrCountPolicy says:

 * li{@link #onDelete(DocumentsWriterFlushControl,
DocumentsWriterPerThreadPool.ThreadState)} - flushes
 * based on the global number of buffered delete terms iff
 * {@link IndexWriterConfig#getMaxBufferedDeleteTerms()} is enabled/li

Confused, I wrote a short unit test:

  public void testMaxBufDelTerm() throws Exception {
Directory dir = new RAMDirectory();
IndexWriterConfig conf = newIndexWriterConfig(TEST_VERSION_CURRENT, new
MockAnalyzer(random()));
conf.setMaxBufferedDeleteTerms(1);
conf.setMaxBufferedDocs(10);
conf.setRAMBufferSizeMB(IndexWriterConfig.DISABLE_AUTO_FLUSH);
conf.setInfoStream(new PrintStreamInfoStream(System.out));
IndexWriter writer = new IndexWriter(dir, conf );
int numDocs = 4;
for (int i = 0; i  numDocs; i++) {
  Document doc = new Document();
  doc.add(new StringField(id, doc- + i, Store.NO));
  writer.addDocument(doc);
}

System.out.println(before delete);
for (String f : dir.listAll()) System.out.println(f);

writer.deleteDocuments(new Term(id, doc-0));
writer.deleteDocuments(new Term(id, doc-1));

System.out.println(\nafter delete);
for (String f : dir.listAll()) System.out.println(f);

writer.close();
dir.close();
  }

When InfoStream is turned on, I can see messages regarding terms flushing
(vs if I comment the .setMaxBufDelTerm line), so I know this settings takes
effect.
Yet both before and after the delete operations, the dir.list() returns
only the fdx and fdt files.

So is this a bug that a segment isn't flushed? If not (and I'm ok with
that), is it a documentation inconsistency?
Strangely, I think, if the delTerms RAM accounting exhausts max-RAM-buffer
size, a new segment will be deleted?

Slightly unrelated to FlushPolicy, but do I understand correctly that
maxBufDelTerm does not apply to delete-by-query operations?
BufferedDeletes doesn't increment any counter on addQuery(), so is it
correct to assume that if I only delete-by-query, this setting has no
effect?
And the delete queries are buffered until the next segment is flushed due
to other operations (constraints, commit, NRT-reopen)?

Shai


Re: FlushPolicy and maxBufDelTerm

2013-08-01 Thread Shai Erera
bq. a new segment will be deleted?

I mean a new segment will be flushed :).

Shai


On Thu, Aug 1, 2013 at 4:03 PM, Shai Erera ser...@gmail.com wrote:

 Hi

 I'm a little confused about FlushPolicy and
 IndexWriterConfig.setMaxBufferedDeleteTerms documentation. FlushPolicy
 jdocs say:

  * Segments are traditionally flushed by:
  * ul
  * liRAM consumption - configured via
 ...
  * li*Number of buffered delete terms/queries* - configured via
  * {@link IndexWriterConfig#setMaxBufferedDeleteTerms(int)}/li
  * /ul

 Yet IWC.setMaxBufDelTerm says:

 NOTE: This setting won't trigger a segment flush.

 And FlushByRamOrCountPolicy says:

  * li{@link #onDelete(DocumentsWriterFlushControl,
 DocumentsWriterPerThreadPool.ThreadState)} - flushes
  * based on the global number of buffered delete terms iff
  * {@link IndexWriterConfig#getMaxBufferedDeleteTerms()} is enabled/li

 Confused, I wrote a short unit test:

   public void testMaxBufDelTerm() throws Exception {
 Directory dir = new RAMDirectory();
 IndexWriterConfig conf = newIndexWriterConfig(TEST_VERSION_CURRENT,
 new MockAnalyzer(random()));
 conf.setMaxBufferedDeleteTerms(1);
 conf.setMaxBufferedDocs(10);
 conf.setRAMBufferSizeMB(IndexWriterConfig.DISABLE_AUTO_FLUSH);
 conf.setInfoStream(new PrintStreamInfoStream(System.out));
 IndexWriter writer = new IndexWriter(dir, conf );
 int numDocs = 4;
 for (int i = 0; i  numDocs; i++) {
   Document doc = new Document();
   doc.add(new StringField(id, doc- + i, Store.NO));
   writer.addDocument(doc);
 }

 System.out.println(before delete);
 for (String f : dir.listAll()) System.out.println(f);

 writer.deleteDocuments(new Term(id, doc-0));
 writer.deleteDocuments(new Term(id, doc-1));

 System.out.println(\nafter delete);
 for (String f : dir.listAll()) System.out.println(f);

 writer.close();
 dir.close();
   }

 When InfoStream is turned on, I can see messages regarding terms flushing
 (vs if I comment the .setMaxBufDelTerm line), so I know this settings takes
 effect.
 Yet both before and after the delete operations, the dir.list() returns
 only the fdx and fdt files.

 So is this a bug that a segment isn't flushed? If not (and I'm ok with
 that), is it a documentation inconsistency?
 Strangely, I think, if the delTerms RAM accounting exhausts max-RAM-buffer
 size, a new segment will be deleted?

 Slightly unrelated to FlushPolicy, but do I understand correctly that
 maxBufDelTerm does not apply to delete-by-query operations?
 BufferedDeletes doesn't increment any counter on addQuery(), so is it
 correct to assume that if I only delete-by-query, this setting has no
 effect?
 And the delete queries are buffered until the next segment is flushed due
 to other operations (constraints, commit, NRT-reopen)?

 Shai



[jira] [Updated] (SOLR-5057) queryResultCache should not related with the order of fq's list

2013-08-01 Thread huangfeihong (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

huangfeihong updated SOLR-5057:
---

Attachment: SOLR-5057.patch

 queryResultCache should not related with the order of fq's list
 ---

 Key: SOLR-5057
 URL: https://issues.apache.org/jira/browse/SOLR-5057
 Project: Solr
  Issue Type: Improvement
  Components: search
Affects Versions: 4.0, 4.1, 4.2, 4.3
Reporter: Feihong Huang
Assignee: Erick Erickson
Priority: Minor
 Attachments: SOLR-5057.patch, SOLR-5057.patch, SOLR-5057.patch

   Original Estimate: 48h
  Remaining Estimate: 48h

 There are two case query with the same meaning below. But the case2 can't use 
 the queryResultCache when case1 is executed.
 case1: q=*:*fq=field1:value1fq=field2:value2
 case2: q=*:*fq=field2:value2fq=field1:value1
 I think queryResultCache should not be related with the order of fq's list.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5057) queryResultCache should not related with the order of fq's list

2013-08-01 Thread huangfeihong (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726425#comment-13726425
 ] 

huangfeihong commented on SOLR-5057:


Patch attached. Just rename several variable'name using Yonik's code.

 queryResultCache should not related with the order of fq's list
 ---

 Key: SOLR-5057
 URL: https://issues.apache.org/jira/browse/SOLR-5057
 Project: Solr
  Issue Type: Improvement
  Components: search
Affects Versions: 4.0, 4.1, 4.2, 4.3
Reporter: Feihong Huang
Assignee: Erick Erickson
Priority: Minor
 Attachments: SOLR-5057.patch, SOLR-5057.patch, SOLR-5057.patch

   Original Estimate: 48h
  Remaining Estimate: 48h

 There are two case query with the same meaning below. But the case2 can't use 
 the queryResultCache when case1 is executed.
 case1: q=*:*fq=field1:value1fq=field2:value2
 case2: q=*:*fq=field2:value2fq=field1:value1
 I think queryResultCache should not be related with the order of fq's list.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5152) Lucene FST is not immutale

2013-08-01 Thread Jack Krupansky (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726429#comment-13726429
 ] 

Jack Krupansky commented on LUCENE-5152:


bq. immutale

Is that the Latin term for immutable??

(spelling in summary line)


 Lucene FST is not immutale
 --

 Key: LUCENE-5152
 URL: https://issues.apache.org/jira/browse/LUCENE-5152
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/FSTs
Affects Versions: 4.4
Reporter: Simon Willnauer
Priority: Blocker
 Fix For: 5.0, 4.5

 Attachments: LUCENE-5152.patch, LUCENE-5152.patch


 a spinnoff from LUCENE-5120 where the analyzing suggester modified a returned 
 output from and FST (BytesRef) which caused sideffects in later execution. 
 I added an assertion into the FST that checks if a cached root arc is 
 modified and in-fact this happens for instance in our MemoryPostingsFormat 
 and I bet we find more places. We need to think about how to make this less 
 trappy since it can cause bugs that are super hard to find.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2894) Implement distributed pivot faceting

2013-08-01 Thread Stein J. Gran (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726461#comment-13726461
 ] 

Stein J. Gran commented on SOLR-2894:
-

I have now re-tested the scenarios I used on April 10th (see my comment above 
from that date), and all of those issues I found then are now resolved :-) I 
applied the July 25th patch to the lucene_solr_4_4 branch (Github) and 
performed the tests on this version.

Well done Andrew :-)  Thumbs up from me.

 Implement distributed pivot faceting
 

 Key: SOLR-2894
 URL: https://issues.apache.org/jira/browse/SOLR-2894
 Project: Solr
  Issue Type: Improvement
Reporter: Erik Hatcher
 Fix For: 4.5

 Attachments: SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, 
 SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, 
 SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, 
 SOLR-2894.patch, SOLR-2894.patch, SOLR-2894-reworked.patch


 Following up on SOLR-792, pivot faceting currently only supports 
 undistributed mode.  Distributed pivot faceting needs to be implemented.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-5106) Grouping on multi-valued fields

2013-08-01 Thread Reinier Battenberg (JIRA)
Reinier Battenberg created SOLR-5106:


 Summary: Grouping on multi-valued fields
 Key: SOLR-5106
 URL: https://issues.apache.org/jira/browse/SOLR-5106
 Project: Solr
  Issue Type: Improvement
Reporter: Reinier Battenberg
Priority: Minor


The Wiki page for FieldCollapsing mentions that Support for grouping on a 
multi-valued field has not yet been implemented.

This issue is to document that implementation.

http://wiki.apache.org/solr/FieldCollapsing#line-158

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5104) Remove Default Core

2013-08-01 Thread Jack Krupansky (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726483#comment-13726483
 ] 

Jack Krupansky commented on SOLR-5104:
--

Minor procedural nit... If you intend to remove a feature, deprecate it first 
(like, in 4.5.) Thanks!

 Remove Default Core
 ---

 Key: SOLR-5104
 URL: https://issues.apache.org/jira/browse/SOLR-5104
 Project: Solr
  Issue Type: Sub-task
Reporter: Grant Ingersoll
 Fix For: 5.0


 I see no reason to maintain the notion of a default Core/Collection.  We can 
 either default to Collection1, or just simply create a core on the fly based 
 on the client's request.  Thus, all APIs that are accessing a core would 
 require the core to be in the address path.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: FlushPolicy and maxBufDelTerm

2013-08-01 Thread Michael McCandless
First off, it's bad that you don't see .del files when
conf.setMaxBufferedDeleteTerms is 1.

But, it could be that newIndexWriterConfig turned on readerPooling
which would mean the deletes are held in the SegmentReader and not
flushed to disk.  Can you make sure that's off?

Second off, I think the doc is correct: a segment will not be flushed;
rather, new .del files should appear against older segments.

And yes, if RAM usage of the buffered del Term/Query s is too high,
then a segment is flushed along with the deletes being applied
(creating the .del files).

I think buffered delete Querys are not counted towards
setMaxBufferedDeleteTerms; so they are only flushed by RAM usage
(rough rough estimate) or by other ops (merging, NRT reopen, commit,
etc.).

Mike McCandless

http://blog.mikemccandless.com


On Thu, Aug 1, 2013 at 9:03 AM, Shai Erera ser...@gmail.com wrote:
 Hi

 I'm a little confused about FlushPolicy and
 IndexWriterConfig.setMaxBufferedDeleteTerms documentation. FlushPolicy jdocs
 say:

  * Segments are traditionally flushed by:
  * ul
  * liRAM consumption - configured via
 ...
  * liNumber of buffered delete terms/queries - configured via
  * {@link IndexWriterConfig#setMaxBufferedDeleteTerms(int)}/li
  * /ul

 Yet IWC.setMaxBufDelTerm says:

 NOTE: This setting won't trigger a segment flush.

 And FlushByRamOrCountPolicy says:

  * li{@link #onDelete(DocumentsWriterFlushControl,
 DocumentsWriterPerThreadPool.ThreadState)} - flushes
  * based on the global number of buffered delete terms iff
  * {@link IndexWriterConfig#getMaxBufferedDeleteTerms()} is enabled/li

 Confused, I wrote a short unit test:

   public void testMaxBufDelTerm() throws Exception {
 Directory dir = new RAMDirectory();
 IndexWriterConfig conf = newIndexWriterConfig(TEST_VERSION_CURRENT, new
 MockAnalyzer(random()));
 conf.setMaxBufferedDeleteTerms(1);
 conf.setMaxBufferedDocs(10);
 conf.setRAMBufferSizeMB(IndexWriterConfig.DISABLE_AUTO_FLUSH);
 conf.setInfoStream(new PrintStreamInfoStream(System.out));
 IndexWriter writer = new IndexWriter(dir, conf );
 int numDocs = 4;
 for (int i = 0; i  numDocs; i++) {
   Document doc = new Document();
   doc.add(new StringField(id, doc- + i, Store.NO));
   writer.addDocument(doc);
 }

 System.out.println(before delete);
 for (String f : dir.listAll()) System.out.println(f);

 writer.deleteDocuments(new Term(id, doc-0));
 writer.deleteDocuments(new Term(id, doc-1));

 System.out.println(\nafter delete);
 for (String f : dir.listAll()) System.out.println(f);

 writer.close();
 dir.close();
   }

 When InfoStream is turned on, I can see messages regarding terms flushing
 (vs if I comment the .setMaxBufDelTerm line), so I know this settings takes
 effect.
 Yet both before and after the delete operations, the dir.list() returns only
 the fdx and fdt files.

 So is this a bug that a segment isn't flushed? If not (and I'm ok with
 that), is it a documentation inconsistency?
 Strangely, I think, if the delTerms RAM accounting exhausts max-RAM-buffer
 size, a new segment will be deleted?

 Slightly unrelated to FlushPolicy, but do I understand correctly that
 maxBufDelTerm does not apply to delete-by-query operations?
 BufferedDeletes doesn't increment any counter on addQuery(), so is it
 correct to assume that if I only delete-by-query, this setting has no
 effect?
 And the delete queries are buffered until the next segment is flushed due to
 other operations (constraints, commit, NRT-reopen)?

 Shai

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-5107) LukeRequestHandler throws NullPointerException when numTerms=0

2013-08-01 Thread Ahmet Arslan (JIRA)
Ahmet Arslan created SOLR-5107:
--

 Summary: LukeRequestHandler throws NullPointerException when 
numTerms=0
 Key: SOLR-5107
 URL: https://issues.apache.org/jira/browse/SOLR-5107
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.4
Reporter: Ahmet Arslan
Priority: Minor


Defaults example 
http://localhost:8983/solr/collection1/admin/luke?fl=catnumTerms=0 yields 
{code}
ERROR org.apache.solr.core.SolrCore  – java.lang.NullPointerException
at 
org.apache.solr.handler.admin.LukeRequestHandler.getDetailedFieldInfo(LukeRequestHandler.java:610)
at 
org.apache.solr.handler.admin.LukeRequestHandler.getIndexedFieldsInfo(LukeRequestHandler.java:378)
at 
org.apache.solr.handler.admin.LukeRequestHandler.handleRequestBody(LukeRequestHandler.java:160)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1845)
at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:666)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:369)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:158)
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
at 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
at 
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557)
at 
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075)
at 
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)
at 
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
at 
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
at 
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)
at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
at org.eclipse.jetty.server.Server.handle(Server.java:368)
at 
org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)
at 
org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53)
at 
org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:942)
at 
org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1004)
at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:640)
at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
at 
org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72)
at 
org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
at java.lang.Thread.run(Thread.java:722)
{code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5107) LukeRequestHandler throws NullPointerException when numTerms=0

2013-08-01 Thread Ahmet Arslan (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmet Arslan updated SOLR-5107:
---

Attachment: SOLR-5107.patch

 LukeRequestHandler throws NullPointerException when numTerms=0
 --

 Key: SOLR-5107
 URL: https://issues.apache.org/jira/browse/SOLR-5107
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.4
Reporter: Ahmet Arslan
Priority: Minor
 Attachments: SOLR-5107.patch


 Defaults example 
 http://localhost:8983/solr/collection1/admin/luke?fl=catnumTerms=0 yields 
 {code}
 ERROR org.apache.solr.core.SolrCore  – java.lang.NullPointerException
   at 
 org.apache.solr.handler.admin.LukeRequestHandler.getDetailedFieldInfo(LukeRequestHandler.java:610)
   at 
 org.apache.solr.handler.admin.LukeRequestHandler.getIndexedFieldsInfo(LukeRequestHandler.java:378)
   at 
 org.apache.solr.handler.admin.LukeRequestHandler.handleRequestBody(LukeRequestHandler.java:160)
   at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
   at org.apache.solr.core.SolrCore.execute(SolrCore.java:1845)
   at 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:666)
   at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:369)
   at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:158)
   at 
 org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
   at 
 org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
   at 
 org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
   at 
 org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557)
   at 
 org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
   at 
 org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075)
   at 
 org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)
   at 
 org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
   at 
 org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)
   at 
 org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
   at 
 org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
   at 
 org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)
   at 
 org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
   at org.eclipse.jetty.server.Server.handle(Server.java:368)
   at 
 org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)
   at 
 org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53)
   at 
 org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:942)
   at 
 org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1004)
   at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:640)
   at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
   at 
 org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72)
   at 
 org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264)
   at 
 org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
   at 
 org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
   at java.lang.Thread.run(Thread.java:722)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Welcome Cassandra Targett as Lucene/Solr committer

2013-08-01 Thread Shawn Heisey
On 7/31/2013 4:47 PM, Robert Muir wrote:
 I'm pleased to announce that Cassandra Targett has accepted to join our
 ranks as a committer.

Welcome to the project!


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5104) Remove Default Core

2013-08-01 Thread Jack Krupansky (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726495#comment-13726495
 ] 

Jack Krupansky commented on SOLR-5104:
--

bq. I see no reason to maintain the notion of a default Core/Collection.

Okay, here is the reason...

It is a great convenience and shortens URLs to make them more readable and 
easier to type.

It greatly facilitates prototyping and experimentation and learning of the 
basics of Solr.

And... compatibility with existing apps.

So, this notion that there isn't any reason is complete nonsense.

OTOH, maybe you are trying to suggest that there is some reason or valuable 
benefit to be gained by requiring explicit collection/core name in the URL 
path. But, you have not done so. Not a hint of any reason or benefit. So, if 
you do have a reason or perceived benefit for eliminating a great convenience 
feature, please disclose it.

Or... is this not so much an issue of reason as because some code or tool 
change you are contemplating does not support the kind of flexible URL syntax 
that Solr supports? Well, if the benefits of the change in technology outweigh 
the loss of a valuable feature, then that is worth considering, but as of this 
moment no positive tradeoff has been proposed or established.

OTOH, if there were a determined effort to give Solr a full-blown true REST API 
and THAT was the motive for explicit collection name, I'd be 100% all for it.

Side note: Maybe collection1 should become example to make it clear that 
real apps should assign an app-meaningful name rather than leaving it as 
collection1.


 Remove Default Core
 ---

 Key: SOLR-5104
 URL: https://issues.apache.org/jira/browse/SOLR-5104
 Project: Solr
  Issue Type: Sub-task
Reporter: Grant Ingersoll
 Fix For: 5.0


 I see no reason to maintain the notion of a default Core/Collection.  We can 
 either default to Collection1, or just simply create a core on the fly based 
 on the client's request.  Thus, all APIs that are accessing a core would 
 require the core to be in the address path.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Welcome Cassandra Targett as Lucene/Solr committer

2013-08-01 Thread Cassandra Targett
Thanks everyone. I'm very excited to join you all.

I don't know how brief this is, but here's a bit about me:

I discovered Solr through my work at LucidWorks. I've had a few
different roles there, but most recently I've been the tech writer.
The Solr Reference Guide became part of the stuff I work on and I had
to learn Solr. I like to figure as much out on my own as I can, so to
learn I tried things out, I read the Jira issues, I tried to interpret
the Javadocs (sometimes following the trail deep into darkness and
getting it wrong). It's the same way many people passionate about Solr
get started, I think - we had a job to do, in one way or another, and
that's how we learned.

I'm technical but not a developer (I think the last real program I
wrote was in computer camp for girls in 1984, where we wrote Basic in
the morning and Jazzercized in the afternoon), but even though I don't
write code, I can understand very technical concepts and can sometimes
read code. I'm a librarian, so I spend time thinking about how to
organize information. As an undergrad I got a BA in creative writing,
and tech writing has become a really lovely pairing of two skills and
passions.

What else? I grew up in New Hampshire and after school, I moved to
Boston and at some point I decided that a) I wanted to work on the
internet and b) the best way to do that was to get an MS in Library
Science. It sounds sort of random, now, but that's what I did.

A couple years ago I left Boston and now live in Northwest Florida
(vaguely halfway between Pensacola and Tallahassee), only a couple
miles from the beach. Until that point, I (loudly and often) vowed I
would never live in Florida. But it turns out that I really love being
able to go to the beach every day of the year, and on my second day in
town I met my boyfriend and we're now sort of officially engaged. So,
even though right now it's hotter-than-Hades and
wetter-than-sunken-Atlantis, I stay.

Lastly, in my spare time, I make mosaic art. I still have more ideas
than pieces, but I'm getting there. Eventually I'll get some photos of
my stuff online for all to see.

On Wed, Jul 31, 2013 at 3:47 PM, Robert Muir rcm...@gmail.com wrote:
 I'm pleased to announce that Cassandra Targett has accepted to join our
 ranks as a committer.

 Cassandra worked on the donation of the new Solr Reference Guide [1] and
 getting things in order for its first official release [2].
 Cassandra, it is tradition that you introduce yourself with a brief bio.

 Welcome!

 P.S. As soon as your SVN access is setup, you should then be able to add
 yourself to the committers list on the website as well.

 [1]
 https://cwiki.apache.org/confluence/display/solr/Apache+Solr+Reference+Guide
 [2] https://www.apache.org/dyn/closer.cgi/lucene/solr/ref-guide/


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5104) Remove Default Core

2013-08-01 Thread Grant Ingersoll (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726503#comment-13726503
 ] 

Grant Ingersoll commented on SOLR-5104:
---

My reason is b/c SolrDispatchFilter is filled with legacy cruft, this being one 
of them.  The simpler and more standard we can make all path handling, the 
better.  I don't really care much about shorter URLs and I don't buy the 
prototyping/learning factor.  In fact, I'd argue that it is harder b/c of it, 
since you have a magic core and than all of your other cores.  If you just 
make the name of the collection part of the path always, there is no more 
guessing.  

The less legacy code for plumbing we carry forward in 5, the better off Solr 
we will be. 

And yes, I am working on making the possibility of a full blown REST API.  See 
SOLR-5091.



 Remove Default Core
 ---

 Key: SOLR-5104
 URL: https://issues.apache.org/jira/browse/SOLR-5104
 Project: Solr
  Issue Type: Sub-task
Reporter: Grant Ingersoll
 Fix For: 5.0


 I see no reason to maintain the notion of a default Core/Collection.  We can 
 either default to Collection1, or just simply create a core on the fly based 
 on the client's request.  Thus, all APIs that are accessing a core would 
 require the core to be in the address path.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: FlushPolicy and maxBufDelTerm

2013-08-01 Thread Shai Erera
 I think the doc is correct

Wait, one of the docs is wrong. I guess according to what you write, it's
FlushPolicy, as a new segment is not flushed per this setting?
Or perhaps they should be clarified that the deletes are flushed == applied
on existing segments?

I disabled reader pooling and I still don't see .del files. But I think
that's explained due to there are no segments in the index yet.
All documents are still in the RAM buffer, and according to what you write,
I shouldn't see any segment cause of delTerms?

Shai


On Thu, Aug 1, 2013 at 5:40 PM, Michael McCandless 
luc...@mikemccandless.com wrote:

 First off, it's bad that you don't see .del files when
 conf.setMaxBufferedDeleteTerms is 1.

 But, it could be that newIndexWriterConfig turned on readerPooling
 which would mean the deletes are held in the SegmentReader and not
 flushed to disk.  Can you make sure that's off?

 Second off, I think the doc is correct: a segment will not be flushed;
 rather, new .del files should appear against older segments.

 And yes, if RAM usage of the buffered del Term/Query s is too high,
 then a segment is flushed along with the deletes being applied
 (creating the .del files).

 I think buffered delete Querys are not counted towards
 setMaxBufferedDeleteTerms; so they are only flushed by RAM usage
 (rough rough estimate) or by other ops (merging, NRT reopen, commit,
 etc.).

 Mike McCandless

 http://blog.mikemccandless.com


 On Thu, Aug 1, 2013 at 9:03 AM, Shai Erera ser...@gmail.com wrote:
  Hi
 
  I'm a little confused about FlushPolicy and
  IndexWriterConfig.setMaxBufferedDeleteTerms documentation. FlushPolicy
 jdocs
  say:
 
   * Segments are traditionally flushed by:
   * ul
   * liRAM consumption - configured via
  ...
   * liNumber of buffered delete terms/queries - configured via
   * {@link IndexWriterConfig#setMaxBufferedDeleteTerms(int)}/li
   * /ul
 
  Yet IWC.setMaxBufDelTerm says:
 
  NOTE: This setting won't trigger a segment flush.
 
  And FlushByRamOrCountPolicy says:
 
   * li{@link #onDelete(DocumentsWriterFlushControl,
  DocumentsWriterPerThreadPool.ThreadState)} - flushes
   * based on the global number of buffered delete terms iff
   * {@link IndexWriterConfig#getMaxBufferedDeleteTerms()} is enabled/li
 
  Confused, I wrote a short unit test:
 
public void testMaxBufDelTerm() throws Exception {
  Directory dir = new RAMDirectory();
  IndexWriterConfig conf = newIndexWriterConfig(TEST_VERSION_CURRENT,
 new
  MockAnalyzer(random()));
  conf.setMaxBufferedDeleteTerms(1);
  conf.setMaxBufferedDocs(10);
  conf.setRAMBufferSizeMB(IndexWriterConfig.DISABLE_AUTO_FLUSH);
  conf.setInfoStream(new PrintStreamInfoStream(System.out));
  IndexWriter writer = new IndexWriter(dir, conf );
  int numDocs = 4;
  for (int i = 0; i  numDocs; i++) {
Document doc = new Document();
doc.add(new StringField(id, doc- + i, Store.NO));
writer.addDocument(doc);
  }
 
  System.out.println(before delete);
  for (String f : dir.listAll()) System.out.println(f);
 
  writer.deleteDocuments(new Term(id, doc-0));
  writer.deleteDocuments(new Term(id, doc-1));
 
  System.out.println(\nafter delete);
  for (String f : dir.listAll()) System.out.println(f);
 
  writer.close();
  dir.close();
}
 
  When InfoStream is turned on, I can see messages regarding terms flushing
  (vs if I comment the .setMaxBufDelTerm line), so I know this settings
 takes
  effect.
  Yet both before and after the delete operations, the dir.list() returns
 only
  the fdx and fdt files.
 
  So is this a bug that a segment isn't flushed? If not (and I'm ok with
  that), is it a documentation inconsistency?
  Strangely, I think, if the delTerms RAM accounting exhausts
 max-RAM-buffer
  size, a new segment will be deleted?
 
  Slightly unrelated to FlushPolicy, but do I understand correctly that
  maxBufDelTerm does not apply to delete-by-query operations?
  BufferedDeletes doesn't increment any counter on addQuery(), so is it
  correct to assume that if I only delete-by-query, this setting has no
  effect?
  And the delete queries are buffered until the next segment is flushed
 due to
  other operations (constraints, commit, NRT-reopen)?
 
  Shai

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org




Re: FlushPolicy and maxBufDelTerm

2013-08-01 Thread Shai Erera
I set maxBufDocs=2 so that I get a segment flushed, and indeed after delete
I see _0.del.

So I guess this is just docs inconsistency. I'll clarify FlushPolicy docs.

Shai


On Thu, Aug 1, 2013 at 6:24 PM, Shai Erera ser...@gmail.com wrote:

  I think the doc is correct

 Wait, one of the docs is wrong. I guess according to what you write, it's
 FlushPolicy, as a new segment is not flushed per this setting?
 Or perhaps they should be clarified that the deletes are flushed ==
 applied on existing segments?

 I disabled reader pooling and I still don't see .del files. But I think
 that's explained due to there are no segments in the index yet.
 All documents are still in the RAM buffer, and according to what you
 write, I shouldn't see any segment cause of delTerms?

 Shai


 On Thu, Aug 1, 2013 at 5:40 PM, Michael McCandless 
 luc...@mikemccandless.com wrote:

 First off, it's bad that you don't see .del files when
 conf.setMaxBufferedDeleteTerms is 1.

 But, it could be that newIndexWriterConfig turned on readerPooling
 which would mean the deletes are held in the SegmentReader and not
 flushed to disk.  Can you make sure that's off?

 Second off, I think the doc is correct: a segment will not be flushed;
 rather, new .del files should appear against older segments.

 And yes, if RAM usage of the buffered del Term/Query s is too high,
 then a segment is flushed along with the deletes being applied
 (creating the .del files).

 I think buffered delete Querys are not counted towards
 setMaxBufferedDeleteTerms; so they are only flushed by RAM usage
 (rough rough estimate) or by other ops (merging, NRT reopen, commit,
 etc.).

 Mike McCandless

 http://blog.mikemccandless.com


 On Thu, Aug 1, 2013 at 9:03 AM, Shai Erera ser...@gmail.com wrote:
  Hi
 
  I'm a little confused about FlushPolicy and
  IndexWriterConfig.setMaxBufferedDeleteTerms documentation. FlushPolicy
 jdocs
  say:
 
   * Segments are traditionally flushed by:
   * ul
   * liRAM consumption - configured via
  ...
   * liNumber of buffered delete terms/queries - configured via
   * {@link IndexWriterConfig#setMaxBufferedDeleteTerms(int)}/li
   * /ul
 
  Yet IWC.setMaxBufDelTerm says:
 
  NOTE: This setting won't trigger a segment flush.
 
  And FlushByRamOrCountPolicy says:
 
   * li{@link #onDelete(DocumentsWriterFlushControl,
  DocumentsWriterPerThreadPool.ThreadState)} - flushes
   * based on the global number of buffered delete terms iff
   * {@link IndexWriterConfig#getMaxBufferedDeleteTerms()} is enabled/li
 
  Confused, I wrote a short unit test:
 
public void testMaxBufDelTerm() throws Exception {
  Directory dir = new RAMDirectory();
  IndexWriterConfig conf = newIndexWriterConfig(TEST_VERSION_CURRENT,
 new
  MockAnalyzer(random()));
  conf.setMaxBufferedDeleteTerms(1);
  conf.setMaxBufferedDocs(10);
  conf.setRAMBufferSizeMB(IndexWriterConfig.DISABLE_AUTO_FLUSH);
  conf.setInfoStream(new PrintStreamInfoStream(System.out));
  IndexWriter writer = new IndexWriter(dir, conf );
  int numDocs = 4;
  for (int i = 0; i  numDocs; i++) {
Document doc = new Document();
doc.add(new StringField(id, doc- + i, Store.NO));
writer.addDocument(doc);
  }
 
  System.out.println(before delete);
  for (String f : dir.listAll()) System.out.println(f);
 
  writer.deleteDocuments(new Term(id, doc-0));
  writer.deleteDocuments(new Term(id, doc-1));
 
  System.out.println(\nafter delete);
  for (String f : dir.listAll()) System.out.println(f);
 
  writer.close();
  dir.close();
}
 
  When InfoStream is turned on, I can see messages regarding terms
 flushing
  (vs if I comment the .setMaxBufDelTerm line), so I know this settings
 takes
  effect.
  Yet both before and after the delete operations, the dir.list() returns
 only
  the fdx and fdt files.
 
  So is this a bug that a segment isn't flushed? If not (and I'm ok with
  that), is it a documentation inconsistency?
  Strangely, I think, if the delTerms RAM accounting exhausts
 max-RAM-buffer
  size, a new segment will be deleted?
 
  Slightly unrelated to FlushPolicy, but do I understand correctly that
  maxBufDelTerm does not apply to delete-by-query operations?
  BufferedDeletes doesn't increment any counter on addQuery(), so is it
  correct to assume that if I only delete-by-query, this setting has no
  effect?
  And the delete queries are buffered until the next segment is flushed
 due to
  other operations (constraints, commit, NRT-reopen)?
 
  Shai

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org





[jira] [Commented] (LUCENE-5152) Lucene FST is not immutale

2013-08-01 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726569#comment-13726569
 ] 

Robert Muir commented on LUCENE-5152:
-

I guess one question would be if its FSTs job to defend against bytesref bugs.

This issue was driven because there was a bytesref bug for suggester payloads.
The same kind of bug could happen, e.g. if someone uses DirectPostings and 
modifies the payload coming back from the postings lists.

Should we clone payload bytes in the postings lists too? what about term 
dictionaries?

At some point then BytesRef is useless as a reference class because of a few 
bad apples trying to use it as a ByteBuffer.
Ideally we would remove code that abuses BytesRef as a ByteBuffer instead. 

I don't mean to pick on your issue Simon, and it doesnt mean I object to the 
patch (though I wonder about performance implications), I just see this as one 
of many in a larger issue.


 Lucene FST is not immutale
 --

 Key: LUCENE-5152
 URL: https://issues.apache.org/jira/browse/LUCENE-5152
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/FSTs
Affects Versions: 4.4
Reporter: Simon Willnauer
Priority: Blocker
 Fix For: 5.0, 4.5

 Attachments: LUCENE-5152.patch, LUCENE-5152.patch


 a spinnoff from LUCENE-5120 where the analyzing suggester modified a returned 
 output from and FST (BytesRef) which caused sideffects in later execution. 
 I added an assertion into the FST that checks if a cached root arc is 
 modified and in-fact this happens for instance in our MemoryPostingsFormat 
 and I bet we find more places. We need to think about how to make this less 
 trappy since it can cause bugs that are super hard to find.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: FlushPolicy and maxBufDelTerm

2013-08-01 Thread Michael McCandless
On Thu, Aug 1, 2013 at 11:24 AM, Shai Erera ser...@gmail.com wrote:
 I think the doc is correct

 Wait, one of the docs is wrong. I guess according to what you write, it's
 FlushPolicy, as a new segment is not flushed per this setting?
 Or perhaps they should be clarified that the deletes are flushed == applied
 on existing segments?

Ahh, right.  OK I think we should fix FlushPolicy to say deletes are
applied?  Let's try to leave the verb flushed to mean a new segment
is written to disk, I think?

 I disabled reader pooling and I still don't see .del files. But I think
 that's explained due to there are no segments in the index yet.
 All documents are still in the RAM buffer, and according to what you write,
 I shouldn't see any segment cause of delTerms?

Right!  OK so that explains it.

Mike McCandless

http://blog.mikemccandless.com

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: FlushPolicy and maxBufDelTerm

2013-08-01 Thread Simon Willnauer
thanks for clarifying this  - I agree the wording is tricky here and
we should use the term apply here! sorry for the confusion!

simon

On Thu, Aug 1, 2013 at 7:39 PM, Michael McCandless
luc...@mikemccandless.com wrote:
 On Thu, Aug 1, 2013 at 11:24 AM, Shai Erera ser...@gmail.com wrote:
 I think the doc is correct

 Wait, one of the docs is wrong. I guess according to what you write, it's
 FlushPolicy, as a new segment is not flushed per this setting?
 Or perhaps they should be clarified that the deletes are flushed == applied
 on existing segments?

 Ahh, right.  OK I think we should fix FlushPolicy to say deletes are
 applied?  Let's try to leave the verb flushed to mean a new segment
 is written to disk, I think?

 I disabled reader pooling and I still don't see .del files. But I think
 that's explained due to there are no segments in the index yet.
 All documents are still in the RAM buffer, and according to what you write,
 I shouldn't see any segment cause of delTerms?

 Right!  OK so that explains it.

 Mike McCandless

 http://blog.mikemccandless.com

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4953) Config XML parsing should fail hard if an xpath is expect to match at most one node/string/int/boolean and multiple values are found

2013-08-01 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726695#comment-13726695
 ] 

ASF subversion and git services commented on SOLR-4953:
---

Commit 1509359 from hoss...@apache.org in branch 'dev/trunk'
[ https://svn.apache.org/r1509359 ]

SOLR-4953: Make XML Configuration parsing fail if an xpath matches multiple 
nodes when only a single value is expected.

 Config XML parsing should fail hard if an xpath is expect to match at most 
 one node/string/int/boolean and multiple values are found
 

 Key: SOLR-4953
 URL: https://issues.apache.org/jira/browse/SOLR-4953
 Project: Solr
  Issue Type: Improvement
Reporter: Hoss Man
Assignee: Hoss Man
 Attachments: SOLR-4953.patch, SOLR-4953.patch


 while reviewing some code i think i noticed that if there are multiple 
 {{indexConfig/}} blocks in solrconfig.xml, one just wins and hte rest are 
 ignored.
 this should be a hard failure situation, and we should have a TestBadConfig 
 method to verify it.
 ---
 broadened goal of issue to fail if configuration contains multiple 
 nodes/values for any option where only one value is expected.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS-MAVEN] Lucene-Solr-Maven-4.x #404: POMs out of sync

2013-08-01 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-Maven-4.x/404/

1 tests failed.
REGRESSION:  
org.apache.solr.cloud.CollectionsAPIDistributedZkTest.testDistribSearch

Error Message:
IOException occured when talking to server at: http://127.0.0.1:28478/sqt

Stack Trace:
org.apache.solr.client.solrj.SolrServerException: IOException occured when 
talking to server at: http://127.0.0.1:28478/sqt
at 
org.apache.http.conn.scheme.PlainSocketFactory.connectSocket(PlainSocketFactory.java:129)
at 
org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:180)
at 
org.apache.http.impl.conn.ManagedClientConnectionImpl.open(ManagedClientConnectionImpl.java:294)
at 
org.apache.http.impl.client.DefaultRequestDirector.tryConnect(DefaultRequestDirector.java:645)
at 
org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:480)
at 
org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906)
at 
org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805)
at 
org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:784)
at 
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:365)
at 
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:180)
at 
org.apache.solr.cloud.CollectionsAPIDistributedZkTest.testCustomCollectionsAPI(CollectionsAPIDistributedZkTest.java:764)
at 
org.apache.solr.cloud.CollectionsAPIDistributedZkTest.doTest(CollectionsAPIDistributedZkTest.java:159)




Build Log:
[...truncated 24519 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-5108) plugin loading should fail if mor then one instance of a singleton plugin is found

2013-08-01 Thread Hoss Man (JIRA)
Hoss Man created SOLR-5108:
--

 Summary: plugin loading should fail if mor then one instance of a 
singleton plugin is found
 Key: SOLR-5108
 URL: https://issues.apache.org/jira/browse/SOLR-5108
 Project: Solr
  Issue Type: Bug
Reporter: Hoss Man
Assignee: Hoss Man


Continuing from the config parsing/validation work done in SOLR-4953, we should 
improve SolrConfig so that parsing fails if multiple instances of a plugin 
are found for types of plugins where only one is allowed to be used at a time.

at the moment, {{SolrConfig.loadPluginInfo}} happily initializes a 
{{ListPluginInfo}} for whatever xpath it's given, and then later code can 
either call {{ListPluginInfo getPluginInfos(String)}} or {{PluginInfo 
getPluginInfo(String)}} (the later just being shorthand for getting the first 
item in the list.

we could make {{getPluginInfo(String)}} throw an error if the list has multiple 
items, but i think we should also change the signature of {{loadPluginInfo}} to 
be explicit about how many instances we expect to find, so we can error 
earlier, and have a redundant check.



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5108) plugin loading should fail if mor then one instance of a singleton plugin is found

2013-08-01 Thread Jack Krupansky (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726739#comment-13726739
 ] 

Jack Krupansky commented on SOLR-5108:
--

Sounds like this might resolve SOLR-4304 - NPE in Solr SpellCheckComponent if 
more than one QueryConverter.

 plugin loading should fail if mor then one instance of a singleton plugin is 
 found
 --

 Key: SOLR-5108
 URL: https://issues.apache.org/jira/browse/SOLR-5108
 Project: Solr
  Issue Type: Bug
Reporter: Hoss Man
Assignee: Hoss Man

 Continuing from the config parsing/validation work done in SOLR-4953, we 
 should improve SolrConfig so that parsing fails if multiple instances of a 
 plugin are found for types of plugins where only one is allowed to be used 
 at a time.
 at the moment, {{SolrConfig.loadPluginInfo}} happily initializes a 
 {{ListPluginInfo}} for whatever xpath it's given, and then later code can 
 either call {{ListPluginInfo getPluginInfos(String)}} or {{PluginInfo 
 getPluginInfo(String)}} (the later just being shorthand for getting the first 
 item in the list.
 we could make {{getPluginInfo(String)}} throw an error if the list has 
 multiple items, but i think we should also change the signature of 
 {{loadPluginInfo}} to be explicit about how many instances we expect to find, 
 so we can error earlier, and have a redundant check.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5081) Highly parallel document insertion hangs SolrCloud

2013-08-01 Thread Noble Paul (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726746#comment-13726746
 ] 

Noble Paul commented on SOLR-5081:
--

[~mikeschrag] COuld you get any more thread dumps?

 Highly parallel document insertion hangs SolrCloud
 --

 Key: SOLR-5081
 URL: https://issues.apache.org/jira/browse/SOLR-5081
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.3.1
Reporter: Mike Schrag
 Attachments: threads.txt


 If I do a highly parallel document load using a Hadoop cluster into an 18 
 node solrcloud cluster, I can deadlock solr every time.
 The ulimits on the nodes are:
 core file size  (blocks, -c) 0
 data seg size   (kbytes, -d) unlimited
 scheduling priority (-e) 0
 file size   (blocks, -f) unlimited
 pending signals (-i) 1031181
 max locked memory   (kbytes, -l) unlimited
 max memory size (kbytes, -m) unlimited
 open files  (-n) 32768
 pipe size(512 bytes, -p) 8
 POSIX message queues (bytes, -q) 819200
 real-time priority  (-r) 0
 stack size  (kbytes, -s) 10240
 cpu time   (seconds, -t) unlimited
 max user processes  (-u) 515590
 virtual memory  (kbytes, -v) unlimited
 file locks  (-x) unlimited
 The open file count is only around 4000 when this happens.
 If I bounce all the servers, things start working again, which makes me think 
 this is Solr and not ZK.
 I'll attach the stack trace from one of the servers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: FlushPolicy and maxBufDelTerm

2013-08-01 Thread Shai Erera
ok I committed some improvements there and some other places.
Thanks guys for clarifying this!

Shai


On Thu, Aug 1, 2013 at 8:55 PM, Simon Willnauer
simon.willna...@gmail.comwrote:

 thanks for clarifying this  - I agree the wording is tricky here and
 we should use the term apply here! sorry for the confusion!

 simon

 On Thu, Aug 1, 2013 at 7:39 PM, Michael McCandless
 luc...@mikemccandless.com wrote:
  On Thu, Aug 1, 2013 at 11:24 AM, Shai Erera ser...@gmail.com wrote:
  I think the doc is correct
 
  Wait, one of the docs is wrong. I guess according to what you write,
 it's
  FlushPolicy, as a new segment is not flushed per this setting?
  Or perhaps they should be clarified that the deletes are flushed ==
 applied
  on existing segments?
 
  Ahh, right.  OK I think we should fix FlushPolicy to say deletes are
  applied?  Let's try to leave the verb flushed to mean a new segment
  is written to disk, I think?
 
  I disabled reader pooling and I still don't see .del files. But I think
  that's explained due to there are no segments in the index yet.
  All documents are still in the RAM buffer, and according to what you
 write,
  I shouldn't see any segment cause of delTerms?
 
  Right!  OK so that explains it.
 
  Mike McCandless
 
  http://blog.mikemccandless.com
 
  -
  To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
  For additional commands, e-mail: dev-h...@lucene.apache.org
 

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org




[jira] [Commented] (SOLR-4953) Config XML parsing should fail hard if an xpath is expect to match at most one node/string/int/boolean and multiple values are found

2013-08-01 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726782#comment-13726782
 ] 

ASF subversion and git services commented on SOLR-4953:
---

Commit 1509390 from hoss...@apache.org in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1509390 ]

SOLR-4953: Make XML Configuration parsing fail if an xpath matches multiple 
nodes when only a single value is expected. (merge r1509359)

 Config XML parsing should fail hard if an xpath is expect to match at most 
 one node/string/int/boolean and multiple values are found
 

 Key: SOLR-4953
 URL: https://issues.apache.org/jira/browse/SOLR-4953
 Project: Solr
  Issue Type: Improvement
Reporter: Hoss Man
Assignee: Hoss Man
 Attachments: SOLR-4953.patch, SOLR-4953.patch


 while reviewing some code i think i noticed that if there are multiple 
 {{indexConfig/}} blocks in solrconfig.xml, one just wins and hte rest are 
 ignored.
 this should be a hard failure situation, and we should have a TestBadConfig 
 method to verify it.
 ---
 broadened goal of issue to fail if configuration contains multiple 
 nodes/values for any option where only one value is expected.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-4953) Config XML parsing should fail hard if an xpath is expect to match at most one node/string/int/boolean and multiple values are found

2013-08-01 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-4953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man resolved SOLR-4953.


   Resolution: Fixed
Fix Version/s: 5.0
   4.5

 Config XML parsing should fail hard if an xpath is expect to match at most 
 one node/string/int/boolean and multiple values are found
 

 Key: SOLR-4953
 URL: https://issues.apache.org/jira/browse/SOLR-4953
 Project: Solr
  Issue Type: Improvement
Reporter: Hoss Man
Assignee: Hoss Man
 Fix For: 4.5, 5.0

 Attachments: SOLR-4953.patch, SOLR-4953.patch


 while reviewing some code i think i noticed that if there are multiple 
 {{indexConfig/}} blocks in solrconfig.xml, one just wins and hte rest are 
 ignored.
 this should be a hard failure situation, and we should have a TestBadConfig 
 method to verify it.
 ---
 broadened goal of issue to fail if configuration contains multiple 
 nodes/values for any option where only one value is expected.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5081) Highly parallel document insertion hangs SolrCloud

2013-08-01 Thread Mike Schrag (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726801#comment-13726801
 ] 

Mike Schrag commented on SOLR-5081:
---

I grabbed more and they all look basically the same as the attached, which is 
to say, it sort of looks like Solr isn't doing ANYTHING. I'm going to look into 
whether I'm crushing ZooKeeper, and maybe my requests aren't even getting to 
Solr.

 Highly parallel document insertion hangs SolrCloud
 --

 Key: SOLR-5081
 URL: https://issues.apache.org/jira/browse/SOLR-5081
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.3.1
Reporter: Mike Schrag
 Attachments: threads.txt


 If I do a highly parallel document load using a Hadoop cluster into an 18 
 node solrcloud cluster, I can deadlock solr every time.
 The ulimits on the nodes are:
 core file size  (blocks, -c) 0
 data seg size   (kbytes, -d) unlimited
 scheduling priority (-e) 0
 file size   (blocks, -f) unlimited
 pending signals (-i) 1031181
 max locked memory   (kbytes, -l) unlimited
 max memory size (kbytes, -m) unlimited
 open files  (-n) 32768
 pipe size(512 bytes, -p) 8
 POSIX message queues (bytes, -q) 819200
 real-time priority  (-r) 0
 stack size  (kbytes, -s) 10240
 cpu time   (seconds, -t) unlimited
 max user processes  (-u) 515590
 virtual memory  (kbytes, -v) unlimited
 file locks  (-x) unlimited
 The open file count is only around 4000 when this happens.
 If I bounce all the servers, things start working again, which makes me think 
 this is Solr and not ZK.
 I'll attach the stack trace from one of the servers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5081) Highly parallel document insertion hangs SolrCloud

2013-08-01 Thread Erick Erickson (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726831#comment-13726831
 ] 

Erick Erickson commented on SOLR-5081:
--

Yeah, that is odd. The stack traces you sent basically showed no deadlocks, 
nothing interesting at all. I suspect pursuing whether anything is getting to 
Solr or not is a good idea

H, blunt-instrument test when the cluster is hung. What happens if you, 
say, submit a query directly to one of the nodes? Does it respond or do you see 
anything in the solr log on that node? Tip: adding distrib=false to the 
_query_ will not try to send sub-queries to other shards.

And I wonder what happens if you, say, use post.jar (comes with the example) to 
try to send a doc to Solr when it's hung, anything?

Clearly I'm grasping at straws here, but I'm kind of out of good ideas.

 Highly parallel document insertion hangs SolrCloud
 --

 Key: SOLR-5081
 URL: https://issues.apache.org/jira/browse/SOLR-5081
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.3.1
Reporter: Mike Schrag
 Attachments: threads.txt


 If I do a highly parallel document load using a Hadoop cluster into an 18 
 node solrcloud cluster, I can deadlock solr every time.
 The ulimits on the nodes are:
 core file size  (blocks, -c) 0
 data seg size   (kbytes, -d) unlimited
 scheduling priority (-e) 0
 file size   (blocks, -f) unlimited
 pending signals (-i) 1031181
 max locked memory   (kbytes, -l) unlimited
 max memory size (kbytes, -m) unlimited
 open files  (-n) 32768
 pipe size(512 bytes, -p) 8
 POSIX message queues (bytes, -q) 819200
 real-time priority  (-r) 0
 stack size  (kbytes, -s) 10240
 cpu time   (seconds, -t) unlimited
 max user processes  (-u) 515590
 virtual memory  (kbytes, -v) unlimited
 file locks  (-x) unlimited
 The open file count is only around 4000 when this happens.
 If I bounce all the servers, things start working again, which makes me think 
 this is Solr and not ZK.
 I'll attach the stack trace from one of the servers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5152) Lucene FST is not immutable

2013-08-01 Thread Simon Willnauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer updated LUCENE-5152:


Summary: Lucene FST is not immutable  (was: Lucene FST is not immutale)

 Lucene FST is not immutable
 ---

 Key: LUCENE-5152
 URL: https://issues.apache.org/jira/browse/LUCENE-5152
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/FSTs
Affects Versions: 4.4
Reporter: Simon Willnauer
Priority: Blocker
 Fix For: 5.0, 4.5

 Attachments: LUCENE-5152.patch, LUCENE-5152.patch


 a spinnoff from LUCENE-5120 where the analyzing suggester modified a returned 
 output from and FST (BytesRef) which caused sideffects in later execution. 
 I added an assertion into the FST that checks if a cached root arc is 
 modified and in-fact this happens for instance in our MemoryPostingsFormat 
 and I bet we find more places. We need to think about how to make this less 
 trappy since it can cause bugs that are super hard to find.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-5152) Lucene FST is not immutable

2013-08-01 Thread Simon Willnauer (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-5152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726847#comment-13726847
 ] 

Simon Willnauer commented on LUCENE-5152:
-

bq. Should we clone payload bytes in the postings lists too? what about term 
dictionaries?
I agree we can be less conservative here and just use the payload and copy it 
into a new BytesRef or whatever is needed. I will bring up a new patch.

bq. At some point then BytesRef is useless as a reference class because of a 
few bad apples trying to use it as a ByteBuffer. Ideally we would remove code 
that abuses BytesRef as a ByteBuffer instead.

agreed again. We just need to make sure that we have asserts in place that 
check for that. 

bq. I don't mean to pick on your issue Simon, and it doesnt mean I object to 
the patch (though I wonder about performance implications), I just see this as 
one of many in a larger issue.

no worries. I am really concerned about this since it took me forever to figure 
out the problems this caused. I just wanna have an infra in place that catches 
those problems. I am more concerned about users that get bitten by this. I 
agree we should figure out the bigger problem eventually but lets make sure 
that we fix the bad apples first

 Lucene FST is not immutable
 ---

 Key: LUCENE-5152
 URL: https://issues.apache.org/jira/browse/LUCENE-5152
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/FSTs
Affects Versions: 4.4
Reporter: Simon Willnauer
Priority: Blocker
 Fix For: 5.0, 4.5

 Attachments: LUCENE-5152.patch, LUCENE-5152.patch


 a spinnoff from LUCENE-5120 where the analyzing suggester modified a returned 
 output from and FST (BytesRef) which caused sideffects in later execution. 
 I added an assertion into the FST that checks if a cached root arc is 
 modified and in-fact this happens for instance in our MemoryPostingsFormat 
 and I bet we find more places. We need to think about how to make this less 
 trappy since it can cause bugs that are super hard to find.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5081) Highly parallel document insertion hangs SolrCloud

2013-08-01 Thread Mike Schrag (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726848#comment-13726848
 ] 

Mike Schrag commented on SOLR-5081:
---

I actually did this exact test when I was in this state originally, and the 
insert _worked_, which totally confused the situation for me. However, in light 
of seeing nothing in the traces, it supports the theory that the cluster isn't 
hung, but rather I'm somehow not even getting that far in the Hadoop cluster. 
ZK was my best guess as something that maybe could be an earlier stage failure, 
but even that I would expect to have hang the test-insert. So I need to do a 
little more forensics here and see if I can get a better picture of wtf is 
going on.

 Highly parallel document insertion hangs SolrCloud
 --

 Key: SOLR-5081
 URL: https://issues.apache.org/jira/browse/SOLR-5081
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.3.1
Reporter: Mike Schrag
 Attachments: threads.txt


 If I do a highly parallel document load using a Hadoop cluster into an 18 
 node solrcloud cluster, I can deadlock solr every time.
 The ulimits on the nodes are:
 core file size  (blocks, -c) 0
 data seg size   (kbytes, -d) unlimited
 scheduling priority (-e) 0
 file size   (blocks, -f) unlimited
 pending signals (-i) 1031181
 max locked memory   (kbytes, -l) unlimited
 max memory size (kbytes, -m) unlimited
 open files  (-n) 32768
 pipe size(512 bytes, -p) 8
 POSIX message queues (bytes, -q) 819200
 real-time priority  (-r) 0
 stack size  (kbytes, -s) 10240
 cpu time   (seconds, -t) unlimited
 max user processes  (-u) 515590
 virtual memory  (kbytes, -v) unlimited
 file locks  (-x) unlimited
 The open file count is only around 4000 when this happens.
 If I bounce all the servers, things start working again, which makes me think 
 this is Solr and not ZK.
 I'll attach the stack trace from one of the servers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-5152) Lucene FST is not immutable

2013-08-01 Thread Simon Willnauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer updated LUCENE-5152:


Attachment: LUCENE-5152.patch

this patch only adds the assert and fixes the problems in MemoryPostings. This 
could solve the immediate issue and adds some more asserts to make sure we 
realise if something modifies the arcs outputs.

 Lucene FST is not immutable
 ---

 Key: LUCENE-5152
 URL: https://issues.apache.org/jira/browse/LUCENE-5152
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/FSTs
Affects Versions: 4.4
Reporter: Simon Willnauer
Priority: Blocker
 Fix For: 5.0, 4.5

 Attachments: LUCENE-5152.patch, LUCENE-5152.patch, LUCENE-5152.patch


 a spinnoff from LUCENE-5120 where the analyzing suggester modified a returned 
 output from and FST (BytesRef) which caused sideffects in later execution. 
 I added an assertion into the FST that checks if a cached root arc is 
 modified and in-fact this happens for instance in our MemoryPostingsFormat 
 and I bet we find more places. We need to think about how to make this less 
 trappy since it can cause bugs that are super hard to find.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS-MAVEN] Lucene-Solr-Maven-trunk #926: POMs out of sync

2013-08-01 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-Maven-trunk/926/

1 tests failed.
FAILED:  org.apache.solr.cloud.CollectionsAPIDistributedZkTest.testDistribSearch

Error Message:
IOException occured when talking to server at: http://127.0.0.1:26547

Stack Trace:
org.apache.solr.client.solrj.SolrServerException: IOException occured when 
talking to server at: http://127.0.0.1:26547
at 
org.apache.http.conn.scheme.PlainSocketFactory.connectSocket(PlainSocketFactory.java:129)
at 
org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:180)
at 
org.apache.http.impl.conn.ManagedClientConnectionImpl.open(ManagedClientConnectionImpl.java:294)
at 
org.apache.http.impl.client.DefaultRequestDirector.tryConnect(DefaultRequestDirector.java:645)
at 
org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:480)
at 
org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906)
at 
org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805)
at 
org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:784)
at 
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:365)
at 
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:180)
at 
org.apache.solr.cloud.CollectionsAPIDistributedZkTest.testCustomCollectionsAPI(CollectionsAPIDistributedZkTest.java:764)
at 
org.apache.solr.cloud.CollectionsAPIDistributedZkTest.doTest(CollectionsAPIDistributedZkTest.java:159)




Build Log:
[...truncated 24120 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-5109) Solr 4.4 will not deploy in Glassfish 4.x

2013-08-01 Thread jamon camisso (JIRA)
jamon camisso created SOLR-5109:
---

 Summary: Solr 4.4 will not deploy in Glassfish 4.x
 Key: SOLR-5109
 URL: https://issues.apache.org/jira/browse/SOLR-5109
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.4
 Environment: Glassfish 4.x
Reporter: jamon camisso
Priority: Blocker


The bundled Guava 14.0.1 JAR blocks deploying Solr 4.4 in Glassfish 4.x.

This failure is a known issue with upstream Guava and is described here:
https://code.google.com/p/guava-libraries/issues/detail?id=1433

Building Guava guava-15.0-SNAPSHOT.jar from master and bundling it in Solr 
allows for a successful deployment.

Until the Guava developers release version 15 using their HEAD or even an RC 
tag seems like the only way to resolve this.

This is frustrating since it was proposed that Guava be removed as a dependency 
before Solr 4.0 was released and yet it remains and blocks upgrading: 
https://issues.apache.org/jira/browse/SOLR-3601


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-5109) Solr 4.4 will not deploy in Glassfish 4.x

2013-08-01 Thread jamon camisso (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jamon camisso updated SOLR-5109:


Attachment: guava-15.0-SNAPSHOT.jar

 Solr 4.4 will not deploy in Glassfish 4.x
 -

 Key: SOLR-5109
 URL: https://issues.apache.org/jira/browse/SOLR-5109
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.4
 Environment: Glassfish 4.x
Reporter: jamon camisso
Priority: Blocker
  Labels: guava
 Attachments: guava-15.0-SNAPSHOT.jar


 The bundled Guava 14.0.1 JAR blocks deploying Solr 4.4 in Glassfish 4.x.
 This failure is a known issue with upstream Guava and is described here:
 https://code.google.com/p/guava-libraries/issues/detail?id=1433
 Building Guava guava-15.0-SNAPSHOT.jar from master and bundling it in Solr 
 allows for a successful deployment.
 Until the Guava developers release version 15 using their HEAD or even an RC 
 tag seems like the only way to resolve this.
 This is frustrating since it was proposed that Guava be removed as a 
 dependency before Solr 4.0 was released and yet it remains and blocks 
 upgrading: https://issues.apache.org/jira/browse/SOLR-3601

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5109) Solr 4.4 will not deploy in Glassfish 4.x

2013-08-01 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726976#comment-13726976
 ] 

Uwe Schindler commented on SOLR-5109:
-

Hi,

we cannot bundle JAR files with our source code and in the case of releasing 
the binary Solr package we need to download all dependencies from Maven 
Central. So we cannot solve this problem. Guava has to relaese a newer version 
first.

 Solr 4.4 will not deploy in Glassfish 4.x
 -

 Key: SOLR-5109
 URL: https://issues.apache.org/jira/browse/SOLR-5109
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.4
 Environment: Glassfish 4.x
Reporter: jamon camisso
Priority: Blocker
  Labels: guava
 Attachments: guava-15.0-SNAPSHOT.jar


 The bundled Guava 14.0.1 JAR blocks deploying Solr 4.4 in Glassfish 4.x.
 This failure is a known issue with upstream Guava and is described here:
 https://code.google.com/p/guava-libraries/issues/detail?id=1433
 Building Guava guava-15.0-SNAPSHOT.jar from master and bundling it in Solr 
 allows for a successful deployment.
 Until the Guava developers release version 15 using their HEAD or even an RC 
 tag seems like the only way to resolve this.
 This is frustrating since it was proposed that Guava be removed as a 
 dependency before Solr 4.0 was released and yet it remains and blocks 
 upgrading: https://issues.apache.org/jira/browse/SOLR-3601

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5109) Solr 4.4 will not deploy in Glassfish 4.x

2013-08-01 Thread jamon camisso (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726978#comment-13726978
 ] 

jamon camisso commented on SOLR-5109:
-

Can Guava be removed as a core dependency was proposed in SOLR-3601?

 Solr 4.4 will not deploy in Glassfish 4.x
 -

 Key: SOLR-5109
 URL: https://issues.apache.org/jira/browse/SOLR-5109
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.4
 Environment: Glassfish 4.x
Reporter: jamon camisso
Priority: Blocker
  Labels: guava
 Attachments: guava-15.0-SNAPSHOT.jar


 The bundled Guava 14.0.1 JAR blocks deploying Solr 4.4 in Glassfish 4.x.
 This failure is a known issue with upstream Guava and is described here:
 https://code.google.com/p/guava-libraries/issues/detail?id=1433
 Building Guava guava-15.0-SNAPSHOT.jar from master and bundling it in Solr 
 allows for a successful deployment.
 Until the Guava developers release version 15 using their HEAD or even an RC 
 tag seems like the only way to resolve this.
 This is frustrating since it was proposed that Guava be removed as a 
 dependency before Solr 4.0 was released and yet it remains and blocks 
 upgrading: https://issues.apache.org/jira/browse/SOLR-3601

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-5109) Solr 4.4 will not deploy in Glassfish 4.x

2013-08-01 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726992#comment-13726992
 ] 

Uwe Schindler commented on SOLR-5109:
-

Please reopen the corresponding issue and maybe provide a patch removing this 
dependency.

I would be happy to remove guava, but other developers may have other plans.

In general, I would not run Solr indide Glassfish, as Solr has very different 
resource usage than conventional enterprise webapps. This is one reason why 
Solr may no longer be a WAR in the future. Solr is a separate s erver like 
mysql and should run in an isolated process. 

 Solr 4.4 will not deploy in Glassfish 4.x
 -

 Key: SOLR-5109
 URL: https://issues.apache.org/jira/browse/SOLR-5109
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.4
 Environment: Glassfish 4.x
Reporter: jamon camisso
Priority: Blocker
  Labels: guava
 Attachments: guava-15.0-SNAPSHOT.jar


 The bundled Guava 14.0.1 JAR blocks deploying Solr 4.4 in Glassfish 4.x.
 This failure is a known issue with upstream Guava and is described here:
 https://code.google.com/p/guava-libraries/issues/detail?id=1433
 Building Guava guava-15.0-SNAPSHOT.jar from master and bundling it in Solr 
 allows for a successful deployment.
 Until the Guava developers release version 15 using their HEAD or even an RC 
 tag seems like the only way to resolve this.
 This is frustrating since it was proposed that Guava be removed as a 
 dependency before Solr 4.0 was released and yet it remains and blocks 
 upgrading: https://issues.apache.org/jira/browse/SOLR-3601

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org