date:20130801

[
https://issues.apache.org/jira/browse/SOLR-5091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726287#comment-13726287
]

Grant Ingersoll commented on SOLR-5091:
---

I'll see what I can do.

Clean up Servlets APIs, Kill SolrDispatchFilter, simplify API creation
--

Key: SOLR-5091
URL: https://issues.apache.org/jira/browse/SOLR-5091
Project: Solr
Issue Type: Improvement
Reporter: Grant Ingersoll
Assignee: Grant Ingersoll
Fix For: 5.0

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-5102) Simplify Solr Home

Grant Ingersoll created SOLR-5102:
-

 Summary: Simplify Solr Home
 Key: SOLR-5102
 URL: https://issues.apache.org/jira/browse/SOLR-5102
 Project: Solr
  Issue Type: Bug
Reporter: Grant Ingersoll
Assignee: Grant Ingersoll
 Fix For: 5.0


I think for 5.0, we should re-think some of the variations we support around 
things like Solr Home, etc.  We have a fair bit of code, I suspect that could 
just go away if make it easier by assuming there is a single solr home where 
everything lives.  The notion of making that stuff configurable has outlived 
its usefulness

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-5103) Plugin Improvements

Grant Ingersoll created SOLR-5103:
-

 Summary: Plugin Improvements
 Key: SOLR-5103
 URL: https://issues.apache.org/jira/browse/SOLR-5103
 Project: Solr
  Issue Type: Improvement
Reporter: Grant Ingersoll
Assignee: Grant Ingersoll
 Fix For: 5.0


I think for 5.0, we should make it easier to add plugins by defining a plugin 
package, ala a Hadoop Job jar, which is a self--contained archive of a plugin 
that can be easily installed (even from the UI!) and configured 
programmatically.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-5104) Remove Default Core

Grant Ingersoll created SOLR-5104:
-

 Summary: Remove Default Core
 Key: SOLR-5104
 URL: https://issues.apache.org/jira/browse/SOLR-5104
 Project: Solr
  Issue Type: Sub-task
Reporter: Grant Ingersoll
 Fix For: 5.0


I see no reason to maintain the notion of a default Core/Collection.  We can 
either default to Collection1, or just simply create a core on the fly based on 
the client's request.  Thus, all APIs that are accessing a core would require 
the core to be in the address path.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4860) MoreLikeThisHandler doesn't work with numeric or date fields in 4.x

2013-08-01 Thread Mike (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-4860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726314#comment-13726314
 ] 

Mike commented on SOLR-4860:


I came across this issue as well, I wanted to use _val_ hook and numeric field 
values for boosting mlt query via mlt.fl parameter. For regular search (via bf) 
this approach works just fine.
Do you plan to fix this, or I should start working on different solution for my 
mlt query? What's the probability it will be fixed this year? :)

 MoreLikeThisHandler doesn't work with numeric or date fields in 4.x
 ---

 Key: SOLR-4860
 URL: https://issues.apache.org/jira/browse/SOLR-4860
 Project: Solr
  Issue Type: Bug
  Components: MoreLikeThis
Affects Versions: 4.2
Reporter: Thomas Seidl

 After upgrading to Solr 4.2 (from 3.x), I realized that my MLT queries no 
 longer work. It happens if I pass an integer ({{solr.TrieIntField}}), float 
 ({{solr.TrieFloatField}}) or date ({{solr.DateField}}) field as part of the 
 {{mlt.fl}} parameter. The field's {{multiValued}} setting doesn't seem to 
 matter.
 This is the error I get:
 {noformat}
 NumericTokenStream does not support CharTermAttribute.
 java.lang.IllegalArgumentException: NumericTokenStream does not support 
 CharTermAttribute.
   at 
 org.apache.lucene.analysis.NumericTokenStream$NumericAttributeFactory.createAttributeInstance(NumericTokenStream.java:136)
   at 
 org.apache.lucene.util.AttributeSource.addAttribute(AttributeSource.java:271)
   at 
 org.apache.lucene.queries.mlt.MoreLikeThis.addTermFrequencies(MoreLikeThis.java:781)
   at 
 org.apache.lucene.queries.mlt.MoreLikeThis.retrieveTerms(MoreLikeThis.java:724)
   at 
 org.apache.lucene.queries.mlt.MoreLikeThis.like(MoreLikeThis.java:578)
   at 
 org.apache.solr.handler.MoreLikeThisHandler$MoreLikeThisHelper.getMoreLikeThis(MoreLikeThisHandler.java:348)
   at 
 org.apache.solr.handler.MoreLikeThisHandler.handleRequestBody(MoreLikeThisHandler.java:167)
   at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
   at org.apache.solr.core.SolrCore.execute(SolrCore.java:1817)
   at 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:639)
   at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:345)
   at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:141)
   at 
 org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1307)
   at 
 org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:453)
   at 
 org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
   at 
 org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:560)
   at 
 org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
   at 
 org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1072)
   at 
 org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:382)
   at 
 org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
   at 
 org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1006)
   at 
 org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
   at 
 org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
   at 
 org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)
   at 
 org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
   at org.eclipse.jetty.server.Server.handle(Server.java:365)
   at 
 org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:485)
   at 
 org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53)
   at 
 org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:926)
   at 
 org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:988)
   at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:635)
   at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
   at 
 org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72)
   at 
 org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264)
   at 
 org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
   at 
 org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
   at java.lang.Thread.run(Thread.java:679)
 {noformat}
 The

[jira] [Commented] (SOLR-5103) Plugin Improvements


[ 
https://issues.apache.org/jira/browse/SOLR-5103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726362#comment-13726362
 ] 

Grant Ingersoll commented on SOLR-5103:
---

https://code.google.com/p/google-guice/wiki/Multibindings has some baseline 
good ideas in it, see SOLR-5091 as well for how Guice gets brought in.

 Plugin Improvements
 ---

 Key: SOLR-5103
 URL: https://issues.apache.org/jira/browse/SOLR-5103
 Project: Solr
  Issue Type: Improvement
Reporter: Grant Ingersoll
Assignee: Grant Ingersoll
 Fix For: 5.0


 I think for 5.0, we should make it easier to add plugins by defining a plugin 
 package, ala a Hadoop Job jar, which is a self--contained archive of a plugin 
 that can be easily installed (even from the UI!) and configured 
 programmatically.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Measuring SOLR performance

2013-08-01 Thread Dmitry Kan

Hi Roman,

When I try to run with -q
/home/dmitry/projects/lab/solrjmeter/queries/demo/demo.queries

here what is reported:
Traceback (most recent call last):
  File solrjmeter.py, line 1390, in module
main(sys.argv)
  File solrjmeter.py, line 1309, in main
tests = find_tests(options)
  File solrjmeter.py, line 461, in find_tests
with changed_dir(pattern):
  File /usr/lib/python2.7/contextlib.py, line 17, in __enter__
return self.gen.next()
  File solrjmeter.py, line 229, in changed_dir
os.chdir(new)
OSError: [Errno 20] Not a directory:
'/home/dmitry/projects/lab/solrjmeter/queries/demo/demo.queries'

Best,

Dmitry



On Wed, Jul 31, 2013 at 7:21 PM, Roman Chyla roman.ch...@gmail.com wrote:

 Hi Dmitry,
 probably mistake in the readme, try calling it with -q
 /home/dmitry/projects/lab/solrjmeter/queries/demo/demo.queries

 as for the base_url, i was testing it on solr4.0, where it tries contactin
 /solr/admin/system - is it different for 4.3? I guess I should make it
 configurable (it already is, the endpoint is set at the check_options())

 thanks

 roman


 On Wed, Jul 31, 2013 at 10:01 AM, Dmitry Kan solrexp...@gmail.com wrote:

  Ok, got the error fixed by modifying the base solr ulr in solrjmeter.py
  (added core name after /solr part).
  Next error is:
 
  WARNING: no test name(s) supplied nor found in:
  ['/home/dmitry/projects/lab/solrjmeter/demo/queries/demo.queries']
 
  It is a 'slow start with new tool' symptom I guess.. :)
 
 
  On Wed, Jul 31, 2013 at 4:39 PM, Dmitry Kan solrexp...@gmail.com
 wrote:
 
  Hi Roman,
 
  What  version and config of SOLR does the tool expect?
 
  Tried to run, but got:
 
  **ERROR**
File solrjmeter.py, line 1390, in module
  main(sys.argv)
File solrjmeter.py, line 1296, in main
  check_prerequisities(options)
File solrjmeter.py, line 351, in check_prerequisities
  error('Cannot contact: %s' % options.query_endpoint)
File solrjmeter.py, line 66, in error
  traceback.print_stack()
  Cannot contact: http://localhost:8983/solr
 
 
  complains about URL, clicking which leads properly to the admin page...
  solr 4.3.1, 2 cores shard
 
  Dmitry
 
 
  On Wed, Jul 31, 2013 at 3:59 AM, Roman Chyla roman.ch...@gmail.com
 wrote:
 
  Hello,
 
  I have been wanting some tools for measuring performance of SOLR,
 similar
  to Mike McCandles' lucene benchmark.
 
  so yet another monitor was born, is described here:
 
 http://29min.wordpress.com/2013/07/31/measuring-solr-query-performance/
 
  I tested it on the problem of garbage collectors (see the blogs for
  details) and so far I can't conclude whether highly customized G1 is
  better
  than highly customized CMS, but I think interesting details can be seen
  there.
 
  Hope this helps someone, and of course, feel free to improve the tool
 and
  share!
 
  roman

[jira] [Resolved] (SOLR-5100) java.lang.OutOfMemoryError: Requested array size exceeds VM limit

2013-08-01 Thread Erick Erickson (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-5100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson resolved SOLR-5100.
--

Resolution: Invalid

Please raise this on the user's list. OOM errors are common when one has not 
allocated enough heap to the JVM or otherwise tries to do too much with too few 
resources. The user's list will offer lots of help to change your setup to no 
longer OOM.

 java.lang.OutOfMemoryError: Requested array size exceeds VM limit
 -

 Key: SOLR-5100
 URL: https://issues.apache.org/jira/browse/SOLR-5100
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.2.1
 Environment: Linux 3.2.0-4-amd64 #1 SMP Debian 3.2.46-1 x86_64 
 GNU/Linux
 Java 7, Tomcat, ZK standalone
Reporter: Grzegorz Sobczyk

 Today I found exception in log (lmsiprse01):
 {code}
 sie 01, 2013 5:27:26 AM org.apache.solr.core.SolrCore execute
 INFO: [products] webapp=/solr path=/select 
 params={facet=truestart=0q=facet.limit=-1facet.field=attribute_u-typfacet.field=attribute_u-gama-kolorystycznafacet.field=brand_namewt=javabinfq=node_id:1056version=2rows=0}
  hits=1241 status=0 QTime=33 
 sie 01, 2013 5:27:26 AM org.apache.solr.common.SolrException log
 SEVERE: null:java.lang.RuntimeException: java.lang.OutOfMemoryError: 
 Requested array size exceeds VM limit
 at 
 org.apache.solr.servlet.SolrDispatchFilter.sendError(SolrDispatchFilter.java:653)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:366)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:141)
 at 
 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
 at 
 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
 at 
 org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
 at 
 org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
 at 
 org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
 at 
 org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
 at 
 org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
 at 
 org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293)
 at 
 org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:859)
 at 
 org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:602)
 at 
 org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489)
 at java.lang.Thread.run(Thread.java:724)
 Caused by: java.lang.OutOfMemoryError: Requested array size exceeds VM limit
 at org.apache.lucene.util.PriorityQueue.init(PriorityQueue.java:64)
 at org.apache.lucene.util.PriorityQueue.init(PriorityQueue.java:37)
 at 
 org.apache.solr.handler.component.ShardFieldSortedHitQueue.init(ShardDoc.java:113)
 at 
 org.apache.solr.handler.component.QueryComponent.mergeIds(QueryComponent.java:766)
 at 
 org.apache.solr.handler.component.QueryComponent.handleRegularResponses(QueryComponent.java:625)
 at 
 org.apache.solr.handler.component.QueryComponent.handleResponses(QueryComponent.java:604)
 at 
 org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:311)
 at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
 at org.apache.solr.core.SolrCore.execute(SolrCore.java:1817)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:639)
 at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:345)
 ... 13 more
 {code}
 We have: 
 * 3x standalone zK
 * 3x Solr 4.2.1 on Tomcat
 Exception shows up after leader was stopped:
 * lmsiprse01:
 [2013-08-01 05:23:43]: /etc/init.d/tomcat6-1 stop
 [2013-08-01 05:25:09]: /etc/init.d/tomcat6-1 start
 * lmsiprse02 (leader):
 2013-08-01 05:27:21]: /etc/init.d/tomcat6-1 stop
 2013-08-01 05:29:31]: /etc/init.d/tomcat6-1 start
 * lmsiprse03:
 [2013-08-01 05:25:48]: /etc/init.d/tomcat6-1 stop
 [2013-08-01 05:26:42]: /etc/init.d/tomcat6-1 start

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-5101) Invalid UTF-8 character 0xfffe during shard update

2013-08-01 Thread Erick Erickson (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-5101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson resolved SOLR-5101.
--

Resolution: Invalid

Please raise this on the user's list and verify that it is indeed a bug before 
raising a JIRA. Offhand this sounds like a configuration error in your servlet 
container, but that's just a guess.

 Invalid UTF-8 character 0xfffe during shard update
 --

 Key: SOLR-5101
 URL: https://issues.apache.org/jira/browse/SOLR-5101
 Project: Solr
  Issue Type: Bug
  Components: clients - java
Affects Versions: 4.3
 Environment: Ubuntu 12.04.2
 java version 1.6.0_27
 OpenJDK Runtime Environment (IcedTea6 1.12.5) (6b27-1.12.5-0ubuntu0.12.04.1)
 OpenJDK 64-Bit Server VM (build 20.0-b12, mixed mode)
Reporter: Federico Chiacchiaretta

 On data import from a PostgreSQL db, I get the following error in solr.log:
 ERROR - 2013-08-01 09:51:00.217; org.apache.solr.common.SolrException; shard 
 update error RetryNode: 
 http://172.16.201.173:8983/solr/archive/:org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:
  Invalid UTF-8 character 0xfffe at char #416, byte #127)
at 
 org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:402)
at 
 org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:180)
at 
 org.apache.solr.update.SolrCmdDistributor$1.call(SolrCmdDistributor.java:332)
at 
 org.apache.solr.update.SolrCmdDistributor$1.call(SolrCmdDistributor.java:306)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:679)
 This prevents the document from being successfully added to the index, and a 
 few documents targeting the same shard are also missing.
 This happens silently, because data import completes successfully, and the 
 whole number of documents reported as Added includes those who failed (and 
 are actually lost).
 Is there a known workaround for this issue?
 Regards,
 Federico Chiacchiaretta

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Welcome Cassandra Targett as Lucene/Solr committer

2013-08-01 Thread Michael McCandless

Welcome Cassandra!

Mike McCandless

http://blog.mikemccandless.com


On Wed, Jul 31, 2013 at 6:47 PM, Robert Muir rcm...@gmail.com wrote:
 I'm pleased to announce that Cassandra Targett has accepted to join our
 ranks as a committer.

 Cassandra worked on the donation of the new Solr Reference Guide [1] and
 getting things in order for its first official release [2].
 Cassandra, it is tradition that you introduce yourself with a brief bio.

 Welcome!

 P.S. As soon as your SVN access is setup, you should then be able to add
 yourself to the committers list on the website as well.

 [1]
 https://cwiki.apache.org/confluence/display/solr/Apache+Solr+Reference+Guide
 [2] https://www.apache.org/dyn/closer.cgi/lucene/solr/ref-guide/


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-5105) Merge CoreAdmin and Collections API

2013-08-01 Thread Alan Woodward (JIRA)

Alan Woodward created SOLR-5105:
---

 Summary: Merge CoreAdmin and Collections API
 Key: SOLR-5105
 URL: https://issues.apache.org/jira/browse/SOLR-5105
 Project: Solr
  Issue Type: Improvement
Reporter: Alan Woodward
 Fix For: 5.0


For 5.0, we should remove the distinction between the Core Admin API and the 
Collections API.  It's confusing for users, and adds unnecessary complexity and 
duplication to the core code.

* Under the hood, the AdminHandlers should just be deserializing the various 
core parameters and then passing them onto the CoreContainer to do the actual 
work.
* The CoreContainer API can be cleaned up (need a distinction between loading 
existing cores and creating new ones, remove the various 'registerCore' 
methods) 
* ZkContainer should become a subclass of CoreContainer (maybe 
CloudCoreContainer?) and deal with the zookeeper interactions, while the base 
class deals with local cores.
* The CoreContainer should be dealing with all core name logic (aliases, 
collections, etc).  This will have the nice side-effect of simplifying the core 
dispatch logic as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-5152) Lucene FST is not immutale


 [ 
https://issues.apache.org/jira/browse/LUCENE-5152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer updated LUCENE-5152:


Attachment: LUCENE-5152.patch

here is a patch that adds a #deepCopy method to Outputs that allows me to do a 
deep copy if the actual arc that is returned is a cached root arc. I think we 
should never return a pointer into the root arcs though. this is way to 
dangerous! I haven't run any perf tests will do once I am on my worksstation 
again.. if somebody beats me go ahead!

 Lucene FST is not immutale
 --

 Key: LUCENE-5152
 URL: https://issues.apache.org/jira/browse/LUCENE-5152
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/FSTs
Affects Versions: 4.4
Reporter: Simon Willnauer
Priority: Blocker
 Fix For: 5.0, 4.5

 Attachments: LUCENE-5152.patch, LUCENE-5152.patch


 a spinnoff from LUCENE-5120 where the analyzing suggester modified a returned 
 output from and FST (BytesRef) which caused sideffects in later execution. 
 I added an assertion into the FST that checks if a cached root arc is 
 modified and in-fact this happens for instance in our MemoryPostingsFormat 
 and I bet we find more places. We need to think about how to make this less 
 trappy since it can cause bugs that are super hard to find.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5101) Invalid UTF-8 character 0xfffe during shard update

2013-08-01 Thread Federico Chiacchiaretta (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726408#comment-13726408
 ] 

Federico Chiacchiaretta commented on SOLR-5101:
---

Hi Erick,
I'll post this on the user's list and I'll be back here when I have an update.
Regarding servlet container config, I'm using included jetty stock 
configuration.

 Invalid UTF-8 character 0xfffe during shard update
 --

 Key: SOLR-5101
 URL: https://issues.apache.org/jira/browse/SOLR-5101
 Project: Solr
  Issue Type: Bug
  Components: clients - java
Affects Versions: 4.3
 Environment: Ubuntu 12.04.2
 java version 1.6.0_27
 OpenJDK Runtime Environment (IcedTea6 1.12.5) (6b27-1.12.5-0ubuntu0.12.04.1)
 OpenJDK 64-Bit Server VM (build 20.0-b12, mixed mode)
Reporter: Federico Chiacchiaretta

 On data import from a PostgreSQL db, I get the following error in solr.log:
 ERROR - 2013-08-01 09:51:00.217; org.apache.solr.common.SolrException; shard 
 update error RetryNode: 
 http://172.16.201.173:8983/solr/archive/:org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:
  Invalid UTF-8 character 0xfffe at char #416, byte #127)
at 
 org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:402)
at 
 org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:180)
at 
 org.apache.solr.update.SolrCmdDistributor$1.call(SolrCmdDistributor.java:332)
at 
 org.apache.solr.update.SolrCmdDistributor$1.call(SolrCmdDistributor.java:306)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:679)
 This prevents the document from being successfully added to the index, and a 
 few documents targeting the same shard are also missing.
 This happens silently, because data import completes successfully, and the 
 whole number of documents reported as Added includes those who failed (and 
 are actually lost).
 Is there a known workaround for this issue?
 Regards,
 Federico Chiacchiaretta

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

FlushPolicy and maxBufDelTerm

Hi

I'm a little confused about FlushPolicy and
IndexWriterConfig.setMaxBufferedDeleteTerms documentation. FlushPolicy
jdocs say:

 * Segments are traditionally flushed by:
 * ul
 * liRAM consumption - configured via
...
 * li*Number of buffered delete terms/queries* - configured via
 * {@link IndexWriterConfig#setMaxBufferedDeleteTerms(int)}/li
 * /ul

Yet IWC.setMaxBufDelTerm says:

NOTE: This setting won't trigger a segment flush.

And FlushByRamOrCountPolicy says:

 * li{@link #onDelete(DocumentsWriterFlushControl,
DocumentsWriterPerThreadPool.ThreadState)} - flushes
 * based on the global number of buffered delete terms iff
 * {@link IndexWriterConfig#getMaxBufferedDeleteTerms()} is enabled/li

Confused, I wrote a short unit test:

  public void testMaxBufDelTerm() throws Exception {
Directory dir = new RAMDirectory();
IndexWriterConfig conf = newIndexWriterConfig(TEST_VERSION_CURRENT, new
MockAnalyzer(random()));
conf.setMaxBufferedDeleteTerms(1);
conf.setMaxBufferedDocs(10);
conf.setRAMBufferSizeMB(IndexWriterConfig.DISABLE_AUTO_FLUSH);
conf.setInfoStream(new PrintStreamInfoStream(System.out));
IndexWriter writer = new IndexWriter(dir, conf );
int numDocs = 4;
for (int i = 0; i  numDocs; i++) {
  Document doc = new Document();
  doc.add(new StringField(id, doc- + i, Store.NO));
  writer.addDocument(doc);
}

System.out.println(before delete);
for (String f : dir.listAll()) System.out.println(f);

writer.deleteDocuments(new Term(id, doc-0));
writer.deleteDocuments(new Term(id, doc-1));

System.out.println(\nafter delete);
for (String f : dir.listAll()) System.out.println(f);

writer.close();
dir.close();
  }

When InfoStream is turned on, I can see messages regarding terms flushing
(vs if I comment the .setMaxBufDelTerm line), so I know this settings takes
effect.
Yet both before and after the delete operations, the dir.list() returns
only the fdx and fdt files.

So is this a bug that a segment isn't flushed? If not (and I'm ok with
that), is it a documentation inconsistency?
Strangely, I think, if the delTerms RAM accounting exhausts max-RAM-buffer
size, a new segment will be deleted?

Slightly unrelated to FlushPolicy, but do I understand correctly that
maxBufDelTerm does not apply to delete-by-query operations?
BufferedDeletes doesn't increment any counter on addQuery(), so is it
correct to assume that if I only delete-by-query, this setting has no
effect?
And the delete queries are buffered until the next segment is flushed due
to other operations (constraints, commit, NRT-reopen)?

Shai

Re: FlushPolicy and maxBufDelTerm

bq. a new segment will be deleted?

I mean a new segment will be flushed :).

Shai


On Thu, Aug 1, 2013 at 4:03 PM, Shai Erera ser...@gmail.com wrote:

 Hi

 I'm a little confused about FlushPolicy and
 IndexWriterConfig.setMaxBufferedDeleteTerms documentation. FlushPolicy
 jdocs say:

  * Segments are traditionally flushed by:
  * ul
  * liRAM consumption - configured via
 ...
  * li*Number of buffered delete terms/queries* - configured via
  * {@link IndexWriterConfig#setMaxBufferedDeleteTerms(int)}/li
  * /ul

 Yet IWC.setMaxBufDelTerm says:

 NOTE: This setting won't trigger a segment flush.

 And FlushByRamOrCountPolicy says:

  * li{@link #onDelete(DocumentsWriterFlushControl,
 DocumentsWriterPerThreadPool.ThreadState)} - flushes
  * based on the global number of buffered delete terms iff
  * {@link IndexWriterConfig#getMaxBufferedDeleteTerms()} is enabled/li

 Confused, I wrote a short unit test:

   public void testMaxBufDelTerm() throws Exception {
 Directory dir = new RAMDirectory();
 IndexWriterConfig conf = newIndexWriterConfig(TEST_VERSION_CURRENT,
 new MockAnalyzer(random()));
 conf.setMaxBufferedDeleteTerms(1);
 conf.setMaxBufferedDocs(10);
 conf.setRAMBufferSizeMB(IndexWriterConfig.DISABLE_AUTO_FLUSH);
 conf.setInfoStream(new PrintStreamInfoStream(System.out));
 IndexWriter writer = new IndexWriter(dir, conf );
 int numDocs = 4;
 for (int i = 0; i  numDocs; i++) {
   Document doc = new Document();
   doc.add(new StringField(id, doc- + i, Store.NO));
   writer.addDocument(doc);
 }

 System.out.println(before delete);
 for (String f : dir.listAll()) System.out.println(f);

 writer.deleteDocuments(new Term(id, doc-0));
 writer.deleteDocuments(new Term(id, doc-1));

 System.out.println(\nafter delete);
 for (String f : dir.listAll()) System.out.println(f);

 writer.close();
 dir.close();
   }

 When InfoStream is turned on, I can see messages regarding terms flushing
 (vs if I comment the .setMaxBufDelTerm line), so I know this settings takes
 effect.
 Yet both before and after the delete operations, the dir.list() returns
 only the fdx and fdt files.

 So is this a bug that a segment isn't flushed? If not (and I'm ok with
 that), is it a documentation inconsistency?
 Strangely, I think, if the delTerms RAM accounting exhausts max-RAM-buffer
 size, a new segment will be deleted?

 Slightly unrelated to FlushPolicy, but do I understand correctly that
 maxBufDelTerm does not apply to delete-by-query operations?
 BufferedDeletes doesn't increment any counter on addQuery(), so is it
 correct to assume that if I only delete-by-query, this setting has no
 effect?
 And the delete queries are buffered until the next segment is flushed due
 to other operations (constraints, commit, NRT-reopen)?

 Shai

[jira] [Updated] (SOLR-5057) queryResultCache should not related with the order of fq's list

2013-08-01 Thread huangfeihong (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-5057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

huangfeihong updated SOLR-5057:
---

Attachment: SOLR-5057.patch

 queryResultCache should not related with the order of fq's list
 ---

 Key: SOLR-5057
 URL: https://issues.apache.org/jira/browse/SOLR-5057
 Project: Solr
  Issue Type: Improvement
  Components: search
Affects Versions: 4.0, 4.1, 4.2, 4.3
Reporter: Feihong Huang
Assignee: Erick Erickson
Priority: Minor
 Attachments: SOLR-5057.patch, SOLR-5057.patch, SOLR-5057.patch

   Original Estimate: 48h
  Remaining Estimate: 48h

 There are two case query with the same meaning below. But the case2 can't use 
 the queryResultCache when case1 is executed.
 case1: q=*:*fq=field1:value1fq=field2:value2
 case2: q=*:*fq=field2:value2fq=field1:value1
 I think queryResultCache should not be related with the order of fq's list.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5057) queryResultCache should not related with the order of fq's list

2013-08-01 Thread huangfeihong (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726425#comment-13726425
 ] 

huangfeihong commented on SOLR-5057:


Patch attached. Just rename several variable'name using Yonik's code.

 queryResultCache should not related with the order of fq's list
 ---

 Key: SOLR-5057
 URL: https://issues.apache.org/jira/browse/SOLR-5057
 Project: Solr
  Issue Type: Improvement
  Components: search
Affects Versions: 4.0, 4.1, 4.2, 4.3
Reporter: Feihong Huang
Assignee: Erick Erickson
Priority: Minor
 Attachments: SOLR-5057.patch, SOLR-5057.patch, SOLR-5057.patch

   Original Estimate: 48h
  Remaining Estimate: 48h

 There are two case query with the same meaning below. But the case2 can't use 
 the queryResultCache when case1 is executed.
 case1: q=*:*fq=field1:value1fq=field2:value2
 case2: q=*:*fq=field2:value2fq=field1:value1
 I think queryResultCache should not be related with the order of fq's list.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5152) Lucene FST is not immutale


[ 
https://issues.apache.org/jira/browse/LUCENE-5152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726429#comment-13726429
 ] 

Jack Krupansky commented on LUCENE-5152:


bq. immutale

Is that the Latin term for immutable??

(spelling in summary line)


 Lucene FST is not immutale
 --

 Key: LUCENE-5152
 URL: https://issues.apache.org/jira/browse/LUCENE-5152
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/FSTs
Affects Versions: 4.4
Reporter: Simon Willnauer
Priority: Blocker
 Fix For: 5.0, 4.5

 Attachments: LUCENE-5152.patch, LUCENE-5152.patch


 a spinnoff from LUCENE-5120 where the analyzing suggester modified a returned 
 output from and FST (BytesRef) which caused sideffects in later execution. 
 I added an assertion into the FST that checks if a cached root arc is 
 modified and in-fact this happens for instance in our MemoryPostingsFormat 
 and I bet we find more places. We need to think about how to make this less 
 trappy since it can cause bugs that are super hard to find.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2894) Implement distributed pivot faceting

2013-08-01 Thread Stein J. Gran (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726461#comment-13726461
 ] 

Stein J. Gran commented on SOLR-2894:
-

I have now re-tested the scenarios I used on April 10th (see my comment above 
from that date), and all of those issues I found then are now resolved :-) I 
applied the July 25th patch to the lucene_solr_4_4 branch (Github) and 
performed the tests on this version.

Well done Andrew :-)  Thumbs up from me.

 Implement distributed pivot faceting
 

 Key: SOLR-2894
 URL: https://issues.apache.org/jira/browse/SOLR-2894
 Project: Solr
  Issue Type: Improvement
Reporter: Erik Hatcher
 Fix For: 4.5

 Attachments: SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, 
 SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, 
 SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, 
 SOLR-2894.patch, SOLR-2894.patch, SOLR-2894-reworked.patch


 Following up on SOLR-792, pivot faceting currently only supports 
 undistributed mode.  Distributed pivot faceting needs to be implemented.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-5106) Grouping on multi-valued fields

2013-08-01 Thread Reinier Battenberg (JIRA)

Reinier Battenberg created SOLR-5106:


 Summary: Grouping on multi-valued fields
 Key: SOLR-5106
 URL: https://issues.apache.org/jira/browse/SOLR-5106
 Project: Solr
  Issue Type: Improvement
Reporter: Reinier Battenberg
Priority: Minor


The Wiki page for FieldCollapsing mentions that Support for grouping on a 
multi-valued field has not yet been implemented.

This issue is to document that implementation.

http://wiki.apache.org/solr/FieldCollapsing#line-158

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5104) Remove Default Core


[ 
https://issues.apache.org/jira/browse/SOLR-5104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726483#comment-13726483
 ] 

Jack Krupansky commented on SOLR-5104:
--

Minor procedural nit... If you intend to remove a feature, deprecate it first 
(like, in 4.5.) Thanks!

 Remove Default Core
 ---

 Key: SOLR-5104
 URL: https://issues.apache.org/jira/browse/SOLR-5104
 Project: Solr
  Issue Type: Sub-task
Reporter: Grant Ingersoll
 Fix For: 5.0


 I see no reason to maintain the notion of a default Core/Collection.  We can 
 either default to Collection1, or just simply create a core on the fly based 
 on the client's request.  Thus, all APIs that are accessing a core would 
 require the core to be in the address path.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: FlushPolicy and maxBufDelTerm

2013-08-01 Thread Michael McCandless

First off, it's bad that you don't see .del files when
conf.setMaxBufferedDeleteTerms is 1.

But, it could be that newIndexWriterConfig turned on readerPooling
which would mean the deletes are held in the SegmentReader and not
flushed to disk.  Can you make sure that's off?

Second off, I think the doc is correct: a segment will not be flushed;
rather, new .del files should appear against older segments.

And yes, if RAM usage of the buffered del Term/Query s is too high,
then a segment is flushed along with the deletes being applied
(creating the .del files).

I think buffered delete Querys are not counted towards
setMaxBufferedDeleteTerms; so they are only flushed by RAM usage
(rough rough estimate) or by other ops (merging, NRT reopen, commit,
etc.).

Mike McCandless

http://blog.mikemccandless.com


On Thu, Aug 1, 2013 at 9:03 AM, Shai Erera ser...@gmail.com wrote:
 Hi

 I'm a little confused about FlushPolicy and
 IndexWriterConfig.setMaxBufferedDeleteTerms documentation. FlushPolicy jdocs
 say:

  * Segments are traditionally flushed by:
  * ul
  * liRAM consumption - configured via
 ...
  * liNumber of buffered delete terms/queries - configured via
  * {@link IndexWriterConfig#setMaxBufferedDeleteTerms(int)}/li
  * /ul

 Yet IWC.setMaxBufDelTerm says:

 NOTE: This setting won't trigger a segment flush.

 And FlushByRamOrCountPolicy says:

  * li{@link #onDelete(DocumentsWriterFlushControl,
 DocumentsWriterPerThreadPool.ThreadState)} - flushes
  * based on the global number of buffered delete terms iff
  * {@link IndexWriterConfig#getMaxBufferedDeleteTerms()} is enabled/li

 Confused, I wrote a short unit test:

   public void testMaxBufDelTerm() throws Exception {
 Directory dir = new RAMDirectory();
 IndexWriterConfig conf = newIndexWriterConfig(TEST_VERSION_CURRENT, new
 MockAnalyzer(random()));
 conf.setMaxBufferedDeleteTerms(1);
 conf.setMaxBufferedDocs(10);
 conf.setRAMBufferSizeMB(IndexWriterConfig.DISABLE_AUTO_FLUSH);
 conf.setInfoStream(new PrintStreamInfoStream(System.out));
 IndexWriter writer = new IndexWriter(dir, conf );
 int numDocs = 4;
 for (int i = 0; i  numDocs; i++) {
   Document doc = new Document();
   doc.add(new StringField(id, doc- + i, Store.NO));
   writer.addDocument(doc);
 }

 System.out.println(before delete);
 for (String f : dir.listAll()) System.out.println(f);

 writer.deleteDocuments(new Term(id, doc-0));
 writer.deleteDocuments(new Term(id, doc-1));

 System.out.println(\nafter delete);
 for (String f : dir.listAll()) System.out.println(f);

 writer.close();
 dir.close();
   }

 When InfoStream is turned on, I can see messages regarding terms flushing
 (vs if I comment the .setMaxBufDelTerm line), so I know this settings takes
 effect.
 Yet both before and after the delete operations, the dir.list() returns only
 the fdx and fdt files.

 So is this a bug that a segment isn't flushed? If not (and I'm ok with
 that), is it a documentation inconsistency?
 Strangely, I think, if the delTerms RAM accounting exhausts max-RAM-buffer
 size, a new segment will be deleted?

 Slightly unrelated to FlushPolicy, but do I understand correctly that
 maxBufDelTerm does not apply to delete-by-query operations?
 BufferedDeletes doesn't increment any counter on addQuery(), so is it
 correct to assume that if I only delete-by-query, this setting has no
 effect?
 And the delete queries are buffered until the next segment is flushed due to
 other operations (constraints, commit, NRT-reopen)?

 Shai

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-5107) LukeRequestHandler throws NullPointerException when numTerms=0

2013-08-01 Thread Ahmet Arslan (JIRA)

Ahmet Arslan created SOLR-5107:
--

 Summary: LukeRequestHandler throws NullPointerException when 
numTerms=0
 Key: SOLR-5107
 URL: https://issues.apache.org/jira/browse/SOLR-5107
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.4
Reporter: Ahmet Arslan
Priority: Minor


Defaults example 
http://localhost:8983/solr/collection1/admin/luke?fl=catnumTerms=0 yields 
{code}
ERROR org.apache.solr.core.SolrCore  – java.lang.NullPointerException
at 
org.apache.solr.handler.admin.LukeRequestHandler.getDetailedFieldInfo(LukeRequestHandler.java:610)
at 
org.apache.solr.handler.admin.LukeRequestHandler.getIndexedFieldsInfo(LukeRequestHandler.java:378)
at 
org.apache.solr.handler.admin.LukeRequestHandler.handleRequestBody(LukeRequestHandler.java:160)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1845)
at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:666)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:369)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:158)
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
at 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
at 
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557)
at 
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075)
at 
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)
at 
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
at 
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
at 
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)
at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
at org.eclipse.jetty.server.Server.handle(Server.java:368)
at 
org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)
at 
org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53)
at 
org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:942)
at 
org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1004)
at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:640)
at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
at 
org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72)
at 
org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
at java.lang.Thread.run(Thread.java:722)
{code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-5107) LukeRequestHandler throws NullPointerException when numTerms=0

2013-08-01 Thread Ahmet Arslan (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-5107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmet Arslan updated SOLR-5107:
---

Attachment: SOLR-5107.patch

 LukeRequestHandler throws NullPointerException when numTerms=0
 --

 Key: SOLR-5107
 URL: https://issues.apache.org/jira/browse/SOLR-5107
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.4
Reporter: Ahmet Arslan
Priority: Minor
 Attachments: SOLR-5107.patch


 Defaults example 
 http://localhost:8983/solr/collection1/admin/luke?fl=catnumTerms=0 yields 
 {code}
 ERROR org.apache.solr.core.SolrCore  – java.lang.NullPointerException
   at 
 org.apache.solr.handler.admin.LukeRequestHandler.getDetailedFieldInfo(LukeRequestHandler.java:610)
   at 
 org.apache.solr.handler.admin.LukeRequestHandler.getIndexedFieldsInfo(LukeRequestHandler.java:378)
   at 
 org.apache.solr.handler.admin.LukeRequestHandler.handleRequestBody(LukeRequestHandler.java:160)
   at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
   at org.apache.solr.core.SolrCore.execute(SolrCore.java:1845)
   at 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:666)
   at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:369)
   at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:158)
   at 
 org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
   at 
 org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
   at 
 org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
   at 
 org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557)
   at 
 org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
   at 
 org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075)
   at 
 org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)
   at 
 org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
   at 
 org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)
   at 
 org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
   at 
 org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
   at 
 org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)
   at 
 org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
   at org.eclipse.jetty.server.Server.handle(Server.java:368)
   at 
 org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)
   at 
 org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53)
   at 
 org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:942)
   at 
 org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1004)
   at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:640)
   at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
   at 
 org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72)
   at 
 org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264)
   at 
 org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
   at 
 org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
   at java.lang.Thread.run(Thread.java:722)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Welcome Cassandra Targett as Lucene/Solr committer

2013-08-01 Thread Shawn Heisey

On 7/31/2013 4:47 PM, Robert Muir wrote:
 I'm pleased to announce that Cassandra Targett has accepted to join our
 ranks as a committer.

Welcome to the project!


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5104) Remove Default Core

[
https://issues.apache.org/jira/browse/SOLR-5104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726495#comment-13726495
]

Jack Krupansky commented on SOLR-5104:
--

bq. I see no reason to maintain the notion of a default Core/Collection.

Okay, here is the reason...

It is a great convenience and shortens URLs to make them more readable and
easier to type.

It greatly facilitates prototyping and experimentation and learning of the
basics of Solr.

And... compatibility with existing apps.

So, this notion that there isn't any reason is complete nonsense.

OTOH, maybe you are trying to suggest that there is some reason or valuable
benefit to be gained by requiring explicit collection/core name in the URL
path. But, you have not done so. Not a hint of any reason or benefit. So, if
you do have a reason or perceived benefit for eliminating a great convenience
feature, please disclose it.

Or... is this not so much an issue of reason as because some code or tool
change you are contemplating does not support the kind of flexible URL syntax
that Solr supports? Well, if the benefits of the change in technology outweigh
the loss of a valuable feature, then that is worth considering, but as of this
moment no positive tradeoff has been proposed or established.

OTOH, if there were a determined effort to give Solr a full-blown true REST API
and THAT was the motive for explicit collection name, I'd be 100% all for it.

Side note: Maybe collection1 should become example to make it clear that
real apps should assign an app-meaningful name rather than leaving it as
collection1.

Remove Default Core
---

Key: SOLR-5104
URL: https://issues.apache.org/jira/browse/SOLR-5104
Project: Solr
Issue Type: Sub-task
Reporter: Grant Ingersoll
Fix For: 5.0

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Welcome Cassandra Targett as Lucene/Solr committer

2013-08-01 Thread Cassandra Targett

Thanks everyone. I'm very excited to join you all.

I don't know how brief this is, but here's a bit about me:

I discovered Solr through my work at LucidWorks. I've had a few
different roles there, but most recently I've been the tech writer.
The Solr Reference Guide became part of the stuff I work on and I had
to learn Solr. I like to figure as much out on my own as I can, so to
learn I tried things out, I read the Jira issues, I tried to interpret
the Javadocs (sometimes following the trail deep into darkness and
getting it wrong). It's the same way many people passionate about Solr
get started, I think - we had a job to do, in one way or another, and
that's how we learned.

I'm technical but not a developer (I think the last real program I
wrote was in computer camp for girls in 1984, where we wrote Basic in
the morning and Jazzercized in the afternoon), but even though I don't
write code, I can understand very technical concepts and can sometimes
read code. I'm a librarian, so I spend time thinking about how to
organize information. As an undergrad I got a BA in creative writing,
and tech writing has become a really lovely pairing of two skills and
passions.

What else? I grew up in New Hampshire and after school, I moved to
Boston and at some point I decided that a) I wanted to work on the
internet and b) the best way to do that was to get an MS in Library
Science. It sounds sort of random, now, but that's what I did.

A couple years ago I left Boston and now live in Northwest Florida
(vaguely halfway between Pensacola and Tallahassee), only a couple
miles from the beach. Until that point, I (loudly and often) vowed I
would never live in Florida. But it turns out that I really love being
able to go to the beach every day of the year, and on my second day in
town I met my boyfriend and we're now sort of officially engaged. So,
even though right now it's hotter-than-Hades and
wetter-than-sunken-Atlantis, I stay.

Lastly, in my spare time, I make mosaic art. I still have more ideas
than pieces, but I'm getting there. Eventually I'll get some photos of
my stuff online for all to see.

On Wed, Jul 31, 2013 at 3:47 PM, Robert Muir rcm...@gmail.com wrote:
 I'm pleased to announce that Cassandra Targett has accepted to join our
 ranks as a committer.

 Cassandra worked on the donation of the new Solr Reference Guide [1] and
 getting things in order for its first official release [2].
 Cassandra, it is tradition that you introduce yourself with a brief bio.

 Welcome!

 P.S. As soon as your SVN access is setup, you should then be able to add
 yourself to the committers list on the website as well.

 [1]
 https://cwiki.apache.org/confluence/display/solr/Apache+Solr+Reference+Guide
 [2] https://www.apache.org/dyn/closer.cgi/lucene/solr/ref-guide/


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5104) Remove Default Core

[
https://issues.apache.org/jira/browse/SOLR-5104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726503#comment-13726503
]

Grant Ingersoll commented on SOLR-5104:
---

My reason is b/c SolrDispatchFilter is filled with legacy cruft, this being one
of them. The simpler and more standard we can make all path handling, the
better. I don't really care much about shorter URLs and I don't buy the
prototyping/learning factor. In fact, I'd argue that it is harder b/c of it,
since you have a magic core and than all of your other cores. If you just
make the name of the collection part of the path always, there is no more
guessing.

The less legacy code for plumbing we carry forward in 5, the better off Solr
we will be.

And yes, I am working on making the possibility of a full blown REST API. See
SOLR-5091.

Remove Default Core
---

Key: SOLR-5104
URL: https://issues.apache.org/jira/browse/SOLR-5104
Project: Solr
Issue Type: Sub-task
Reporter: Grant Ingersoll
Fix For: 5.0

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: FlushPolicy and maxBufDelTerm

 I think the doc is correct

Wait, one of the docs is wrong. I guess according to what you write, it's
FlushPolicy, as a new segment is not flushed per this setting?
Or perhaps they should be clarified that the deletes are flushed == applied
on existing segments?

I disabled reader pooling and I still don't see .del files. But I think
that's explained due to there are no segments in the index yet.
All documents are still in the RAM buffer, and according to what you write,
I shouldn't see any segment cause of delTerms?

Shai


On Thu, Aug 1, 2013 at 5:40 PM, Michael McCandless 
luc...@mikemccandless.com wrote:

 First off, it's bad that you don't see .del files when
 conf.setMaxBufferedDeleteTerms is 1.

 But, it could be that newIndexWriterConfig turned on readerPooling
 which would mean the deletes are held in the SegmentReader and not
 flushed to disk.  Can you make sure that's off?

 Second off, I think the doc is correct: a segment will not be flushed;
 rather, new .del files should appear against older segments.

 And yes, if RAM usage of the buffered del Term/Query s is too high,
 then a segment is flushed along with the deletes being applied
 (creating the .del files).

 I think buffered delete Querys are not counted towards
 setMaxBufferedDeleteTerms; so they are only flushed by RAM usage
 (rough rough estimate) or by other ops (merging, NRT reopen, commit,
 etc.).

 Mike McCandless

 http://blog.mikemccandless.com


 On Thu, Aug 1, 2013 at 9:03 AM, Shai Erera ser...@gmail.com wrote:
  Hi
 
  I'm a little confused about FlushPolicy and
  IndexWriterConfig.setMaxBufferedDeleteTerms documentation. FlushPolicy
 jdocs
  say:
 
   * Segments are traditionally flushed by:
   * ul
   * liRAM consumption - configured via
  ...
   * liNumber of buffered delete terms/queries - configured via
   * {@link IndexWriterConfig#setMaxBufferedDeleteTerms(int)}/li
   * /ul
 
  Yet IWC.setMaxBufDelTerm says:
 
  NOTE: This setting won't trigger a segment flush.
 
  And FlushByRamOrCountPolicy says:
 
   * li{@link #onDelete(DocumentsWriterFlushControl,
  DocumentsWriterPerThreadPool.ThreadState)} - flushes
   * based on the global number of buffered delete terms iff
   * {@link IndexWriterConfig#getMaxBufferedDeleteTerms()} is enabled/li
 
  Confused, I wrote a short unit test:
 
public void testMaxBufDelTerm() throws Exception {
  Directory dir = new RAMDirectory();
  IndexWriterConfig conf = newIndexWriterConfig(TEST_VERSION_CURRENT,
 new
  MockAnalyzer(random()));
  conf.setMaxBufferedDeleteTerms(1);
  conf.setMaxBufferedDocs(10);
  conf.setRAMBufferSizeMB(IndexWriterConfig.DISABLE_AUTO_FLUSH);
  conf.setInfoStream(new PrintStreamInfoStream(System.out));
  IndexWriter writer = new IndexWriter(dir, conf );
  int numDocs = 4;
  for (int i = 0; i  numDocs; i++) {
Document doc = new Document();
doc.add(new StringField(id, doc- + i, Store.NO));
writer.addDocument(doc);
  }
 
  System.out.println(before delete);
  for (String f : dir.listAll()) System.out.println(f);
 
  writer.deleteDocuments(new Term(id, doc-0));
  writer.deleteDocuments(new Term(id, doc-1));
 
  System.out.println(\nafter delete);
  for (String f : dir.listAll()) System.out.println(f);
 
  writer.close();
  dir.close();
}
 
  When InfoStream is turned on, I can see messages regarding terms flushing
  (vs if I comment the .setMaxBufDelTerm line), so I know this settings
 takes
  effect.
  Yet both before and after the delete operations, the dir.list() returns
 only
  the fdx and fdt files.
 
  So is this a bug that a segment isn't flushed? If not (and I'm ok with
  that), is it a documentation inconsistency?
  Strangely, I think, if the delTerms RAM accounting exhausts
 max-RAM-buffer
  size, a new segment will be deleted?
 
  Slightly unrelated to FlushPolicy, but do I understand correctly that
  maxBufDelTerm does not apply to delete-by-query operations?
  BufferedDeletes doesn't increment any counter on addQuery(), so is it
  correct to assume that if I only delete-by-query, this setting has no
  effect?
  And the delete queries are buffered until the next segment is flushed
 due to
  other operations (constraints, commit, NRT-reopen)?
 
  Shai

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org

Re: FlushPolicy and maxBufDelTerm

2013-08-01 Thread ASF subversion and git services (JIRA)

I set maxBufDocs=2 so that I get a segment flushed, and indeed after delete
I see _0.del.

So I guess this is just docs inconsistency. I'll clarify FlushPolicy docs.

Shai


On Thu, Aug 1, 2013 at 6:24 PM, Shai Erera ser...@gmail.com wrote:

  I think the doc is correct

 Wait, one of the docs is wrong. I guess according to what you write, it's
 FlushPolicy, as a new segment is not flushed per this setting?
 Or perhaps they should be clarified that the deletes are flushed ==
 applied on existing segments?

 I disabled reader pooling and I still don't see .del files. But I think
 that's explained due to there are no segments in the index yet.
 All documents are still in the RAM buffer, and according to what you
 write, I shouldn't see any segment cause of delTerms?

 Shai


 On Thu, Aug 1, 2013 at 5:40 PM, Michael McCandless 
 luc...@mikemccandless.com wrote:

 First off, it's bad that you don't see .del files when
 conf.setMaxBufferedDeleteTerms is 1.

 But, it could be that newIndexWriterConfig turned on readerPooling
 which would mean the deletes are held in the SegmentReader and not
 flushed to disk.  Can you make sure that's off?

 Second off, I think the doc is correct: a segment will not be flushed;
 rather, new .del files should appear against older segments.

 And yes, if RAM usage of the buffered del Term/Query s is too high,
 then a segment is flushed along with the deletes being applied
 (creating the .del files).

 I think buffered delete Querys are not counted towards
 setMaxBufferedDeleteTerms; so they are only flushed by RAM usage
 (rough rough estimate) or by other ops (merging, NRT reopen, commit,
 etc.).

 Mike McCandless

 http://blog.mikemccandless.com


 On Thu, Aug 1, 2013 at 9:03 AM, Shai Erera ser...@gmail.com wrote:
  Hi
 
  I'm a little confused about FlushPolicy and
  IndexWriterConfig.setMaxBufferedDeleteTerms documentation. FlushPolicy
 jdocs
  say:
 
   * Segments are traditionally flushed by:
   * ul
   * liRAM consumption - configured via
  ...
   * liNumber of buffered delete terms/queries - configured via
   * {@link IndexWriterConfig#setMaxBufferedDeleteTerms(int)}/li
   * /ul
 
  Yet IWC.setMaxBufDelTerm says:
 
  NOTE: This setting won't trigger a segment flush.
 
  And FlushByRamOrCountPolicy says:
 
   * li{@link #onDelete(DocumentsWriterFlushControl,
  DocumentsWriterPerThreadPool.ThreadState)} - flushes
   * based on the global number of buffered delete terms iff
   * {@link IndexWriterConfig#getMaxBufferedDeleteTerms()} is enabled/li
 
  Confused, I wrote a short unit test:
 
public void testMaxBufDelTerm() throws Exception {
  Directory dir = new RAMDirectory();
  IndexWriterConfig conf = newIndexWriterConfig(TEST_VERSION_CURRENT,
 new
  MockAnalyzer(random()));
  conf.setMaxBufferedDeleteTerms(1);
  conf.setMaxBufferedDocs(10);
  conf.setRAMBufferSizeMB(IndexWriterConfig.DISABLE_AUTO_FLUSH);
  conf.setInfoStream(new PrintStreamInfoStream(System.out));
  IndexWriter writer = new IndexWriter(dir, conf );
  int numDocs = 4;
  for (int i = 0; i  numDocs; i++) {
Document doc = new Document();
doc.add(new StringField(id, doc- + i, Store.NO));
writer.addDocument(doc);
  }
 
  System.out.println(before delete);
  for (String f : dir.listAll()) System.out.println(f);
 
  writer.deleteDocuments(new Term(id, doc-0));
  writer.deleteDocuments(new Term(id, doc-1));
 
  System.out.println(\nafter delete);
  for (String f : dir.listAll()) System.out.println(f);
 
  writer.close();
  dir.close();
}
 
  When InfoStream is turned on, I can see messages regarding terms
 flushing
  (vs if I comment the .setMaxBufDelTerm line), so I know this settings
 takes
  effect.
  Yet both before and after the delete operations, the dir.list() returns
 only
  the fdx and fdt files.
 
  So is this a bug that a segment isn't flushed? If not (and I'm ok with
  that), is it a documentation inconsistency?
  Strangely, I think, if the delTerms RAM accounting exhausts
 max-RAM-buffer
  size, a new segment will be deleted?
 
  Slightly unrelated to FlushPolicy, but do I understand correctly that
  maxBufDelTerm does not apply to delete-by-query operations?
  BufferedDeletes doesn't increment any counter on addQuery(), so is it
  correct to assume that if I only delete-by-query, this setting has no
  effect?
  And the delete queries are buffered until the next segment is flushed
 due to
  other operations (constraints, commit, NRT-reopen)?
 
  Shai

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5152) Lucene FST is not immutale

2013-08-01 Thread Robert Muir (JIRA)

[
https://issues.apache.org/jira/browse/LUCENE-5152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726569#comment-13726569
]

Robert Muir commented on LUCENE-5152:
-

I guess one question would be if its FSTs job to defend against bytesref bugs.

This issue was driven because there was a bytesref bug for suggester payloads.
The same kind of bug could happen, e.g. if someone uses DirectPostings and
modifies the payload coming back from the postings lists.

Should we clone payload bytes in the postings lists too? what about term
dictionaries?

At some point then BytesRef is useless as a reference class because of a few
bad apples trying to use it as a ByteBuffer.
Ideally we would remove code that abuses BytesRef as a ByteBuffer instead.

I don't mean to pick on your issue Simon, and it doesnt mean I object to the
patch (though I wonder about performance implications), I just see this as one
of many in a larger issue.

Lucene FST is not immutale
--

Key: LUCENE-5152
URL: https://issues.apache.org/jira/browse/LUCENE-5152
Project: Lucene - Core
Issue Type: Bug
Components: core/FSTs
Affects Versions: 4.4
Reporter: Simon Willnauer
Priority: Blocker
Fix For: 5.0, 4.5

Attachments: LUCENE-5152.patch, LUCENE-5152.patch

a spinnoff from LUCENE-5120 where the analyzing suggester modified a returned
output from and FST (BytesRef) which caused sideffects in later execution.
I added an assertion into the FST that checks if a cached root arc is
modified and in-fact this happens for instance in our MemoryPostingsFormat
and I bet we find more places. We need to think about how to make this less
trappy since it can cause bugs that are super hard to find.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: FlushPolicy and maxBufDelTerm

2013-08-01 Thread Michael McCandless

On Thu, Aug 1, 2013 at 11:24 AM, Shai Erera ser...@gmail.com wrote:
 I think the doc is correct

 Wait, one of the docs is wrong. I guess according to what you write, it's
 FlushPolicy, as a new segment is not flushed per this setting?
 Or perhaps they should be clarified that the deletes are flushed == applied
 on existing segments?

Ahh, right.  OK I think we should fix FlushPolicy to say deletes are
applied?  Let's try to leave the verb flushed to mean a new segment
is written to disk, I think?

 I disabled reader pooling and I still don't see .del files. But I think
 that's explained due to there are no segments in the index yet.
 All documents are still in the RAM buffer, and according to what you write,
 I shouldn't see any segment cause of delTerms?

Right!  OK so that explains it.

Mike McCandless

http://blog.mikemccandless.com

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: FlushPolicy and maxBufDelTerm

2013-08-01 Thread Simon Willnauer

thanks for clarifying this  - I agree the wording is tricky here and
we should use the term apply here! sorry for the confusion!

simon

On Thu, Aug 1, 2013 at 7:39 PM, Michael McCandless
luc...@mikemccandless.com wrote:
 On Thu, Aug 1, 2013 at 11:24 AM, Shai Erera ser...@gmail.com wrote:
 I think the doc is correct

 Wait, one of the docs is wrong. I guess according to what you write, it's
 FlushPolicy, as a new segment is not flushed per this setting?
 Or perhaps they should be clarified that the deletes are flushed == applied
 on existing segments?

 Ahh, right.  OK I think we should fix FlushPolicy to say deletes are
 applied?  Let's try to leave the verb flushed to mean a new segment
 is written to disk, I think?

 I disabled reader pooling and I still don't see .del files. But I think
 that's explained due to there are no segments in the index yet.
 All documents are still in the RAM buffer, and according to what you write,
 I shouldn't see any segment cause of delTerms?

 Right!  OK so that explains it.

 Mike McCandless

 http://blog.mikemccandless.com

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4953) Config XML parsing should fail hard if an xpath is expect to match at most one node/string/int/boolean and multiple values are found


[ 
https://issues.apache.org/jira/browse/SOLR-4953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726695#comment-13726695
 ] 

ASF subversion and git services commented on SOLR-4953:
---

Commit 1509359 from hoss...@apache.org in branch 'dev/trunk'
[ https://svn.apache.org/r1509359 ]

SOLR-4953: Make XML Configuration parsing fail if an xpath matches multiple 
nodes when only a single value is expected.

 Config XML parsing should fail hard if an xpath is expect to match at most 
 one node/string/int/boolean and multiple values are found
 

 Key: SOLR-4953
 URL: https://issues.apache.org/jira/browse/SOLR-4953
 Project: Solr
  Issue Type: Improvement
Reporter: Hoss Man
Assignee: Hoss Man
 Attachments: SOLR-4953.patch, SOLR-4953.patch


 while reviewing some code i think i noticed that if there are multiple 
 {{indexConfig/}} blocks in solrconfig.xml, one just wins and hte rest are 
 ignored.
 this should be a hard failure situation, and we should have a TestBadConfig 
 method to verify it.
 ---
 broadened goal of issue to fail if configuration contains multiple 
 nodes/values for any option where only one value is expected.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[JENKINS-MAVEN] Lucene-Solr-Maven-4.x #404: POMs out of sync

2013-08-01 Thread Apache Jenkins Server

Build: https://builds.apache.org/job/Lucene-Solr-Maven-4.x/404/

1 tests failed.
REGRESSION:  
org.apache.solr.cloud.CollectionsAPIDistributedZkTest.testDistribSearch

Error Message:
IOException occured when talking to server at: http://127.0.0.1:28478/sqt

Stack Trace:
org.apache.solr.client.solrj.SolrServerException: IOException occured when 
talking to server at: http://127.0.0.1:28478/sqt
at 
org.apache.http.conn.scheme.PlainSocketFactory.connectSocket(PlainSocketFactory.java:129)
at 
org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:180)
at 
org.apache.http.impl.conn.ManagedClientConnectionImpl.open(ManagedClientConnectionImpl.java:294)
at 
org.apache.http.impl.client.DefaultRequestDirector.tryConnect(DefaultRequestDirector.java:645)
at 
org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:480)
at 
org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906)
at 
org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805)
at 
org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:784)
at 
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:365)
at 
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:180)
at 
org.apache.solr.cloud.CollectionsAPIDistributedZkTest.testCustomCollectionsAPI(CollectionsAPIDistributedZkTest.java:764)
at 
org.apache.solr.cloud.CollectionsAPIDistributedZkTest.doTest(CollectionsAPIDistributedZkTest.java:159)




Build Log:
[...truncated 24519 lines...]



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-5108) plugin loading should fail if mor then one instance of a singleton plugin is found

2013-08-01 Thread Hoss Man (JIRA)

Hoss Man created SOLR-5108:
--

 Summary: plugin loading should fail if mor then one instance of a 
singleton plugin is found
 Key: SOLR-5108
 URL: https://issues.apache.org/jira/browse/SOLR-5108
 Project: Solr
  Issue Type: Bug
Reporter: Hoss Man
Assignee: Hoss Man


Continuing from the config parsing/validation work done in SOLR-4953, we should 
improve SolrConfig so that parsing fails if multiple instances of a plugin 
are found for types of plugins where only one is allowed to be used at a time.

at the moment, {{SolrConfig.loadPluginInfo}} happily initializes a 
{{ListPluginInfo}} for whatever xpath it's given, and then later code can 
either call {{ListPluginInfo getPluginInfos(String)}} or {{PluginInfo 
getPluginInfo(String)}} (the later just being shorthand for getting the first 
item in the list.

we could make {{getPluginInfo(String)}} throw an error if the list has multiple 
items, but i think we should also change the signature of {{loadPluginInfo}} to 
be explicit about how many instances we expect to find, so we can error 
earlier, and have a redundant check.



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5108) plugin loading should fail if mor then one instance of a singleton plugin is found


[ 
https://issues.apache.org/jira/browse/SOLR-5108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726739#comment-13726739
 ] 

Jack Krupansky commented on SOLR-5108:
--

Sounds like this might resolve SOLR-4304 - NPE in Solr SpellCheckComponent if 
more than one QueryConverter.

 plugin loading should fail if mor then one instance of a singleton plugin is 
 found
 --

 Key: SOLR-5108
 URL: https://issues.apache.org/jira/browse/SOLR-5108
 Project: Solr
  Issue Type: Bug
Reporter: Hoss Man
Assignee: Hoss Man

 Continuing from the config parsing/validation work done in SOLR-4953, we 
 should improve SolrConfig so that parsing fails if multiple instances of a 
 plugin are found for types of plugins where only one is allowed to be used 
 at a time.
 at the moment, {{SolrConfig.loadPluginInfo}} happily initializes a 
 {{ListPluginInfo}} for whatever xpath it's given, and then later code can 
 either call {{ListPluginInfo getPluginInfos(String)}} or {{PluginInfo 
 getPluginInfo(String)}} (the later just being shorthand for getting the first 
 item in the list.
 we could make {{getPluginInfo(String)}} throw an error if the list has 
 multiple items, but i think we should also change the signature of 
 {{loadPluginInfo}} to be explicit about how many instances we expect to find, 
 so we can error earlier, and have a redundant check.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5081) Highly parallel document insertion hangs SolrCloud

2013-08-01 Thread Noble Paul (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726746#comment-13726746
 ] 

Noble Paul commented on SOLR-5081:
--

[~mikeschrag] COuld you get any more thread dumps?

 Highly parallel document insertion hangs SolrCloud
 --

 Key: SOLR-5081
 URL: https://issues.apache.org/jira/browse/SOLR-5081
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.3.1
Reporter: Mike Schrag
 Attachments: threads.txt


 If I do a highly parallel document load using a Hadoop cluster into an 18 
 node solrcloud cluster, I can deadlock solr every time.
 The ulimits on the nodes are:
 core file size  (blocks, -c) 0
 data seg size   (kbytes, -d) unlimited
 scheduling priority (-e) 0
 file size   (blocks, -f) unlimited
 pending signals (-i) 1031181
 max locked memory   (kbytes, -l) unlimited
 max memory size (kbytes, -m) unlimited
 open files  (-n) 32768
 pipe size(512 bytes, -p) 8
 POSIX message queues (bytes, -q) 819200
 real-time priority  (-r) 0
 stack size  (kbytes, -s) 10240
 cpu time   (seconds, -t) unlimited
 max user processes  (-u) 515590
 virtual memory  (kbytes, -v) unlimited
 file locks  (-x) unlimited
 The open file count is only around 4000 when this happens.
 If I bounce all the servers, things start working again, which makes me think 
 this is Solr and not ZK.
 I'll attach the stack trace from one of the servers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: FlushPolicy and maxBufDelTerm

2013-08-01 Thread ASF subversion and git services (JIRA)

ok I committed some improvements there and some other places.
Thanks guys for clarifying this!

Shai


On Thu, Aug 1, 2013 at 8:55 PM, Simon Willnauer
simon.willna...@gmail.comwrote:

 thanks for clarifying this  - I agree the wording is tricky here and
 we should use the term apply here! sorry for the confusion!

 simon

 On Thu, Aug 1, 2013 at 7:39 PM, Michael McCandless
 luc...@mikemccandless.com wrote:
  On Thu, Aug 1, 2013 at 11:24 AM, Shai Erera ser...@gmail.com wrote:
  I think the doc is correct
 
  Wait, one of the docs is wrong. I guess according to what you write,
 it's
  FlushPolicy, as a new segment is not flushed per this setting?
  Or perhaps they should be clarified that the deletes are flushed ==
 applied
  on existing segments?
 
  Ahh, right.  OK I think we should fix FlushPolicy to say deletes are
  applied?  Let's try to leave the verb flushed to mean a new segment
  is written to disk, I think?
 
  I disabled reader pooling and I still don't see .del files. But I think
  that's explained due to there are no segments in the index yet.
  All documents are still in the RAM buffer, and according to what you
 write,
  I shouldn't see any segment cause of delTerms?
 
  Right!  OK so that explains it.
 
  Mike McCandless
 
  http://blog.mikemccandless.com
 
  -
  To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
  For additional commands, e-mail: dev-h...@lucene.apache.org
 

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-4953) Config XML parsing should fail hard if an xpath is expect to match at most one node/string/int/boolean and multiple values are found


[ 
https://issues.apache.org/jira/browse/SOLR-4953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726782#comment-13726782
 ] 

ASF subversion and git services commented on SOLR-4953:
---

Commit 1509390 from hoss...@apache.org in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1509390 ]

SOLR-4953: Make XML Configuration parsing fail if an xpath matches multiple 
nodes when only a single value is expected. (merge r1509359)

 Config XML parsing should fail hard if an xpath is expect to match at most 
 one node/string/int/boolean and multiple values are found
 

 Key: SOLR-4953
 URL: https://issues.apache.org/jira/browse/SOLR-4953
 Project: Solr
  Issue Type: Improvement
Reporter: Hoss Man
Assignee: Hoss Man
 Attachments: SOLR-4953.patch, SOLR-4953.patch


 while reviewing some code i think i noticed that if there are multiple 
 {{indexConfig/}} blocks in solrconfig.xml, one just wins and hte rest are 
 ignored.
 this should be a hard failure situation, and we should have a TestBadConfig 
 method to verify it.
 ---
 broadened goal of issue to fail if configuration contains multiple 
 nodes/values for any option where only one value is expected.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (SOLR-4953) Config XML parsing should fail hard if an xpath is expect to match at most one node/string/int/boolean and multiple values are found

2013-08-01 Thread Hoss Man (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-4953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man resolved SOLR-4953.


   Resolution: Fixed
Fix Version/s: 5.0
   4.5

 Config XML parsing should fail hard if an xpath is expect to match at most 
 one node/string/int/boolean and multiple values are found
 

 Key: SOLR-4953
 URL: https://issues.apache.org/jira/browse/SOLR-4953
 Project: Solr
  Issue Type: Improvement
Reporter: Hoss Man
Assignee: Hoss Man
 Fix For: 4.5, 5.0

 Attachments: SOLR-4953.patch, SOLR-4953.patch


 while reviewing some code i think i noticed that if there are multiple 
 {{indexConfig/}} blocks in solrconfig.xml, one just wins and hte rest are 
 ignored.
 this should be a hard failure situation, and we should have a TestBadConfig 
 method to verify it.
 ---
 broadened goal of issue to fail if configuration contains multiple 
 nodes/values for any option where only one value is expected.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5081) Highly parallel document insertion hangs SolrCloud

2013-08-01 Thread Mike Schrag (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726801#comment-13726801
 ] 

Mike Schrag commented on SOLR-5081:
---

I grabbed more and they all look basically the same as the attached, which is 
to say, it sort of looks like Solr isn't doing ANYTHING. I'm going to look into 
whether I'm crushing ZooKeeper, and maybe my requests aren't even getting to 
Solr.

 Highly parallel document insertion hangs SolrCloud
 --

 Key: SOLR-5081
 URL: https://issues.apache.org/jira/browse/SOLR-5081
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.3.1
Reporter: Mike Schrag
 Attachments: threads.txt


 If I do a highly parallel document load using a Hadoop cluster into an 18 
 node solrcloud cluster, I can deadlock solr every time.
 The ulimits on the nodes are:
 core file size  (blocks, -c) 0
 data seg size   (kbytes, -d) unlimited
 scheduling priority (-e) 0
 file size   (blocks, -f) unlimited
 pending signals (-i) 1031181
 max locked memory   (kbytes, -l) unlimited
 max memory size (kbytes, -m) unlimited
 open files  (-n) 32768
 pipe size(512 bytes, -p) 8
 POSIX message queues (bytes, -q) 819200
 real-time priority  (-r) 0
 stack size  (kbytes, -s) 10240
 cpu time   (seconds, -t) unlimited
 max user processes  (-u) 515590
 virtual memory  (kbytes, -v) unlimited
 file locks  (-x) unlimited
 The open file count is only around 4000 when this happens.
 If I bounce all the servers, things start working again, which makes me think 
 this is Solr and not ZK.
 I'll attach the stack trace from one of the servers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5081) Highly parallel document insertion hangs SolrCloud

2013-08-01 Thread Erick Erickson (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726831#comment-13726831
 ] 

Erick Erickson commented on SOLR-5081:
--

Yeah, that is odd. The stack traces you sent basically showed no deadlocks, 
nothing interesting at all. I suspect pursuing whether anything is getting to 
Solr or not is a good idea

H, blunt-instrument test when the cluster is hung. What happens if you, 
say, submit a query directly to one of the nodes? Does it respond or do you see 
anything in the solr log on that node? Tip: adding distrib=false to the 
_query_ will not try to send sub-queries to other shards.

And I wonder what happens if you, say, use post.jar (comes with the example) to 
try to send a doc to Solr when it's hung, anything?

Clearly I'm grasping at straws here, but I'm kind of out of good ideas.

 Highly parallel document insertion hangs SolrCloud
 --

 Key: SOLR-5081
 URL: https://issues.apache.org/jira/browse/SOLR-5081
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.3.1
Reporter: Mike Schrag
 Attachments: threads.txt


 If I do a highly parallel document load using a Hadoop cluster into an 18 
 node solrcloud cluster, I can deadlock solr every time.
 The ulimits on the nodes are:
 core file size  (blocks, -c) 0
 data seg size   (kbytes, -d) unlimited
 scheduling priority (-e) 0
 file size   (blocks, -f) unlimited
 pending signals (-i) 1031181
 max locked memory   (kbytes, -l) unlimited
 max memory size (kbytes, -m) unlimited
 open files  (-n) 32768
 pipe size(512 bytes, -p) 8
 POSIX message queues (bytes, -q) 819200
 real-time priority  (-r) 0
 stack size  (kbytes, -s) 10240
 cpu time   (seconds, -t) unlimited
 max user processes  (-u) 515590
 virtual memory  (kbytes, -v) unlimited
 file locks  (-x) unlimited
 The open file count is only around 4000 when this happens.
 If I bounce all the servers, things start working again, which makes me think 
 this is Solr and not ZK.
 I'll attach the stack trace from one of the servers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-5152) Lucene FST is not immutable


 [ 
https://issues.apache.org/jira/browse/LUCENE-5152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Simon Willnauer updated LUCENE-5152:


Summary: Lucene FST is not immutable  (was: Lucene FST is not immutale)

 Lucene FST is not immutable
 ---

 Key: LUCENE-5152
 URL: https://issues.apache.org/jira/browse/LUCENE-5152
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/FSTs
Affects Versions: 4.4
Reporter: Simon Willnauer
Priority: Blocker
 Fix For: 5.0, 4.5

 Attachments: LUCENE-5152.patch, LUCENE-5152.patch


 a spinnoff from LUCENE-5120 where the analyzing suggester modified a returned 
 output from and FST (BytesRef) which caused sideffects in later execution. 
 I added an assertion into the FST that checks if a cached root arc is 
 modified and in-fact this happens for instance in our MemoryPostingsFormat 
 and I bet we find more places. We need to think about how to make this less 
 trappy since it can cause bugs that are super hard to find.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5152) Lucene FST is not immutable

[
https://issues.apache.org/jira/browse/LUCENE-5152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726847#comment-13726847
]

Simon Willnauer commented on LUCENE-5152:
-

bq. Should we clone payload bytes in the postings lists too? what about term
dictionaries?
I agree we can be less conservative here and just use the payload and copy it
into a new BytesRef or whatever is needed. I will bring up a new patch.

bq. At some point then BytesRef is useless as a reference class because of a
few bad apples trying to use it as a ByteBuffer. Ideally we would remove code
that abuses BytesRef as a ByteBuffer instead.

agreed again. We just need to make sure that we have asserts in place that
check for that.

bq. I don't mean to pick on your issue Simon, and it doesnt mean I object to
the patch (though I wonder about performance implications), I just see this as
one of many in a larger issue.

no worries. I am really concerned about this since it took me forever to figure
out the problems this caused. I just wanna have an infra in place that catches
those problems. I am more concerned about users that get bitten by this. I
agree we should figure out the bigger problem eventually but lets make sure
that we fix the bad apples first

Lucene FST is not immutable
---

Attachments: LUCENE-5152.patch, LUCENE-5152.patch

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5081) Highly parallel document insertion hangs SolrCloud

2013-08-01 Thread Mike Schrag (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726848#comment-13726848
 ] 

Mike Schrag commented on SOLR-5081:
---

I actually did this exact test when I was in this state originally, and the 
insert _worked_, which totally confused the situation for me. However, in light 
of seeing nothing in the traces, it supports the theory that the cluster isn't 
hung, but rather I'm somehow not even getting that far in the Hadoop cluster. 
ZK was my best guess as something that maybe could be an earlier stage failure, 
but even that I would expect to have hang the test-insert. So I need to do a 
little more forensics here and see if I can get a better picture of wtf is 
going on.

 Highly parallel document insertion hangs SolrCloud
 --

 Key: SOLR-5081
 URL: https://issues.apache.org/jira/browse/SOLR-5081
 Project: Solr
  Issue Type: Bug
  Components: SolrCloud
Affects Versions: 4.3.1
Reporter: Mike Schrag
 Attachments: threads.txt


 If I do a highly parallel document load using a Hadoop cluster into an 18 
 node solrcloud cluster, I can deadlock solr every time.
 The ulimits on the nodes are:
 core file size  (blocks, -c) 0
 data seg size   (kbytes, -d) unlimited
 scheduling priority (-e) 0
 file size   (blocks, -f) unlimited
 pending signals (-i) 1031181
 max locked memory   (kbytes, -l) unlimited
 max memory size (kbytes, -m) unlimited
 open files  (-n) 32768
 pipe size(512 bytes, -p) 8
 POSIX message queues (bytes, -q) 819200
 real-time priority  (-r) 0
 stack size  (kbytes, -s) 10240
 cpu time   (seconds, -t) unlimited
 max user processes  (-u) 515590
 virtual memory  (kbytes, -v) unlimited
 file locks  (-x) unlimited
 The open file count is only around 4000 when this happens.
 If I bounce all the servers, things start working again, which makes me think 
 this is Solr and not ZK.
 I'll attach the stack trace from one of the servers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-5152) Lucene FST is not immutable