[jira] [Updated] (SOLR-3218) Range faceting support for CurrencyField
[ https://issues.apache.org/jira/browse/SOLR-3218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vitaliy Zhovtyuk updated SOLR-3218: --- Attachment: SOLR-3218.patch Updated to latest trunk. Added range facet tests to org.apache.solr.schema.AbstractCurrencyFieldTest, Moved org.apache.solr.schema.CurrencyValue back to separate class from nested class org.apache.solr.schema.CurrencyField since CurrencyValue used outside in org.apache.solr.request.SimpleFacets and other classes. Probably worth for wrap and encapsulate in org.apache.solr.schema.CurrencyField Range faceting support for CurrencyField Key: SOLR-3218 URL: https://issues.apache.org/jira/browse/SOLR-3218 Project: Solr Issue Type: Improvement Components: Schema and Analysis Reporter: Jan Høydahl Fix For: 4.7 Attachments: SOLR-3218-1.patch, SOLR-3218-2.patch, SOLR-3218.patch, SOLR-3218.patch, SOLR-3218.patch, SOLR-3218.patch Spinoff from SOLR-2202. Need to add range faceting capabilities for CurrencyField -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[GitHub] lucene-solr pull request: Lucene 5092 pull 1
Github user PaulElschot closed the pull request at: https://github.com/apache/lucene-solr/pull/24 If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. To do so, please top-post your response. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[GitHub] lucene-solr pull request: Lucene 5092 pull 1
Github user PaulElschot commented on the pull request: https://github.com/apache/lucene-solr/pull/24#issuecomment-35196892 Closed because of some conficts after LUCENE-5044. If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. To do so, please top-post your response. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5092) join: don't expect all filters to be FixedBitSet instances
[ https://issues.apache.org/jira/browse/LUCENE-5092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13902714#comment-13902714 ] Paul Elschot commented on LUCENE-5092: -- As expected, after LUCENE-5440, the patch/pull request has a few conflicts. I'll resolve these. join: don't expect all filters to be FixedBitSet instances -- Key: LUCENE-5092 URL: https://issues.apache.org/jira/browse/LUCENE-5092 Project: Lucene - Core Issue Type: Improvement Components: modules/join Reporter: Adrien Grand Assignee: Adrien Grand Priority: Minor Attachments: LUCENE-5092.patch The join module throws exceptions when the parents filter isn't a FixedBitSet. The reason is that the join module relies on prevSetBit to find the first child document given a parent ID. As suggested by Uwe and Paul Elschot on LUCENE-5081, we could fix it by exposing methods in the iterators to iterate backwards. When the join modules gets an iterator which isn't able to iterate backwards, it would just need to dump its content into another DocIdSet that supports backward iteration, FixedBitSet for example. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5044) wasted work in AllGroupHeadsCollectorTest.arrayContains()
[ https://issues.apache.org/jira/browse/LUCENE-5044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13902715#comment-13902715 ] ASF GitHub Bot commented on LUCENE-5044: Github user PaulElschot commented on the pull request: https://github.com/apache/lucene-solr/pull/24#issuecomment-35196892 Closed because of some conficts after LUCENE-5044. wasted work in AllGroupHeadsCollectorTest.arrayContains() - Key: LUCENE-5044 URL: https://issues.apache.org/jira/browse/LUCENE-5044 Project: Lucene - Core Issue Type: Bug Affects Versions: 4.3 Environment: any Reporter: Adrian Nistor Priority: Minor Labels: patch, performance Fix For: 4.4, 5.0 Attachments: patch.diff, patchAll.diff The problem appears in version 4.3.0 and in revision 1490286. I attached a one-line patch that fixes it. In method AllGroupHeadsCollectorTest.arrayContains, the loop over actual should break immediately after found is set to true. All the iterations after found is set to true do not perform any useful work, at best they just set found again to true. Method processWord in class CapitalizationFilter has a similar loop (over prefix), and this loop breaks immediately after match is set to false, just like in the proposed patch. Other methods (e.g., Step.apply, JapaneseTokenizer.computePenalty, CompressingStoredFieldsWriter.saveInts, FieldQuery.checkOverlap) also have similar loops with similar breaks, just like in the proposed patch. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5721) ConnectionManager can become stuck in likeExpired
[ https://issues.apache.org/jira/browse/SOLR-5721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13902718#comment-13902718 ] Ramkumar Aiyengar commented on SOLR-5721: - Wouldn't using System.currentTimeMillis for timr deltas lead to errors due to NTP sync or DST? It's not guaranteed to be monotonic. See http://stackoverflow.com/questions/2978598/will-sytem-currenttimemillis-always-return-a-value-previous-calls System.nanoTime seems to provide a better alternative, at least when the OS supports a monotonic clock. ConnectionManager can become stuck in likeExpired - Key: SOLR-5721 URL: https://issues.apache.org/jira/browse/SOLR-5721 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.6.1 Reporter: Gregory Chanan Assignee: Mark Miller Fix For: 4.7, 5.0 Attachments: SOLR-5721.patch, SOLR-5721test.patch Here are the sequence of events: - we disconnect - The disconnect timer beings to run (so no longer scheduled), but doesn't set likelyExpired yet - We connect, and set likelyExpired = false - The disconnect thread runs and sets likelyExpired to true, and it is never set back to false (note that we cancel the disconnect thread but that only cancels scheduled tasks but not running tasks). This is pretty difficult to reproduce without doing more work in the disconnect thread. It's easy to reproduce by adding sleeps in various places -- I have a test that I'll attach that does that. The most straightforward way to solve this would be to grab the synchronization lock on ConnectionManager in the disconnect thread, ensure we aren't actually connected, and only then setting likelyExpired to true. In code: {code} synchronized (ConnectionManager.this) { if (!connected) likelyExpired = true; } {code} but this is all pretty subtle and error prone. It's easier to just get rid of the disconnect thread and record the last time we disconnected. Then, when we check likelyExpired, we just do a quick calculation to see if we are likelyExpired. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5514) atomic update throws exception if the schema contains uuid fields: Invalid UUID String: 'java.util.UUID:e26c4d56-e98d-41de-9b7f-f63192089670'
[ https://issues.apache.org/jira/browse/SOLR-5514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13902720#comment-13902720 ] Dirk Reuss commented on SOLR-5514: --- Here is an other hint which might cause the error. Just seen it in our log files. This time the error occurs when a document is read from the RealTimeGet-component. I'm pretty sure we do only store well formed uuid values. the error happens when i send the follow command: add overwrite=truedocfield name=.../doc...doc.../doc/add the add command contains about 100 docs each with about 33 fields. I have to examine this problem later, may be it is a new error. lst name=responseHeaderint name=status500/intint name=QTime2/int/lstlst name=errorstr name=msgFor input string: 000#0;#0;#0;#0;#0;#0;#0;#0;#0;/strstr name=tracejava.lang.NumberFormatException: For input string: 000#0;#0;#0;#0;#0;#0;#0;#0;#0; at java.lang.NumberFormatException.forInputString(NumberFormatException.java:76) at java.lang.Long.parseLong(Long.java:452) at java.lang.Long.valueOf(Long.java:524) at java.lang.Long.decode(Long.java:676) at java.util.UUID.fromString(UUID.java:217) at org.apache.solr.schema.UUIDField.toObject(UUIDField.java:103) at org.apache.solr.schema.UUIDField.toObject(UUIDField.java:49) at org.apache.solr.handler.component.RealTimeGetComponent.toSolrInputDocument(RealTimeGetComponent.java:263) at org.apache.solr.handler.component.RealTimeGetComponent.getInputDocument(RealTimeGetComponent.java:244) at org.apache.solr.update.processor.DistributedUpdateProcessor.getUpdatedDocument(DistributedUpdateProcessor.java:726) at org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:635) at org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:435) at org.apache.solr.handler.loader.XMLLoader.processUpdate(XMLLoader.java:247) at org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:174) at org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:92) at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1859) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:703) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:406) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:195) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:286) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:852) at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583) at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447) atomic update throws exception if the schema contains uuid fields: Invalid UUID String: 'java.util.UUID:e26c4d56-e98d-41de-9b7f-f63192089670' - Key: SOLR-5514 URL: https://issues.apache.org/jira/browse/SOLR-5514 Project: Solr Issue Type: Bug Affects Versions: 4.5.1 Environment: unix and windows Reporter: Dirk Reuss Assignee: Shalin Shekhar Mangar I am updating an exiting document with the statement adddocfield name='name' update='set'newvalue/field All fields are stored and I have several UUID fields. About 10-20% of the update commands will fail with the message: (example) Invalid UUID String: 'java.util.UUID:532c9353-d391-4a04-8618-dc2fa1ef8b35' the point is that java.util.UUID seems to be prepended to the original uuid stored in the field and when the value is written this error occours. I tried to check if this specific uuid field was the problem and added the uuid
[jira] [Commented] (SOLR-5721) ConnectionManager can become stuck in likeExpired
[ https://issues.apache.org/jira/browse/SOLR-5721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13902747#comment-13902747 ] Mark Miller commented on SOLR-5721: --- Yeah, sounds like a good improvement. ConnectionManager can become stuck in likeExpired - Key: SOLR-5721 URL: https://issues.apache.org/jira/browse/SOLR-5721 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.6.1 Reporter: Gregory Chanan Assignee: Mark Miller Fix For: 4.7, 5.0 Attachments: SOLR-5721.patch, SOLR-5721test.patch Here are the sequence of events: - we disconnect - The disconnect timer beings to run (so no longer scheduled), but doesn't set likelyExpired yet - We connect, and set likelyExpired = false - The disconnect thread runs and sets likelyExpired to true, and it is never set back to false (note that we cancel the disconnect thread but that only cancels scheduled tasks but not running tasks). This is pretty difficult to reproduce without doing more work in the disconnect thread. It's easy to reproduce by adding sleeps in various places -- I have a test that I'll attach that does that. The most straightforward way to solve this would be to grab the synchronization lock on ConnectionManager in the disconnect thread, ensure we aren't actually connected, and only then setting likelyExpired to true. In code: {code} synchronized (ConnectionManager.this) { if (!connected) likelyExpired = true; } {code} but this is all pretty subtle and error prone. It's easier to just get rid of the disconnect thread and record the last time we disconnected. Then, when we check likelyExpired, we just do a quick calculation to see if we are likelyExpired. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-1913) QParserPlugin plugin for Search Results Filtering Based on Bitwise Operations on Integer Fields
[ https://issues.apache.org/jira/browse/SOLR-1913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vitaliy Zhovtyuk updated SOLR-1913: --- Attachment: SOLR-1913.patch Changed packages for BitwiseFIlter: org.apache.lucene.search.BitwiseFilter, for BitwiseQueryParserPlugin: org.apache.solr.search.BitwiseQueryParserPlugin. Added Lucene tests for BitwiseFilter, added Solr tests checking bitwise parser queries for BitwiseQueryParserPlugin. QParserPlugin plugin for Search Results Filtering Based on Bitwise Operations on Integer Fields --- Key: SOLR-1913 URL: https://issues.apache.org/jira/browse/SOLR-1913 Project: Solr Issue Type: New Feature Components: search Reporter: Israel Ekpo Fix For: 4.7 Attachments: SOLR-1913-src.tar.gz, SOLR-1913.bitwise.tar.gz, SOLR-1913.patch, WEB-INF lib.jpg, bitwise_filter_plugin.jar, solr-bitwise-plugin.jar Original Estimate: 1h Remaining Estimate: 1h BitwiseQueryParserPlugin is a org.apache.solr.search.QParserPlugin that allows users to filter the documents returned from a query by performing bitwise operations between a particular integer field in the index and the specified value. This Solr plugin is based on the BitwiseFilter in LUCENE-2460 See https://issues.apache.org/jira/browse/LUCENE-2460 for more details This is the syntax for searching in Solr: http://localhost:8983/path/to/solr/select/?q={!bitwise field=fieldname op=OPERATION_NAME source=sourcevalue negate=boolean}remainder of query Example : http://localhost:8983/solr/bitwise/select/?q={!bitwise field=user_permissions op=AND source=3 negate=true}state:FL The negate parameter is optional The field parameter is the name of the integer field The op parameter is the name of the operation; one of {AND, OR, XOR} The source parameter is the specified integer value The negate parameter is a boolean indicating whether or not to negate the results of the bitwise operation To test out this plugin, simply copy the jar file containing the plugin classes into your $SOLR_HOME/lib directory and then add the following to your solrconfig.xml file after the dismax request handler: queryParser name=bitwise class=org.apache.solr.bitwise.BitwiseQueryParserPlugin basedOn=dismax / Restart your servlet container. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-5734) We should use System.nanoTime rather than System.currentTimeMillis when calculating elapsed time.
Mark Miller created SOLR-5734: - Summary: We should use System.nanoTime rather than System.currentTimeMillis when calculating elapsed time. Key: SOLR-5734 URL: https://issues.apache.org/jira/browse/SOLR-5734 Project: Solr Issue Type: Bug Reporter: Mark Miller Assignee: Mark Miller Fix For: 4.7, 5.0 -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5721) ConnectionManager can become stuck in likeExpired
[ https://issues.apache.org/jira/browse/SOLR-5721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13902766#comment-13902766 ] Mark Miller commented on SOLR-5721: --- I filed SOLR-5734 as this kind of spans the system. ConnectionManager can become stuck in likeExpired - Key: SOLR-5721 URL: https://issues.apache.org/jira/browse/SOLR-5721 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.6.1 Reporter: Gregory Chanan Assignee: Mark Miller Fix For: 4.7, 5.0 Attachments: SOLR-5721.patch, SOLR-5721test.patch Here are the sequence of events: - we disconnect - The disconnect timer beings to run (so no longer scheduled), but doesn't set likelyExpired yet - We connect, and set likelyExpired = false - The disconnect thread runs and sets likelyExpired to true, and it is never set back to false (note that we cancel the disconnect thread but that only cancels scheduled tasks but not running tasks). This is pretty difficult to reproduce without doing more work in the disconnect thread. It's easy to reproduce by adding sleeps in various places -- I have a test that I'll attach that does that. The most straightforward way to solve this would be to grab the synchronization lock on ConnectionManager in the disconnect thread, ensure we aren't actually connected, and only then setting likelyExpired to true. In code: {code} synchronized (ConnectionManager.this) { if (!connected) likelyExpired = true; } {code} but this is all pretty subtle and error prone. It's easier to just get rid of the disconnect thread and record the last time we disconnected. Then, when we check likelyExpired, we just do a quick calculation to see if we are likelyExpired. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5734) We should use System.nanoTime rather than System.currentTimeMillis when calculating elapsed time.
[ https://issues.apache.org/jira/browse/SOLR-5734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller updated SOLR-5734: -- Description: As brought up by [~andyetitmoves] in SOLR-5721. We should use System.nanoTime rather than System.currentTimeMillis when calculating elapsed time. -- Key: SOLR-5734 URL: https://issues.apache.org/jira/browse/SOLR-5734 Project: Solr Issue Type: Bug Reporter: Mark Miller Assignee: Mark Miller Fix For: 4.7, 5.0 As brought up by [~andyetitmoves] in SOLR-5721. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5688) Allow updating of soft and hard commit parameters using HTTP API
[ https://issues.apache.org/jira/browse/SOLR-5688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13902767#comment-13902767 ] Rafał Kuć commented on SOLR-5688: - A small 'ping' from me. Do we need anything more here? Any comments? Allow updating of soft and hard commit parameters using HTTP API Key: SOLR-5688 URL: https://issues.apache.org/jira/browse/SOLR-5688 Project: Solr Issue Type: Improvement Affects Versions: 4.6.1 Reporter: Rafał Kuć Fix For: 5.0 Attachments: SOLR-5688-single_api_call.patch, SOLR-5688.patch Right now, to update the values (max time and max docs) for hard and soft autocommits one has to alter the configuration and reload the core. I think it may be nice, to expose an API to do that in a way, that the configuration is not updated, so the change is not persistent. There may be various reasons for doing that - for example one may know that the application will send large amount of data and want to prepare for that. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5408) SerializedDVStrategy -- match geometries in DocValues
[ https://issues.apache.org/jira/browse/LUCENE-5408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13902783#comment-13902783 ] ASF subversion and git services commented on LUCENE-5408: - Commit 1568807 from [~dsmiley] in branch 'dev/trunk' [ https://svn.apache.org/r1568807 ] LUCENE-5408: Spatial SerializedDVStrategy SerializedDVStrategy -- match geometries in DocValues - Key: LUCENE-5408 URL: https://issues.apache.org/jira/browse/LUCENE-5408 Project: Lucene - Core Issue Type: New Feature Components: modules/spatial Reporter: David Smiley Assignee: David Smiley Fix For: 4.7 Attachments: LUCENE-5408_GeometryStrategy.patch, LUCENE-5408_SerializedDVStrategy.patch I've started work on a new SpatialStrategy implementation I'm tentatively calling SerializedDVStrategy. It's similar to the [JtsGeoStrategy in Spatial-Solr-Sandbox|https://github.com/ryantxu/spatial-solr-sandbox/tree/master/LSE/src/main/java/org/apache/lucene/spatial/pending/jts] but a little different in the details -- certainly faster. Using Spatial4j 0.4's BinaryCodec, it'll serialize the shape to bytes (for polygons this in internally WKB format) and the strategy will put it in a BinaryDocValuesField. In practice the shape is likely a polygon but it needn't be. Then I'll implement a Filter that returns a DocIdSetIterator that evaluates a given document passed via advance(docid)) to see if the query shape matches a shape in DocValues. It's improper usage for it to be used in a situation where it will evaluate every document id via nextDoc(). And in practice the DocValues format chosen should be a disk resident one since each value tends to be kind of big. This spatial strategy in and of itself has no _index_; it's O(N) where N is the number of documents that get passed thru it. So it should be placed last in the query/filter tree so that the other queries limit the documents it needs to see. At a minimum, another query/filter to use in conjunction is another SpatialStrategy like RecursivePrefixTreeStrategy. Eventually once the PrefixTree grid encoding has a little bit more metadata, it will be possible to further combine the grid this strategy in such a way that many documents won't need to be checked against the serialized geometry. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-4.x-MacOSX (64bit/jdk1.7.0) - Build # 1304 - Still Failing!
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-MacOSX/1304/ Java: 64bit/jdk1.7.0 -XX:-UseCompressedOops -XX:+UseConcMarkSweepGC All tests passed Build Log: [...truncated 9912 lines...] [junit4] JVM J0: stderr was not empty, see: /Users/jenkins/workspace/Lucene-Solr-4.x-MacOSX/solr/build/solr-core/test/temp/junit4-J0-20140216_193116_238.syserr [junit4] JVM J0: stderr (verbatim) [junit4] java(1575,0x13ce57000) malloc: *** error for object 0x13ce44f10: pointer being freed was not allocated [junit4] *** set a breakpoint in malloc_error_break to debug [junit4] JVM J0: EOF [...truncated 1 lines...] [junit4] ERROR: JVM J0 ended with an exception, command line: /Library/Java/JavaVirtualMachines/jdk1.7.0_51.jdk/Contents/Home/jre/bin/java -XX:-UseCompressedOops -XX:+UseConcMarkSweepGC -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/Users/jenkins/workspace/Lucene-Solr-4.x-MacOSX/heapdumps -Dtests.prefix=tests -Dtests.seed=6B5378DCCDC69524 -Xmx512M -Dtests.iters= -Dtests.verbose=false -Dtests.infostream=false -Dtests.codec=random -Dtests.postingsformat=random -Dtests.docvaluesformat=random -Dtests.locale=random -Dtests.timezone=random -Dtests.directory=random -Dtests.linedocsfile=europarl.lines.txt.gz -Dtests.luceneMatchVersion=4.7 -Dtests.cleanthreads=perClass -Djava.util.logging.config.file=/Users/jenkins/workspace/Lucene-Solr-4.x-MacOSX/lucene/tools/junit4/logging.properties -Dtests.nightly=false -Dtests.weekly=false -Dtests.slow=true -Dtests.asserts.gracious=false -Dtests.multiplier=1 -DtempDir=. -Djava.io.tmpdir=. -Djunit4.tempDir=/Users/jenkins/workspace/Lucene-Solr-4.x-MacOSX/solr/build/solr-core/test/temp -Dclover.db.dir=/Users/jenkins/workspace/Lucene-Solr-4.x-MacOSX/lucene/build/clover/db -Djava.security.manager=org.apache.lucene.util.TestSecurityManager -Djava.security.policy=/Users/jenkins/workspace/Lucene-Solr-4.x-MacOSX/lucene/tools/junit4/tests.policy -Dlucene.version=4.7-SNAPSHOT -Djetty.testMode=1 -Djetty.insecurerandom=1 -Dsolr.directoryFactory=org.apache.solr.core.MockDirectoryFactory -Djava.awt.headless=true -Djdk.map.althashing.threshold=0 -Dtests.disableHdfs=true -Dfile.encoding=UTF-8 -classpath
[jira] [Commented] (SOLR-5688) Allow updating of soft and hard commit parameters using HTTP API
[ https://issues.apache.org/jira/browse/SOLR-5688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13902802#comment-13902802 ] Mark Miller commented on SOLR-5688: --- Hopefully the guys working a bunch of other rest api stuff can comment. I think we want to all pull in the same direction with that - but I haven't followed progress closely enough to comment helpfully yet. Allow updating of soft and hard commit parameters using HTTP API Key: SOLR-5688 URL: https://issues.apache.org/jira/browse/SOLR-5688 Project: Solr Issue Type: Improvement Affects Versions: 4.6.1 Reporter: Rafał Kuć Fix For: 5.0 Attachments: SOLR-5688-single_api_call.patch, SOLR-5688.patch Right now, to update the values (max time and max docs) for hard and soft autocommits one has to alter the configuration and reload the core. I think it may be nice, to expose an API to do that in a way, that the configuration is not updated, so the change is not persistent. There may be various reasons for doing that - for example one may know that the application will send large amount of data and want to prepare for that. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5408) SerializedDVStrategy -- match geometries in DocValues
[ https://issues.apache.org/jira/browse/LUCENE-5408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13902805#comment-13902805 ] ASF subversion and git services commented on LUCENE-5408: - Commit 1568817 from [~dsmiley] in branch 'dev/trunk' [ https://svn.apache.org/r1568817 ] LUCENE-5408: fixed tests; some strategies require DocValues SerializedDVStrategy -- match geometries in DocValues - Key: LUCENE-5408 URL: https://issues.apache.org/jira/browse/LUCENE-5408 Project: Lucene - Core Issue Type: New Feature Components: modules/spatial Reporter: David Smiley Assignee: David Smiley Fix For: 4.7 Attachments: LUCENE-5408_GeometryStrategy.patch, LUCENE-5408_SerializedDVStrategy.patch I've started work on a new SpatialStrategy implementation I'm tentatively calling SerializedDVStrategy. It's similar to the [JtsGeoStrategy in Spatial-Solr-Sandbox|https://github.com/ryantxu/spatial-solr-sandbox/tree/master/LSE/src/main/java/org/apache/lucene/spatial/pending/jts] but a little different in the details -- certainly faster. Using Spatial4j 0.4's BinaryCodec, it'll serialize the shape to bytes (for polygons this in internally WKB format) and the strategy will put it in a BinaryDocValuesField. In practice the shape is likely a polygon but it needn't be. Then I'll implement a Filter that returns a DocIdSetIterator that evaluates a given document passed via advance(docid)) to see if the query shape matches a shape in DocValues. It's improper usage for it to be used in a situation where it will evaluate every document id via nextDoc(). And in practice the DocValues format chosen should be a disk resident one since each value tends to be kind of big. This spatial strategy in and of itself has no _index_; it's O(N) where N is the number of documents that get passed thru it. So it should be placed last in the query/filter tree so that the other queries limit the documents it needs to see. At a minimum, another query/filter to use in conjunction is another SpatialStrategy like RecursivePrefixTreeStrategy. Eventually once the PrefixTree grid encoding has a little bit more metadata, it will be possible to further combine the grid this strategy in such a way that many documents won't need to be checked against the serialized geometry. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5408) SerializedDVStrategy -- match geometries in DocValues
[ https://issues.apache.org/jira/browse/LUCENE-5408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13902806#comment-13902806 ] ASF subversion and git services commented on LUCENE-5408: - Commit 1568818 from [~dsmiley] in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1568818 ] LUCENE-5408: Spatial SerializedDVStrategy SerializedDVStrategy -- match geometries in DocValues - Key: LUCENE-5408 URL: https://issues.apache.org/jira/browse/LUCENE-5408 Project: Lucene - Core Issue Type: New Feature Components: modules/spatial Reporter: David Smiley Assignee: David Smiley Fix For: 4.7 Attachments: LUCENE-5408_GeometryStrategy.patch, LUCENE-5408_SerializedDVStrategy.patch I've started work on a new SpatialStrategy implementation I'm tentatively calling SerializedDVStrategy. It's similar to the [JtsGeoStrategy in Spatial-Solr-Sandbox|https://github.com/ryantxu/spatial-solr-sandbox/tree/master/LSE/src/main/java/org/apache/lucene/spatial/pending/jts] but a little different in the details -- certainly faster. Using Spatial4j 0.4's BinaryCodec, it'll serialize the shape to bytes (for polygons this in internally WKB format) and the strategy will put it in a BinaryDocValuesField. In practice the shape is likely a polygon but it needn't be. Then I'll implement a Filter that returns a DocIdSetIterator that evaluates a given document passed via advance(docid)) to see if the query shape matches a shape in DocValues. It's improper usage for it to be used in a situation where it will evaluate every document id via nextDoc(). And in practice the DocValues format chosen should be a disk resident one since each value tends to be kind of big. This spatial strategy in and of itself has no _index_; it's O(N) where N is the number of documents that get passed thru it. So it should be placed last in the query/filter tree so that the other queries limit the documents it needs to see. At a minimum, another query/filter to use in conjunction is another SpatialStrategy like RecursivePrefixTreeStrategy. Eventually once the PrefixTree grid encoding has a little bit more metadata, it will be possible to further combine the grid this strategy in such a way that many documents won't need to be checked against the serialized geometry. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5727) LBHttpSolrServer should only retry on Connection exceptions when sending updates.
[ https://issues.apache.org/jira/browse/SOLR-5727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13902814#comment-13902814 ] Mark Miller commented on SOLR-5727: --- There seem to be some unrelated failures that make my patch for this hard to test, but once that gets worked out, I'll post a patch and commit. I want to get this into jenkins to see the affects on chaosmonkey tests. I think SOLR-5593 was hiding / protecting against some issues around this. It also fits with some fails even before that I was trying to figure out and seemed to make no sense unless the client was resending the same update even while we were internally retrying to send an update to a leader. LBHttpSolrServer should only retry on Connection exceptions when sending updates. - Key: SOLR-5727 URL: https://issues.apache.org/jira/browse/SOLR-5727 Project: Solr Issue Type: Bug Components: SolrCloud Reporter: Mark Miller Assignee: Mark Miller Fix For: 4.7, 5.0 You don't know if the request was successful or not and so its better to error to the user than retry, especially because forwards to a shard leader can be retried internally. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5722) Add catenateShingles option to WordDelimiterFilter
[ https://issues.apache.org/jira/browse/SOLR-5722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13902824#comment-13902824 ] Greg Pendlebury commented on SOLR-5722: --- I don't think it does. It has been a while since we looked into it, and that link is currently returning 503 for me, but my understanding was that the HyphenatedWordsFilter put two tokens back together when a hyphen was found on the end of the first token. The catenateShingles options we are using addresses the scenario where multiple hyphens are found internal to a single token. Add catenateShingles option to WordDelimiterFilter -- Key: SOLR-5722 URL: https://issues.apache.org/jira/browse/SOLR-5722 Project: Solr Issue Type: Improvement Reporter: Greg Pendlebury Priority: Minor Labels: filter, newbie, patch Attachments: WDFconcatShingles.patch Apologies if I put this in the wrong spot. I'm attaching a patch (against current trunk) that adds support for a 'catenateShingles' option to the WordDelimiterFilter. We (National Library of Australia - NLA) are currently maintaining this as an internal modification to the Filter, but I believe it is generic enough to contribute upstream. Description: = {code} /** * NLA Modification to the standard word delimiter to support various * hyphenation use cases. Primarily driven by requirements for * newspapers where words are often broken across line endings. * * eg. hyphenated-surname is printed printed across a line ending and * turns out like hyphen-ated-surname or hyphenated-sur-name. * * In this scenario the stock filter, with 'catenateAll' turned on, will * generate individual tokens plus one combined token, but not * sub-tokens like hyphenated surname and hyphenatedsur name. * * So we add a new 'catenateShingles' to achieve this. */ {code} Includes unit tests, and as is noted in one of them CATENATE_WORDS and CATENATE_SHINGLES are logically considered mutually exclusive for sensible usage and can cause duplicate tokens (although they should have the same positions etc). I'm happy to work on it more if anyone finds problems with it. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5440) Add LongFixedBitSet and replace usage of OpenBitSet
[ https://issues.apache.org/jira/browse/LUCENE-5440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13902825#comment-13902825 ] ASF subversion and git services commented on LUCENE-5440: - Commit 1568824 from [~shaie] in branch 'dev/trunk' [ https://svn.apache.org/r1568824 ] LUCENE-5440: fix bug in FacetComponent Add LongFixedBitSet and replace usage of OpenBitSet --- Key: LUCENE-5440 URL: https://issues.apache.org/jira/browse/LUCENE-5440 Project: Lucene - Core Issue Type: Improvement Components: core/search Reporter: Shai Erera Assignee: Shai Erera Fix For: 5.0, 4.7 Attachments: LUCENE-5440-solr.patch, LUCENE-5440-solr.patch, LUCENE-5440-solr.patch, LUCENE-5440.patch, LUCENE-5440.patch, LUCENE-5440.patch, LUCENE-5440.patch, LUCENE-5440.patch Spinoff from here: http://lucene.markmail.org/thread/35gw3amo53dsqsqj. I wrote a LongFixedBitSet which behaves like FixedBitSet, only allows managing more than 2.1B bits. It overcome some issues I've encountered with OpenBitSet, such as the use of set/fastSet as well the implementation of DocIdSet. I'll post a patch shortly and describe it in more detail. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: svn commit: r1568824 - /lucene/dev/trunk/solr/core/src/java/org/apache/solr/handler/component/FacetComponent.java
Thanks! - Mark http://about.me/markrmiller On Feb 16, 2014, at 3:37 PM, sh...@apache.org wrote: Author: shaie Date: Sun Feb 16 20:37:28 2014 New Revision: 1568824 URL: http://svn.apache.org/r1568824 Log: LUCENE-5440: fix bug in FacetComponent Modified: lucene/dev/trunk/solr/core/src/java/org/apache/solr/handler/component/FacetComponent.java Modified: lucene/dev/trunk/solr/core/src/java/org/apache/solr/handler/component/FacetComponent.java URL: http://svn.apache.org/viewvc/lucene/dev/trunk/solr/core/src/java/org/apache/solr/handler/component/FacetComponent.java?rev=1568824r1=1568823r2=1568824view=diff == --- lucene/dev/trunk/solr/core/src/java/org/apache/solr/handler/component/FacetComponent.java (original) +++ lucene/dev/trunk/solr/core/src/java/org/apache/solr/handler/component/FacetComponent.java Sun Feb 16 20:37:28 2014 @@ -451,7 +451,7 @@ public class FacetComponent extends Sear long maxCount = sfc.count; for (int shardNum=0; shardNumrb.shards.length; shardNum++) { FixedBitSet fbs = dff.counted[shardNum]; -if (fbs!=null !fbs.get(sfc.termNum)) { // fbs can be null if a shard request failed +if (fbs!=null (sfc.termNum = fbs.length() || !fbs.get(sfc.termNum))) { // fbs can be null if a shard request failed // if missing from this shard, add the max it could be maxCount += dff.maxPossible(sfc,shardNum); } @@ -466,7 +466,7 @@ public class FacetComponent extends Sear // add a query for each shard missing the term that needs refinement for (int shardNum=0; shardNumrb.shards.length; shardNum++) { FixedBitSet fbs = dff.counted[shardNum]; -if(fbs!=null !fbs.get(sfc.termNum) dff.maxPossible(sfc,shardNum)0) { +if(fbs!=null (sfc.termNum = fbs.length() || !fbs.get(sfc.termNum)) dff.maxPossible(sfc,shardNum)0) { dff.needRefinements = true; ListString lst = dff._toRefine[shardNum]; if (lst == null) { - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5440) Add LongFixedBitSet and replace usage of OpenBitSet
[ https://issues.apache.org/jira/browse/LUCENE-5440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13902832#comment-13902832 ] ASF subversion and git services commented on LUCENE-5440: - Commit 1568825 from [~shaie] in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1568825 ] LUCENE-5440: fix bug in FacetComponent Add LongFixedBitSet and replace usage of OpenBitSet --- Key: LUCENE-5440 URL: https://issues.apache.org/jira/browse/LUCENE-5440 Project: Lucene - Core Issue Type: Improvement Components: core/search Reporter: Shai Erera Assignee: Shai Erera Fix For: 5.0, 4.7 Attachments: LUCENE-5440-solr.patch, LUCENE-5440-solr.patch, LUCENE-5440-solr.patch, LUCENE-5440.patch, LUCENE-5440.patch, LUCENE-5440.patch, LUCENE-5440.patch, LUCENE-5440.patch Spinoff from here: http://lucene.markmail.org/thread/35gw3amo53dsqsqj. I wrote a LongFixedBitSet which behaves like FixedBitSet, only allows managing more than 2.1B bits. It overcome some issues I've encountered with OpenBitSet, such as the use of set/fastSet as well the implementation of DocIdSet. I'll post a patch shortly and describe it in more detail. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-5735) ChaosMonkey test timeouts.
Mark Miller created SOLR-5735: - Summary: ChaosMonkey test timeouts. Key: SOLR-5735 URL: https://issues.apache.org/jira/browse/SOLR-5735 Project: Solr Issue Type: Task Reporter: Mark Miller Assignee: Mark Miller Priority: Critical Fix For: 4.7, 5.0 This started showing up in jenkins runs a while back. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Welcome Anshum Gupta as Lucene/Solr Committer!
Hey everybody! The Lucene PMC is happy to welcome Anshum Gupta as a committer on the Lucene / Solr project. Anshum has contributed to a number of issues for the project, especially around SolrCloud. Welcome Anshum! It's tradition to introduce yourself with a short bio :) -- - Mark http://about.me/markrmiller
Re: Welcome Anshum Gupta as Lucene/Solr Committer!
Welcome Anshum! Dawid On Sun, Feb 16, 2014 at 11:33 PM, Mark Miller markrmil...@gmail.com wrote: Hey everybody! The Lucene PMC is happy to welcome Anshum Gupta as a committer on the Lucene / Solr project. Anshum has contributed to a number of issues for the project, especially around SolrCloud. Welcome Anshum! It's tradition to introduce yourself with a short bio :) -- - Mark http://about.me/markrmiller - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[GitHub] lucene-solr pull request: LUCENE-5092, 2nd try
GitHub user PaulElschot opened a pull request: https://github.com/apache/lucene-solr/pull/33 LUCENE-5092, 2nd try In core introduce DocBlocksIterator. Use this in FixedBitSet, in EliasFanoDocIdSet and in join module ToChild... and ToParent... Also change BaseDocIdSetTestCase to test DocBlocksIterator.advanceToJustBefore. This was simplified a lot by LUCENE-5441 and LUCENE-5440. You can merge this pull request into a Git repository by running: $ git pull https://github.com/apache/lucene-solr LUCENE-5092-pull-2 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/lucene-solr/pull/33.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #33 commit 4f8eae48ff0441b86a0fdb130e564f646dffcc43 Author: Paul Elschot paul.j.elsc...@gmail.com Date: 2014-02-16T22:31:58Z Squashed commit for LUCENE-5092 If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. To do so, please top-post your response. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Welcome Anshum Gupta as Lucene/Solr Committer!
Hey Anshum, welcome aboard! Uwe On 16. Februar 2014 23:33:11 MEZ, Mark Miller markrmil...@gmail.com wrote: Hey everybody! The Lucene PMC is happy to welcome Anshum Gupta as a committer on the Lucene / Solr project. Anshum has contributed to a number of issues for the project, especially around SolrCloud. Welcome Anshum! It's tradition to introduce yourself with a short bio :) -- - Mark http://about.me/markrmiller -- Uwe Schindler H.-H.-Meier-Allee 63, 28213 Bremen http://www.thetaphi.de
[jira] [Commented] (LUCENE-5441) Decouple DocIdSet from OpenBitSet and FixedBitSet
[ https://issues.apache.org/jira/browse/LUCENE-5441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13902865#comment-13902865 ] ASF GitHub Bot commented on LUCENE-5441: GitHub user PaulElschot opened a pull request: https://github.com/apache/lucene-solr/pull/33 LUCENE-5092, 2nd try In core introduce DocBlocksIterator. Use this in FixedBitSet, in EliasFanoDocIdSet and in join module ToChild... and ToParent... Also change BaseDocIdSetTestCase to test DocBlocksIterator.advanceToJustBefore. This was simplified a lot by LUCENE-5441 and LUCENE-5440. You can merge this pull request into a Git repository by running: $ git pull https://github.com/apache/lucene-solr LUCENE-5092-pull-2 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/lucene-solr/pull/33.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #33 commit 4f8eae48ff0441b86a0fdb130e564f646dffcc43 Author: Paul Elschot paul.j.elsc...@gmail.com Date: 2014-02-16T22:31:58Z Squashed commit for LUCENE-5092 Decouple DocIdSet from OpenBitSet and FixedBitSet - Key: LUCENE-5441 URL: https://issues.apache.org/jira/browse/LUCENE-5441 Project: Lucene - Core Issue Type: Task Components: core/other Affects Versions: 4.6.1 Reporter: Uwe Schindler Fix For: 5.0 Attachments: LUCENE-5441.patch, LUCENE-5441.patch, LUCENE-5441.patch Back from the times of Lucene 2.4 when DocIdSet was introduced, we somehow kept the stupid filters can return a BitSet directly in the code. So lots of Filters return just FixedBitSet, because this is the superclass (ideally interface) of FixedBitSet. We should decouple that and *not* implement that abstract interface directly by FixedBitSet. This leads to bugs e.g. in BlockJoin, because it used Filters in a wrong way, just because it was always returning Bitsets. But some filters actually don't do this. I propose to let FixedBitSet (only in trunk, because that a major backwards break) just have a method {{asDocIdSet()}}, that returns an anonymous instance of DocIdSet: bits() returns the FixedBitSet itsself, iterator() returns a new Iterator (like it always did) and the cost/cacheable methods return static values. Filters in trunk would need to be changed like that: {code:java} FixedBitSet bits = ... return bits; {code} gets: {code:java} FixedBitSet bits = ... return bits.asDocIdSet(); {code} As this methods returns an anonymous DocIdSet, calling code can no longer rely or check if the implementation behind is a FixedBitSet. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5441) Decouple DocIdSet from OpenBitSet and FixedBitSet
[ https://issues.apache.org/jira/browse/LUCENE-5441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13902866#comment-13902866 ] Paul Elschot commented on LUCENE-5441: -- I'm sorry that pull request #33 ended up here, I think I should have mentioned LUCENE-5092 as the first issue in the comment body at the pull request. Decouple DocIdSet from OpenBitSet and FixedBitSet - Key: LUCENE-5441 URL: https://issues.apache.org/jira/browse/LUCENE-5441 Project: Lucene - Core Issue Type: Task Components: core/other Affects Versions: 4.6.1 Reporter: Uwe Schindler Fix For: 5.0 Attachments: LUCENE-5441.patch, LUCENE-5441.patch, LUCENE-5441.patch Back from the times of Lucene 2.4 when DocIdSet was introduced, we somehow kept the stupid filters can return a BitSet directly in the code. So lots of Filters return just FixedBitSet, because this is the superclass (ideally interface) of FixedBitSet. We should decouple that and *not* implement that abstract interface directly by FixedBitSet. This leads to bugs e.g. in BlockJoin, because it used Filters in a wrong way, just because it was always returning Bitsets. But some filters actually don't do this. I propose to let FixedBitSet (only in trunk, because that a major backwards break) just have a method {{asDocIdSet()}}, that returns an anonymous instance of DocIdSet: bits() returns the FixedBitSet itsself, iterator() returns a new Iterator (like it always did) and the cost/cacheable methods return static values. Filters in trunk would need to be changed like that: {code:java} FixedBitSet bits = ... return bits; {code} gets: {code:java} FixedBitSet bits = ... return bits.asDocIdSet(); {code} As this methods returns an anonymous DocIdSet, calling code can no longer rely or check if the implementation behind is a FixedBitSet. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Welcome Anshum Gupta as Lucene/Solr Committer!
Welcome Anshum ! On Sun, Feb 16, 2014 at 11:46 PM, Uwe Schindler u...@thetaphi.de wrote: Hey Anshum, welcome aboard! Uwe On 16. Februar 2014 23:33:11 MEZ, Mark Miller markrmil...@gmail.com wrote: Hey everybody! The Lucene PMC is happy to welcome Anshum Gupta as a committer on the Lucene / Solr project. Anshum has contributed to a number of issues for the project, especially around SolrCloud. Welcome Anshum! It's tradition to introduce yourself with a short bio :) -- Uwe Schindler H.-H.-Meier-Allee 63, 28213 Bremen http://www.thetaphi.de -- Adrien - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5092) join: don't expect all filters to be FixedBitSet instances
[ https://issues.apache.org/jira/browse/LUCENE-5092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13902867#comment-13902867 ] Paul Elschot commented on LUCENE-5092: -- A new pull request is here: https://github.com/apache/lucene-solr/pull/33 The automated message for the pull request ended up at LUCENE-5441. join: don't expect all filters to be FixedBitSet instances -- Key: LUCENE-5092 URL: https://issues.apache.org/jira/browse/LUCENE-5092 Project: Lucene - Core Issue Type: Improvement Components: modules/join Reporter: Adrien Grand Assignee: Adrien Grand Priority: Minor Attachments: LUCENE-5092.patch The join module throws exceptions when the parents filter isn't a FixedBitSet. The reason is that the join module relies on prevSetBit to find the first child document given a parent ID. As suggested by Uwe and Paul Elschot on LUCENE-5081, we could fix it by exposing methods in the iterators to iterate backwards. When the join modules gets an iterator which isn't able to iterate backwards, it would just need to dump its content into another DocIdSet that supports backward iteration, FixedBitSet for example. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Welcome Anshum Gupta as Lucene/Solr Committer!
Congrats! On Feb 17, 2014, at 7:33 AM, Mark Miller markrmil...@gmail.com wrote: Hey everybody! The Lucene PMC is happy to welcome Anshum Gupta as a committer on the Lucene / Solr project. Anshum has contributed to a number of issues for the project, especially around SolrCloud. Welcome Anshum! It's tradition to introduce yourself with a short bio :) -- - Mark http://about.me/markrmiller
[jira] [Commented] (SOLR-5727) LBHttpSolrServer should only retry on Connection exceptions when sending updates.
[ https://issues.apache.org/jira/browse/SOLR-5727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13902878#comment-13902878 ] ASF subversion and git services commented on SOLR-5727: --- Commit 1568859 from [~markrmil...@gmail.com] in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1568859 ] SOLR-5727: LBHttpSolrServer should only retry on Connection exceptions when sending updates. Affects CloudSolrServer. LBHttpSolrServer should only retry on Connection exceptions when sending updates. - Key: SOLR-5727 URL: https://issues.apache.org/jira/browse/SOLR-5727 Project: Solr Issue Type: Bug Components: SolrCloud Reporter: Mark Miller Assignee: Mark Miller Fix For: 4.7, 5.0 You don't know if the request was successful or not and so its better to error to the user than retry, especially because forwards to a shard leader can be retried internally. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5727) LBHttpSolrServer should only retry on Connection exceptions when sending updates.
[ https://issues.apache.org/jira/browse/SOLR-5727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13902876#comment-13902876 ] ASF subversion and git services commented on SOLR-5727: --- Commit 1568857 from [~markrmil...@gmail.com] in branch 'dev/trunk' [ https://svn.apache.org/r1568857 ] SOLR-5727: LBHttpSolrServer should only retry on Connection exceptions when sending updates. Affects CloudSolrServer. LBHttpSolrServer should only retry on Connection exceptions when sending updates. - Key: SOLR-5727 URL: https://issues.apache.org/jira/browse/SOLR-5727 Project: Solr Issue Type: Bug Components: SolrCloud Reporter: Mark Miller Assignee: Mark Miller Fix For: 4.7, 5.0 You don't know if the request was successful or not and so its better to error to the user than retry, especially because forwards to a shard leader can be retried internally. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Welcome Anshum Gupta as Lucene/Solr Committer!
Welcome Anshum! On Sun, Feb 16, 2014 at 5:55 PM, Christian Moen c...@atilika.com wrote: Congrats! On Feb 17, 2014, at 7:33 AM, Mark Miller markrmil...@gmail.com wrote: Hey everybody! The Lucene PMC is happy to welcome Anshum Gupta as a committer on the Lucene / Solr project. Anshum has contributed to a number of issues for the project, especially around SolrCloud. Welcome Anshum! It's tradition to introduce yourself with a short bio :) -- - Mark http://about.me/markrmiller
Re: Welcome Anshum Gupta as Lucene/Solr Committer!
Welcome Anshum! Mike McCandless http://blog.mikemccandless.com On Sun, Feb 16, 2014 at 5:33 PM, Mark Miller markrmil...@gmail.com wrote: Hey everybody! The Lucene PMC is happy to welcome Anshum Gupta as a committer on the Lucene / Solr project. Anshum has contributed to a number of issues for the project, especially around SolrCloud. Welcome Anshum! It's tradition to introduce yourself with a short bio :) -- - Mark http://about.me/markrmiller - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Welcome Anshum Gupta as Lucene/Solr Committer!
Congrats Anshum! koji -- http://soleami.com/blog/mahout-and-machine-learning-training-course-is-here.html (14/02/17 7:33), Mark Miller wrote: Hey everybody! The Lucene PMC is happy to welcome Anshum Gupta as a committer on the Lucene / Solr project. Anshum has contributed to a number of issues for the project, especially around SolrCloud. Welcome Anshum! It's tradition to introduce yourself with a short bio :) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Welcome Anshum Gupta as Lucene/Solr Committer!
Welcome Anshum! On Mon, Feb 17, 2014 at 6:33 AM, Mark Miller markrmil...@gmail.com wrote: Hey everybody! The Lucene PMC is happy to welcome Anshum Gupta as a committer on the Lucene / Solr project. Anshum has contributed to a number of issues for the project, especially around SolrCloud. Welcome Anshum! It's tradition to introduce yourself with a short bio :) -- - Mark http://about.me/markrmiller -- Han Jiang Team of Search Engine and Web Mining, School of Electronic Engineering and Computer Science, Peking University, China
Re: Welcome Anshum Gupta as Lucene/Solr Committer!
Welcome Anshum! On Sun, Feb 16, 2014 at 5:02 PM, Han Jiang h...@apache.org wrote: Welcome Anshum! On Mon, Feb 17, 2014 at 6:33 AM, Mark Miller markrmil...@gmail.comwrote: Hey everybody! The Lucene PMC is happy to welcome Anshum Gupta as a committer on the Lucene / Solr project. Anshum has contributed to a number of issues for the project, especially around SolrCloud. Welcome Anshum! It's tradition to introduce yourself with a short bio :) -- - Mark http://about.me/markrmiller -- Han Jiang Team of Search Engine and Web Mining, School of Electronic Engineering and Computer Science, Peking University, China
Re: Welcome Anshum Gupta as Lucene/Solr Committer!
Welcome Anshum! Joel Bernstein Search Engineer at Heliosearch On Sun, Feb 16, 2014 at 8:03 PM, Erick Erickson erickerick...@gmail.comwrote: Welcome Anshum! On Sun, Feb 16, 2014 at 5:02 PM, Han Jiang h...@apache.org wrote: Welcome Anshum! On Mon, Feb 17, 2014 at 6:33 AM, Mark Miller markrmil...@gmail.comwrote: Hey everybody! The Lucene PMC is happy to welcome Anshum Gupta as a committer on the Lucene / Solr project. Anshum has contributed to a number of issues for the project, especially around SolrCloud. Welcome Anshum! It's tradition to introduce yourself with a short bio :) -- - Mark http://about.me/markrmiller -- Han Jiang Team of Search Engine and Web Mining, School of Electronic Engineering and Computer Science, Peking University, China
Re: Welcome Anshum Gupta as Lucene/Solr Committer!
Welcome Anshum! On Sun, Feb 16, 2014 at 5:33 PM, Mark Miller markrmil...@gmail.com wrote: Hey everybody! The Lucene PMC is happy to welcome Anshum Gupta as a committer on the Lucene / Solr project. Anshum has contributed to a number of issues for the project, especially around SolrCloud. Welcome Anshum! It's tradition to introduce yourself with a short bio :) -- - Mark http://about.me/markrmiller
Re: Welcome Anshum Gupta as Lucene/Solr Committer!
Thanks Mark. I spent most of my life in New Delhi, India other than short stints in different parts of the country (including living in a beach house on a tropical island for 3 years when I was young). After spending the last 3 years in Bangalore, I just relocated to San Francisco to be at the LucidWorks office in the Bay Area. Prior to this I've been a part of the search teams at A9 (CloudSearch), Cleartrip.com and Naukri.com where I was involved in designing and developing search and recommendation engines. These days, I love contributing stuff to Solr, primarily around SolrCloud and hope to continue to be at least as active towards it. In my free time I love photography, traveling, eating out and drinking my beer. On Sun, Feb 16, 2014 at 2:33 PM, Mark Miller markrmil...@gmail.com wrote: Hey everybody! The Lucene PMC is happy to welcome Anshum Gupta as a committer on the Lucene / Solr project. Anshum has contributed to a number of issues for the project, especially around SolrCloud. Welcome Anshum! It's tradition to introduce yourself with a short bio :) -- - Mark http://about.me/markrmiller -- Anshum Gupta http://www.anshumgupta.net
[jira] [Resolved] (LUCENE-5408) SerializedDVStrategy -- match geometries in DocValues
[ https://issues.apache.org/jira/browse/LUCENE-5408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Smiley resolved LUCENE-5408. -- Resolution: Fixed Fix Version/s: 5.0 SerializedDVStrategy -- match geometries in DocValues - Key: LUCENE-5408 URL: https://issues.apache.org/jira/browse/LUCENE-5408 Project: Lucene - Core Issue Type: New Feature Components: modules/spatial Reporter: David Smiley Assignee: David Smiley Fix For: 5.0, 4.7 Attachments: LUCENE-5408_GeometryStrategy.patch, LUCENE-5408_SerializedDVStrategy.patch I've started work on a new SpatialStrategy implementation I'm tentatively calling SerializedDVStrategy. It's similar to the [JtsGeoStrategy in Spatial-Solr-Sandbox|https://github.com/ryantxu/spatial-solr-sandbox/tree/master/LSE/src/main/java/org/apache/lucene/spatial/pending/jts] but a little different in the details -- certainly faster. Using Spatial4j 0.4's BinaryCodec, it'll serialize the shape to bytes (for polygons this in internally WKB format) and the strategy will put it in a BinaryDocValuesField. In practice the shape is likely a polygon but it needn't be. Then I'll implement a Filter that returns a DocIdSetIterator that evaluates a given document passed via advance(docid)) to see if the query shape matches a shape in DocValues. It's improper usage for it to be used in a situation where it will evaluate every document id via nextDoc(). And in practice the DocValues format chosen should be a disk resident one since each value tends to be kind of big. This spatial strategy in and of itself has no _index_; it's O(N) where N is the number of documents that get passed thru it. So it should be placed last in the query/filter tree so that the other queries limit the documents it needs to see. At a minimum, another query/filter to use in conjunction is another SpatialStrategy like RecursivePrefixTreeStrategy. Eventually once the PrefixTree grid encoding has a little bit more metadata, it will be possible to further combine the grid this strategy in such a way that many documents won't need to be checked against the serialized geometry. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Ensuring a test uses a codec supporting DocValues
I wrote a test that requires DocValues. It failed on me once because of the Codec randomization chose Lucene3x which doesn’t support DocValues. What’s the best way to adjust my test to assure this doesn’t happen? What I ended up doing was this: indexWriterConfig.setCodec( _TestUtil.alwaysDocValuesFormat(new Lucene45DocValuesFormat())); But I don’t like that I hard-coded a particular format. (FYI the source file is an abstract base test class: SpatialTestCase, method newIndexWriterConfig ) Another approach might be to call: assumeTrue(defaultCodecSupportsDocValues()) Although sometimes the test of course won’t be run at all instead of it preferably forcing a compatible format. Thoughts? ~ David
Only highlight terms that caused a search hit/match
3 posts Hello, I have recently been given a requirement to improve document highlights within our system. Unfortunately, the current functionality gives more of a best-guess on what terms to highlight vs the actual terms to highlight that actually did perform the match. A couple examples of issues that were found: Nested boolean clause with a term that doesn’t exist ANDed with a term that does highlights the ignored term in the query Text: a b c Logical Query: a OR (b AND z) Result: *a* *b* c Expected: *a* b c Nested span query doesn’t maintain the proper positions and offsets Text: y z x y z a Logical Query: (“x y z”, a) span near 10 Result: *y* *z* *x* *y* *z* *a* Expected: y z *x* *y* *z* *a* I am currently using the Highlighter with a QueryScorer and a SimpleSpanFragmenter. While looking through the code it looks like the entire query structure is dropped in the WeightedSpanTermExtractor by just grabbing any positive TermQuery and flattening them all into a simple Map which is then passed on to highlight all of those terms. I believe this over simplification of term extraction is the crux of the issue and needs to be modified in order to produce more “exact” highlights. I was brainstorming with a colleague and thought perhaps we can spin up a MemoryIndex to index that one document and start performing a depth-first search of all queries within the overall Lucene query graph. At that point we can start querying the MemoryIndex for leaf queries and start walking back up the tree, pruning branches that don’t result in a search hit which results in a map of actual matched query terms. This approach seems pretty painful but will hopefully produce better matches. I would like to see what the experts on the mailing list would have to say about this approach or is there a better way to retrieve the query terms positions that produced the match? Or perhaps there is a different Highlighter implementation that should be used, though our user queries are extremely complex with a lot of nested queries of various types. Thanks, -- View this message in context: http://lucene.472066.n3.nabble.com/Only-highlight-terms-that-caused-a-search-hit-match-tp4117692.html Sent from the Lucene - Java Developer mailing list archive at Nabble.com. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Welcome Anshum Gupta as Lucene/Solr Committer!
Welcome on board Anshum, Looking forward to more exciting days --Noble On Mon, Feb 17, 2014 at 8:44 AM, Anshum Gupta ans...@anshumgupta.netwrote: Thanks Mark. I spent most of my life in New Delhi, India other than short stints in different parts of the country (including living in a beach house on a tropical island for 3 years when I was young). After spending the last 3 years in Bangalore, I just relocated to San Francisco to be at the LucidWorks office in the Bay Area. Prior to this I've been a part of the search teams at A9 (CloudSearch), Cleartrip.com and Naukri.com where I was involved in designing and developing search and recommendation engines. These days, I love contributing stuff to Solr, primarily around SolrCloud and hope to continue to be at least as active towards it. In my free time I love photography, traveling, eating out and drinking my beer. On Sun, Feb 16, 2014 at 2:33 PM, Mark Miller markrmil...@gmail.comwrote: Hey everybody! The Lucene PMC is happy to welcome Anshum Gupta as a committer on the Lucene / Solr project. Anshum has contributed to a number of issues for the project, especially around SolrCloud. Welcome Anshum! It's tradition to introduce yourself with a short bio :) -- - Mark http://about.me/markrmiller -- Anshum Gupta http://www.anshumgupta.net -- - Noble Paul
Re: Welcome Anshum Gupta as Lucene/Solr Committer!
Welcome Anshum! On Mon, Feb 17, 2014 at 4:03 AM, Mark Miller markrmil...@gmail.com wrote: Hey everybody! The Lucene PMC is happy to welcome Anshum Gupta as a committer on the Lucene / Solr project. Anshum has contributed to a number of issues for the project, especially around SolrCloud. Welcome Anshum! It's tradition to introduce yourself with a short bio :) -- - Mark http://about.me/markrmiller -- Regards, Shalin Shekhar Mangar. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[GitHub] lucene-solr pull request: LUCENE-5092, 2nd try
Github user mkhludnev commented on the pull request: https://github.com/apache/lucene-solr/pull/33#issuecomment-35233474 I like it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. To do so, please top-post your response. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5092) join: don't expect all filters to be FixedBitSet instances
[ https://issues.apache.org/jira/browse/LUCENE-5092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13903011#comment-13903011 ] ASF GitHub Bot commented on LUCENE-5092: Github user mkhludnev commented on the pull request: https://github.com/apache/lucene-solr/pull/33#issuecomment-35233474 I like it. join: don't expect all filters to be FixedBitSet instances -- Key: LUCENE-5092 URL: https://issues.apache.org/jira/browse/LUCENE-5092 Project: Lucene - Core Issue Type: Improvement Components: modules/join Reporter: Adrien Grand Assignee: Adrien Grand Priority: Minor Attachments: LUCENE-5092.patch The join module throws exceptions when the parents filter isn't a FixedBitSet. The reason is that the join module relies on prevSetBit to find the first child document given a parent ID. As suggested by Uwe and Paul Elschot on LUCENE-5081, we could fix it by exposing methods in the iterators to iterate backwards. When the join modules gets an iterator which isn't able to iterate backwards, it would just need to dump its content into another DocIdSet that supports backward iteration, FixedBitSet for example. -- This message was sent by Atlassian JIRA (v6.1.5#6160) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS-MAVEN] Lucene-Solr-Maven-4.x #591: POMs out of sync
Build: https://builds.apache.org/job/Lucene-Solr-Maven-4.x/591/ 1 tests failed. REGRESSION: org.apache.solr.cloud.ChaosMonkeyNothingIsSafeTest.testDistribSearch Error Message: document count mismatch. control=269 sum(shards)=268 cloudClient=268 Stack Trace: java.lang.AssertionError: document count mismatch. control=269 sum(shards)=268 cloudClient=268 at __randomizedtesting.SeedInfo.seed([6815C01AF496ADA6:E9F34E0283C9CD9A]:0) at org.junit.Assert.fail(Assert.java:93) at org.apache.solr.cloud.AbstractFullDistribZkTestBase.checkShardConsistency(AbstractFullDistribZkTestBase.java:1230) at org.apache.solr.cloud.ChaosMonkeyNothingIsSafeTest.doTest(ChaosMonkeyNothingIsSafeTest.java:208) Build Log: [...truncated 52398 lines...] BUILD FAILED /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Maven-4.x/build.xml:482: The following error occurred while executing this line: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Maven-4.x/build.xml:176: The following error occurred while executing this line: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-Maven-4.x/extra-targets.xml:77: Java returned: 1 Total time: 139 minutes 24 seconds Build step 'Invoke Ant' marked build as failure Recording test results Email was triggered for: Failure Sending email for trigger: Failure - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org