[jira] [Commented] (SOLR-11488) Do not allow collections and aliases to have the same name

2017-10-13 Thread Varun Thacker (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16204491#comment-16204491
 ] 

Varun Thacker commented on SOLR-11488:
--

I'd hate to tell users that the alias re-index strategy will no longer work.

bq. This is kind of a pain, but much better than following an alias and 
deleting "new"

Do you mean someone wants to delete data or add data and it ends up in the 
wrong collection?


Here's an idea Shalin and I had long back discussed :

Whenever someone creates a collection {{foo}} Solr should do two things under 
the hood
- Internally store the collection as {{_foo}} 
- Create an alias "foo"

Never allow users to create aliases with an underscore prefix

All routing logic in Solr ( CloudSolrClient / HttpSolrCall ) only check if the 
{{foo}} alias exists and then fetches the underlying collection details and 
processes the request.

Today the routing logic first checks if an alias exists . If it doesn't it 
checks if an actual collection with the name exists. I believe we are trying to 
solve this ambiguity problem for more consistent routing logic and the approach 
mentioned probably addresses the concerns?

> Do not allow collections and aliases to have the same name
> --
>
> Key: SOLR-11488
> URL: https://issues.apache.org/jira/browse/SOLR-11488
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Erick Erickson
>Assignee: Erick Erickson
> Attachments: SOLR-11488.patch
>
>
> Currently you can define an alias with the same name as a collection and 
> (perhaps) vice-versa. The more I think about this the worse idea it seems. 
> See the discussion at the linked JIRAs.
> Proposal: We should fail to create a collection if an alias already exists 
> with the same name and vice-versa.
> This should depend on SOLR-11444 and supersede SOLR-11218, this JIRA will 
> include tests that define the intended behavior making SOLR-11218 obsolete. 
> We'll close SOLR-11218 as "contained by" this JIRA.
> This _will_ take away the ability to
> 1> create a collection, call it "old" and index to it.
> 2> decide you want to change the schema
> 3> create a collection call it "new" and index to it.
> 4> create an alias old->new THIS WILL FAIL.
> 5> delete the "old" collection
> People will have to create an alias pointing to "old" and change their 
> clients to use it, then they can do step 4 above
> This is kind of a pain, but much better than following an alias and deleting 
> "new". I'd also argue that it's a maintenance problem to have collections and 
> aliases with the same name.
> What do people think? I'll try to work up a preliminary patch. If we do this, 
> we should probably coordinate committing this and SOLR-11444 and I'll also 
> change the docs to reflect this and upgrade notes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11488) Do not allow collections and aliases to have the same name

2017-10-13 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16204474#comment-16204474
 ] 

David Smiley commented on SOLR-11488:
-

+1 very nice Erick.

> Do not allow collections and aliases to have the same name
> --
>
> Key: SOLR-11488
> URL: https://issues.apache.org/jira/browse/SOLR-11488
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Erick Erickson
>Assignee: Erick Erickson
> Attachments: SOLR-11488.patch
>
>
> Currently you can define an alias with the same name as a collection and 
> (perhaps) vice-versa. The more I think about this the worse idea it seems. 
> See the discussion at the linked JIRAs.
> Proposal: We should fail to create a collection if an alias already exists 
> with the same name and vice-versa.
> This should depend on SOLR-11444 and supersede SOLR-11218, this JIRA will 
> include tests that define the intended behavior making SOLR-11218 obsolete. 
> We'll close SOLR-11218 as "contained by" this JIRA.
> This _will_ take away the ability to
> 1> create a collection, call it "old" and index to it.
> 2> decide you want to change the schema
> 3> create a collection call it "new" and index to it.
> 4> create an alias old->new THIS WILL FAIL.
> 5> delete the "old" collection
> People will have to create an alias pointing to "old" and change their 
> clients to use it, then they can do step 4 above
> This is kind of a pain, but much better than following an alias and deleting 
> "new". I'd also argue that it's a maintenance problem to have collections and 
> aliases with the same name.
> What do people think? I'll try to work up a preliminary patch. If we do this, 
> we should probably coordinate committing this and SOLR-11444 and I'll also 
> change the docs to reflect this and upgrade notes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-11488) Do not allow collections and aliases to have the same name

2017-10-13 Thread Erick Erickson (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-11488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson updated SOLR-11488:
--
Attachment: SOLR-11488.patch

Preliminary patch. The actual code changes are pretty minimal. I did add a 
method to Aliases.java, not sure I like the name or if it's in the right place, 
but it's late Friday.

I have not run precommit or all tests on this yet, and the patch doesn't 
contain documentation fixes, I'll do these later if we move forward with this..

Let me know what people think, particularly [~dsmiley].

> Do not allow collections and aliases to have the same name
> --
>
> Key: SOLR-11488
> URL: https://issues.apache.org/jira/browse/SOLR-11488
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Erick Erickson
>Assignee: Erick Erickson
> Attachments: SOLR-11488.patch
>
>
> Currently you can define an alias with the same name as a collection and 
> (perhaps) vice-versa. The more I think about this the worse idea it seems. 
> See the discussion at the linked JIRAs.
> Proposal: We should fail to create a collection if an alias already exists 
> with the same name and vice-versa.
> This should depend on SOLR-11444 and supersede SOLR-11218, this JIRA will 
> include tests that define the intended behavior making SOLR-11218 obsolete. 
> We'll close SOLR-11218 as "contained by" this JIRA.
> This _will_ take away the ability to
> 1> create a collection, call it "old" and index to it.
> 2> decide you want to change the schema
> 3> create a collection call it "new" and index to it.
> 4> create an alias old->new THIS WILL FAIL.
> 5> delete the "old" collection
> People will have to create an alias pointing to "old" and change their 
> clients to use it, then they can do step 4 above
> This is kind of a pain, but much better than following an alias and deleting 
> "new". I'd also argue that it's a maintenance problem to have collections and 
> aliases with the same name.
> What do people think? I'll try to work up a preliminary patch. If we do this, 
> we should probably coordinate committing this and SOLR-11444 and I'll also 
> change the docs to reflect this and upgrade notes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Assigned] (SOLR-11464) Unused code in DistributedUpdateProcessor

2017-10-13 Thread David Smiley (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-11464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Smiley reassigned SOLR-11464:
---

Assignee: David Smiley

> Unused code in DistributedUpdateProcessor
> -
>
> Key: SOLR-11464
> URL: https://issues.apache.org/jira/browse/SOLR-11464
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrCloud
>Affects Versions: master (8.0)
>Reporter: Gus Heck
>Assignee: David Smiley
>Priority: Minor
> Attachments: SOLR-11464.patch, unused.png
>
>
> While reading code I ran across a couple of suspicious unused 
> values/variables. Thought I would raise this so that folks can consider if 
> something was lost here, or if perhaps we can eliminate an unnecessary call 
> to zookeeper and tidy things up a bit. Screenshot and patch to eliminate 
> shortly...



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11299) Time partitioned collections (umbrella issue)

2017-10-13 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16204466#comment-16204466
 ] 

David Smiley commented on SOLR-11299:
-

The timezone bit is for two things:
* the interpretation of the partition time size.  A timezone is useful and in 
fact necessary for the same reasons as facet.range.gap with dates which support 
it.  See SOLR-2690 for context as to why {{TZ}} exists.
* allowing for shorter friendly collection names like mycollection_2017-10-13 
instead of needing to get to the hour.  This isn't a big deal, granted.  I 
don't really like millisecond collection names, sorry.  Hey [~hossman] I recall 
we both attended an LSR presentation (Rocana?) that described a time 
partitioning strategy with the dubious choice of milliseconds in the name and 
you were like, oh yeah, ol collection 1507953042461 -- there's some great data 
in there :-)

RE alias metadata for storing partition ranges... yeah I suppose that's 
possible but I admit I like the lean sufficiency of the names themselves in 
series being adequate.  The only problem I can think of with using the names 
alone is that you must have a complete contiguous series with no gaps of 
collections that haven't been created.  That doesn't seam like a serious 
limitation, I think?  If we wanted metadata on each partition like the start 
and end range, I'm not inclined to think the alias is where it goes -- more 
likely it's metadata on the collection.

> Time partitioned collections (umbrella issue)
> -
>
> Key: SOLR-11299
> URL: https://issues.apache.org/jira/browse/SOLR-11299
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrCloud
>Reporter: David Smiley
>Assignee: David Smiley
>
> Solr ought to have the ability to manage large-scale time-series data (think 
> logs or sensor data / IOT) itself without a lot of manual/external work.  The 
> most naive and painless approach today is to create a collection with a high 
> numShards with hash routing but this isn't as good as partitioning the 
> underlying indexes by time for these reasons:
> * Easy to scale up/down horizontally as data/requirements change.  (No need 
> to over-provision, use shard splitting, or re-index with different config)
> * Faster queries: 
> ** can search fewer shards, reducing overall load
> ** realtime search is more tractable (since most shards are stable -- 
> good caches)
> ** "recent" shards (that might be queried more) can be allocated to 
> faster hardware
> ** aged out data is simply removed, not marked as deleted.  Deleted docs 
> still have search overhead.
> * Outages of a shard result in a degraded but sometimes a useful system 
> nonetheless (compare to random subset missing)
> Ideally you could set this up once and then simply work with a collection 
> (potentially actually an alias) in a normal way (search or update), letting 
> Solr handle the addition of new partitions, removing of old ones, and 
> appropriate routing of requests depending on their nature.
> This issue is an umbrella issue for the particular tasks that will make it 
> all happen -- either subtasks or issue linking.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-NightlyTests-7.x - Build # 61 - Still Unstable

2017-10-13 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-NightlyTests-7.x/61/

10 tests failed.
FAILED:  
org.apache.lucene.codecs.memory.TestMemoryDocValuesFormat.testBinaryFixedLengthVsStoredFields

Error Message:
Test abandoned because suite timeout was reached.

Stack Trace:
java.lang.Exception: Test abandoned because suite timeout was reached.
at __randomizedtesting.SeedInfo.seed([6CB5EC62AC1F849B]:0)


FAILED:  
junit.framework.TestSuite.org.apache.lucene.codecs.memory.TestMemoryDocValuesFormat

Error Message:
Suite timeout exceeded (>= 720 msec).

Stack Trace:
java.lang.Exception: Suite timeout exceeded (>= 720 msec).
at __randomizedtesting.SeedInfo.seed([6CB5EC62AC1F849B]:0)


FAILED:  org.apache.lucene.index.TestIndexWriterOnVMError.testCheckpoint

Error Message:
Test abandoned because suite timeout was reached.

Stack Trace:
java.lang.Exception: Test abandoned because suite timeout was reached.
at __randomizedtesting.SeedInfo.seed([B214C1465D2D4E3]:0)


FAILED:  
junit.framework.TestSuite.org.apache.lucene.index.TestIndexWriterOnVMError

Error Message:
Suite timeout exceeded (>= 720 msec).

Stack Trace:
java.lang.Exception: Suite timeout exceeded (>= 720 msec).
at __randomizedtesting.SeedInfo.seed([B214C1465D2D4E3]:0)


FAILED:  org.apache.solr.cloud.RecoveryZkTest.test

Error Message:


Stack Trace:
java.util.concurrent.TimeoutException
at 
__randomizedtesting.SeedInfo.seed([CEDB9D92D513606A:468FA2487BEF0D92]:0)
at 
org.apache.solr.common.cloud.ZkStateReader.waitForState(ZkStateReader.java:1323)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.waitForState(CloudSolrClient.java:438)
at org.apache.solr.cloud.RecoveryZkTest.test(RecoveryZkTest.java:122)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1737)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:934)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:970)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:984)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:468)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:943)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:829)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:879)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:890)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 

[jira] [Commented] (SOLR-11444) Improve Aliases.java and comma delimited collection list handling

2017-10-13 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16204455#comment-16204455
 ] 

David Smiley commented on SOLR-11444:
-

bq. In looking at code for aliases a great deal lately, aliases point to 
collections only, not to other aliases.

To clarify, I mean the code that _uses_ aliases (lots of which affected by 
SOLR-11444 here) it was clear that aliases are not recursively resolved, and 
thus you can't get yourself into an infinite loop.  That said, now that I look 
carefully at the alias creation validation code, it does not forbid creating 
aliases pointing to aliases, and there is even a test that shows it's not 
forbidden!  But no test (in AliasIntegrationTest) actually _uses_ (searches 
with) this alias pointing to an alias.  I added a simple try to do so to 
validate my understanding of the code and it indeed failed.

All this conversation about SOLR-11218 really ought to be on that issue, alas.  
Too late?   Hey [~markrmil...@gmail.com] you added aliases originally and have 
some insight I'm sure.  IMO, we shouldn't let someone create an alias to an 
alias.  As I mentioned above, Solr will let you do it but it doesn't actually 
work, and may be tricky to support with safety.

> Improve Aliases.java and comma delimited collection list handling
> -
>
> Key: SOLR-11444
> URL: https://issues.apache.org/jira/browse/SOLR-11444
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrCloud
>Reporter: David Smiley
>Assignee: David Smiley
> Attachments: SOLR_11444_Aliases.patch, SOLR_11444_Aliases.patch
>
>
> While starting to look at SOLR-11299 I noticed some brittleness in 
> assumptions about Strings that refer to a collection.  Sometimes they are in 
> fact references to comma separated lists, which appears was added with the 
> introduction of collection aliases (an alias can refer to a comma delimited 
> list).  So Java's type system kind of goes out the window when we do this.  
> In one case this leads to a bug -- CloudSolrClient will throw an NPE if you 
> try to write to such an alias.  Sending an update via HTTP will allow it and 
> send it to the first in the list.
> So this issue is about refactoring and some little improvements pertaining to 
> Aliases.java plus certain key spots that deal with collection references.  I 
> don't think I want to go as far as changing the public SolrJ API except to 
> adding documentation on what's possible.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11444) Improve Aliases.java and comma delimited collection list handling

2017-10-13 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16204446#comment-16204446
 ] 

David Smiley commented on SOLR-11444:
-

I applied the patch in SOLR-11218 over my working copy which has the changes 
here.  (there were some straight-forward merge conflicts in the test).  I ran 
{{testDeleteAliasWithExistingCollectionName}}  It failed only because of one 
assertEquals that I think is incorrect:
{code:java}
// Now we should still transitively see collection_new
res = cluster.getSolrClient().query("collection_old_reserve", new 
SolrQuery("*:*"));
assertEquals(1, res.getResults().getNumFound());
{code}
This doesn't make sense to me -- the _alias_ {{collection_old_reserve}} points 
to the _collection_ {{collection_old}} and thus it should find three documents. 
 No?  In looking at code for aliases a great deal lately, aliases point to 
collections only, not to other aliases.  Thus you _cannot_ get yourself into an 
infinite loop A -> B -> A(not possible)

> Improve Aliases.java and comma delimited collection list handling
> -
>
> Key: SOLR-11444
> URL: https://issues.apache.org/jira/browse/SOLR-11444
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrCloud
>Reporter: David Smiley
>Assignee: David Smiley
> Attachments: SOLR_11444_Aliases.patch, SOLR_11444_Aliases.patch
>
>
> While starting to look at SOLR-11299 I noticed some brittleness in 
> assumptions about Strings that refer to a collection.  Sometimes they are in 
> fact references to comma separated lists, which appears was added with the 
> introduction of collection aliases (an alias can refer to a comma delimited 
> list).  So Java's type system kind of goes out the window when we do this.  
> In one case this leads to a bug -- CloudSolrClient will throw an NPE if you 
> try to write to such an alias.  Sending an update via HTTP will allow it and 
> send it to the first in the list.
> So this issue is about refactoring and some little improvements pertaining to 
> Aliases.java plus certain key spots that deal with collection references.  I 
> don't think I want to go as far as changing the public SolrJ API except to 
> adding documentation on what's possible.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11443) Remove the usage of workqueue for Overseer

2017-10-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1620#comment-1620
 ] 

ASF GitHub Bot commented on SOLR-11443:
---

Github user CaoManhDat commented on the issue:

https://github.com/apache/lucene-solr/pull/262
  
:+1: 


> Remove the usage of workqueue for Overseer
> --
>
> Key: SOLR-11443
> URL: https://issues.apache.org/jira/browse/SOLR-11443
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
> Attachments: SOLR-11443.patch, SOLR-11443.patch, SOLR-11443.patch
>
>
> If we can remove the usage of workqueue, We can save a lot of IO blocking in 
> Overseer, hence boost performance a lot.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr issue #262: SOLR-11443: Remove the usage of workqueue for Overse...

2017-10-13 Thread CaoManhDat
Github user CaoManhDat commented on the issue:

https://github.com/apache/lucene-solr/pull/262
  
:+1: 


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11443) Remove the usage of workqueue for Overseer

2017-10-13 Thread Scott Blum (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16204443#comment-16204443
 ] 

Scott Blum commented on SOLR-11443:
---

LGTM

> Remove the usage of workqueue for Overseer
> --
>
> Key: SOLR-11443
> URL: https://issues.apache.org/jira/browse/SOLR-11443
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
> Attachments: SOLR-11443.patch, SOLR-11443.patch, SOLR-11443.patch
>
>
> If we can remove the usage of workqueue, We can save a lot of IO blocking in 
> Overseer, hence boost performance a lot.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-11488) Do not allow collections and aliases to have the same name

2017-10-13 Thread Erick Erickson (JIRA)
Erick Erickson created SOLR-11488:
-

 Summary: Do not allow collections and aliases to have the same name
 Key: SOLR-11488
 URL: https://issues.apache.org/jira/browse/SOLR-11488
 Project: Solr
  Issue Type: Improvement
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Erick Erickson
Assignee: Erick Erickson


Currently you can define an alias with the same name as a collection and 
(perhaps) vice-versa. The more I think about this the worse idea it seems. See 
the discussion at the linked JIRAs.

Proposal: We should fail to create a collection if an alias already exists with 
the same name and vice-versa.

This should depend on SOLR-11444 and supersede SOLR-11218, this JIRA will 
include tests that define the intended behavior making SOLR-11218 obsolete. 
We'll close SOLR-11218 as "contained by" this JIRA.

This _will_ take away the ability to
1> create a collection, call it "old" and index to it.
2> decide you want to change the schema
3> create a collection call it "new" and index to it.
4> create an alias old->new THIS WILL FAIL.
5> delete the "old" collection

People will have to create an alias pointing to "old" and change their clients 
to use it, then they can do step 4 above

This is kind of a pain, but much better than following an alias and deleting 
"new". I'd also argue that it's a maintenance problem to have collections and 
aliases with the same name.

What do people think? I'll try to work up a preliminary patch. If we do this, 
we should probably coordinate committing this and SOLR-11444 and I'll also 
change the docs to reflect this and upgrade notes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-11443) Remove the usage of workqueue for Overseer

2017-10-13 Thread Cao Manh Dat (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16204325#comment-16204325
 ] 

Cao Manh Dat edited comment on SOLR-11443 at 10/14/17 2:35 AM:
---

[~dragonsinth] We should not, unconditionally clear and set dirty, cause this 
will trigger get all zk node names ( which is very expensive ). Why should we 
do that if the cache still valid?? Yeah, knowChildren.size() == 0 should be 
addressed too.


was (Author: caomanhdat):
[~dragonsinth] We should not, unconditionally clear and set dirty, cause this 
will trigger get zk node names ( which is very expensive ). Why should we do 
that if the cache still valid?? Yeah, knowChildren.size() == 0 should be 
addressed too.

> Remove the usage of workqueue for Overseer
> --
>
> Key: SOLR-11443
> URL: https://issues.apache.org/jira/browse/SOLR-11443
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
> Attachments: SOLR-11443.patch, SOLR-11443.patch, SOLR-11443.patch
>
>
> If we can remove the usage of workqueue, We can save a lot of IO blocking in 
> Overseer, hence boost performance a lot.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11443) Remove the usage of workqueue for Overseer

2017-10-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16204436#comment-16204436
 ] 

ASF GitHub Bot commented on SOLR-11443:
---

GitHub user CaoManhDat opened a pull request:

https://github.com/apache/lucene-solr/pull/262

SOLR-11443: Remove the usage of workqueue for Overseer

SOLR-11443: Remove the usage of workqueue for Overseer

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/CaoManhDat/lucene-solr jira/SOLR-11443

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/lucene-solr/pull/262.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #262


commit 9543e85460b6d1264857c42b568d4a7f59c06007
Author: Cao Manh Dat 
Date:   2017-10-14T02:33:17Z

SOLR-11443: Remove the usage of workqueue for Overseer




> Remove the usage of workqueue for Overseer
> --
>
> Key: SOLR-11443
> URL: https://issues.apache.org/jira/browse/SOLR-11443
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
> Attachments: SOLR-11443.patch, SOLR-11443.patch, SOLR-11443.patch
>
>
> If we can remove the usage of workqueue, We can save a lot of IO blocking in 
> Overseer, hence boost performance a lot.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #262: SOLR-11443: Remove the usage of workqueue for...

2017-10-13 Thread CaoManhDat
GitHub user CaoManhDat opened a pull request:

https://github.com/apache/lucene-solr/pull/262

SOLR-11443: Remove the usage of workqueue for Overseer

SOLR-11443: Remove the usage of workqueue for Overseer

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/CaoManhDat/lucene-solr jira/SOLR-11443

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/lucene-solr/pull/262.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #262


commit 9543e85460b6d1264857c42b568d4a7f59c06007
Author: Cao Manh Dat 
Date:   2017-10-14T02:33:17Z

SOLR-11443: Remove the usage of workqueue for Overseer




---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: 6.6.2 Release

2017-10-13 Thread Erick Erickson
Done both for 6.6 and 6x

On Fri, Oct 13, 2017 at 5:16 PM, Ishan Chattopadhyaya <
ichattopadhy...@gmail.com> wrote:

> Sure Erick, please go ahead.
> I'll start the release later today.
> Thanks,
> Ishan
>
> On Sat, Oct 14, 2017 at 5:44 AM, Erick Erickson 
> wrote:
>
>> Ishan:
>>
>> I have 11297 ready to rock-n-roll, it's just a matter of pushing it. Give
>> me a few.
>>
>> The thing I'm not clear on is what to do with CHANGES.txt. Currently it's
>> in 7.0.1 and 7.1.
>>
>> I propose adding a 6.6.2 section to 6x and including it there and leaving
>> it in the 7.0.1 and 7.1 sections of master.
>>
>> I'll do it that way, you can change it if you want unless I hear back
>> from you sooner.
>>
>> Erick
>>
>> On Fri, Oct 13, 2017 at 4:59 PM, Allison, Timothy B. 
>> wrote:
>>
>>> Sounds good.  Thank you!
>>>
>>>
>>>
>>> *From:* Ishan Chattopadhyaya [mailto:ichattopadhy...@gmail.com]
>>> *Sent:* Friday, October 13, 2017 5:25 PM
>>> *To:* dev@lucene.apache.org
>>> *Subject:* Re: 6.6.2 Release
>>>
>>>
>>>
>>> > Any chance we could get SOLR-11450 in?  I understand if the answer is
>>> no. 
>>>
>>> Currently, I want to have this release out as soon as possible so as to
>>> mitigate the risk exposure of the security vulnerability. Since this is not
>>> committed yet, I'd vote for leaving this out and possibly having it
>>> included in a later release, if needed.
>>>
>>> +1 to SOLR-11297.
>>>
>>>
>>>
>>>
>>>
>>> On Sat, Oct 14, 2017 at 2:32 AM, David Smiley 
>>> wrote:
>>>
>>> Suggested criteria for bug-fix release issues:
>>>
>>> * fixes a bug :-) and doesn't harm backwards-compatibility in the
>>> process
>>>
>>> * helps users upgrade to later versions
>>>
>>> * documentation
>>>
>>>
>>>
>>> +1 to SOLR-11297
>>>
>>>
>>>
>>> I'm not sure on SOLR-11450.  Seems it might introduce a back-compat
>>> issue?
>>>
>>>
>>>
>>> On Fri, Oct 13, 2017 at 4:40 PM Erick Erickson 
>>> wrote:
>>>
>>> I'd also like to get SOLR-11297 in if there are no objections. Ditto if
>>> the answer is no
>>>
>>>
>>>
>>> It's quite a safe fix though.
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Fri, Oct 13, 2017 at 1:26 PM, Allison, Timothy B. 
>>> wrote:
>>>
>>> Any chance we could get SOLR-11450 in?  I understand if the answer is
>>> no. 
>>>
>>>
>>>
>>> Thank you!
>>>
>>>
>>>
>>> *From:* Ishan Chattopadhyaya [mailto:ichattopadhy...@gmail.com]
>>> *Sent:* Friday, October 13, 2017 4:23 PM
>>> *To:* dev@lucene.apache.org
>>> *Subject:* 6.6.2 Release
>>>
>>>
>>>
>>> Hi,
>>>
>>> In light of [0], we need a 6.6.2 release as soon as possible.
>>>
>>> I'd like to volunteer to RM for this release, unless someone else wants
>>> to do so or has an objection.
>>>
>>> Regards,
>>>
>>> Ishan
>>>
>>>
>>>
>>> [0] - https://lucene.apache.org/solr/news.html#12-october-2017-ple
>>> ase-secure-your-apache-solr-servers-since-a-zero-day-explo
>>> it-has-been-reported-on-a-public-mailing-list
>>>
>>>
>>>
>>> --
>>>
>>> Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
>>>
>>> LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
>>> http://www.solrenterprisesearchserver.com
>>>
>>>
>>>
>>
>>
>


[jira] [Commented] (SOLR-11297) Message "Lock held by this virtual machine" during startup. Solr is trying to start some cores twice

2017-10-13 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16204397#comment-16204397
 ] 

ASF subversion and git services commented on SOLR-11297:


Commit d8e587e227a414d2c991f6fd740073112b9a1cf5 in lucene-solr's branch 
refs/heads/branch_6x from Erick
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=d8e587e ]

SOLR-11297: Message 'Lock held by this virtual machine' during startup. Solr is 
trying to start some cores twice


> Message "Lock held by this virtual machine" during startup.  Solr is trying 
> to start some cores twice
> -
>
> Key: SOLR-11297
> URL: https://issues.apache.org/jira/browse/SOLR-11297
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 6.6
>Reporter: Shawn Heisey
>Assignee: Erick Erickson
> Fix For: 7.0.1, 7.1
>
> Attachments: SOLR-11297.patch, SOLR-11297.patch, SOLR-11297.patch, 
> SOLR-11297.sh, solr6_6-startup.log
>
>
> Sometimes when Solr is restarted, I get some "lock held by this virtual 
> machine" messages in the log, and the admin UI has messages about a failure 
> to open a new searcher.  It doesn't happen on all cores, and the list of 
> cores that have the problem changes on subsequent restarts.  The cores that 
> exhibit the problems are working just fine -- the first core load is 
> successful, the failure to open a new searcher is on a second core load 
> attempt, which fails.
> None of the cores in the system are sharing an instanceDir or dataDir.  This 
> has been verified several times.
> The index is sharded manually, and the servers are not running in cloud mode.
> One critical detail to this issue: The cores are all perfectly functional.  
> If somebody is seeing an error message that results in a core not working at 
> all, then it is likely a different issue.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: 6.6.2 Release

2017-10-13 Thread Ishan Chattopadhyaya
Sure Erick, please go ahead.
I'll start the release later today.
Thanks,
Ishan

On Sat, Oct 14, 2017 at 5:44 AM, Erick Erickson 
wrote:

> Ishan:
>
> I have 11297 ready to rock-n-roll, it's just a matter of pushing it. Give
> me a few.
>
> The thing I'm not clear on is what to do with CHANGES.txt. Currently it's
> in 7.0.1 and 7.1.
>
> I propose adding a 6.6.2 section to 6x and including it there and leaving
> it in the 7.0.1 and 7.1 sections of master.
>
> I'll do it that way, you can change it if you want unless I hear back from
> you sooner.
>
> Erick
>
> On Fri, Oct 13, 2017 at 4:59 PM, Allison, Timothy B. 
> wrote:
>
>> Sounds good.  Thank you!
>>
>>
>>
>> *From:* Ishan Chattopadhyaya [mailto:ichattopadhy...@gmail.com]
>> *Sent:* Friday, October 13, 2017 5:25 PM
>> *To:* dev@lucene.apache.org
>> *Subject:* Re: 6.6.2 Release
>>
>>
>>
>> > Any chance we could get SOLR-11450 in?  I understand if the answer is
>> no. 
>>
>> Currently, I want to have this release out as soon as possible so as to
>> mitigate the risk exposure of the security vulnerability. Since this is not
>> committed yet, I'd vote for leaving this out and possibly having it
>> included in a later release, if needed.
>>
>> +1 to SOLR-11297.
>>
>>
>>
>>
>>
>> On Sat, Oct 14, 2017 at 2:32 AM, David Smiley 
>> wrote:
>>
>> Suggested criteria for bug-fix release issues:
>>
>> * fixes a bug :-) and doesn't harm backwards-compatibility in the
>> process
>>
>> * helps users upgrade to later versions
>>
>> * documentation
>>
>>
>>
>> +1 to SOLR-11297
>>
>>
>>
>> I'm not sure on SOLR-11450.  Seems it might introduce a back-compat issue?
>>
>>
>>
>> On Fri, Oct 13, 2017 at 4:40 PM Erick Erickson 
>> wrote:
>>
>> I'd also like to get SOLR-11297 in if there are no objections. Ditto if
>> the answer is no
>>
>>
>>
>> It's quite a safe fix though.
>>
>>
>>
>>
>>
>>
>>
>> On Fri, Oct 13, 2017 at 1:26 PM, Allison, Timothy B. 
>> wrote:
>>
>> Any chance we could get SOLR-11450 in?  I understand if the answer is no.
>> 
>>
>>
>>
>> Thank you!
>>
>>
>>
>> *From:* Ishan Chattopadhyaya [mailto:ichattopadhy...@gmail.com]
>> *Sent:* Friday, October 13, 2017 4:23 PM
>> *To:* dev@lucene.apache.org
>> *Subject:* 6.6.2 Release
>>
>>
>>
>> Hi,
>>
>> In light of [0], we need a 6.6.2 release as soon as possible.
>>
>> I'd like to volunteer to RM for this release, unless someone else wants
>> to do so or has an objection.
>>
>> Regards,
>>
>> Ishan
>>
>>
>>
>> [0] - https://lucene.apache.org/solr/news.html#12-october-2017-
>> please-secure-your-apache-solr-servers-since-a-zero-day-
>> exploit-has-been-reported-on-a-public-mailing-list
>>
>>
>>
>> --
>>
>> Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
>>
>> LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
>> http://www.solrenterprisesearchserver.com
>>
>>
>>
>
>


Re: 6.6.2 Release

2017-10-13 Thread Erick Erickson
Ishan:

I have 11297 ready to rock-n-roll, it's just a matter of pushing it. Give
me a few.

The thing I'm not clear on is what to do with CHANGES.txt. Currently it's
in 7.0.1 and 7.1.

I propose adding a 6.6.2 section to 6x and including it there and leaving
it in the 7.0.1 and 7.1 sections of master.

I'll do it that way, you can change it if you want unless I hear back from
you sooner.

Erick

On Fri, Oct 13, 2017 at 4:59 PM, Allison, Timothy B. 
wrote:

> Sounds good.  Thank you!
>
>
>
> *From:* Ishan Chattopadhyaya [mailto:ichattopadhy...@gmail.com]
> *Sent:* Friday, October 13, 2017 5:25 PM
> *To:* dev@lucene.apache.org
> *Subject:* Re: 6.6.2 Release
>
>
>
> > Any chance we could get SOLR-11450 in?  I understand if the answer is
> no. 
>
> Currently, I want to have this release out as soon as possible so as to
> mitigate the risk exposure of the security vulnerability. Since this is not
> committed yet, I'd vote for leaving this out and possibly having it
> included in a later release, if needed.
>
> +1 to SOLR-11297.
>
>
>
>
>
> On Sat, Oct 14, 2017 at 2:32 AM, David Smiley 
> wrote:
>
> Suggested criteria for bug-fix release issues:
>
> * fixes a bug :-) and doesn't harm backwards-compatibility in the
> process
>
> * helps users upgrade to later versions
>
> * documentation
>
>
>
> +1 to SOLR-11297
>
>
>
> I'm not sure on SOLR-11450.  Seems it might introduce a back-compat issue?
>
>
>
> On Fri, Oct 13, 2017 at 4:40 PM Erick Erickson 
> wrote:
>
> I'd also like to get SOLR-11297 in if there are no objections. Ditto if
> the answer is no
>
>
>
> It's quite a safe fix though.
>
>
>
>
>
>
>
> On Fri, Oct 13, 2017 at 1:26 PM, Allison, Timothy B. 
> wrote:
>
> Any chance we could get SOLR-11450 in?  I understand if the answer is no.
> 
>
>
>
> Thank you!
>
>
>
> *From:* Ishan Chattopadhyaya [mailto:ichattopadhy...@gmail.com]
> *Sent:* Friday, October 13, 2017 4:23 PM
> *To:* dev@lucene.apache.org
> *Subject:* 6.6.2 Release
>
>
>
> Hi,
>
> In light of [0], we need a 6.6.2 release as soon as possible.
>
> I'd like to volunteer to RM for this release, unless someone else wants to
> do so or has an objection.
>
> Regards,
>
> Ishan
>
>
>
> [0] - https://lucene.apache.org/solr/news.html#12-october-
> 2017-please-secure-your-apache-solr-servers-since-a-
> zero-day-exploit-has-been-reported-on-a-public-mailing-list
>
>
>
> --
>
> Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
>
> LinkedIn: http://linkedin.com/in/davidwsmiley | Book: http://www.
> solrenterprisesearchserver.com
>
>
>


[jira] [Commented] (SOLR-11444) Improve Aliases.java and comma delimited collection list handling

2017-10-13 Thread Erick Erickson (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16204379#comment-16204379
 ] 

Erick Erickson commented on SOLR-11444:
---

Yeah, when I realized that we were recommending that and didn't have a test for 
it it kind of scared me.

Several things though:
1> I'm interested in preventing this "delete the wrong collection" issue, not 
_necessarily_ keeping this behavior.
2> The test could very well not be doing what I think
3> I'm also interested in codifying the intended behavior with tests.

If it gets too hairy (and I'm thinking of all the persist pain and and agony 
for solr.xml historically) we could consider alternatives like preventing 
having an alias with the same name as a collection. People could get by if they

> index to new collection
> have a maintenance window in which they
>> deleted the old collection
>> created an alias with the old name and pointed to the new collection

Once that was done the first time, they wouldn't have the problem again as they 
are now using an alias not a collection even though it has the same name as the 
old (deleted) collection. This pre-supposes they can't/won't change whatever 
the app is to use an alias in the first place.

H, I'm starting to think that preventing an alias from being created with 
the same name as an existing collection is the way to go. Supporting the 
current behavior would be for people who do it that way now and can't/won't 
change client(s) to use an alias. And there is a way to get by without changing 
the client above, albeit it'd be a bit of a pain, but not much frankly.

And it's not even an all-or-none thing. Say I have collectionA. I create an 
alias to it (aliasA->collectionA). Now I can switch over my client(s) to use 
aliasA on whatever schedule I want, one at a time or all at once. When they're 
all done and tested I can reindex to collectionB and then switch 
aliasA->collectionB.

WDYT? I can claim that having an alias and a collection with the same name is 
inherently unsafe/confusing without any qualms at all.

That would leave the question of what we should do at initialization time if we 
find a collection and alias with the same name from something previous though. 
Big fat ERROR THIS HAS UNDEFINED BEHAVIOR message or something?

> Improve Aliases.java and comma delimited collection list handling
> -
>
> Key: SOLR-11444
> URL: https://issues.apache.org/jira/browse/SOLR-11444
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrCloud
>Reporter: David Smiley
>Assignee: David Smiley
> Attachments: SOLR_11444_Aliases.patch, SOLR_11444_Aliases.patch
>
>
> While starting to look at SOLR-11299 I noticed some brittleness in 
> assumptions about Strings that refer to a collection.  Sometimes they are in 
> fact references to comma separated lists, which appears was added with the 
> introduction of collection aliases (an alias can refer to a comma delimited 
> list).  So Java's type system kind of goes out the window when we do this.  
> In one case this leads to a bug -- CloudSolrClient will throw an NPE if you 
> try to write to such an alias.  Sending an update via HTTP will allow it and 
> send it to the first in the list.
> So this issue is about refactoring and some little improvements pertaining to 
> Aliases.java plus certain key spots that deal with collection references.  I 
> don't think I want to go as far as changing the public SolrJ API except to 
> adding documentation on what's possible.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



RE: 6.6.2 Release

2017-10-13 Thread Allison, Timothy B.
Sounds good.  Thank you!

From: Ishan Chattopadhyaya [mailto:ichattopadhy...@gmail.com]
Sent: Friday, October 13, 2017 5:25 PM
To: dev@lucene.apache.org
Subject: Re: 6.6.2 Release

> Any chance we could get SOLR-11450 in?  I understand if the answer is no. 
Currently, I want to have this release out as soon as possible so as to 
mitigate the risk exposure of the security vulnerability. Since this is not 
committed yet, I'd vote for leaving this out and possibly having it included in 
a later release, if needed.
+1 to SOLR-11297.


On Sat, Oct 14, 2017 at 2:32 AM, David Smiley 
> wrote:
Suggested criteria for bug-fix release issues:
* fixes a bug :-) and doesn't harm backwards-compatibility in the process
* helps users upgrade to later versions
* documentation

+1 to SOLR-11297

I'm not sure on SOLR-11450.  Seems it might introduce a back-compat issue?

On Fri, Oct 13, 2017 at 4:40 PM Erick Erickson 
> wrote:
I'd also like to get SOLR-11297 in if there are no objections. Ditto if the 
answer is no

It's quite a safe fix though.



On Fri, Oct 13, 2017 at 1:26 PM, Allison, Timothy B. 
> wrote:
Any chance we could get SOLR-11450 in?  I understand if the answer is no. 

Thank you!

From: Ishan Chattopadhyaya 
[mailto:ichattopadhy...@gmail.com]
Sent: Friday, October 13, 2017 4:23 PM
To: dev@lucene.apache.org
Subject: 6.6.2 Release

Hi,
In light of [0], we need a 6.6.2 release as soon as possible.
I'd like to volunteer to RM for this release, unless someone else wants to do 
so or has an objection.
Regards,
Ishan


[0] - 
https://lucene.apache.org/solr/news.html#12-october-2017-please-secure-your-apache-solr-servers-since-a-zero-day-exploit-has-been-reported-on-a-public-mailing-list

--
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
LinkedIn: http://linkedin.com/in/davidwsmiley | Book: 
http://www.solrenterprisesearchserver.com



[jira] [Commented] (SOLR-11299) Time partitioned collections (umbrella issue)

2017-10-13 Thread Gus Heck (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16204365#comment-16204365
 ] 

Gus Heck commented on SOLR-11299:
-

Was thinking about the timezone bit... it seems to me that just as in 
applications where one normally stores data as UTC and converts when needed, we 
should dodge the timezone metadata read-only issue and always name our 
partitions in terms of UTC... conversions can be done based on the timezone 
portion of the key date field in cases where we are not receiving UTC... if no 
timezone specifier assume UTC... 

I've seen an implementation of this sort of thing where routing was based on 
DateFormat parsing the partition names on each request, but I could also 
imagine that we might simplify things by naming the partitions based on epoch 
milliseconds, which could also be kept in alias metadata as a sorted list of 
partition boundaries with partitions named for their (inclusive) lower bound. 
Allowing pretty, human friendly collection names that are formatted versions of 
the lower bounds and mapping the collection start time values to those names 
could be a follow on enhancement just adding a layer of indirection... 



> Time partitioned collections (umbrella issue)
> -
>
> Key: SOLR-11299
> URL: https://issues.apache.org/jira/browse/SOLR-11299
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrCloud
>Reporter: David Smiley
>Assignee: David Smiley
>
> Solr ought to have the ability to manage large-scale time-series data (think 
> logs or sensor data / IOT) itself without a lot of manual/external work.  The 
> most naive and painless approach today is to create a collection with a high 
> numShards with hash routing but this isn't as good as partitioning the 
> underlying indexes by time for these reasons:
> * Easy to scale up/down horizontally as data/requirements change.  (No need 
> to over-provision, use shard splitting, or re-index with different config)
> * Faster queries: 
> ** can search fewer shards, reducing overall load
> ** realtime search is more tractable (since most shards are stable -- 
> good caches)
> ** "recent" shards (that might be queried more) can be allocated to 
> faster hardware
> ** aged out data is simply removed, not marked as deleted.  Deleted docs 
> still have search overhead.
> * Outages of a shard result in a degraded but sometimes a useful system 
> nonetheless (compare to random subset missing)
> Ideally you could set this up once and then simply work with a collection 
> (potentially actually an alias) in a normal way (search or update), letting 
> Solr handle the addition of new partitions, removing of old ones, and 
> appropriate routing of requests depending on their nature.
> This issue is an umbrella issue for the particular tasks that will make it 
> all happen -- either subtasks or issue linking.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [VOTE] Release Lucene/Solr 7.1.0 RC2

2017-10-13 Thread Steve Rowe
+1

Changes, docs and javadocs look good.

Smoke tester says: SUCCESS! [0:47:53.532695]

--
Steve
www.lucidworks.com

> On Oct 13, 2017, at 4:34 PM, Shalin Shekhar Mangar  wrote:
> 
> Answering myself, I think we should go ahead with this RC. I've added
> this entry to CHANGES.txt in all branches and it will be picked up in
> case there needs to be a re-spin due to other reasons.
> 
> On Fri, Oct 13, 2017 at 8:16 PM, Shalin Shekhar Mangar
>  wrote:
>> I just noticed that in the hurry to create this RC, I forgot to add
>> SOLR-10335 to Solr's CHANGES.txt. Is that worth a re-spin?
>> 
>> On Fri, Oct 13, 2017 at 7:25 PM, Shalin Shekhar Mangar
>>  wrote:
>>> Please vote for release candidate 2 for Lucene/Solr 7.1.0
>>> 
>>> The artifacts can be downloaded from:
>>> https://dist.apache.org/repos/dist/dev/lucene/lucene-solr-7.1.0-RC2-rev84c90ad2c0218156c840e19a64d72b8a38550659
>>> 
>>> You can run the smoke tester directly with this command:
>>> 
>>> python3 -u dev-tools/scripts/smokeTestRelease.py \
>>> https://dist.apache.org/repos/dist/dev/lucene/lucene-solr-7.1.0-RC2-rev84c90ad2c0218156c840e19a64d72b8a38550659
>>> 
>>> Smoke tester passed for me.
>>> SUCCESS! [0:40:53.908967]
>>> 
>>> Here's my +1 to release.
>>> 
>>> --
>>> Regards,
>>> Shalin Shekhar Mangar.
>> 
>> 
>> 
>> --
>> Regards,
>> Shalin Shekhar Mangar.
> 
> 
> 
> -- 
> Regards,
> Shalin Shekhar Mangar.
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-5753) Refresh UAX29URLEmailTokenizer's TLD list

2017-10-13 Thread Steve Rowe (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-5753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Rowe resolved LUCENE-5753.

   Resolution: Fixed
 Assignee: Steve Rowe
Fix Version/s: master (8.0)
   7.1

> Refresh UAX29URLEmailTokenizer's TLD list
> -
>
> Key: LUCENE-5753
> URL: https://issues.apache.org/jira/browse/LUCENE-5753
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/analysis
>Reporter: Steve Merritt
>Assignee: Steve Rowe
> Fix For: 7.1, master (8.0)
>
> Attachments: LUCENE-5753.patch
>
>
> uax_url_email analyzer appears unable to recognize the ".local" TLD among 
> others. Bug can be reproduced by
> curl -XGET 
> "ADDRESS/INDEX/_analyze?text=First%20Last%20lname@section.mycorp.local=uax_url_email"
> will parse "ln...@section.my" and "corp.local" as separate tokens, as opposed 
> to
> curl -XGET 
> "ADDRESS/INDEX/_analyze?text=first%20last%20ln...@section.mycorp.org=uax_url_email"
> which will recognize "ln...@section.mycorp.org".
> Can this be fixed by updating to a newer version? I am running ElasticSearch 
> 0.90.5 and whatever Lucene version sits underneath that. My suspicion is that 
> the TLD list the analyzer relies on (http://www.internic.net/zones/root.zone, 
> I think?) is incomplete and needs updating. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-Tests-master - Build # 2120 - Still Unstable

2017-10-13 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-Tests-master/2120/

5 tests failed.
FAILED:  org.apache.solr.cloud.autoscaling.ExecutePlanActionTest.testIntegration

Error Message:
expected:<2> but was:<1>

Stack Trace:
java.lang.AssertionError: expected:<2> but was:<1>
at 
__randomizedtesting.SeedInfo.seed([F06683C6E6B2D1AB:40078DEAC38D708E]:0)
at org.junit.Assert.fail(Assert.java:93)
at org.junit.Assert.failNotEquals(Assert.java:647)
at org.junit.Assert.assertEquals(Assert.java:128)
at org.junit.Assert.assertEquals(Assert.java:472)
at org.junit.Assert.assertEquals(Assert.java:456)
at 
org.apache.solr.cloud.autoscaling.ExecutePlanActionTest.testIntegration(ExecutePlanActionTest.java:219)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1737)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:934)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:970)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:984)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:468)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:943)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:829)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:879)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:890)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at java.lang.Thread.run(Thread.java:748)


FAILED:  
org.apache.solr.client.solrj.io.stream.StreamExpressionTest.testParallelExecutorStream

Error Message:


Stack Trace:
java.lang.AssertionError
at 

[jira] [Commented] (SOLR-11443) Remove the usage of workqueue for Overseer

2017-10-13 Thread Cao Manh Dat (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16204325#comment-16204325
 ] 

Cao Manh Dat commented on SOLR-11443:
-

[~dragonsinth] We should not, unconditionally clear and set dirty, cause this 
will trigger get zk node names ( which is very expensive ). Why should we do 
that if the cache still valid?? Yeah, knowChildren.size() == 0 should be 
addressed too.

> Remove the usage of workqueue for Overseer
> --
>
> Key: SOLR-11443
> URL: https://issues.apache.org/jira/browse/SOLR-11443
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
> Attachments: SOLR-11443.patch, SOLR-11443.patch, SOLR-11443.patch
>
>
> If we can remove the usage of workqueue, We can save a lot of IO blocking in 
> Overseer, hence boost performance a lot.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: 6.6.2 Release

2017-10-13 Thread Shalin Shekhar Mangar
Hi Ishan,

I've backported SOLR-10335 to branch_6_6 so this is ready to go.
Thanks for volunteering for the release. I would have volunteered
after releasing 7.1 but you beat me to it.

On Fri, Oct 13, 2017 at 8:23 PM, Ishan Chattopadhyaya
 wrote:
> Hi,
> In light of [0], we need a 6.6.2 release as soon as possible.
>
> I'd like to volunteer to RM for this release, unless someone else wants to
> do so or has an objection.
>
> Regards,
> Ishan
>
>
> [0] -
> https://lucene.apache.org/solr/news.html#12-october-2017-please-secure-your-apache-solr-servers-since-a-zero-day-exploit-has-been-reported-on-a-public-mailing-list



-- 
Regards,
Shalin Shekhar Mangar.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10335) Upgrade to Tika 1.16 when available

2017-10-13 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16204264#comment-16204264
 ] 

ASF subversion and git services commented on SOLR-10335:


Commit 04e225980b2def9ba34ca0f07b45d8e5aa01784f in lucene-solr's branch 
refs/heads/branch_6_6 from [~shalinmangar]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=04e2259 ]

SOLR-10335: Merge branch 'SOLR-10335' of 
https://github.com/tballison/lucene-solr

(cherry picked from commit 3a098ec)


> Upgrade to Tika 1.16 when available
> ---
>
> Key: SOLR-10335
> URL: https://issues.apache.org/jira/browse/SOLR-10335
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Tim Allison
>Assignee: Shalin Shekhar Mangar
>Priority: Critical
> Fix For: 7.1, 6.6.2
>
>
> Once POI 3.16-beta3 is out (early/mid April?), we'll push for a release of 
> Tika 1.15.
> Please let us know if you have any requests.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10335) Upgrade to Tika 1.16 when available

2017-10-13 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16204263#comment-16204263
 ] 

ASF subversion and git services commented on SOLR-10335:


Commit 04e225980b2def9ba34ca0f07b45d8e5aa01784f in lucene-solr's branch 
refs/heads/branch_6_6 from [~shalinmangar]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=04e2259 ]

SOLR-10335: Merge branch 'SOLR-10335' of 
https://github.com/tballison/lucene-solr

(cherry picked from commit 3a098ec)


> Upgrade to Tika 1.16 when available
> ---
>
> Key: SOLR-10335
> URL: https://issues.apache.org/jira/browse/SOLR-10335
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Tim Allison
>Assignee: Shalin Shekhar Mangar
>Priority: Critical
> Fix For: 7.1, 6.6.2
>
>
> Once POI 3.16-beta3 is out (early/mid April?), we'll push for a release of 
> Tika 1.15.
> Please let us know if you have any requests.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: 6.6.2 Release

2017-10-13 Thread Ishan Chattopadhyaya
Thanks Uwe, you beat me to it (I was trying to do that myself).
I'll backport SOLR-11297 and start the release process in about 3 hours
from now.

Regards,
Ishan

On Sat, Oct 14, 2017 at 3:04 AM, Uwe Schindler  wrote:

> You can start the release, the second part of the security incident
> (SOLR-11482) is also in.
>
>
>
> Uwe
>
>
>
> -
>
> Uwe Schindler
>
> Achterdiek 19
> , D-28357
> Bremen
>
> http://www.thetaphi.de
>
> eMail: u...@thetaphi.de
>
>
>
> *From:* Ishan Chattopadhyaya [mailto:ichattopadhy...@gmail.com]
> *Sent:* Friday, October 13, 2017 11:25 PM
> *To:* dev@lucene.apache.org
> *Subject:* Re: 6.6.2 Release
>
>
>
> > Any chance we could get SOLR-11450 in?  I understand if the answer is
> no. 
>
> Currently, I want to have this release out as soon as possible so as to
> mitigate the risk exposure of the security vulnerability. Since this is not
> committed yet, I'd vote for leaving this out and possibly having it
> included in a later release, if needed.
>
> +1 to SOLR-11297.
>
>
>
>
>
> On Sat, Oct 14, 2017 at 2:32 AM, David Smiley 
> wrote:
>
> Suggested criteria for bug-fix release issues:
>
> * fixes a bug :-) and doesn't harm backwards-compatibility in the
> process
>
> * helps users upgrade to later versions
>
> * documentation
>
>
>
> +1 to SOLR-11297
>
>
>
> I'm not sure on SOLR-11450.  Seems it might introduce a back-compat issue?
>
>
>
> On Fri, Oct 13, 2017 at 4:40 PM Erick Erickson 
> wrote:
>
> I'd also like to get SOLR-11297 in if there are no objections. Ditto if
> the answer is no
>
>
>
> It's quite a safe fix though.
>
>
>
>
>
>
>
> On Fri, Oct 13, 2017 at 1:26 PM, Allison, Timothy B. 
> wrote:
>
> Any chance we could get SOLR-11450 in?  I understand if the answer is no.
> 
>
>
>
> Thank you!
>
>
>
> *From:* Ishan Chattopadhyaya [mailto:ichattopadhy...@gmail.com]
> *Sent:* Friday, October 13, 2017 4:23 PM
> *To:* dev@lucene.apache.org
> *Subject:* 6.6.2 Release
>
>
>
> Hi,
>
> In light of [0], we need a 6.6.2 release as soon as possible.
>
> I'd like to volunteer to RM for this release, unless someone else wants to
> do so or has an objection.
>
> Regards,
>
> Ishan
>
>
>
> [0] - https://lucene.apache.org/solr/news.html#12-october-
> 2017-please-secure-your-apache-solr-servers-since-a-
> zero-day-exploit-has-been-reported-on-a-public-mailing-list
>
>
>
> --
>
> Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
>
> LinkedIn: http://linkedin.com/in/davidwsmiley | Book: http://www.
> solrenterprisesearchserver.com
>
>
>


[jira] [Commented] (SOLR-11450) ComplexPhraseQParserPlugin not running charfilter for some multiterm queries in 6.x

2017-10-13 Thread JIRA

[ 
https://issues.apache.org/jira/browse/SOLR-11450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16204233#comment-16204233
 ] 

Jan Høydahl commented on SOLR-11450:


I'll pass on this I'm afraid.

> ComplexPhraseQParserPlugin not running charfilter for some multiterm queries 
> in 6.x 
> 
>
> Key: SOLR-11450
> URL: https://issues.apache.org/jira/browse/SOLR-11450
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 6.6.1
>Reporter: Tim Allison
>Priority: Minor
>  Labels: patch-with-test
> Attachments: SOLR-11450-unit-test.patch, SOLR-11450.patch
>
>
> On the user list, [~bjarkebm] reported that the charfilter is not being 
> applied in PrefixQueries in the ComplexPhraseQParserPlugin in 6.x.  Bjarke 
> fixed my proposed unit tests to prove this failure. All appears to work in 
> 7.x and trunk. If there are plans to release a 6.6.2, let's fold this in.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



RE: 6.6.2 Release

2017-10-13 Thread Uwe Schindler
You can start the release, the second part of the security incident 
(SOLR-11482) is also in.

 

Uwe

 

-

Uwe Schindler

Achterdiek 19, D-28357 Bremen

http://www.thetaphi.de  

eMail: u...@thetaphi.de

 

From: Ishan Chattopadhyaya [mailto:ichattopadhy...@gmail.com] 
Sent: Friday, October 13, 2017 11:25 PM
To: dev@lucene.apache.org
Subject: Re: 6.6.2 Release

 

> Any chance we could get SOLR-11450 in?  I understand if the answer is no. 

Currently, I want to have this release out as soon as possible so as to 
mitigate the risk exposure of the security vulnerability. Since this is not 
committed yet, I'd vote for leaving this out and possibly having it included in 
a later release, if needed.

+1 to SOLR-11297. 

 

 

On Sat, Oct 14, 2017 at 2:32 AM, David Smiley  > wrote:

Suggested criteria for bug-fix release issues:

* fixes a bug :-) and doesn't harm backwards-compatibility in the process

* helps users upgrade to later versions

* documentation

 

+1 to SOLR-11297

 

I'm not sure on SOLR-11450.  Seems it might introduce a back-compat issue?

 

On Fri, Oct 13, 2017 at 4:40 PM Erick Erickson  > wrote:

I'd also like to get SOLR-11297 in if there are no objections. Ditto if the 
answer is no

 

It's quite a safe fix though.

 

 

 

On Fri, Oct 13, 2017 at 1:26 PM, Allison, Timothy B.  > wrote:

Any chance we could get SOLR-11450 in?  I understand if the answer is no. 

 

Thank you!

 

From: Ishan Chattopadhyaya [mailto:ichattopadhy...@gmail.com 
 ] 
Sent: Friday, October 13, 2017 4:23 PM
To: dev@lucene.apache.org  
Subject: 6.6.2 Release

 

Hi,

In light of [0], we need a 6.6.2 release as soon as possible.

I'd like to volunteer to RM for this release, unless someone else wants to do 
so or has an objection.

Regards,

Ishan



[0] - 
https://lucene.apache.org/solr/news.html#12-october-2017-please-secure-your-apache-solr-servers-since-a-zero-day-exploit-has-been-reported-on-a-public-mailing-list

 

-- 

Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker

LinkedIn: http://linkedin.com/in/davidwsmiley | Book: 
http://www.solrenterprisesearchserver.com

 



[jira] [Commented] (SOLR-11444) Improve Aliases.java and comma delimited collection list handling

2017-10-13 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16204226#comment-16204226
 ] 

David Smiley commented on SOLR-11444:
-

No prob Erick; I'll investigate and see what's going wrong.  BTW I didn't know 
aliases could be named after existing collections!  Seems like something 
possibly hard to get right.

> Improve Aliases.java and comma delimited collection list handling
> -
>
> Key: SOLR-11444
> URL: https://issues.apache.org/jira/browse/SOLR-11444
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrCloud
>Reporter: David Smiley
>Assignee: David Smiley
> Attachments: SOLR_11444_Aliases.patch, SOLR_11444_Aliases.patch
>
>
> While starting to look at SOLR-11299 I noticed some brittleness in 
> assumptions about Strings that refer to a collection.  Sometimes they are in 
> fact references to comma separated lists, which appears was added with the 
> introduction of collection aliases (an alias can refer to a comma delimited 
> list).  So Java's type system kind of goes out the window when we do this.  
> In one case this leads to a bug -- CloudSolrClient will throw an NPE if you 
> try to write to such an alias.  Sending an update via HTTP will allow it and 
> send it to the first in the list.
> So this issue is about refactoring and some little improvements pertaining to 
> Aliases.java plus certain key spots that deal with collection references.  I 
> don't think I want to go as far as changing the public SolrJ API except to 
> adding documentation on what's possible.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: 6.6.2 Release

2017-10-13 Thread Ishan Chattopadhyaya
> Any chance we could get SOLR-11450 in?  I understand if the answer is no.

Currently, I want to have this release out as soon as possible so as to
mitigate the risk exposure of the security vulnerability. Since this is not
committed yet, I'd vote for leaving this out and possibly having it
included in a later release, if needed.

+1 to SOLR-11297.


On Sat, Oct 14, 2017 at 2:32 AM, David Smiley 
wrote:

> Suggested criteria for bug-fix release issues:
> * fixes a bug :-) and doesn't harm backwards-compatibility in the
> process
> * helps users upgrade to later versions
> * documentation
>
> +1 to SOLR-11297
>
> I'm not sure on SOLR-11450.  Seems it might introduce a back-compat issue?
>
> On Fri, Oct 13, 2017 at 4:40 PM Erick Erickson 
> wrote:
>
>> I'd also like to get SOLR-11297 in if there are no objections. Ditto if
>> the answer is no
>>
>> It's quite a safe fix though.
>>
>>
>>
>> On Fri, Oct 13, 2017 at 1:26 PM, Allison, Timothy B. 
>> wrote:
>>
>>> Any chance we could get SOLR-11450 in?  I understand if the answer is
>>> no. 
>>>
>>>
>>>
>>> Thank you!
>>>
>>>
>>>
>>> *From:* Ishan Chattopadhyaya [mailto:ichattopadhy...@gmail.com]
>>> *Sent:* Friday, October 13, 2017 4:23 PM
>>> *To:* dev@lucene.apache.org
>>> *Subject:* 6.6.2 Release
>>>
>>>
>>>
>>> Hi,
>>>
>>> In light of [0], we need a 6.6.2 release as soon as possible.
>>>
>>> I'd like to volunteer to RM for this release, unless someone else wants
>>> to do so or has an objection.
>>>
>>> Regards,
>>>
>>> Ishan
>>>
>>>
>>>
>>> [0] - https://lucene.apache.org/solr/news.html#12-october-
>>> 2017-please-secure-your-apache-solr-servers-since-a-
>>> zero-day-exploit-has-been-reported-on-a-public-mailing-list
>>>
>>
>> --
> Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
> LinkedIn: http://linkedin.com/in/davidwsmiley | Book: http://www.
> solrenterprisesearchserver.com
>


[jira] [Commented] (SOLR-11292) Querying against an alias can lead to incorrect routing

2017-10-13 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16204203#comment-16204203
 ] 

David Smiley commented on SOLR-11292:
-

I think it's bizarre that you can create an alias with a name that is also that 
of a collection.  It seems ripe for problems.

I'm not sure how this particular issue is happening.  SOLR-11444 maybe fixes 
it?.  CloudSolrClient.sendRequest should resolve the collection list and 
aliases to a list of target collections.  Then it should loop over the slices 
across all of them to build a list of URLs to the nodes it will communicate 
with.  SOLR-11444 improves the clarity of this logic substantially IMO; I'm not 
sure if there is a change in behavior with respect to the issue here.  
[~varunthacker] might you apply SOLR-11444 and see if there is an impact?

> Querying against an alias can lead to incorrect routing
> ---
>
> Key: SOLR-11292
> URL: https://issues.apache.org/jira/browse/SOLR-11292
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Varun Thacker
>Assignee: Varun Thacker
>
> collection1 has 2 shards and 1 replica
> collection2 has 8 shards and 1 replica
> I have 8 nodes so collection2 is spread across all 8 , while collection1 is 
> hosted by two nodes
> If we create an alias called "collection1" and point it to "collection2".
> Querying against the alias "collection1" works as expected but what I noticed 
> was the top level queries would only hit 2 out of the 8 JVMs when querying 
> using SolrJ
> It turns out that SolrJ is using the state.json of collection1 ( the actual 
> collection ) and routing queries to only those nodes.
> There are two negatives to this:
>  - If those two nodes are down all queries fail.
>  - Top level queries are only routed to those two nodes thus causing a skew 
> in the top level requests
> The obvious solution would be to use the state.json file of the underlying 
> collection that the alias is pointing to  . But if we have the alias pointing 
> to multiple collections then this might get tricky?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: 6.6.2 Release

2017-10-13 Thread David Smiley
Suggested criteria for bug-fix release issues:
* fixes a bug :-) and doesn't harm backwards-compatibility in the
process
* helps users upgrade to later versions
* documentation

+1 to SOLR-11297

I'm not sure on SOLR-11450.  Seems it might introduce a back-compat issue?

On Fri, Oct 13, 2017 at 4:40 PM Erick Erickson 
wrote:

> I'd also like to get SOLR-11297 in if there are no objections. Ditto if
> the answer is no
>
> It's quite a safe fix though.
>
>
>
> On Fri, Oct 13, 2017 at 1:26 PM, Allison, Timothy B. 
> wrote:
>
>> Any chance we could get SOLR-11450 in?  I understand if the answer is no.
>> 
>>
>>
>>
>> Thank you!
>>
>>
>>
>> *From:* Ishan Chattopadhyaya [mailto:ichattopadhy...@gmail.com]
>> *Sent:* Friday, October 13, 2017 4:23 PM
>> *To:* dev@lucene.apache.org
>> *Subject:* 6.6.2 Release
>>
>>
>>
>> Hi,
>>
>> In light of [0], we need a 6.6.2 release as soon as possible.
>>
>> I'd like to volunteer to RM for this release, unless someone else wants
>> to do so or has an objection.
>>
>> Regards,
>>
>> Ishan
>>
>>
>>
>> [0] -
>> https://lucene.apache.org/solr/news.html#12-october-2017-please-secure-your-apache-solr-servers-since-a-zero-day-exploit-has-been-reported-on-a-public-mailing-list
>>
>
> --
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
http://www.solrenterprisesearchserver.com


[jira] [Commented] (SOLR-11299) Time partitioned collections (umbrella issue)

2017-10-13 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16204175#comment-16204175
 ] 

David Smiley commented on SOLR-11299:
-

Nice article on some of these ideas by [~markrmil...@gmail.com]
https://blog.cloudera.com/blog/2013/10/collection-aliasing-near-real-time-search-for-really-big-data/
   4 years old but still relevant.

> Time partitioned collections (umbrella issue)
> -
>
> Key: SOLR-11299
> URL: https://issues.apache.org/jira/browse/SOLR-11299
> Project: Solr
>  Issue Type: New Feature
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrCloud
>Reporter: David Smiley
>Assignee: David Smiley
>
> Solr ought to have the ability to manage large-scale time-series data (think 
> logs or sensor data / IOT) itself without a lot of manual/external work.  The 
> most naive and painless approach today is to create a collection with a high 
> numShards with hash routing but this isn't as good as partitioning the 
> underlying indexes by time for these reasons:
> * Easy to scale up/down horizontally as data/requirements change.  (No need 
> to over-provision, use shard splitting, or re-index with different config)
> * Faster queries: 
> ** can search fewer shards, reducing overall load
> ** realtime search is more tractable (since most shards are stable -- 
> good caches)
> ** "recent" shards (that might be queried more) can be allocated to 
> faster hardware
> ** aged out data is simply removed, not marked as deleted.  Deleted docs 
> still have search overhead.
> * Outages of a shard result in a degraded but sometimes a useful system 
> nonetheless (compare to random subset missing)
> Ideally you could set this up once and then simply work with a collection 
> (potentially actually an alias) in a normal way (search or update), letting 
> Solr handle the addition of new partitions, removing of old ones, and 
> appropriate routing of requests depending on their nature.
> This issue is an umbrella issue for the particular tasks that will make it 
> all happen -- either subtasks or issue linking.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-master-Linux (32bit/jdk1.8.0_144) - Build # 20672 - Unstable!

2017-10-13 Thread Policeman Jenkins Server
Build: https://jenkins.thetaphi.de/job/Lucene-Solr-master-Linux/20672/
Java: 32bit/jdk1.8.0_144 -server -XX:+UseConcMarkSweepGC

1 tests failed.
FAILED:  org.apache.solr.cloud.ShardSplitTest.testSplitWithChaosMonkey

Error Message:
There are still nodes recoverying - waited for 330 seconds

Stack Trace:
java.lang.AssertionError: There are still nodes recoverying - waited for 330 
seconds
at __randomizedtesting.SeedInfo.seed([8B26241523C4B78D:1F7C462C21C09]:0)
at org.junit.Assert.fail(Assert.java:93)
at 
org.apache.solr.cloud.AbstractDistribZkTestBase.waitForRecoveriesToFinish(AbstractDistribZkTestBase.java:185)
at 
org.apache.solr.cloud.AbstractDistribZkTestBase.waitForRecoveriesToFinish(AbstractDistribZkTestBase.java:140)
at 
org.apache.solr.cloud.AbstractDistribZkTestBase.waitForRecoveriesToFinish(AbstractDistribZkTestBase.java:135)
at 
org.apache.solr.cloud.AbstractFullDistribZkTestBase.waitForRecoveriesToFinish(AbstractFullDistribZkTestBase.java:908)
at 
org.apache.solr.cloud.ShardSplitTest.testSplitWithChaosMonkey(ShardSplitTest.java:436)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1737)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:934)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:970)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:984)
at 
org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsFixedStatement.callStatement(BaseDistributedSearchTestCase.java:993)
at 
org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsStatement.evaluate(BaseDistributedSearchTestCase.java:968)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:468)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:943)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:829)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:879)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:890)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
   

[jira] [Commented] (SOLR-11444) Improve Aliases.java and comma delimited collection list handling

2017-10-13 Thread Erick Erickson (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16204174#comment-16204174
 ] 

Erick Erickson commented on SOLR-11444:
---

-1 until we resolve the additional test issue in SOLR-11218. I hope to have 
some time this weekend to dig further.

The issue for me is that testDeleteAliasWithExistingCollectionName fails with 
the patch on this JIRA but did not before. The reason I'm persnickety about 
this one is that this scenario doing the wrong thing is very painful.

A common strategy for reindexing from scratch is to create a new collection, 
index to that and then define an alias to switch atomically. Problem is that 
it's often the case that the user won't already have an alias set up, so you 
wind up with an alias and collection name that are identical, but the alias 
does _not_ point to the collection named the same. So far, so good.

But now I want to delete the old collection. If the alias is followed it 
deletes the wrong collection. Here's an abbreviated set of steps.

> create collection old
> create collection new
> alias old->new
> delete old

If that sequence deletes the new collection it's pretty bad.

Now, it may just be that the test isn't doing what I think, I really didn't 
have time to look yet. I'll see what I can get through this weekend.

The second test in SOLR-11218 with the nocommit is a separate concern, "it does 
what it does" may be fine there.

> Improve Aliases.java and comma delimited collection list handling
> -
>
> Key: SOLR-11444
> URL: https://issues.apache.org/jira/browse/SOLR-11444
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrCloud
>Reporter: David Smiley
>Assignee: David Smiley
> Attachments: SOLR_11444_Aliases.patch, SOLR_11444_Aliases.patch
>
>
> While starting to look at SOLR-11299 I noticed some brittleness in 
> assumptions about Strings that refer to a collection.  Sometimes they are in 
> fact references to comma separated lists, which appears was added with the 
> introduction of collection aliases (an alias can refer to a comma delimited 
> list).  So Java's type system kind of goes out the window when we do this.  
> In one case this leads to a bug -- CloudSolrClient will throw an NPE if you 
> try to write to such an alias.  Sending an update via HTTP will allow it and 
> send it to the first in the list.
> So this issue is about refactoring and some little improvements pertaining to 
> Aliases.java plus certain key spots that deal with collection references.  I 
> don't think I want to go as far as changing the public SolrJ API except to 
> adding documentation on what's possible.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-11487) Collection Alias metadata for time partitioned collections

2017-10-13 Thread David Smiley (JIRA)
David Smiley created SOLR-11487:
---

 Summary: Collection Alias metadata for time partitioned collections
 Key: SOLR-11487
 URL: https://issues.apache.org/jira/browse/SOLR-11487
 Project: Solr
  Issue Type: Sub-task
  Security Level: Public (Default Security Level. Issues are Public)
  Components: SolrCloud
Reporter: David Smiley


SOLR-11299 outlines an approach to using a collection Alias to refer to a 
series of collections of a time series. We'll need to store some metadata about 
these time series collections, such as which field of the document contains the 
timestamp to route on.

The current {{/aliases.json}} is a Map with a key {{collection}} which is in 
turn a Map of alias name strings to a comma delimited list of the collections.
_If we change the comma delimited list to be another Map to hold the existing 
list and more stuff, older CloudSolrClient (configured to talk to ZooKeeper) 
will break_.  Although if it's configured with an HTTP Solr URL then it would 
not break.  There's also some read/write hassle to worry about -- we may need 
to continue to read an aliases.json in the older format.

Alternatively, we could add a new map entry to aliases.json, say, 
{{collection_metadata}} keyed by alias name?

Perhaps another very different approach is to attach metadata to the configset 
in use?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-7.x-Windows (32bit/jdk1.8.0_144) - Build # 247 - Still Unstable!

2017-10-13 Thread Policeman Jenkins Server
Build: https://jenkins.thetaphi.de/job/Lucene-Solr-7.x-Windows/247/
Java: 32bit/jdk1.8.0_144 -server -XX:+UseG1GC

3 tests failed.
FAILED:  org.apache.solr.core.ExitableDirectoryReaderTest.testCacheAssumptions

Error Message:
Should have fewer docs than 100

Stack Trace:
java.lang.AssertionError: Should have fewer docs than 100
at 
__randomizedtesting.SeedInfo.seed([448F96DA3C7C4489:33F229FD528909AB]:0)
at org.junit.Assert.fail(Assert.java:93)
at org.junit.Assert.assertTrue(Assert.java:43)
at 
org.apache.solr.core.ExitableDirectoryReaderTest.testCacheAssumptions(ExitableDirectoryReaderTest.java:102)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1737)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:934)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:970)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:984)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:468)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:943)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:829)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:879)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:890)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at java.lang.Thread.run(Thread.java:748)


FAILED:  
junit.framework.TestSuite.org.apache.solr.update.DataDrivenBlockJoinTest

Error Message:
Could not remove the following files (in the order of attempts):

Re: 6.6.2 Release

2017-10-13 Thread Erick Erickson
I'd also like to get SOLR-11297 in if there are no objections. Ditto if the
answer is no

It's quite a safe fix though.



On Fri, Oct 13, 2017 at 1:26 PM, Allison, Timothy B. 
wrote:

> Any chance we could get SOLR-11450 in?  I understand if the answer is no.
> 
>
>
>
> Thank you!
>
>
>
> *From:* Ishan Chattopadhyaya [mailto:ichattopadhy...@gmail.com]
> *Sent:* Friday, October 13, 2017 4:23 PM
> *To:* dev@lucene.apache.org
> *Subject:* 6.6.2 Release
>
>
>
> Hi,
>
> In light of [0], we need a 6.6.2 release as soon as possible.
>
> I'd like to volunteer to RM for this release, unless someone else wants to
> do so or has an objection.
>
> Regards,
>
> Ishan
>
>
>
> [0] - https://lucene.apache.org/solr/news.html#12-october-
> 2017-please-secure-your-apache-solr-servers-since-a-
> zero-day-exploit-has-been-reported-on-a-public-mailing-list
>


Re: [VOTE] Release Lucene/Solr 7.1.0 RC2

2017-10-13 Thread Shalin Shekhar Mangar
Answering myself, I think we should go ahead with this RC. I've added
this entry to CHANGES.txt in all branches and it will be picked up in
case there needs to be a re-spin due to other reasons.

On Fri, Oct 13, 2017 at 8:16 PM, Shalin Shekhar Mangar
 wrote:
> I just noticed that in the hurry to create this RC, I forgot to add
> SOLR-10335 to Solr's CHANGES.txt. Is that worth a re-spin?
>
> On Fri, Oct 13, 2017 at 7:25 PM, Shalin Shekhar Mangar
>  wrote:
>> Please vote for release candidate 2 for Lucene/Solr 7.1.0
>>
>> The artifacts can be downloaded from:
>> https://dist.apache.org/repos/dist/dev/lucene/lucene-solr-7.1.0-RC2-rev84c90ad2c0218156c840e19a64d72b8a38550659
>>
>> You can run the smoke tester directly with this command:
>>
>> python3 -u dev-tools/scripts/smokeTestRelease.py \
>> https://dist.apache.org/repos/dist/dev/lucene/lucene-solr-7.1.0-RC2-rev84c90ad2c0218156c840e19a64d72b8a38550659
>>
>> Smoke tester passed for me.
>> SUCCESS! [0:40:53.908967]
>>
>> Here's my +1 to release.
>>
>> --
>> Regards,
>> Shalin Shekhar Mangar.
>
>
>
> --
> Regards,
> Shalin Shekhar Mangar.



-- 
Regards,
Shalin Shekhar Mangar.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



RE: 6.6.2 Release

2017-10-13 Thread Allison, Timothy B.
Any chance we could get SOLR-11450 in?  I understand if the answer is no. 

Thank you!

From: Ishan Chattopadhyaya [mailto:ichattopadhy...@gmail.com]
Sent: Friday, October 13, 2017 4:23 PM
To: dev@lucene.apache.org
Subject: 6.6.2 Release

Hi,
In light of [0], we need a 6.6.2 release as soon as possible.
I'd like to volunteer to RM for this release, unless someone else wants to do 
so or has an objection.
Regards,
Ishan


[0] - 
https://lucene.apache.org/solr/news.html#12-october-2017-please-secure-your-apache-solr-servers-since-a-zero-day-exploit-has-been-reported-on-a-public-mailing-list


[jira] [Commented] (LUCENE-4100) Maxscore - Efficient Scoring

2017-10-13 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16204134#comment-16204134
 ] 

Robert Muir commented on LUCENE-4100:
-

Thanks for the benchmarking! It is unfortunate we have to make the api more 
complicated / specialize disjunctions even more, but seems like the right 
tradeoff i suppose.

{quote}
Can you elaborate on what you find confusing? This looks similar to how you 
should not call Score.score() if you passed needsScores=false to me?
{quote}

That's exactly it, i think we should try to avoid situations like that. its 
basically the opposite of type-safety, and the more of these conditionals / 
"methods you should not call" that we add, the more confusing it should get. 
That's why i'm still mulling what we can do to keep scorers simpler...

but for now, to move along, I think we have some basic idea of what to do to 
fix indexsearcher (a boolean about whether exact total hits are needed, for 
various purposes), but yeah lets keep it separate from what we do about 
createWeight. For the latter maybe an explicit boolean for maxScore is the 
simplest for now, and we can see where it goes.

> Maxscore - Efficient Scoring
> 
>
> Key: LUCENE-4100
> URL: https://issues.apache.org/jira/browse/LUCENE-4100
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/codecs, core/query/scoring, core/search
>Affects Versions: 4.0-ALPHA
>Reporter: Stefan Pohl
>  Labels: api-change, gsoc2014, patch, performance
> Fix For: 4.9, 6.0
>
> Attachments: LUCENE-4100.patch, LUCENE-4100.patch, 
> contrib_maxscore.tgz, maxscore.patch
>
>
> At Berlin Buzzwords 2012, I will be presenting 'maxscore', an efficient 
> algorithm first published in the IR domain in 1995 by H. Turtle & J. Flood, 
> that I find deserves more attention among Lucene users (and developers).
> I implemented a proof of concept and did some performance measurements with 
> example queries and lucenebench, the package of Mike McCandless, resulting in 
> very significant speedups.
> This ticket is to get started the discussion on including the implementation 
> into Lucene's codebase. Because the technique requires awareness about it 
> from the Lucene user/developer, it seems best to become a contrib/module 
> package so that it consciously can be chosen to be used.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



6.6.2 Release

2017-10-13 Thread Ishan Chattopadhyaya
Hi,
In light of [0], we need a 6.6.2 release as soon as possible.

I'd like to volunteer to RM for this release, unless someone else wants to
do so or has an objection.

Regards,
Ishan


[0] -
https://lucene.apache.org/solr/news.html#12-october-2017-please-secure-your-apache-solr-servers-since-a-zero-day-exploit-has-been-reported-on-a-public-mailing-list


[jira] [Commented] (SOLR-10335) Upgrade to Tika 1.16 when available

2017-10-13 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16204130#comment-16204130
 ] 

ASF subversion and git services commented on SOLR-10335:


Commit 42ef683208865f2d599df716e6013f3407261bf3 in lucene-solr's branch 
refs/heads/branch_7_1 from [~shalinmangar]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=42ef683 ]

SOLR-10335: Adding entry to CHANGES.txt

(cherry picked from commit eef660e)

(cherry picked from commit 6ddf723)


> Upgrade to Tika 1.16 when available
> ---
>
> Key: SOLR-10335
> URL: https://issues.apache.org/jira/browse/SOLR-10335
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Tim Allison
>Assignee: Shalin Shekhar Mangar
>Priority: Critical
> Fix For: 7.1, 6.6.2
>
>
> Once POI 3.16-beta3 is out (early/mid April?), we'll push for a release of 
> Tika 1.15.
> Please let us know if you have any requests.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11218) Add a test that insures that you can delete the underlying collection if you have an alias of the same name pointing to a different collection

2017-10-13 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16204129#comment-16204129
 ] 

David Smiley commented on SOLR-11218:
-

bq. Is this expected?

It's a gray area I think.  It does what it does :-)  Does adding 
"shards.tolerant" help?

> Add a test that insures that you can delete the underlying collection if you 
> have an alias of the same name pointing to a different collection
> --
>
> Key: SOLR-11218
> URL: https://issues.apache.org/jira/browse/SOLR-11218
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Erick Erickson
>Assignee: Erick Erickson
> Attachments: SOLR-11218.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10335) Upgrade to Tika 1.16 when available

2017-10-13 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16204127#comment-16204127
 ] 

ASF subversion and git services commented on SOLR-10335:


Commit 6ddf723fdc97350e3b73eee713a63ad871a66116 in lucene-solr's branch 
refs/heads/branch_7x from [~shalinmangar]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=6ddf723 ]

SOLR-10335: Adding entry to CHANGES.txt

(cherry picked from commit eef660e)


> Upgrade to Tika 1.16 when available
> ---
>
> Key: SOLR-10335
> URL: https://issues.apache.org/jira/browse/SOLR-10335
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Tim Allison
>Assignee: Shalin Shekhar Mangar
>Priority: Critical
> Fix For: 7.1, 6.6.2
>
>
> Once POI 3.16-beta3 is out (early/mid April?), we'll push for a release of 
> Tika 1.15.
> Please let us know if you have any requests.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10335) Upgrade to Tika 1.16 when available

2017-10-13 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16204126#comment-16204126
 ] 

ASF subversion and git services commented on SOLR-10335:


Commit eef660e77365a5e268c55961d5e6920d512f7e7f in lucene-solr's branch 
refs/heads/master from [~shalinmangar]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=eef660e ]

SOLR-10335: Adding entry to CHANGES.txt


> Upgrade to Tika 1.16 when available
> ---
>
> Key: SOLR-10335
> URL: https://issues.apache.org/jira/browse/SOLR-10335
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Tim Allison
>Assignee: Shalin Shekhar Mangar
>Priority: Critical
> Fix For: 7.1, 6.6.2
>
>
> Once POI 3.16-beta3 is out (early/mid April?), we'll push for a release of 
> Tika 1.15.
> Please let us know if you have any requests.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11444) Improve Aliases.java and comma delimited collection list handling

2017-10-13 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16204123#comment-16204123
 ] 

David Smiley commented on SOLR-11444:
-

[~erickerickson] This issue, SOLR-11444, should not change the semantics of 
either of those issues (SOLR-10181 or SOLR-11218).  However both issues touch 
the same files as this one and so there will be some merge conflicts for one or 
both of us to get through if our mutual work overlaps.

If there are no further comments, I'd like to commit this Monday and let the 
tests beat on it a bit.  In a subsequent commit I can adjust Solr's 
documentation to point out that collection references in the path can be a 
comma delimited list (new).  I should add some docs too.

> Improve Aliases.java and comma delimited collection list handling
> -
>
> Key: SOLR-11444
> URL: https://issues.apache.org/jira/browse/SOLR-11444
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrCloud
>Reporter: David Smiley
>Assignee: David Smiley
> Attachments: SOLR_11444_Aliases.patch, SOLR_11444_Aliases.patch
>
>
> While starting to look at SOLR-11299 I noticed some brittleness in 
> assumptions about Strings that refer to a collection.  Sometimes they are in 
> fact references to comma separated lists, which appears was added with the 
> introduction of collection aliases (an alias can refer to a comma delimited 
> list).  So Java's type system kind of goes out the window when we do this.  
> In one case this leads to a bug -- CloudSolrClient will throw an NPE if you 
> try to write to such an alias.  Sending an update via HTTP will allow it and 
> send it to the first in the list.
> So this issue is about refactoring and some little improvements pertaining to 
> Aliases.java plus certain key spots that deal with collection references.  I 
> don't think I want to go as far as changing the public SolrJ API except to 
> adding documentation on what's possible.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10335) Upgrade to Tika 1.16 when available

2017-10-13 Thread Tim Allison (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16204121#comment-16204121
 ] 

Tim Allison commented on SOLR-10335:


Thank you, again!

> Upgrade to Tika 1.16 when available
> ---
>
> Key: SOLR-10335
> URL: https://issues.apache.org/jira/browse/SOLR-10335
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Tim Allison
>Assignee: Shalin Shekhar Mangar
>Priority: Critical
> Fix For: 7.1, 6.6.2
>
>
> Once POI 3.16-beta3 is out (early/mid April?), we'll push for a release of 
> Tika 1.15.
> Please let us know if you have any requests.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [VOTE] Release Lucene/Solr 7.1.0 RC2

2017-10-13 Thread Shalin Shekhar Mangar
I just noticed that in the hurry to create this RC, I forgot to add
SOLR-10335 to Solr's CHANGES.txt. Is that worth a re-spin?

On Fri, Oct 13, 2017 at 7:25 PM, Shalin Shekhar Mangar
 wrote:
> Please vote for release candidate 2 for Lucene/Solr 7.1.0
>
> The artifacts can be downloaded from:
> https://dist.apache.org/repos/dist/dev/lucene/lucene-solr-7.1.0-RC2-rev84c90ad2c0218156c840e19a64d72b8a38550659
>
> You can run the smoke tester directly with this command:
>
> python3 -u dev-tools/scripts/smokeTestRelease.py \
> https://dist.apache.org/repos/dist/dev/lucene/lucene-solr-7.1.0-RC2-rev84c90ad2c0218156c840e19a64d72b8a38550659
>
> Smoke tester passed for me.
> SUCCESS! [0:40:53.908967]
>
> Here's my +1 to release.
>
> --
> Regards,
> Shalin Shekhar Mangar.



-- 
Regards,
Shalin Shekhar Mangar.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10335) Upgrade to Tika 1.16 when available

2017-10-13 Thread Shalin Shekhar Mangar (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16204119#comment-16204119
 ] 

Shalin Shekhar Mangar commented on SOLR-10335:
--

No, I'll take care of the backports. Thanks!

> Upgrade to Tika 1.16 when available
> ---
>
> Key: SOLR-10335
> URL: https://issues.apache.org/jira/browse/SOLR-10335
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Tim Allison
>Assignee: Shalin Shekhar Mangar
>Priority: Critical
> Fix For: 7.1, 6.6.2
>
>
> Once POI 3.16-beta3 is out (early/mid April?), we'll push for a release of 
> Tika 1.15.
> Please let us know if you have any requests.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-10335) Upgrade to Tika 1.16 when available

2017-10-13 Thread Shalin Shekhar Mangar (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-10335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shalin Shekhar Mangar updated SOLR-10335:
-
Fix Version/s: 6.6.2
   7.1

> Upgrade to Tika 1.16 when available
> ---
>
> Key: SOLR-10335
> URL: https://issues.apache.org/jira/browse/SOLR-10335
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Tim Allison
>Assignee: Shalin Shekhar Mangar
>Priority: Critical
> Fix For: 7.1, 6.6.2
>
>
> Once POI 3.16-beta3 is out (early/mid April?), we'll push for a release of 
> Tika 1.15.
> Please let us know if you have any requests.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Assigned] (SOLR-11146) Analytics Component 2.0 Bug Fixes

2017-10-13 Thread Dennis Gove (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-11146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dennis Gove reassigned SOLR-11146:
--

Assignee: Dennis Gove

> Analytics Component 2.0 Bug Fixes
> -
>
> Key: SOLR-11146
> URL: https://issues.apache.org/jira/browse/SOLR-11146
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 7.1
>Reporter: Houston Putman
>Assignee: Dennis Gove
>Priority: Blocker
> Fix For: 7.0
>
>
> The new Analytics Component has several small bugs in mapping functions and 
> other places. This ticket is a fix for a large number of them. This patch 
> should allow all unit tests created in SOLR-11145 to pass.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Assigned] (SOLR-11145) Comprehensive Unit Tests for the Analytics Component

2017-10-13 Thread Dennis Gove (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-11145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dennis Gove reassigned SOLR-11145:
--

Assignee: Dennis Gove

> Comprehensive Unit Tests for the Analytics Component
> 
>
> Key: SOLR-11145
> URL: https://issues.apache.org/jira/browse/SOLR-11145
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 7.1
>Reporter: Houston Putman
>Assignee: Dennis Gove
>Priority: Critical
> Fix For: 7.0
>
>
> Adding comprehensive unit tests for the new Analytics Component.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[VOTE] Release Lucene/Solr 7.1.0 RC2

2017-10-13 Thread Shalin Shekhar Mangar
Please vote for release candidate 2 for Lucene/Solr 7.1.0

The artifacts can be downloaded from:
https://dist.apache.org/repos/dist/dev/lucene/lucene-solr-7.1.0-RC2-rev84c90ad2c0218156c840e19a64d72b8a38550659

You can run the smoke tester directly with this command:

python3 -u dev-tools/scripts/smokeTestRelease.py \
https://dist.apache.org/repos/dist/dev/lucene/lucene-solr-7.1.0-RC2-rev84c90ad2c0218156c840e19a64d72b8a38550659

Smoke tester passed for me.
SUCCESS! [0:40:53.908967]

Here's my +1 to release.

-- 
Regards,
Shalin Shekhar Mangar.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-7.1-Windows (64bit/jdk1.8.0_144) - Build # 1 - Unstable!

2017-10-13 Thread Policeman Jenkins Server
Build: https://jenkins.thetaphi.de/job/Lucene-Solr-7.1-Windows/1/
Java: 64bit/jdk1.8.0_144 -XX:-UseCompressedOops -XX:+UseG1GC

2 tests failed.
FAILED:  
junit.framework.TestSuite.org.apache.solr.search.TestAddFieldRealTimeGet

Error Message:
Could not remove the following files (in the order of attempts):
C:\Users\jenkins\workspace\Lucene-Solr-7.1-Windows\solr\build\solr-core\test\J1\temp\solr.search.TestAddFieldRealTimeGet_6D1247707E1B41AA-001\init-core-data-001:
 java.nio.file.DirectoryNotEmptyException: 
C:\Users\jenkins\workspace\Lucene-Solr-7.1-Windows\solr\build\solr-core\test\J1\temp\solr.search.TestAddFieldRealTimeGet_6D1247707E1B41AA-001\init-core-data-001

C:\Users\jenkins\workspace\Lucene-Solr-7.1-Windows\solr\build\solr-core\test\J1\temp\solr.search.TestAddFieldRealTimeGet_6D1247707E1B41AA-001\init-core-data-001\tlog:
 java.nio.file.AccessDeniedException: 
C:\Users\jenkins\workspace\Lucene-Solr-7.1-Windows\solr\build\solr-core\test\J1\temp\solr.search.TestAddFieldRealTimeGet_6D1247707E1B41AA-001\init-core-data-001\tlog

C:\Users\jenkins\workspace\Lucene-Solr-7.1-Windows\solr\build\solr-core\test\J1\temp\solr.search.TestAddFieldRealTimeGet_6D1247707E1B41AA-001:
 java.nio.file.DirectoryNotEmptyException: 
C:\Users\jenkins\workspace\Lucene-Solr-7.1-Windows\solr\build\solr-core\test\J1\temp\solr.search.TestAddFieldRealTimeGet_6D1247707E1B41AA-001
 

Stack Trace:
java.io.IOException: Could not remove the following files (in the order of 
attempts):
   
C:\Users\jenkins\workspace\Lucene-Solr-7.1-Windows\solr\build\solr-core\test\J1\temp\solr.search.TestAddFieldRealTimeGet_6D1247707E1B41AA-001\init-core-data-001:
 java.nio.file.DirectoryNotEmptyException: 
C:\Users\jenkins\workspace\Lucene-Solr-7.1-Windows\solr\build\solr-core\test\J1\temp\solr.search.TestAddFieldRealTimeGet_6D1247707E1B41AA-001\init-core-data-001
   
C:\Users\jenkins\workspace\Lucene-Solr-7.1-Windows\solr\build\solr-core\test\J1\temp\solr.search.TestAddFieldRealTimeGet_6D1247707E1B41AA-001\init-core-data-001\tlog:
 java.nio.file.AccessDeniedException: 
C:\Users\jenkins\workspace\Lucene-Solr-7.1-Windows\solr\build\solr-core\test\J1\temp\solr.search.TestAddFieldRealTimeGet_6D1247707E1B41AA-001\init-core-data-001\tlog
   
C:\Users\jenkins\workspace\Lucene-Solr-7.1-Windows\solr\build\solr-core\test\J1\temp\solr.search.TestAddFieldRealTimeGet_6D1247707E1B41AA-001:
 java.nio.file.DirectoryNotEmptyException: 
C:\Users\jenkins\workspace\Lucene-Solr-7.1-Windows\solr\build\solr-core\test\J1\temp\solr.search.TestAddFieldRealTimeGet_6D1247707E1B41AA-001

at __randomizedtesting.SeedInfo.seed([6D1247707E1B41AA]:0)
at org.apache.lucene.util.IOUtils.rm(IOUtils.java:329)
at 
org.apache.lucene.util.TestRuleTemporaryFilesCleanup.afterAlways(TestRuleTemporaryFilesCleanup.java:216)
at 
com.carrotsearch.randomizedtesting.rules.TestRuleAdapter$1.afterAlways(TestRuleAdapter.java:31)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:43)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at java.lang.Thread.run(Thread.java:748)


FAILED:  org.apache.solr.update.AutoCommitTest.testMaxTime

Error Message:
Exception during query

Stack Trace:
java.lang.RuntimeException: Exception during query
at 
__randomizedtesting.SeedInfo.seed([6D1247707E1B41AA:F7E63A92E081DD96]:0)
at org.apache.solr.SolrTestCaseJ4.assertQ(SolrTestCaseJ4.java:884)
at 
org.apache.solr.update.AutoCommitTest.testMaxTime(AutoCommitTest.java:270)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1737)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:934)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:970)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:984)
at 

[jira] [Commented] (SOLR-11473) Make HDFSDirectoryFactory support other prefixes (besides hdfs:/)

2017-10-13 Thread Timothy Potter (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16204040#comment-16204040
 ] 

Timothy Potter commented on SOLR-11473:
---

I didn't try auto-add replica feature with Alluxio yet. For the update log, I 
worked around the issue by setting the fully qualified classname on the 
 element in the config, e.g.  
... but that doesn't address the dataDir issue.

> Make HDFSDirectoryFactory support other prefixes (besides hdfs:/)
> -
>
> Key: SOLR-11473
> URL: https://issues.apache.org/jira/browse/SOLR-11473
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: hdfs
>Affects Versions: 6.6.1
>Reporter: Radu Gheorghe
>Priority: Minor
>
> Not sure if it's a bug or a missing feature :) I'm trying to make Solr work 
> on Alluxio, as described by [~thelabdude] in 
> https://www.slideshare.net/thelabdude/running-solr-in-the-cloud-at-memory-speed-with-alluxio/1
> The problem I'm facing here is with autoAddReplicas. If I have 
> replicationFactor=1 and the node with that replica dies, the node taking over 
> incorrectly assigns the data directory. For example:
> before
> {code}"dataDir":"alluxio://localhost:19998/solr/test/",{code}
> after
> {code}"dataDir":"alluxio://localhost:19998/solr/test/core_node1/alluxio://localhost:19998/solr/test/",{code}
> The same happens for ulogDir. Apparently, this has to do with this bit from 
> HDFSDirectoryFactory:
> {code}  public boolean isAbsolute(String path) {
> return path.startsWith("hdfs:/");
>   }{code}
> If I add "alluxio:/" in there, the paths are correct and the index is 
> recovered.
> I see a few options here:
> * add "alluxio:/" to the list there
> * add a regular expression in the lines of \[a-z]*:/ I hope that's not too 
> expensive, I'm not sure how often this method is called
> * don't do anything and expect alluxio to work with an "hdfs:/" path? I 
> actually tried that and didn't manage to make it work
> * have a different DirectoryFactory or something else?
> What do you think?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-6.6-Linux (64bit/jdk-9) - Build # 157 - Unstable!

2017-10-13 Thread Policeman Jenkins Server
Build: https://jenkins.thetaphi.de/job/Lucene-Solr-6.6-Linux/157/
Java: 64bit/jdk-9 -XX:-UseCompressedOops -XX:+UseG1GC --illegal-access=deny

1 tests failed.
FAILED:  org.apache.solr.update.AutoCommitTest.testMaxDocs

Error Message:
Exception during query

Stack Trace:
java.lang.RuntimeException: Exception during query
at 
__randomizedtesting.SeedInfo.seed([511F3BD964009EA8:E89EED0648EA9A22]:0)
at org.apache.solr.SolrTestCaseJ4.assertQ(SolrTestCaseJ4.java:895)
at 
org.apache.solr.update.AutoCommitTest.testMaxDocs(AutoCommitTest.java:225)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:564)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1713)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:907)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:943)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:957)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:468)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:916)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:802)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:852)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at java.base/java.lang.Thread.run(Thread.java:844)
Caused by: java.lang.RuntimeException: REQUEST FAILED: 
xpath=//result[@numFound=1]
xml response was: 

00


request was:q=id:14=standard=0=20=2.2
at org.apache.solr.SolrTestCaseJ4.assertQ(SolrTestCaseJ4.java:888)
... 39 more




Build Log:
[...truncated 12071 lines...]
   [junit4] Suite: org.apache.solr.update.AutoCommitTest
   [junit4]   2> 

[JENKINS] Lucene-Solr-SmokeRelease-master - Build # 866 - Still Failing

2017-10-13 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-SmokeRelease-master/866/

No tests ran.

Build Log:
[...truncated 28006 lines...]
prepare-release-no-sign:
[mkdir] Created dir: 
/home/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-master/lucene/build/smokeTestRelease/dist
 [copy] Copying 476 files to 
/home/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-master/lucene/build/smokeTestRelease/dist/lucene
 [copy] Copying 215 files to 
/home/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-master/lucene/build/smokeTestRelease/dist/solr
   [smoker] Java 1.8 JAVA_HOME=/home/jenkins/tools/java/latest1.8
   [smoker] NOTE: output encoding is UTF-8
   [smoker] 
   [smoker] Load release URL 
"file:/home/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-master/lucene/build/smokeTestRelease/dist/"...
   [smoker] 
   [smoker] Test Lucene...
   [smoker]   test basics...
   [smoker]   get KEYS
   [smoker] 0.2 MB in 0.02 sec (12.7 MB/sec)
   [smoker]   check changes HTML...
   [smoker]   download lucene-8.0.0-src.tgz...
   [smoker] 29.6 MB in 0.10 sec (285.5 MB/sec)
   [smoker] verify md5/sha1 digests
   [smoker]   download lucene-8.0.0.tgz...
   [smoker] 69.4 MB in 0.21 sec (327.7 MB/sec)
   [smoker] verify md5/sha1 digests
   [smoker]   download lucene-8.0.0.zip...
   [smoker] 79.7 MB in 0.24 sec (331.9 MB/sec)
   [smoker] verify md5/sha1 digests
   [smoker]   unpack lucene-8.0.0.tgz...
   [smoker] verify JAR metadata/identity/no javax.* or java.* classes...
   [smoker] test demo with 1.8...
   [smoker]   got 6184 hits for query "lucene"
   [smoker] checkindex with 1.8...
   [smoker] check Lucene's javadoc JAR
   [smoker]   unpack lucene-8.0.0.zip...
   [smoker] verify JAR metadata/identity/no javax.* or java.* classes...
   [smoker] test demo with 1.8...
   [smoker]   got 6184 hits for query "lucene"
   [smoker] checkindex with 1.8...
   [smoker] check Lucene's javadoc JAR
   [smoker]   unpack lucene-8.0.0-src.tgz...
   [smoker] make sure no JARs/WARs in src dist...
   [smoker] run "ant validate"
   [smoker] run tests w/ Java 8 and testArgs='-Dtests.slow=false'...
   [smoker] test demo with 1.8...
   [smoker]   got 213 hits for query "lucene"
   [smoker] checkindex with 1.8...
   [smoker] generate javadocs w/ Java 8...
   [smoker] 
   [smoker] Crawl/parse...
   [smoker] 
   [smoker] Verify...
   [smoker]   confirm all releases have coverage in TestBackwardsCompatibility
   [smoker] find all past Lucene releases...
   [smoker] run TestBackwardsCompatibility..
   [smoker] success!
   [smoker] 
   [smoker] Test Solr...
   [smoker]   test basics...
   [smoker]   get KEYS
   [smoker] 0.2 MB in 0.00 sec (259.6 MB/sec)
   [smoker]   check changes HTML...
   [smoker]   download solr-8.0.0-src.tgz...
   [smoker] 51.3 MB in 0.15 sec (338.9 MB/sec)
   [smoker] verify md5/sha1 digests
   [smoker]   download solr-8.0.0.tgz...
   [smoker] 143.8 MB in 0.48 sec (296.6 MB/sec)
   [smoker] verify md5/sha1 digests
   [smoker]   download solr-8.0.0.zip...
   [smoker] 144.8 MB in 0.58 sec (250.1 MB/sec)
   [smoker] verify md5/sha1 digests
   [smoker]   unpack solr-8.0.0.tgz...
   [smoker] verify JAR metadata/identity/no javax.* or java.* classes...
   [smoker] unpack lucene-8.0.0.tgz...
   [smoker]   **WARNING**: skipping check of 
/home/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-master/lucene/build/smokeTestRelease/tmp/unpack/solr-8.0.0/contrib/dataimporthandler-extras/lib/javax.mail-1.5.1.jar:
 it has javax.* classes
   [smoker]   **WARNING**: skipping check of 
/home/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-master/lucene/build/smokeTestRelease/tmp/unpack/solr-8.0.0/contrib/dataimporthandler-extras/lib/activation-1.1.1.jar:
 it has javax.* classes
   [smoker] copying unpacked distribution for Java 8 ...
   [smoker] test solr example w/ Java 8...
   [smoker]   start Solr instance 
(log=/home/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-master/lucene/build/smokeTestRelease/tmp/unpack/solr-8.0.0-java8/solr-example.log)...
   [smoker] No process found for Solr node running on port 8983
   [smoker]   Running techproducts example on port 8983 from 
/home/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-master/lucene/build/smokeTestRelease/tmp/unpack/solr-8.0.0-java8
   [smoker] Creating Solr home directory 
/home/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-master/lucene/build/smokeTestRelease/tmp/unpack/solr-8.0.0-java8/example/techproducts/solr
   [smoker] 
   [smoker] Starting up Solr on port 8983 using command:
   [smoker] "bin/solr" start -p 8983 -s "example/techproducts/solr"
   [smoker] 
   [smoker] Waiting up to 180 seconds to see Solr running on port 8983 [|]  
 [/]   [-]   [\]   [|]   [/]   [-]   
[\]   [|]   

Re: [VOTE] Release Lucene/Solr 7.1.0 RC1

2017-10-13 Thread Shalin Shekhar Mangar
This vote has been cancelled due to the recently reported
vulnerabilities. I'll be putting up another RC soon for voting.

On Thu, Oct 12, 2017 at 11:20 AM, Shalin Shekhar Mangar
 wrote:
> Please vote for release candidate 1 for Lucene/Solr 7.1.0
>
> The artifacts can be downloaded from:
> https://dist.apache.org/repos/dist/dev/lucene/lucene-solr-7.1.0-RC1-reva2c54447f118a5dc70ab0e0ae14bd87b3545254b
>
> You can run the smoke tester directly with this command:
>
> python3 -u dev-tools/scripts/smokeTestRelease.py \
> https://dist.apache.org/repos/dist/dev/lucene/lucene-solr-7.1.0-RC1-reva2c54447f118a5dc70ab0e0ae14bd87b3545254b
>
> Smoke tester passed for me (but on the 2nd attempt due to a flaky test
> that's already being tracked in a Jira).
> SUCCESS! [0:55:14.107386]
>
> Here's my +1 to release.
>
>
> --
> Regards,
> Shalin Shekhar Mangar.



-- 
Regards,
Shalin Shekhar Mangar.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11423) Overseer queue needs a hard cap (maximum size) that clients respect

2017-10-13 Thread Scott Blum (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16204004#comment-16204004
 ] 

Scott Blum commented on SOLR-11423:
---

I'm happy to defer on this issue, but I just want to be clear that I actively 
dislike having a system property.  It feels like a not useful piece of config, 
and worse, what happens if you don't set the same cap on every node?  Remember 
this is enforced client side, not server side.  If you accidentally have a mix 
of nodes where half of them cap at 20k, and half of them cap at 40k, then the 
moment you get above 20k any badly behaving 40k nodes are going to starve out 
the 20k nodes.  It becomes unfair contention.

> Overseer queue needs a hard cap (maximum size) that clients respect
> ---
>
> Key: SOLR-11423
> URL: https://issues.apache.org/jira/browse/SOLR-11423
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrCloud
>Reporter: Scott Blum
>Assignee: Scott Blum
>
> When Solr gets into pathological GC thrashing states, it can fill the 
> overseer queue with literally thousands and thousands of queued state 
> changes.  Many of these end up being duplicated up/down state updates.  Our 
> production cluster has gotten to the 100k queued items level many times, and 
> there's nothing useful you can do at this point except manually purge the 
> queue in ZK.  Recently, it hit 3 million queued items, at which point our 
> entire ZK cluster exploded.
> I propose a hard cap.  Any client trying to enqueue a item when a queue is 
> full would throw an exception.  I was thinking maybe 10,000 items would be a 
> reasonable limit.  Thoughts?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11443) Remove the usage of workqueue for Overseer

2017-10-13 Thread Scott Blum (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16204001#comment-16204001
 ] 

Scott Blum commented on SOLR-11443:
---

SOLR-11447 looks interesting, might well address that comment.

{code}
int cacheSizeBefore = knownChildren.size();
knownChildren.removeAll(paths);
if (cacheSizeBefore - paths.size() == knownChildren.size()) {
  stats.setQueueLength(knownChildren.size());
} else {
  // There are elements get deleted but not present in the cache,
  // the cache seems not valid anymore
  knownChildren.clear();
  isDirty = true;
}
{code}

I just kind of feel like you should unconditionally clear and set dirty, to 
catch any weird edge cases.  What if post removal, knownChildren.size() == 0 in 
the above code?  Having knownChildren empty and !isDirty seems runs the risk of 
report false queue empty status when in fact we just need to pull more nodes 
from ZK.

> Remove the usage of workqueue for Overseer
> --
>
> Key: SOLR-11443
> URL: https://issues.apache.org/jira/browse/SOLR-11443
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
> Attachments: SOLR-11443.patch, SOLR-11443.patch, SOLR-11443.patch
>
>
> If we can remove the usage of workqueue, We can save a lot of IO blocking in 
> Overseer, hence boost performance a lot.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11443) Remove the usage of workqueue for Overseer

2017-10-13 Thread Scott Blum (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16203992#comment-16203992
 ] 

Scott Blum commented on SOLR-11443:
---

BTW, have you tried out github PRs?  It would be so much easier to review in 
that tool. :)

> Remove the usage of workqueue for Overseer
> --
>
> Key: SOLR-11443
> URL: https://issues.apache.org/jira/browse/SOLR-11443
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
> Attachments: SOLR-11443.patch, SOLR-11443.patch, SOLR-11443.patch
>
>
> If we can remove the usage of workqueue, We can save a lot of IO blocking in 
> Overseer, hence boost performance a lot.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-7.1-Linux (64bit/jdk1.8.0_144) - Build # 1 - Unstable!

2017-10-13 Thread Policeman Jenkins Server
Build: https://jenkins.thetaphi.de/job/Lucene-Solr-7.1-Linux/1/
Java: 64bit/jdk1.8.0_144 -XX:+UseCompressedOops -XX:+UseConcMarkSweepGC

1 tests failed.
FAILED:  org.apache.solr.cloud.DocValuesNotIndexedTest.testGroupingSorting

Error Message:
Should have exactly 4 documents returned expected:<4> but was:<3>

Stack Trace:
java.lang.AssertionError: Should have exactly 4 documents returned expected:<4> 
but was:<3>
at 
__randomizedtesting.SeedInfo.seed([B2883AE70D2C128A:ACB032EF7187A80A]:0)
at org.junit.Assert.fail(Assert.java:93)
at org.junit.Assert.failNotEquals(Assert.java:647)
at org.junit.Assert.assertEquals(Assert.java:128)
at org.junit.Assert.assertEquals(Assert.java:472)
at 
org.apache.solr.cloud.DocValuesNotIndexedTest.checkSortOrder(DocValuesNotIndexedTest.java:260)
at 
org.apache.solr.cloud.DocValuesNotIndexedTest.testGroupingSorting(DocValuesNotIndexedTest.java:245)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1737)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:934)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:970)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:984)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:468)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:943)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:829)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:879)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:890)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at 

[jira] [Commented] (SOLR-11386) Extracting learning to rank features fails when word ordering of EFI argument changed.

2017-10-13 Thread Michael A. Alcorn (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16203861#comment-16203861
 ] 

Michael A. Alcorn commented on SOLR-11386:
--

-I just set up a local install of Solr 6.6.0 with a toy data set and tested 
multi-term EFI arguments using single quotes and it worked as expected. The 
issue seems to be isolated to older Solr versions. We'll upgrade our 
development version and see if that fixes it.-

I was incorrect. The issue persists in Solr 6.6.0, however, I believe I've 
discovered a workaround. If you use:

{code}
{
"store": "redhat_efi_feature_store",
"name": "case_description_issue_tfidf",
"class": "org.apache.solr.ltr.feature.SolrFeature",
"params": {
"q":"{!dismax qf=text_tfidf}${text}"
}
}
{code}

instead of:

{code}
{
"store": "redhat_efi_feature_store",
"name": "case_description_issue_tfidf",
"class": "org.apache.solr.ltr.feature.SolrFeature",
"params": {
"q": "{!field f=issue_tfidf}${case_description}"
}
}
{code}

you can then use single quotes to incorporate multi-term arguments as 
[~alessandro.benedetti] suggested.

> Extracting learning to rank features fails when word ordering of EFI argument 
> changed.
> --
>
> Key: SOLR-11386
> URL: https://issues.apache.org/jira/browse/SOLR-11386
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: contrib - LTR
>Affects Versions: 6.5.1
>Reporter: Michael A. Alcorn
>
> I'm getting some extremely strange behavior when trying to extract features 
> for a learning to rank model. The following query incorrectly says all 
> features have zero values:
> {code}
> http://gss-test-fusion.usersys.redhat.com:8983/solr/access/query?q=added 
> couple of fiber channel={!ltr model=redhat_efi_model reRankDocs=1 
> efi.case_summary=the efi.case_description=added couple of fiber channel 
> efi.case_issue=the efi.case_environment=the}=id,score,[features]=10
> {code}
> But this query, which simply moves the word "added" from the front of the 
> provided text to the back, properly fills in the feature values:
> {code}
> http://gss-test-fusion.usersys.redhat.com:8983/solr/access/query?q=couple of 
> fiber channel added={!ltr model=redhat_efi_model reRankDocs=1 
> efi.case_summary=the efi.case_description=couple of fiber channel added 
> efi.case_issue=the efi.case_environment=the}=id,score,[features]=10
> {code}
> The explain output for the failing query can be found here:
> https://gist.github.com/manisnesan/18a8f1804f29b1b62ebfae1211f38cc4
> and the explain output for the properly functioning query can be found here:
> https://gist.github.com/manisnesan/47685a561605e2229434b38aed11cc65



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8389) Convert CDCR peer cluster and other configurations into collection properties modifiable via APIs

2017-10-13 Thread Peter Rusko (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16203842#comment-16203842
 ] 

Peter Rusko commented on SOLR-8389:
---

The main concern was exactly the update frequency, that's why I separated this 
from state.json, to clusterprops.json. It will have its own watcher and won't 
be affected by clusterstate changes. The intention was to put user-defined 
properties here, which can be changed without restarting solr.

Varun, I'm not sure what you mean by that. Currently unrelated collection 
property changes still trigger all the listeners, but given that they are all 
supposed to be rare (though nothing prevents anyone from writing properties 
from the code I guess), I don't see this as a problem.

> Convert CDCR peer cluster and other configurations into collection properties 
> modifiable via APIs
> -
>
> Key: SOLR-8389
> URL: https://issues.apache.org/jira/browse/SOLR-8389
> Project: Solr
>  Issue Type: Improvement
>  Components: CDCR, SolrCloud
>Reporter: Shalin Shekhar Mangar
>
> CDCR configuration is kept inside solrconfig.xml which makes it difficult to 
> add or change peer cluster configuration.
> I propose to move all CDCR config to collection level properties in cluster 
> state so that they can be modified using the existing modify collection API.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11450) ComplexPhraseQParserPlugin not running charfilter for some multiterm queries in 6.x

2017-10-13 Thread Tim Allison (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16203840#comment-16203840
 ] 

Tim Allison commented on SOLR-11450:


bq. I'm not familiar enough with Solr query parsers

Y, I've been away from this for too long and got the first couple of answers to 
[~bjarkebm] wrong on the user list because of the diff btwn Lucene and Solr.  
It is good to be back.

Thank you!

> ComplexPhraseQParserPlugin not running charfilter for some multiterm queries 
> in 6.x 
> 
>
> Key: SOLR-11450
> URL: https://issues.apache.org/jira/browse/SOLR-11450
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 6.6.1
>Reporter: Tim Allison
>Priority: Minor
>  Labels: patch-with-test
> Attachments: SOLR-11450-unit-test.patch, SOLR-11450.patch
>
>
> On the user list, [~bjarkebm] reported that the charfilter is not being 
> applied in PrefixQueries in the ComplexPhraseQParserPlugin in 6.x.  Bjarke 
> fixed my proposed unit tests to prove this failure. All appears to work in 
> 7.x and trunk. If there are plans to release a 6.6.2, let's fold this in.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4100) Maxscore - Efficient Scoring

2017-10-13 Thread Adrien Grand (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16203829#comment-16203829
 ] 

Adrien Grand commented on LUCENE-4100:
--

bq. indexsearcher already knows if scores are needed (e.g. from the Sort), but 
there is no way to tell it that approximate total hit count is acceptable. If 
we can do that, then I think we can make the early termination case really easy 
for the sorted case, index order case, and also this maxscore case.

+1 We need to add a new {{boolean needsTotalHits}} to the {{search(Query, 
int)}} and {{search(Query, int, Sort)}} methods.

bq. we do? [...] I'm just asking because the new scorer here looks a hell of a 
lot like a disjunction scorer

Well it is a disjunction. :) Our regular disjunction scorer maintains a single 
heap and only looks at the 'top' element and callso {{updateTop}} after calls 
to {{nextDoc}} or {{advance}}. If you want to be able to call {{advance()}} 
sometimes on low-scoring clauses even if only {{nextDoc()}} was called on the 
disjunction, you need to give it the ability to leave some scorers behind, as 
long as the sum of the max scores of scorers that are behind and scorers that 
are positioned on the current candidate is not greater than the minimum 
competitive score (otherwise you might be missing matches). This means you need 
to move scorers between at least 2 data-structures. In practice, we actually 
use 3 of them so that we can also easily differenciate scorers that are on the 
current candidate from scorers that are too advanced. Moving scorers between 
those data-structures has some overhead. For instance here is what I get when I 
benchmark the MaxScoreScorer against master's DisjunctionSumScorer (BS1 is 
disabled in both cases):

{noformat}
TaskQPS baseline  StdDev   QPS patch  StdDev
Pct diff
   HighTermDayOfYearSort   33.34  (7.3%)   16.55  (2.6%)  
-50.4% ( -56% -  -43%)
  OrHighHigh   21.91  (4.1%)   12.36  (1.7%)  
-43.6% ( -47% -  -39%)
   OrHighLow   48.08  (4.0%)   29.27  (2.4%)  
-39.1% ( -43% -  -34%)
   OrHighMed   60.66  (4.0%)   39.32  (2.5%)  
-35.2% ( -40% -  -29%)
  Fuzzy2  117.30  (7.1%)  101.36  (7.5%)  
-13.6% ( -26% -1%)
  Fuzzy1  238.83 (10.7%)  212.80  (7.3%)  
-10.9% ( -26% -7%)
OrHighNotLow   70.64  (3.4%)   66.82  (2.0%)   
-5.4% ( -10% -0%)
OrHighNotMed   65.59  (3.0%)   62.63  (1.6%)   
-4.5% (  -8% -0%)
   OrNotHighHigh   34.55  (2.3%)   33.24  (1.1%)   
-3.8% (  -6% -0%)
   OrHighNotHigh   48.98  (2.4%)   47.17  (1.4%)   
-3.7% (  -7% -0%)
OrNotHighMed  107.24  (2.1%)  103.40  (1.1%)   
-3.6% (  -6% -0%)
  IntNRQ   26.26 (11.5%)   25.64 (12.3%)   
-2.3% ( -23% -   24%)
  AndHighLow  879.33  (3.7%)  860.16  (3.3%)   
-2.2% (  -8% -4%)
  AndHighMed  168.90  (1.7%)  165.48  (1.3%)   
-2.0% (  -4% -1%)
  HighPhrase   20.08  (2.8%)   19.69  (2.7%)   
-2.0% (  -7% -3%)
 MedSloppyPhrase   15.44  (1.7%)   15.15  (1.8%)   
-1.9% (  -5% -1%)
 LowSpanNear   44.70  (2.1%)   43.88  (1.9%)   
-1.8% (  -5% -2%)
   MedPhrase   52.07  (3.2%)   51.16  (3.1%)   
-1.7% (  -7% -4%)
 LowSloppyPhrase  150.90  (1.5%)  148.62  (1.5%)   
-1.5% (  -4% -1%)
OrNotHighLow 1174.47  (3.6%) 1157.52  (4.1%)   
-1.4% (  -8% -6%)
HighSpanNear   38.46  (3.0%)   37.92  (2.4%)   
-1.4% (  -6% -4%)
HighSloppyPhrase   28.13  (2.4%)   27.76  (2.5%)   
-1.3% (  -6% -3%)
   LowPhrase  131.67  (1.6%)  130.37  (2.0%)   
-1.0% (  -4% -2%)
 MedSpanNear   54.75  (3.4%)   54.22  (3.0%)   
-1.0% (  -7% -5%)
Wildcard   41.58  (5.1%)   41.23  (5.2%)   
-0.8% ( -10% -9%)
 AndHighHigh   71.08  (1.5%)   70.61  (1.3%)   
-0.7% (  -3% -2%)
 Prefix3   88.11  (6.4%)   87.57  (7.5%)   
-0.6% ( -13% -   14%)
 MedTerm  235.73  (1.4%)  235.35  (3.7%)   
-0.2% (  -5% -4%)
HighTerm  144.06  (1.3%)  143.93  (4.2%)   
-0.1% (  -5% -5%)
 LowTerm  711.49  (4.2%)  718.15  (4.7%)
0.9% (  -7% -   10%)
 Respell  203.69  (8.6%)  206.72  (7.0%)
1.5% ( -12% -   18%)
   

[jira] [Commented] (SOLR-11450) ComplexPhraseQParserPlugin not running charfilter for some multiterm queries in 6.x

2017-10-13 Thread Adrien Grand (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16203821#comment-16203821
 ] 

Adrien Grand commented on SOLR-11450:
-

I see. Yes so it looks like something that can get fixed in 6.6.2 in the end. 
Thanks for explaining. I'm not familiar enough with Solr query parsers to do a 
proper review but if we can find someone who understands these things better 
than me, I have no objection to getting it committed. [~janhoy] Maybe you can 
review it?

> ComplexPhraseQParserPlugin not running charfilter for some multiterm queries 
> in 6.x 
> 
>
> Key: SOLR-11450
> URL: https://issues.apache.org/jira/browse/SOLR-11450
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 6.6.1
>Reporter: Tim Allison
>Priority: Minor
>  Labels: patch-with-test
> Attachments: SOLR-11450-unit-test.patch, SOLR-11450.patch
>
>
> On the user list, [~bjarkebm] reported that the charfilter is not being 
> applied in PrefixQueries in the ComplexPhraseQParserPlugin in 6.x.  Bjarke 
> fixed my proposed unit tests to prove this failure. All appears to work in 
> 7.x and trunk. If there are plans to release a 6.6.2, let's fold this in.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10335) Upgrade to Tika 1.16 when available

2017-10-13 Thread Tim Allison (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16203804#comment-16203804
 ] 

Tim Allison commented on SOLR-10335:


[~shalinmangar], should I submit another PR for the 6_x and 6.6.2 branch or 
will you take it from here?  THANK YOU!!!

> Upgrade to Tika 1.16 when available
> ---
>
> Key: SOLR-10335
> URL: https://issues.apache.org/jira/browse/SOLR-10335
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Tim Allison
>Assignee: Shalin Shekhar Mangar
>Priority: Critical
>
> Once POI 3.16-beta3 is out (early/mid April?), we'll push for a release of 
> Tika 1.15.
> Please let us know if you have any requests.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-10335) Upgrade to Tika 1.16 when available

2017-10-13 Thread Tim Allison (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16203804#comment-16203804
 ] 

Tim Allison edited comment on SOLR-10335 at 10/13/17 4:32 PM:
--

[~shalinmangar], should I submit another PR for the 6_x and 6.6.2 branches or 
will you take it from here?  THANK YOU!!!


was (Author: talli...@mitre.org):
[~shalinmangar], should I submit another PR for the 6_x and 6.6.2 branch or 
will you take it from here?  THANK YOU!!!

> Upgrade to Tika 1.16 when available
> ---
>
> Key: SOLR-10335
> URL: https://issues.apache.org/jira/browse/SOLR-10335
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Tim Allison
>Assignee: Shalin Shekhar Mangar
>Priority: Critical
>
> Once POI 3.16-beta3 is out (early/mid April?), we'll push for a release of 
> Tika 1.15.
> Please let us know if you have any requests.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-11450) ComplexPhraseQParserPlugin not running charfilter for some multiterm queries in 6.x

2017-10-13 Thread Tim Allison (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16203778#comment-16203778
 ] 

Tim Allison edited comment on SOLR-11450 at 10/13/17 4:30 PM:
--

Ha.  Right.  Solr does do its own thing.  {{FieldTypePluginLoader}} generates a 
multiterm analyzer in the TextField by subsetting the TokenizerChain's 
components that are MultitermAware and/or swapping in a KeywordAnalyzer 
--[here|https://github.com/apache/lucene-solr/blob/branch_6x/solr/core/src/java/org/apache/solr/schema/FieldTypePluginLoader.java#L182]
 ...almost like {{Analyzer.normalize()}} in 7.x :)

Then {{SolrQueryParserBase}} has an {{analyzeIfMultiTermText}} 
[here|https://github.com/apache/lucene-solr/blob/branch_6x/solr/core/src/java/org/apache/solr/parser/SolrQueryParserBase.java#L883],
 which in turn calls {{TextField}}'s {{analyzeMultiTerm}} with {{TextField}}'s 
multitermanalyzer that was built back in the {{FieldTypePluginLoader}} above.

So, in Solr 6.x, the basic QueryParser relies on the SolrQueryParserBase and 
all is good.  However, the CPQP doesn't extend the SolrQueryParserBase.  

Two things make this feel like a bug and not a feature in Solr 6.x:

1) multiterm analysis works for the classic query parser but not fully for the 
CPQP in Solr 6.x
2) multiterm analysis works for CPQP for some multiterms (wildcard/reverse 
wildcard) and range, but not in the other multiterms: prefix, regex and fuzzy.



was (Author: talli...@mitre.org):
Ha.  Right.  Solr does do its own thing.  {{FieldTypePluginLoader}} generates a 
multiterm analyzer in the TextField by subsetting the TokenizerChain's 
components that are MultitermAware and/or swapping in a KeywordAnalyzer 
--[here|https://github.com/apache/lucene-solr/blob/branch_6x/solr/core/src/java/org/apache/solr/schema/FieldTypePluginLoader.java#L182]
 ...just like {{Analyzer.normalize()}} in 7.x :)

Then {{SolrQueryParserBase}} has an {{analyzeIfMultiTermText}} 
[here|https://github.com/apache/lucene-solr/blob/branch_6x/solr/core/src/java/org/apache/solr/parser/SolrQueryParserBase.java#L883],
 which in turn calls {{TextField}}'s {{analyzeMultiTerm}} with {{TextField}}'s 
multitermanalyzer that was built back in the {{FieldTypePluginLoader}} above.

So, in Solr 6.x, the basic QueryParser relies on the SolrQueryParserBase and 
all is good.  However, the CPQP doesn't extend the SolrQueryParserBase.  

Two things make this feel like a bug and not a feature in Solr 6.x:

1) multiterm analysis works for the classic query parser but not fully for the 
CPQP in Solr 6.x
2) multiterm analysis works for CPQP for some multiterms (wildcard/reverse 
wildcard) and range, but not in the other multiterms: prefix, regex and fuzzy.


> ComplexPhraseQParserPlugin not running charfilter for some multiterm queries 
> in 6.x 
> 
>
> Key: SOLR-11450
> URL: https://issues.apache.org/jira/browse/SOLR-11450
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 6.6.1
>Reporter: Tim Allison
>Priority: Minor
>  Labels: patch-with-test
> Attachments: SOLR-11450-unit-test.patch, SOLR-11450.patch
>
>
> On the user list, [~bjarkebm] reported that the charfilter is not being 
> applied in PrefixQueries in the ComplexPhraseQParserPlugin in 6.x.  Bjarke 
> fixed my proposed unit tests to prove this failure. All appears to work in 
> 7.x and trunk. If there are plans to release a 6.6.2, let's fold this in.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-11450) ComplexPhraseQParserPlugin not running charfilter for some multiterm queries in 6.x

2017-10-13 Thread Tim Allison (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16203778#comment-16203778
 ] 

Tim Allison edited comment on SOLR-11450 at 10/13/17 4:20 PM:
--

Ha.  Right.  Solr does do its own thing.  {{FieldTypePluginLoader}} generates a 
multiterm analyzer in the TextField by subsetting the TokenizerChain's 
components that are MultitermAware and/or swapping in a KeywordAnalyzer 
--[here|https://github.com/apache/lucene-solr/blob/branch_6x/solr/core/src/java/org/apache/solr/schema/FieldTypePluginLoader.java#L182]
 ...just like {{Analyzer.normalize()}} in 7.x :)

Then {{SolrQueryParserBase}} has an {{analyzeIfMultiTermText}} 
[here|https://github.com/apache/lucene-solr/blob/branch_6x/solr/core/src/java/org/apache/solr/parser/SolrQueryParserBase.java#L883],
 which in turn calls {{TextField}}'s {{analyzeMultiTerm}} with {{TextField}}'s 
multitermanalyzer that was built back in the {{FieldTypePluginLoader}} above.

So, in Solr 6.x, the basic QueryParser relies on the SolrQueryParserBase and 
all is good.  However, the CPQP doesn't extend the SolrQueryParserBase.  

Two things make this feel like a bug and not a feature in Solr 6.x:

1) multiterm analysis works for the classic query parser but not fully for the 
CPQP in Solr 6.x
2) multiterm analysis works for CPQP for some multiterms (wildcard/reverse 
wildcard) and range, but not in the other multiterms: prefix, regex and fuzzy.



was (Author: talli...@mitre.org):
Ha.  Right.  Solr does do its own thing.  {{FieldTypePluginLoader}} generates a 
multiterm analyzer in the TextField by subsetting the TokenizerChain's 
components that are MultitermAware and/or swapping in a KeywordAnalyzer 
--[here|https://github.com/apache/lucene-solr/blob/branch_6x/solr/core/src/java/org/apache/solr/schema/FieldTypePluginLoader.java#L182]
 ...just like {{Analyzer.normalize()}} in 7.x :)

Then {{SolrQueryParserBase}} has an {{analyzeIfMultiTermText}} 
[here|https://github.com/apache/lucene-solr/blob/branch_6x/solr/core/src/java/org/apache/solr/parser/SolrQueryParserBase.java#L883],
 which in turn calls {{TextField}}'s {{analyzeMultiTerm}} with {{TextField}}'s 
multitermanalyzer that was built back in the {{FieldTypePluginLoader}} above.

So, in Solr 6.x, the basic QueryParser relies on the SolrQueryParserBase and 
all is good.  However, the CPQP doesn't extend the SolrQueryParserBase.  

Two things make this feel like a bug and not a feature in Solr 6.x:

1) multiterm analysis works for the classic query parser in Solr 6.x
2) multiterm analysis works for CPQP for some multiterms (wildcard/reverse 
wildcard) and range, but not in the other multiterms: prefix, regex and fuzzy.


> ComplexPhraseQParserPlugin not running charfilter for some multiterm queries 
> in 6.x 
> 
>
> Key: SOLR-11450
> URL: https://issues.apache.org/jira/browse/SOLR-11450
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 6.6.1
>Reporter: Tim Allison
>Priority: Minor
>  Labels: patch-with-test
> Attachments: SOLR-11450-unit-test.patch, SOLR-11450.patch
>
>
> On the user list, [~bjarkebm] reported that the charfilter is not being 
> applied in PrefixQueries in the ComplexPhraseQParserPlugin in 6.x.  Bjarke 
> fixed my proposed unit tests to prove this failure. All appears to work in 
> 7.x and trunk. If there are plans to release a 6.6.2, let's fold this in.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-11450) ComplexPhraseQParserPlugin not running charfilter for some multiterm queries in 6.x

2017-10-13 Thread Tim Allison (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16203778#comment-16203778
 ] 

Tim Allison edited comment on SOLR-11450 at 10/13/17 4:19 PM:
--

Ha.  Right.  Solr does do its own thing.  {{FieldTypePluginLoader}} generates a 
multiterm analyzer in the TextField by subsetting the TokenizerChain's 
components that are MultitermAware and/or swapping in a KeywordAnalyzer 
--[here|https://github.com/apache/lucene-solr/blob/branch_6x/solr/core/src/java/org/apache/solr/schema/FieldTypePluginLoader.java#L182]
 ...just like {{Analyzer.normalize()}} in 7.x :)

Then {{SolrQueryParserBase}} has an {{analyzeIfMultiTermText}} 
[here|https://github.com/apache/lucene-solr/blob/branch_6x/solr/core/src/java/org/apache/solr/parser/SolrQueryParserBase.java#L883],
 which in turn calls {{TextField}}'s {{analyzeMultiTerm}} with {{TextField}}'s 
multitermanalyzer that was built back in the {{FieldTypePluginLoader}} above.

So, in Solr 6.x, the basic QueryParser relies on the SolrQueryParserBase and 
all is good.  However, the CPQP doesn't extend the SolrQueryParserBase.  

Two things make this feel like a bug and not a feature in Solr 6.x:

1) multiterm analysis works for the classic query parser in Solr 6.x
2) multiterm analysis works for CPQP for some multiterms (wildcard/reverse 
wildcard) and range, but not in the other multiterms: prefix, regex and fuzzy.



was (Author: talli...@mitre.org):
Ha.  Right.  Solr does do its own thing.  {{FieldTypePluginLoader}} generates a 
multiterm analyzer in the TextField by subsetting the TokenizerChain's 
components that are MultitermAware and swapping in a KeywordTokenizer 
--[here|http://example.com] 
[https://github.com/apache/lucene-solr/blob/branch_6x/solr/core/src/java/org/apache/solr/schema/FieldTypePluginLoader.java#L182]
 ...just CustomAnalyzer's {{normalize()}} in 7.x :)

Then {{SolrQueryParserBase}} has an {{analyzeIfMultiTermText}} 
[here|https://github.com/apache/lucene-solr/blob/branch_6x/solr/core/src/java/org/apache/solr/parser/SolrQueryParserBase.java#L883],
 which in turn calls {{TextField}}'s {{analyzeMultiTerm}} with {{TextField}}'s 
multitermanalyzer that was built back in the {{FieldTypePluginLoader}} above.

So, in Solr 6.x, the basic QueryParser relies on the SolrQueryParserBase and 
all is good.  However, the CPQP doesn't extend the SolrQueryParserBase.  

Two things make this feel like a bug and not a feature in Solr 6.x:

1) multiterm analysis works for the classic query parser in Solr 6.x
2) multiterm analysis works for CPQP for some multiterms (wildcard/reverse 
wildcard) and range, but not in the other multiterms: prefix, regex and fuzzy.


> ComplexPhraseQParserPlugin not running charfilter for some multiterm queries 
> in 6.x 
> 
>
> Key: SOLR-11450
> URL: https://issues.apache.org/jira/browse/SOLR-11450
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 6.6.1
>Reporter: Tim Allison
>Priority: Minor
>  Labels: patch-with-test
> Attachments: SOLR-11450-unit-test.patch, SOLR-11450.patch
>
>
> On the user list, [~bjarkebm] reported that the charfilter is not being 
> applied in PrefixQueries in the ComplexPhraseQParserPlugin in 6.x.  Bjarke 
> fixed my proposed unit tests to prove this failure. All appears to work in 
> 7.x and trunk. If there are plans to release a 6.6.2, let's fold this in.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11450) ComplexPhraseQParserPlugin not running charfilter for some multiterm queries in 6.x

2017-10-13 Thread Tim Allison (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16203778#comment-16203778
 ] 

Tim Allison commented on SOLR-11450:


Ha.  Right.  Solr does do its own thing.  {{FieldTypePluginLoader}} generates a 
multiterm analyzer in the TextField by subsetting the TokenizerChain's 
components that are MultitermAware and swapping in a KeywordTokenizer 
--[here|http://example.com] 
[https://github.com/apache/lucene-solr/blob/branch_6x/solr/core/src/java/org/apache/solr/schema/FieldTypePluginLoader.java#L182]
 ...just CustomAnalyzer's {{normalize()}} in 7.x :)

Then {{SolrQueryParserBase}} has an {{analyzeIfMultiTermText}} 
[here|https://github.com/apache/lucene-solr/blob/branch_6x/solr/core/src/java/org/apache/solr/parser/SolrQueryParserBase.java#L883],
 which in turn calls {{TextField}}'s {{analyzeMultiTerm}} with {{TextField}}'s 
multitermanalyzer that was built back in the {{FieldTypePluginLoader}} above.

So, in Solr 6.x, the basic QueryParser relies on the SolrQueryParserBase and 
all is good.  However, the CPQP doesn't extend the SolrQueryParserBase.  

Two things make this feel like a bug and not a feature in Solr 6.x:

1) multiterm analysis works for the classic query parser in Solr 6.x
2) multiterm analysis works for CPQP for some multiterms (wildcard/reverse 
wildcard) and range, but not in the other multiterms: prefix, regex and fuzzy.


> ComplexPhraseQParserPlugin not running charfilter for some multiterm queries 
> in 6.x 
> 
>
> Key: SOLR-11450
> URL: https://issues.apache.org/jira/browse/SOLR-11450
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 6.6.1
>Reporter: Tim Allison
>Priority: Minor
>  Labels: patch-with-test
> Attachments: SOLR-11450-unit-test.patch, SOLR-11450.patch
>
>
> On the user list, [~bjarkebm] reported that the charfilter is not being 
> applied in PrefixQueries in the ComplexPhraseQParserPlugin in 6.x.  Bjarke 
> fixed my proposed unit tests to prove this failure. All appears to work in 
> 7.x and trunk. If there are plans to release a 6.6.2, let's fold this in.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-10335) Upgrade to Tika 1.16 when available

2017-10-13 Thread Uwe Schindler (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-10335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated SOLR-10335:
-
Priority: Critical  (was: Minor)

> Upgrade to Tika 1.16 when available
> ---
>
> Key: SOLR-10335
> URL: https://issues.apache.org/jira/browse/SOLR-10335
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Tim Allison
>Assignee: Shalin Shekhar Mangar
>Priority: Critical
>
> Once POI 3.16-beta3 is out (early/mid April?), we'll push for a release of 
> Tika 1.15.
> Please let us know if you have any requests.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8389) Convert CDCR peer cluster and other configurations into collection properties modifiable via APIs

2017-10-13 Thread Amrit Sarkar (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16203745#comment-16203745
 ] 

Amrit Sarkar commented on SOLR-8389:


Peter, how often the collection properties will change? Are these properties 
comprise of {{user-defined}} ones or subjected to machine / node level 
properties? If unrelated updates are too frequent., I think Varun is right; it 
won't be helpful in this case atleast. Let us know.



> Convert CDCR peer cluster and other configurations into collection properties 
> modifiable via APIs
> -
>
> Key: SOLR-8389
> URL: https://issues.apache.org/jira/browse/SOLR-8389
> Project: Solr
>  Issue Type: Improvement
>  Components: CDCR, SolrCloud
>Reporter: Shalin Shekhar Mangar
>
> CDCR configuration is kept inside solrconfig.xml which makes it difficult to 
> add or change peer cluster configuration.
> I propose to move all CDCR config to collection level properties in cluster 
> state so that they can be modified using the existing modify collection API.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4100) Maxscore - Efficient Scoring

2017-10-13 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16203693#comment-16203693
 ] 

Robert Muir commented on LUCENE-4100:
-

yeah, I think i am looking at it from the top-down (indexsearcher) vs bottom up 
(queries).

indexsearcher already knows if scores are needed (e.g. from the Sort), but 
there is no way to tell it that approximate total hit count is acceptable. If 
we can do that, then I think we can make the early termination case really easy 
for the sorted case, index order case, and also this maxscore case.

{quote}
Ideally we would not need another parameter on Query.createWeight for MAXSCORE 
either, but the issue is that depending on whether you want to collect all hits 
or only the top-scoring ones, then we need different Scorer impls.
{quote}

we do? (genuine question). We added needsScores because previously scorers had 
to always be ready for you to "lazily" call score(), and this prevented scoring 
from doing much more interesting things up-front like caching whole bitsets, 
but is it really the case for maxScore? I'm just asking because the new scorer 
here looks a hell of a lot like a disjunction scorer :) If we truly need a 
different impl, we should maybe still think it thru because of stuff like 
setMinCompetitiveScore() method, which would make no sense except for that 
case. I do like that in your patch AssertingScorer checks all that stuff, but 
there it is a bit confusing.


> Maxscore - Efficient Scoring
> 
>
> Key: LUCENE-4100
> URL: https://issues.apache.org/jira/browse/LUCENE-4100
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/codecs, core/query/scoring, core/search
>Affects Versions: 4.0-ALPHA
>Reporter: Stefan Pohl
>  Labels: api-change, gsoc2014, patch, performance
> Fix For: 4.9, 6.0
>
> Attachments: LUCENE-4100.patch, LUCENE-4100.patch, 
> contrib_maxscore.tgz, maxscore.patch
>
>
> At Berlin Buzzwords 2012, I will be presenting 'maxscore', an efficient 
> algorithm first published in the IR domain in 1995 by H. Turtle & J. Flood, 
> that I find deserves more attention among Lucene users (and developers).
> I implemented a proof of concept and did some performance measurements with 
> example queries and lucenebench, the package of Mike McCandless, resulting in 
> very significant speedups.
> This ticket is to get started the discussion on including the implementation 
> into Lucene's codebase. Because the technique requires awareness about it 
> from the Lucene user/developer, it seems best to become a contrib/module 
> package so that it consciously can be chosen to be used.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10335) Upgrade to Tika 1.16 when available

2017-10-13 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16203677#comment-16203677
 ] 

ASF subversion and git services commented on SOLR-10335:


Commit 84c90ad2c0218156c840e19a64d72b8a38550659 in lucene-solr's branch 
refs/heads/branch_7_1 from [~shalinmangar]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=84c90ad ]

SOLR-10335: Merge branch 'SOLR-10335' of 
https://github.com/tballison/lucene-solr

(cherry picked from commit 3a098ec)


> Upgrade to Tika 1.16 when available
> ---
>
> Key: SOLR-10335
> URL: https://issues.apache.org/jira/browse/SOLR-10335
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Tim Allison
>Assignee: Shalin Shekhar Mangar
>Priority: Minor
>
> Once POI 3.16-beta3 is out (early/mid April?), we'll push for a release of 
> Tika 1.15.
> Please let us know if you have any requests.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10335) Upgrade to Tika 1.16 when available

2017-10-13 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16203678#comment-16203678
 ] 

ASF subversion and git services commented on SOLR-10335:


Commit 84c90ad2c0218156c840e19a64d72b8a38550659 in lucene-solr's branch 
refs/heads/branch_7_1 from [~shalinmangar]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=84c90ad ]

SOLR-10335: Merge branch 'SOLR-10335' of 
https://github.com/tballison/lucene-solr

(cherry picked from commit 3a098ec)


> Upgrade to Tika 1.16 when available
> ---
>
> Key: SOLR-10335
> URL: https://issues.apache.org/jira/browse/SOLR-10335
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Tim Allison
>Assignee: Shalin Shekhar Mangar
>Priority: Minor
>
> Once POI 3.16-beta3 is out (early/mid April?), we'll push for a release of 
> Tika 1.15.
> Please let us know if you have any requests.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4100) Maxscore - Efficient Scoring

2017-10-13 Thread Adrien Grand (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16203676#comment-16203676
 ] 

Adrien Grand commented on LUCENE-4100:
--

Yes, but this 3rd case is fine since it can be handled solely from the 
collector side, so we don't need to make weights aware of it?

Ideally we would not need another parameter on {{Query.createWeight}} for 
{{MAXSCORE}} either, but the issue is that depending on whether you want to 
collect all hits or only the top-scoring ones, then we need different Scorer 
impls.

bq. There is the current sorted-index case today, but you have to be a rocket 
scientist (custom collector) to use it

+1 to fix it and bake support for early termination in TopFieldDocCollector.

> Maxscore - Efficient Scoring
> 
>
> Key: LUCENE-4100
> URL: https://issues.apache.org/jira/browse/LUCENE-4100
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/codecs, core/query/scoring, core/search
>Affects Versions: 4.0-ALPHA
>Reporter: Stefan Pohl
>  Labels: api-change, gsoc2014, patch, performance
> Fix For: 4.9, 6.0
>
> Attachments: LUCENE-4100.patch, LUCENE-4100.patch, 
> contrib_maxscore.tgz, maxscore.patch
>
>
> At Berlin Buzzwords 2012, I will be presenting 'maxscore', an efficient 
> algorithm first published in the IR domain in 1995 by H. Turtle & J. Flood, 
> that I find deserves more attention among Lucene users (and developers).
> I implemented a proof of concept and did some performance measurements with 
> example queries and lucenebench, the package of Mike McCandless, resulting in 
> very significant speedups.
> This ticket is to get started the discussion on including the implementation 
> into Lucene's codebase. Because the technique requires awareness about it 
> from the Lucene user/developer, it seems best to become a contrib/module 
> package so that it consciously can be chosen to be used.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10335) Upgrade to Tika 1.16 when available

2017-10-13 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16203667#comment-16203667
 ] 

ASF subversion and git services commented on SOLR-10335:


Commit 3a098ecb8199b9a7e67655a1165660fd13d878b2 in lucene-solr's branch 
refs/heads/branch_7x from [~shalinmangar]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=3a098ec ]

SOLR-10335: Merge branch 'SOLR-10335' of 
https://github.com/tballison/lucene-solr


> Upgrade to Tika 1.16 when available
> ---
>
> Key: SOLR-10335
> URL: https://issues.apache.org/jira/browse/SOLR-10335
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Tim Allison
>Assignee: Shalin Shekhar Mangar
>Priority: Minor
>
> Once POI 3.16-beta3 is out (early/mid April?), we'll push for a release of 
> Tika 1.15.
> Please let us know if you have any requests.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10335) Upgrade to Tika 1.16 when available

2017-10-13 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16203668#comment-16203668
 ] 

ASF subversion and git services commented on SOLR-10335:


Commit 3a098ecb8199b9a7e67655a1165660fd13d878b2 in lucene-solr's branch 
refs/heads/branch_7x from [~shalinmangar]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=3a098ec ]

SOLR-10335: Merge branch 'SOLR-10335' of 
https://github.com/tballison/lucene-solr


> Upgrade to Tika 1.16 when available
> ---
>
> Key: SOLR-10335
> URL: https://issues.apache.org/jira/browse/SOLR-10335
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Tim Allison
>Assignee: Shalin Shekhar Mangar
>Priority: Minor
>
> Once POI 3.16-beta3 is out (early/mid April?), we'll push for a release of 
> Tika 1.15.
> Please let us know if you have any requests.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10335) Upgrade to Tika 1.16 when available

2017-10-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16203665#comment-16203665
 ] 

ASF GitHub Bot commented on SOLR-10335:
---

Github user asfgit closed the pull request at:

https://github.com/apache/lucene-solr/pull/259


> Upgrade to Tika 1.16 when available
> ---
>
> Key: SOLR-10335
> URL: https://issues.apache.org/jira/browse/SOLR-10335
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Tim Allison
>Assignee: Shalin Shekhar Mangar
>Priority: Minor
>
> Once POI 3.16-beta3 is out (early/mid April?), we'll push for a release of 
> Tika 1.15.
> Please let us know if you have any requests.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10335) Upgrade to Tika 1.16 when available

2017-10-13 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16203659#comment-16203659
 ] 

ASF subversion and git services commented on SOLR-10335:


Commit a5c4777314ca26ad7b6e33060c7a5132a80a1827 in lucene-solr's branch 
refs/heads/master from [~talli...@mitre.org]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=a5c4777 ]

SOLR-10335 -- Upgrade to Tika 1.16


> Upgrade to Tika 1.16 when available
> ---
>
> Key: SOLR-10335
> URL: https://issues.apache.org/jira/browse/SOLR-10335
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Tim Allison
>Assignee: Shalin Shekhar Mangar
>Priority: Minor
>
> Once POI 3.16-beta3 is out (early/mid April?), we'll push for a release of 
> Tika 1.15.
> Please let us know if you have any requests.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[GitHub] lucene-solr pull request #259: SOLR-10335

2017-10-13 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/lucene-solr/pull/259


---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10335) Upgrade to Tika 1.16 when available

2017-10-13 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16203662#comment-16203662
 ] 

ASF subversion and git services commented on SOLR-10335:


Commit 9d2f0cf2547263e8c7472b694e7ed6e700f1399b in lucene-solr's branch 
refs/heads/master from [~shalinmangar]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=9d2f0cf ]

SOLR-10335: Merge branch 'SOLR-10335' of 
https://github.com/tballison/lucene-solr


> Upgrade to Tika 1.16 when available
> ---
>
> Key: SOLR-10335
> URL: https://issues.apache.org/jira/browse/SOLR-10335
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Tim Allison
>Assignee: Shalin Shekhar Mangar
>Priority: Minor
>
> Once POI 3.16-beta3 is out (early/mid April?), we'll push for a release of 
> Tika 1.15.
> Please let us know if you have any requests.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10335) Upgrade to Tika 1.16 when available

2017-10-13 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16203660#comment-16203660
 ] 

ASF subversion and git services commented on SOLR-10335:


Commit c501e8139c569e703c0c2de80173a89ab7fc1c8a in lucene-solr's branch 
refs/heads/master from [~talli...@mitre.org]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=c501e81 ]

SOLR-10335 -- Upgrade to Tika 1.16 -- add collections4 sha1 and license/notice 
info


> Upgrade to Tika 1.16 when available
> ---
>
> Key: SOLR-10335
> URL: https://issues.apache.org/jira/browse/SOLR-10335
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Tim Allison
>Assignee: Shalin Shekhar Mangar
>Priority: Minor
>
> Once POI 3.16-beta3 is out (early/mid April?), we'll push for a release of 
> Tika 1.15.
> Please let us know if you have any requests.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10335) Upgrade to Tika 1.16 when available

2017-10-13 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16203663#comment-16203663
 ] 

ASF subversion and git services commented on SOLR-10335:


Commit 9d2f0cf2547263e8c7472b694e7ed6e700f1399b in lucene-solr's branch 
refs/heads/master from [~shalinmangar]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=9d2f0cf ]

SOLR-10335: Merge branch 'SOLR-10335' of 
https://github.com/tballison/lucene-solr


> Upgrade to Tika 1.16 when available
> ---
>
> Key: SOLR-10335
> URL: https://issues.apache.org/jira/browse/SOLR-10335
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Tim Allison
>Assignee: Shalin Shekhar Mangar
>Priority: Minor
>
> Once POI 3.16-beta3 is out (early/mid April?), we'll push for a release of 
> Tika 1.15.
> Please let us know if you have any requests.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10335) Upgrade to Tika 1.16 when available

2017-10-13 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16203661#comment-16203661
 ] 

ASF subversion and git services commented on SOLR-10335:


Commit 4c7ff73c98169c837a8617fcfb8ea1789df29473 in lucene-solr's branch 
refs/heads/master from [~talli...@mitre.org]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=4c7ff73 ]

Merge remote-tracking branch 'upstream/master' into SOLR-10335


> Upgrade to Tika 1.16 when available
> ---
>
> Key: SOLR-10335
> URL: https://issues.apache.org/jira/browse/SOLR-10335
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Tim Allison
>Assignee: Shalin Shekhar Mangar
>Priority: Minor
>
> Once POI 3.16-beta3 is out (early/mid April?), we'll push for a release of 
> Tika 1.15.
> Please let us know if you have any requests.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-11467) CdcrBootstrapTest::testBootstrapWithContinousIndexingOnSourceCluster Failure

2017-10-13 Thread Amrit Sarkar (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16203143#comment-16203143
 ] 

Amrit Sarkar edited comment on SOLR-11467 at 10/13/17 2:49 PM:
---

{code}
- System.out.println("Adding " + docs + " docs with commit=true, numDocs=" + 
numDocs);
+ System.out.println("Adding 100 docs with commit=true, numDocs=" + numDocs);
{code}

This will be seperate patch and seperate jira. We always add 100 docs in batch 
and commit; but still the msg shows 10, when no nightly, 100 when nightly. 
That's a wrong log line, I thought putting it up now will shed the confusion.

Yes, I did put nocommit, so that you can verify. Thanks :)


was (Author: sarkaramr...@gmail.com):
{code}
- System.out.println("Adding " + docs + " docs with commit=true, numDocs=" + 
numDocs);
+ System.out.println("Adding 100 docs with commit=true, numDocs=" + numDocs);
{code}

This will be seperate patch and seperate jira. We always add 100 docs in batch 
and commit but still the msg shows 10, when no nightly, 100 when nightly. 
That's a wrong log line, I thought putting it up now will shed the confusion.

Yes, I put nocommit, so that you can verify. Thanks :)

> CdcrBootstrapTest::testBootstrapWithContinousIndexingOnSourceCluster Failure
> 
>
> Key: SOLR-11467
> URL: https://issues.apache.org/jira/browse/SOLR-11467
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: CDCR
>Affects Versions: 6.6, 6.6.1, 7.0, 7.0.1, 7.1
>Reporter: Amrit Sarkar
> Attachments: SOLR-11467-debug-code.log
>
>
> CdcrBootstrapTest is still failing in master and other branches with:
> {code}
> [junit4] FAILURE  130s J1 | 
> CdcrBootstrapTest.testBootstrapWithContinousIndexingOnSourceCluster <<<
>[junit4]> Throwable #1: java.lang.AssertionError: Document mismatch on 
> target after sync expected:<2000> but was:<1901>
>[junit4]>  at 
> __randomizedtesting.SeedInfo.seed([41753A7BCCA7C778:953071222BF17483]:0)
>[junit4]>  at 
> org.apache.solr.cloud.CdcrBootstrapTest.testBootstrapWithContinousIndexingOnSourceCluster(CdcrBootstrapTest.java:309)
>[junit4]>  at java.lang.Thread.run(Thread.java:748)
> {code}
> ref: https://jenkins.thetaphi.de/job/Lucene-Solr-7.0-Linux/423/
> From one of the failed Solr jenkins log:
>[junit4]   2> 1143166 INFO  
> (cdcr-replicator-4421-thread-1-processing-n:127.0.0.1:62832_solr 
> x:cdcr-source_shard1_replica_n1 s:shard1 c:cdcr-source r:core_node2) 
> [n:127.0.0.1:62832_solr c:cdcr-source s:shard1 r:core_node2 
> x:cdcr-source_shard1_replica_n1] o.a.s.h.CdcrReplicator Forwarded 991 updates 
> to target cdcr-target
>[junit4]   2> 1144176 INFO  
> (cdcr-replicator-4421-thread-1-processing-n:127.0.0.1:62832_solr 
> x:cdcr-source_shard1_replica_n1 s:shard1 c:cdcr-source r:core_node2) 
> [n:127.0.0.1:62832_solr c:cdcr-source s:shard1 r:core_node2 
> x:cdcr-source_shard1_replica_n1] o.a.s.h.CdcrReplicator Forwarded 909 updates 
> to target cdcr-target
>[junit4]   2> 1145118 INFO  
> (cdcr-replicator-4421-thread-1-processing-n:127.0.0.1:62832_solr 
> x:cdcr-source_shard1_replica_n1 s:shard1 c:cdcr-source r:core_node2) 
> [n:127.0.0.1:62832_solr c:cdcr-source s:shard1 r:core_node2 
> x:cdcr-source_shard1_replica_n1] o.a.s.h.CdcrReplicator Forwarded 0 updates 
> to target cdcr-target
> Total 1900 updates were sent, instead of 2000. Ideally the bootstrap process 
> is responsible for 1000, and normal cdc replication is responsble for 1000. 
> On looking closely, the bootstrap is completed successfully. We are 100% sure 
> here, bootstrap worked w/o any issue. And still 1900 updates were sent via 
> replicator, instead of 1000. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-11443) Remove the usage of workqueue for Overseer

2017-10-13 Thread Cao Manh Dat (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-11443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cao Manh Dat updated SOLR-11443:

Attachment: SOLR-11443.patch

Updated patch for master
1. Adding fallbackQueue concept, in the startup, we consider workQueue as 
fallbackQueue. Which contains messages that need to process one by one - if 
there a message that causes exception on writing new clusterstate to Zk, 
consider that as bad message and poll out from fallbackQueue.
2. After that, stateUpdateQueue is used as fallbackQueue, cause we writing in 
batch, so if an exception is thrown on writing new clusterstate, we don't know 
which message is bad, so we go back to the beginning of the loop and do 1. 

> Remove the usage of workqueue for Overseer
> --
>
> Key: SOLR-11443
> URL: https://issues.apache.org/jira/browse/SOLR-11443
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Cao Manh Dat
>Assignee: Cao Manh Dat
> Attachments: SOLR-11443.patch, SOLR-11443.patch, SOLR-11443.patch
>
>
> If we can remove the usage of workqueue, We can save a lot of IO blocking in 
> Overseer, hence boost performance a lot.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11450) ComplexPhraseQParserPlugin not running charfilter for some multiterm queries in 6.x

2017-10-13 Thread Adrien Grand (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16203615#comment-16203615
 ] 

Adrien Grand commented on SOLR-11450:
-

Maybe Solr's classic analyzer is doing more than Lucene's. When I run the 
following code:

{code}
Analyzer analyzer = CustomAnalyzer.builder()
.withTokenizer(StandardTokenizerFactory.class)
.addTokenFilter(LowerCaseFilterFactory.class)
.addTokenFilter(ASCIIFoldingFilterFactory.class)
.build();
QueryParser qp = new QueryParser("field", analyzer);
Query query = qp.parse("König*");
System.out.println(query);
{code}

I get {{field:könig*}}, so the ascii folding filter is definitely not applied. 
Is Solr maybe using AnalyzingQueryParser?

> ComplexPhraseQParserPlugin not running charfilter for some multiterm queries 
> in 6.x 
> 
>
> Key: SOLR-11450
> URL: https://issues.apache.org/jira/browse/SOLR-11450
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 6.6.1
>Reporter: Tim Allison
>Priority: Minor
>  Labels: patch-with-test
> Attachments: SOLR-11450-unit-test.patch, SOLR-11450.patch
>
>
> On the user list, [~bjarkebm] reported that the charfilter is not being 
> applied in PrefixQueries in the ComplexPhraseQParserPlugin in 6.x.  Bjarke 
> fixed my proposed unit tests to prove this failure. All appears to work in 
> 7.x and trunk. If there are plans to release a 6.6.2, let's fold this in.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-10335) Upgrade to Tika 1.16 when available

2017-10-13 Thread Shalin Shekhar Mangar (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-10335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16203609#comment-16203609
 ] 

Shalin Shekhar Mangar commented on SOLR-10335:
--

[~talli...@mitre.org] - It is likely that there will be a 6.6.2 due to the 
other vulnerabilities so yes, we should back port this to branch_6x and 
branch_6_6 too.

> Upgrade to Tika 1.16 when available
> ---
>
> Key: SOLR-10335
> URL: https://issues.apache.org/jira/browse/SOLR-10335
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Tim Allison
>Assignee: Shalin Shekhar Mangar
>Priority: Minor
>
> Once POI 3.16-beta3 is out (early/mid April?), we'll push for a release of 
> Tika 1.15.
> Please let us know if you have any requests.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4100) Maxscore - Efficient Scoring

2017-10-13 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16203603#comment-16203603
 ] 

Robert Muir commented on LUCENE-4100:
-

{quote}
Sorry I wrote needsTotalHits because this is the option name I used on the 
TopScoreDocCollector factory method, but on the weight we'd need something 
different. Because there are two situations in which you would need scores but 
not visit all matches:
{quote}

I'm not sure thats true. There is another case right? There is the current 
sorted-index case today, but you have to be a rocket scientist (custom 
collector) to use it. Surely the API should handle that case? (it should "just 
work" from indexsearcher). 

> Maxscore - Efficient Scoring
> 
>
> Key: LUCENE-4100
> URL: https://issues.apache.org/jira/browse/LUCENE-4100
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/codecs, core/query/scoring, core/search
>Affects Versions: 4.0-ALPHA
>Reporter: Stefan Pohl
>  Labels: api-change, gsoc2014, patch, performance
> Fix For: 4.9, 6.0
>
> Attachments: LUCENE-4100.patch, LUCENE-4100.patch, 
> contrib_maxscore.tgz, maxscore.patch
>
>
> At Berlin Buzzwords 2012, I will be presenting 'maxscore', an efficient 
> algorithm first published in the IR domain in 1995 by H. Turtle & J. Flood, 
> that I find deserves more attention among Lucene users (and developers).
> I implemented a proof of concept and did some performance measurements with 
> example queries and lucenebench, the package of Mike McCandless, resulting in 
> very significant speedups.
> This ticket is to get started the discussion on including the implementation 
> into Lucene's codebase. Because the technique requires awareness about it 
> from the Lucene user/developer, it seems best to become a contrib/module 
> package so that it consciously can be chosen to be used.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-4100) Maxscore - Efficient Scoring

2017-10-13 Thread Adrien Grand (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-4100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16203596#comment-16203596
 ] 

Adrien Grand commented on LUCENE-4100:
--

bq. Why doesn't it make sense? If i do a query, sorting by reverse time 
(recency), and retrieve the top 20, then i don't need scores, why do i need an 
exact hit count too? I think an approximation would suffice.

Sorry I wrote {{needsTotalHits}} because this is the option name I used on the 
TopScoreDocCollector factory method, but on the weight we'd need something 
different. Because there are two situations in which you would need scores but 
not visit all matches:
 - sorting by score
 - sorting by a field (potentially with early-termination) then score

Yet these cases are different since we can only apply MAXSCORE in the first 
case. So what we need is more something like {{canSkipNonCompetitiveScores}} 
but then this implies that {{needsScores}} is true, so there would be illegal 
combinations. Which is why I went with the enum.

bq. I think naively we want to base it on where we early terminate (as oppose 
to maxdoc) but i get the idea with many clauses. still, i think this estimate 
may be "good enough" because as you paginate, the estimate would get better?

OK I'll give this approach a go.

> Maxscore - Efficient Scoring
> 
>
> Key: LUCENE-4100
> URL: https://issues.apache.org/jira/browse/LUCENE-4100
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/codecs, core/query/scoring, core/search
>Affects Versions: 4.0-ALPHA
>Reporter: Stefan Pohl
>  Labels: api-change, gsoc2014, patch, performance
> Fix For: 4.9, 6.0
>
> Attachments: LUCENE-4100.patch, LUCENE-4100.patch, 
> contrib_maxscore.tgz, maxscore.patch
>
>
> At Berlin Buzzwords 2012, I will be presenting 'maxscore', an efficient 
> algorithm first published in the IR domain in 1995 by H. Turtle & J. Flood, 
> that I find deserves more attention among Lucene users (and developers).
> I implemented a proof of concept and did some performance measurements with 
> example queries and lucenebench, the package of Mike McCandless, resulting in 
> very significant speedups.
> This ticket is to get started the discussion on including the implementation 
> into Lucene's codebase. Because the technique requires awareness about it 
> from the Lucene user/developer, it seems best to become a contrib/module 
> package so that it consciously can be chosen to be used.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11450) ComplexPhraseQParserPlugin not running charfilter for some multiterm queries in 6.x

2017-10-13 Thread Tim Allison (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16203573#comment-16203573
 ] 

Tim Allison commented on SOLR-11450:


[~jpountz], thank you for your response!

Y, the changes in 7.x are fantastic.

Am I misunderstanding 6.x, though?  This test passes, which suggests that 
normalization was working correctly for the classic queryparser in 6.x, but not 
the cpqp.  Or am I misunderstanding?

If your point is that this would be a breaking change for some users of cpqp 
and it therefore doesn't belong in a bugfix release, I'm willing to accept that.

{noformat}
  @Test
  public void testCharFilter() {
assertU(adoc("iso-latin1", "craezy traen", "id", "1"));
assertU(commit());
assertU(optimize());

assertQ(req("q",  "iso-latin1:cr\u00E6zy")
, "//result[@numFound='1']"
, "//doc[./str[@name='id']='1']"
);

assertQ(req("q", "iso-latin1:tr\u00E6n")
, "//result[@numFound='1']"
, "//doc[./str[@name='id']='1']"
);

assertQ(req("q", "iso-latin1:c\u00E6zy~1")
, "//result[@numFound='1']"
, "//doc[./str[@name='id']='1']"
);

assertQ(req("q", "iso-latin1:cr\u00E6z*")
, "//result[@numFound='1']"
, "//doc[./str[@name='id']='1']"
);

assertQ(req("q", "iso-latin1:*\u00E6zy")
, "//result[@numFound='1']"
, "//doc[./str[@name='id']='1']"
);

assertQ(req("q", "iso-latin1:cr\u00E6*y")
, "//result[@numFound='1']"
, "//doc[./str[@name='id']='1']"
);

assertQ(req("q", "iso-latin1:/cr\u00E6[a-z]y/")
, "//result[@numFound='1']"
, "//doc[./str[@name='id']='1']"
);

assertQ(req("q", "iso-latin1:[cr\u00E6zx TO cr\u00E6zz]")
, "//result[@numFound='1']"
, "//doc[./str[@name='id']='1']"
);
}
{noformat}

> ComplexPhraseQParserPlugin not running charfilter for some multiterm queries 
> in 6.x 
> 
>
> Key: SOLR-11450
> URL: https://issues.apache.org/jira/browse/SOLR-11450
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>Affects Versions: 6.6.1
>Reporter: Tim Allison
>Priority: Minor
>  Labels: patch-with-test
> Attachments: SOLR-11450-unit-test.patch, SOLR-11450.patch
>
>
> On the user list, [~bjarkebm] reported that the charfilter is not being 
> applied in PrefixQueries in the ComplexPhraseQParserPlugin in 6.x.  Bjarke 
> fixed my proposed unit tests to prove this failure. All appears to work in 
> 7.x and trunk. If there are plans to release a 6.6.2, let's fold this in.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



  1   2   >