[jira] [Commented] (SOLR-12167) Child documents are ignored if unknown atomic operation specified in parent doc

2018-04-02 Thread Lucene/Solr QA (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-12167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16423506#comment-16423506
 ] 

Lucene/Solr QA commented on SOLR-12167:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
51s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} Release audit (RAT) {color} | 
{color:green}  0m 51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} Check forbidden APIs {color} | 
{color:green}  0m 51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} Validate source patterns {color} | 
{color:green}  0m 51s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 51m 22s{color} 
| {color:red} core in the patch failed. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 54m 49s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | solr.cloud.TestLeaderElectionZkExpiry |
|   | solr.cloud.cdcr.CdcrBidirectionalTest |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | SOLR-12167 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12917198/SOLR-12167.patch |
| Optional Tests |  compile  javac  unit  ratsources  checkforbiddenapis  
validatesourcepatterns  |
| uname | Linux lucene1-us-west 3.13.0-88-generic #135-Ubuntu SMP Wed Jun 8 
21:10:42 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | ant |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-SOLR-Build/sourcedir/dev-tools/test-patch/lucene-solr-yetus-personality.sh
 |
| git revision | master / 2c1f110 |
| ant | version: Apache Ant(TM) version 1.9.3 compiled on April 8 2014 |
| Default Java | 1.8.0_152 |
| unit | 
https://builds.apache.org/job/PreCommit-SOLR-Build/34/artifact/out/patch-unit-solr_core.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-SOLR-Build/34/testReport/ |
| modules | C: solr/core U: solr/core |
| Console output | 
https://builds.apache.org/job/PreCommit-SOLR-Build/34/console |
| Powered by | Apache Yetus 0.7.0   http://yetus.apache.org |


This message was automatically generated.



> Child documents are ignored if unknown atomic operation specified in parent 
> doc
> ---
>
> Key: SOLR-12167
> URL: https://issues.apache.org/jira/browse/SOLR-12167
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: update
>Reporter: Munendra S N
>Priority: Major
> Attachments: SOLR-12167.patch
>
>
> On trying to add this nested document,
> {code:java}
> {uniqueId : book6, type_s:book, title_t : "The Way of Kings", author_s : 
> "Brandon Sanderson",
>   cat_s:fantasy, pubyear_i:2010, publisher_s:Tor, parent_unbxd:true,
>   _childDocuments_ : [
> { uniqueId: book6_c1, type_s:review, 
> review_dt:"2015-01-03T14:30:00Z",parentId : book6,
>   stars_i:5, author_s:rahul,
>   comment_t:"A great start to what looks like an epic series!"
> }
> ,
> { uniqueId: book6_c2, type_s:review, 
> review_dt:"2014-03-15T12:00:00Z",parentId : book6,
>   stars_i:3, author_s:arpan,
>   comment_t:"This book was too long."
> }
>   ],labelinfo:{label_image:"",hotdeal_type:"",apply_hotdeal:""}
>  }
> {code}
> Only parent document is getting indexed(without labelinfo field) and child 
> documents are being ingored.
> On checking the code,
> https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/update/processor/AtomicUpdateDocumentMerger.java#L94
>  
> I realized that since *labelinfo* is a Map, Solr is trying for atomic updates 
> and since label_image, hotdeal_type, apply_hotdeal are invalid operation 
> field is ignored. Unfortunately, child documents are also not getting indexed.
> h4. Problem with current behavior:
> * field is silently ignored when its value is a map instead of failing 
> document update(when present in parent)
> * In the above 

[JENKINS] Lucene-Solr-master-Linux (64bit/jdk1.8.0_162) - Build # 21746 - Unstable!

2018-04-02 Thread Policeman Jenkins Server
Build: https://jenkins.thetaphi.de/job/Lucene-Solr-master-Linux/21746/
Java: 64bit/jdk1.8.0_162 -XX:-UseCompressedOops -XX:+UseConcMarkSweepGC

4 tests failed.
FAILED:  
org.apache.solr.handler.admin.SegmentsInfoRequestHandlerTest.testSegmentInfosVersion

Error Message:
Exception during query

Stack Trace:
java.lang.RuntimeException: Exception during query
at 
__randomizedtesting.SeedInfo.seed([8E651BAF9CB40369:76BB8E47E4D8D23A]:0)
at org.apache.solr.SolrTestCaseJ4.assertQ(SolrTestCaseJ4.java:914)
at 
org.apache.solr.handler.admin.SegmentsInfoRequestHandlerTest.testSegmentInfosVersion(SegmentsInfoRequestHandlerTest.java:68)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1737)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:934)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:970)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:984)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:468)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:943)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:829)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:879)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:890)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.RuntimeException: REQUEST FAILED: 
xpath=2=count(//lst[@name='segments']/lst/str[@name='version'][.='8.0.0'])
xml response was: 


[jira] [Commented] (SOLR-12172) Race condition in collection properties can cause invalid cache of properties

2018-04-02 Thread Shalin Shekhar Mangar (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-12172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16423500#comment-16423500
 ] 

Shalin Shekhar Mangar commented on SOLR-12172:
--

[~tomasflobbe] -- I don't think we should introduce another thread(pool) just 
for this feature. We can use a method similar to updateWatchedCollection which 
checks if the new znode version is greater than the old one. This ensures that 
we replace the old collection props only if the new one is actually newer.

> Race condition in collection properties can cause invalid cache of properties
> -
>
> Key: SOLR-12172
> URL: https://issues.apache.org/jira/browse/SOLR-12172
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Tests
>Reporter: Tomás Fernández Löbbe
>Assignee: Tomás Fernández Löbbe
>Priority: Minor
> Fix For: 7.4, master (8.0)
>
> Attachments: SOLR-12172.patch
>
>
> From: https://builds.apache.org/job/Lucene-Solr-BadApples-Tests-master/24
> {noformat}
> java.lang.AssertionError: Could not see value change after setting collection 
> property. Name: property2, current value: value2, expected value: newValue
>   at 
> __randomizedtesting.SeedInfo.seed([1BCE6473A2A5E68A:FD89A9BD30939A79]:0)
>   at org.junit.Assert.fail(Assert.java:93)
>   at 
> org.apache.solr.cloud.CollectionPropsTest.waitForValue(CollectionPropsTest.java:146)
>   at 
> org.apache.solr.cloud.CollectionPropsTest.testReadWriteCached(CollectionPropsTest.java:115){noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-8234) GeoStandardCircle can compute wrongly the spatial relationship when covering the whole world

2018-04-02 Thread Ignacio Vera (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-8234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ignacio Vera updated LUCENE-8234:
-
Summary: GeoStandardCircle can compute wrongly the spatial relationship 
when covering the whole world  (was: GeoStandard circle can compute wrongly the 
spatial relationship when covering the whole world)

> GeoStandardCircle can compute wrongly the spatial relationship when covering 
> the whole world
> 
>
> Key: LUCENE-8234
> URL: https://issues.apache.org/jira/browse/LUCENE-8234
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/spatial3d
>Reporter: Ignacio Vera
>Assignee: Ignacio Vera
>Priority: Minor
> Attachments: LUCENE-8234.patch
>
>
> GeoStandardCircle computes the wrong spatial relationship with other shape 
> when it covers the whole world and the provided shape covers the whole world 
> as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8234) GeoStandard circle can compute wrongly the spatial relationship when covering the whole world

2018-04-02 Thread Ignacio Vera (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16423448#comment-16423448
 ] 

Ignacio Vera commented on LUCENE-8234:
--

Attached the fix with a test.

> GeoStandard circle can compute wrongly the spatial relationship when covering 
> the whole world
> -
>
> Key: LUCENE-8234
> URL: https://issues.apache.org/jira/browse/LUCENE-8234
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/spatial3d
>Reporter: Ignacio Vera
>Priority: Minor
> Attachments: LUCENE-8234.patch
>
>
> GeoStandardCircle computes the wrong spatial relationship with other shape 
> when it covers the whole world and the provided shape covers the whole world 
> as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Assigned] (LUCENE-8234) GeoStandard circle can compute wrongly the spatial relationship when covering the whole world

2018-04-02 Thread Ignacio Vera (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-8234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ignacio Vera reassigned LUCENE-8234:


Assignee: Ignacio Vera

> GeoStandard circle can compute wrongly the spatial relationship when covering 
> the whole world
> -
>
> Key: LUCENE-8234
> URL: https://issues.apache.org/jira/browse/LUCENE-8234
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/spatial3d
>Reporter: Ignacio Vera
>Assignee: Ignacio Vera
>Priority: Minor
> Attachments: LUCENE-8234.patch
>
>
> GeoStandardCircle computes the wrong spatial relationship with other shape 
> when it covers the whole world and the provided shape covers the whole world 
> as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-8234) GeoStandard circle can compute wrongly the spatial relationship when covering the whole world

2018-04-02 Thread Ignacio Vera (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-8234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ignacio Vera updated LUCENE-8234:
-
Attachment: LUCENE-8234.patch

> GeoStandard circle can compute wrongly the spatial relationship when covering 
> the whole world
> -
>
> Key: LUCENE-8234
> URL: https://issues.apache.org/jira/browse/LUCENE-8234
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/spatial3d
>Reporter: Ignacio Vera
>Priority: Minor
> Attachments: LUCENE-8234.patch
>
>
> GeoStandardCircle computes the wrong spatial relationship with other shape 
> when it covers the whole world and the provided shape covers the whole world 
> as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-8234) GeoStandard circle can compute wrongly the spatial relationship when covering the whole world

2018-04-02 Thread Ignacio Vera (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-8234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ignacio Vera updated LUCENE-8234:
-
Component/s: modules/spatial3d

> GeoStandard circle can compute wrongly the spatial relationship when covering 
> the whole world
> -
>
> Key: LUCENE-8234
> URL: https://issues.apache.org/jira/browse/LUCENE-8234
> Project: Lucene - Core
>  Issue Type: Bug
>  Components: modules/spatial3d
>Reporter: Ignacio Vera
>Priority: Minor
> Attachments: LUCENE-8234.patch
>
>
> GeoStandardCircle computes the wrong spatial relationship with other shape 
> when it covers the whole world and the provided shape covers the whole world 
> as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (LUCENE-8234) GeoStandard circle can compute wrongly the spatial relationship when covering the whole world

2018-04-02 Thread Ignacio Vera (JIRA)
Ignacio Vera created LUCENE-8234:


 Summary: GeoStandard circle can compute wrongly the spatial 
relationship when covering the whole world
 Key: LUCENE-8234
 URL: https://issues.apache.org/jira/browse/LUCENE-8234
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Ignacio Vera


GeoStandardCircle computes the wrong spatial relationship with other shape when 
it covers the whole world and the provided shape covers the whole world as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Welcome to the PMC

2018-04-02 Thread Đạt Cao Mạnh
Thanks, everyone!!


On Tue, Apr 3, 2018 at 10:17 AM David Smiley 
wrote:

> Welcome!
>
> On Mon, Apr 2, 2018 at 3:49 PM Adrien Grand  wrote:
>
>> I am pleased to announce that Cao Mạnh Đạt has accepted the PMC's
>> invitation to join.
>>
>> Welcome Đạt!
>>
> --
> Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
> LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
> http://www.solrenterprisesearchserver.com
>


[JENKINS] Lucene-Solr-SmokeRelease-master - Build # 994 - Still Failing

2018-04-02 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-SmokeRelease-master/994/

No tests ran.

Build Log:
[...truncated 30106 lines...]
prepare-release-no-sign:
[mkdir] Created dir: 
/x1/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-master/lucene/build/smokeTestRelease/dist
 [copy] Copying 491 files to 
/x1/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-master/lucene/build/smokeTestRelease/dist/lucene
 [copy] Copying 230 files to 
/x1/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-master/lucene/build/smokeTestRelease/dist/solr
   [smoker] Java 1.8 JAVA_HOME=/home/jenkins/tools/java/latest1.8
   [smoker] Java 9 JAVA_HOME=/home/jenkins/tools/java/latest1.9
   [smoker] NOTE: output encoding is UTF-8
   [smoker] 
   [smoker] Load release URL 
"file:/x1/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-master/lucene/build/smokeTestRelease/dist/"...
   [smoker] 
   [smoker] Test Lucene...
   [smoker]   test basics...
   [smoker]   get KEYS
   [smoker] 0.2 MB in 0.01 sec (17.7 MB/sec)
   [smoker]   check changes HTML...
   [smoker]   download lucene-8.0.0-src.tgz...
   [smoker] 30.3 MB in 0.03 sec (1112.5 MB/sec)
   [smoker] verify sha1/sha512 digests
   [smoker]   download lucene-8.0.0.tgz...
   [smoker] 74.2 MB in 0.07 sec (1119.4 MB/sec)
   [smoker] verify sha1/sha512 digests
   [smoker]   download lucene-8.0.0.zip...
   [smoker] 84.7 MB in 0.07 sec (1144.9 MB/sec)
   [smoker] verify sha1/sha512 digests
   [smoker]   unpack lucene-8.0.0.tgz...
   [smoker] verify JAR metadata/identity/no javax.* or java.* classes...
   [smoker] test demo with 1.8...
   [smoker]   got 6261 hits for query "lucene"
   [smoker] checkindex with 1.8...
   [smoker] test demo with 9...
   [smoker]   got 6261 hits for query "lucene"
   [smoker] checkindex with 9...
   [smoker] check Lucene's javadoc JAR
   [smoker]   unpack lucene-8.0.0.zip...
   [smoker] verify JAR metadata/identity/no javax.* or java.* classes...
   [smoker] test demo with 1.8...
   [smoker]   got 6261 hits for query "lucene"
   [smoker] checkindex with 1.8...
   [smoker] test demo with 9...
   [smoker]   got 6261 hits for query "lucene"
   [smoker] checkindex with 9...
   [smoker] check Lucene's javadoc JAR
   [smoker]   unpack lucene-8.0.0-src.tgz...
   [smoker] make sure no JARs/WARs in src dist...
   [smoker] run "ant validate"
   [smoker] run tests w/ Java 8 and testArgs='-Dtests.badapples=false 
-Dtests.slow=false'...
   [smoker] test demo with 1.8...
   [smoker]   got 214 hits for query "lucene"
   [smoker] checkindex with 1.8...
   [smoker] generate javadocs w/ Java 8...
   [smoker] 
   [smoker] Crawl/parse...
   [smoker] 
   [smoker] Verify...
   [smoker] run tests w/ Java 9 and testArgs='-Dtests.badapples=false 
-Dtests.slow=false'...
   [smoker] test demo with 9...
   [smoker]   got 214 hits for query "lucene"
   [smoker] checkindex with 9...
   [smoker]   confirm all releases have coverage in TestBackwardsCompatibility
   [smoker] find all past Lucene releases...
   [smoker] run TestBackwardsCompatibility..
   [smoker] success!
   [smoker] 
   [smoker] Test Solr...
   [smoker]   test basics...
   [smoker]   get KEYS
   [smoker] 0.2 MB in 0.00 sec (287.6 MB/sec)
   [smoker]   check changes HTML...
   [smoker]   download solr-8.0.0-src.tgz...
   [smoker] 53.8 MB in 0.06 sec (939.5 MB/sec)
   [smoker] verify sha1/sha512 digests
   [smoker]   download solr-8.0.0.tgz...
   [smoker] 157.9 MB in 0.15 sec (1047.9 MB/sec)
   [smoker] verify sha1/sha512 digests
   [smoker]   download solr-8.0.0.zip...
   [smoker] 158.9 MB in 0.15 sec (1080.9 MB/sec)
   [smoker] verify sha1/sha512 digests
   [smoker]   unpack solr-8.0.0.tgz...
   [smoker] verify JAR metadata/identity/no javax.* or java.* classes...
   [smoker] unpack lucene-8.0.0.tgz...
   [smoker]   **WARNING**: skipping check of 
/x1/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-master/lucene/build/smokeTestRelease/tmp/unpack/solr-8.0.0/contrib/dataimporthandler-extras/lib/javax.mail-1.5.1.jar:
 it has javax.* classes
   [smoker]   **WARNING**: skipping check of 
/x1/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-master/lucene/build/smokeTestRelease/tmp/unpack/solr-8.0.0/contrib/dataimporthandler-extras/lib/activation-1.1.1.jar:
 it has javax.* classes
   [smoker] copying unpacked distribution for Java 8 ...
   [smoker] test solr example w/ Java 8...
   [smoker]   start Solr instance 
(log=/x1/jenkins/jenkins-slave/workspace/Lucene-Solr-SmokeRelease-master/lucene/build/smokeTestRelease/tmp/unpack/solr-8.0.0-java8/solr-example.log)...
   [smoker] No process found for Solr node running on port 8983
   [smoker]   Running techproducts example on port 8983 from 

Re: Welcome to the PMC

2018-04-02 Thread David Smiley
Welcome!

On Mon, Apr 2, 2018 at 3:49 PM Adrien Grand  wrote:

> I am pleased to announce that Cao Mạnh Đạt has accepted the PMC's
> invitation to join.
>
> Welcome Đạt!
>
-- 
Lucene/Solr Search Committer, Consultant, Developer, Author, Speaker
LinkedIn: http://linkedin.com/in/davidwsmiley | Book:
http://www.solrenterprisesearchserver.com


Re: Welcome to the PMC

2018-04-02 Thread Shai Erera
Welcome!

On Tue, Apr 3, 2018, 01:22 Mark Miller  wrote:

> Welcome!
> On Mon, Apr 2, 2018 at 3:49 PM Adrien Grand  wrote:
>
>> I am pleased to announce that Cao Mạnh Đạt has accepted the PMC's
>> invitation to join.
>>
>> Welcome Đạt!
>>
> --
> - Mark
> about.me/markrmiller
>


[JENKINS] Lucene-Solr-master-Windows (32bit/jdk1.8.0_144) - Build # 7251 - Still Unstable!

2018-04-02 Thread Policeman Jenkins Server
Build: https://jenkins.thetaphi.de/job/Lucene-Solr-master-Windows/7251/
Java: 32bit/jdk1.8.0_144 -client -XX:+UseConcMarkSweepGC

17 tests failed.
FAILED:  
org.apache.solr.handler.dataimport.TestSolrEntityProcessorEndToEnd.testFullImportFqParam

Error Message:
Could not remove the following files (in the order of attempts):
C:\Users\jenkins\workspace\Lucene-Solr-master-Windows\solr\build\contrib\solr-dataimporthandler\test\J1\temp\solr.handler.dataimport.TestSolrEntityProcessorEndToEnd_C5DA3F73870AE57C-001\tempDir-007:
 java.nio.file.DirectoryNotEmptyException: 
C:\Users\jenkins\workspace\Lucene-Solr-master-Windows\solr\build\contrib\solr-dataimporthandler\test\J1\temp\solr.handler.dataimport.TestSolrEntityProcessorEndToEnd_C5DA3F73870AE57C-001\tempDir-007
 

Stack Trace:
java.io.IOException: Could not remove the following files (in the order of 
attempts):
   
C:\Users\jenkins\workspace\Lucene-Solr-master-Windows\solr\build\contrib\solr-dataimporthandler\test\J1\temp\solr.handler.dataimport.TestSolrEntityProcessorEndToEnd_C5DA3F73870AE57C-001\tempDir-007:
 java.nio.file.DirectoryNotEmptyException: 
C:\Users\jenkins\workspace\Lucene-Solr-master-Windows\solr\build\contrib\solr-dataimporthandler\test\J1\temp\solr.handler.dataimport.TestSolrEntityProcessorEndToEnd_C5DA3F73870AE57C-001\tempDir-007

at 
__randomizedtesting.SeedInfo.seed([C5DA3F73870AE57C:34918180174A9249]:0)
at org.apache.lucene.util.IOUtils.rm(IOUtils.java:318)
at 
org.apache.solr.handler.dataimport.TestSolrEntityProcessorEndToEnd$SolrInstance.tearDown(TestSolrEntityProcessorEndToEnd.java:360)
at 
org.apache.solr.handler.dataimport.TestSolrEntityProcessorEndToEnd.tearDown(TestSolrEntityProcessorEndToEnd.java:142)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1737)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:992)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:468)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:943)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:829)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:879)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:890)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 

[jira] [Commented] (SOLR-9852) Solr JDBC doesn't implement columns' metadata

2018-04-02 Thread Kevin Risden (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-9852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16423399#comment-16423399
 ] 

Kevin Risden commented on SOLR-9852:


The ODBC bridge I got working is detailed here: 
[https://github.com/risdenk/solrj-jdbc-testing/tree/master/odbc]

 

Slide 20 of 
[https://www.slideshare.net/lucidworks/solr-jdbc-presented-by-kevin-risden-avalon-consulting]
 has some detail. The recording is here: 
https://www.youtube.com/watch?v=XpWomATSKzM

> Solr JDBC doesn't implement columns' metadata
> -
>
> Key: SOLR-9852
> URL: https://issues.apache.org/jira/browse/SOLR-9852
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Parallel SQL
>Affects Versions: 6.3
> Environment: N/A
>Reporter: Rani Y.
>Priority: Major
>  Labels: solrJ
>
> This is the error I get (from Squirrel SQL) while trying to get the objects 
> -meaning both tables and columns metadata
> 2016-12-12 13:47:48,241 [Thread-2] ERROR 
> net.sourceforge.squirrel_sql.client.session.schemainfo.SchemaInfo  - Error 
> occurred creating data types collection
> java.lang.UnsupportedOperationException
>   at 
> org.apache.solr.client.solrj.io.sql.DatabaseMetaDataImpl.getTypeInfo(DatabaseMetaDataImpl.java:773)
>   at 
> net.sourceforge.squirrel_sql.fw.sql.SQLDatabaseMetaData.getDataTypesSimpleNames(SQLDatabaseMetaData.java:1978)
>   at 
> net.sourceforge.squirrel_sql.client.session.schemainfo.SchemaInfo.loadDataTypes(SchemaInfo.java:900)
>   at 
> net.sourceforge.squirrel_sql.client.session.schemainfo.SchemaInfo.privateLoadAll(SchemaInfo.java:315)
>   at 
> net.sourceforge.squirrel_sql.client.session.schemainfo.SchemaInfo.reloadAll(SchemaInfo.java:208)
>   at 
> net.sourceforge.squirrel_sql.client.session.schemainfo.SchemaInfo.reloadAll(SchemaInfo.java:198)
>   at 
> net.sourceforge.squirrel_sql.client.session.mainpanel.objecttree.ObjectTree$3.run(ObjectTree.java:315)
>   at 
> net.sourceforge.squirrel_sql.fw.util.TaskExecuter.run(TaskExecuter.java:82)
>   at java.lang.Thread.run(Thread.java:745)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Welcome Cao Mạnh Đạt to the PMC

2018-04-02 Thread Yonik Seeley
Congrats, Đạt!

-Yonik


On Mon, Apr 2, 2018 at 3:50 PM, Adrien Grand  wrote:
> Fixing the subject of the email.
>
> Le lun. 2 avr. 2018 à 21:48, Adrien Grand  a écrit :
>>
>> I am pleased to announce that Cao Mạnh Đạt has accepted the PMC's
>> invitation to join.
>>
>> Welcome Đạt!

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-repro - Build # 415 - Still Unstable

2018-04-02 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-repro/415/

[...truncated 28 lines...]
[repro] Jenkins log URL: 
https://builds.apache.org/job/Lucene-Solr-NightlyTests-master/1518/consoleText

[repro] Revision: a4789db47788daeef0ba2ab426b4047d2fa47070

[repro] Ant options: -Dtests.multiplier=2 
-Dtests.linedocsfile=/home/jenkins/jenkins-slave/workspace/Lucene-Solr-NightlyTests-master/test-data/enwiki.random.lines.txt
[repro] Repro line:  ant test  -Dtestcase=TestReplicationHandler 
-Dtests.method=doTestReplicateAfterCoreReload -Dtests.seed=A81891ADF121A048 
-Dtests.multiplier=2 -Dtests.nightly=true -Dtests.slow=true 
-Dtests.linedocsfile=/home/jenkins/jenkins-slave/workspace/Lucene-Solr-NightlyTests-master/test-data/enwiki.random.lines.txt
 -Dtests.locale=es-AR -Dtests.timezone=America/Argentina/Catamarca 
-Dtests.asserts=true -Dtests.file.encoding=UTF-8

[repro] Repro line:  ant test  -Dtestcase=FullSolrCloudDistribCmdsTest 
-Dtests.method=test -Dtests.seed=A81891ADF121A048 -Dtests.multiplier=2 
-Dtests.nightly=true -Dtests.slow=true 
-Dtests.linedocsfile=/home/jenkins/jenkins-slave/workspace/Lucene-Solr-NightlyTests-master/test-data/enwiki.random.lines.txt
 -Dtests.locale=ar-AE -Dtests.timezone=Canada/Newfoundland -Dtests.asserts=true 
-Dtests.file.encoding=UTF-8

[repro] Repro line:  ant test  -Dtestcase=FullSolrCloudDistribCmdsTest 
-Dtests.seed=A81891ADF121A048 -Dtests.multiplier=2 -Dtests.nightly=true 
-Dtests.slow=true 
-Dtests.linedocsfile=/home/jenkins/jenkins-slave/workspace/Lucene-Solr-NightlyTests-master/test-data/enwiki.random.lines.txt
 -Dtests.locale=ar-AE -Dtests.timezone=Canada/Newfoundland -Dtests.asserts=true 
-Dtests.file.encoding=UTF-8

[repro] Repro line:  ant test  -Dtestcase=SuggesterWFSTTest 
-Dtests.seed=A81891ADF121A048 -Dtests.multiplier=2 -Dtests.nightly=true 
-Dtests.slow=true 
-Dtests.linedocsfile=/home/jenkins/jenkins-slave/workspace/Lucene-Solr-NightlyTests-master/test-data/enwiki.random.lines.txt
 -Dtests.locale=ar-AE -Dtests.timezone=Canada/Newfoundland -Dtests.asserts=true 
-Dtests.file.encoding=UTF-8

[repro] git rev-parse --abbrev-ref HEAD
[repro] git rev-parse HEAD
[repro] Initial local git branch/revision: 
41a1cbe2c337d2415a0e52415f43c7aba1059fb8
[repro] git fetch
[repro] git checkout a4789db47788daeef0ba2ab426b4047d2fa47070

[...truncated 2 lines...]
[repro] git merge --ff-only

[...truncated 1 lines...]
[repro] ant clean

[...truncated 6 lines...]
[repro] Test suites by module:
[repro]solr/core
[repro]   SuggesterWFSTTest
[repro]   FullSolrCloudDistribCmdsTest
[repro]   TestReplicationHandler
[repro] ant compile-test

[...truncated 3296 lines...]
[repro] ant test-nocompile -Dtests.dups=5 -Dtests.maxfailures=15 
-Dtests.class="*.SuggesterWFSTTest|*.FullSolrCloudDistribCmdsTest|*.TestReplicationHandler"
 -Dtests.showOutput=onerror -Dtests.multiplier=2 
-Dtests.linedocsfile=/home/jenkins/jenkins-slave/workspace/Lucene-Solr-NightlyTests-master/test-data/enwiki.random.lines.txt
 -Dtests.seed=A81891ADF121A048 -Dtests.multiplier=2 -Dtests.nightly=true 
-Dtests.slow=true 
-Dtests.linedocsfile=/home/jenkins/jenkins-slave/workspace/Lucene-Solr-NightlyTests-master/test-data/enwiki.random.lines.txt
 -Dtests.locale=ar-AE -Dtests.timezone=Canada/Newfoundland -Dtests.asserts=true 
-Dtests.file.encoding=UTF-8

[...truncated 26094 lines...]
   [junit4] ERROR: JVM J0 ended with an exception, command line: 
/usr/local/asfpackages/java/jdk1.8.0_152/jre/bin/java -ea -esa 
-Dtests.prefix=tests -Dtests.seed=A81891ADF121A048 -Xmx512M -Dtests.iters= 
-Dtests.verbose=false -Dtests.infostream=false -Dtests.codec=random 
-Dtests.postingsformat=random -Dtests.docvaluesformat=random 
-Dtests.locale=ar-AE -Dtests.timezone=Canada/Newfoundland 
-Dtests.directory=random 
-Dtests.linedocsfile=/home/jenkins/jenkins-slave/workspace/Lucene-Solr-NightlyTests-master/test-data/enwiki.random.lines.txt
 -Dtests.luceneMatchVersion=8.0.0 -Dtests.cleanthreads=perClass 
-Djava.util.logging.config.file=/home/jenkins/jenkins-slave/workspace/Lucene-Solr-repro/lucene/tools/junit4/logging.properties
 -Dtests.nightly=true -Dtests.weekly=false -Dtests.monster=false 
-Dtests.slow=true -Dtests.asserts=true -Dtests.multiplier=2 -DtempDir=./temp 
-Djava.io.tmpdir=./temp 
-Djunit4.tempDir=/home/jenkins/jenkins-slave/workspace/Lucene-Solr-repro/solr/build/solr-core/test/temp
 -Dcommon.dir=/home/jenkins/jenkins-slave/workspace/Lucene-Solr-repro/lucene 
-Dclover.db.dir=/home/jenkins/jenkins-slave/workspace/Lucene-Solr-repro/lucene/build/clover/db
 
-Djava.security.policy=/home/jenkins/jenkins-slave/workspace/Lucene-Solr-repro/lucene/tools/junit4/solr-tests.policy
 -Dtests.LUCENE_VERSION=8.0.0 -Djetty.testMode=1 -Djetty.insecurerandom=1 
-Dsolr.directoryFactory=org.apache.solr.core.MockDirectoryFactory 
-Djava.awt.headless=true -Djdk.map.althashing.threshold=0 
-Dtests.src.home=/home/jenkins/jenkins-slave/workspace/Lucene-Solr-repro/solr/core
 -Djava.security.egd=file:/dev/./urandom 

Re: Welcome Cao Mạnh Đạt to the PMC

2018-04-02 Thread Đạt Cao Mạnh
Thanks, everyone!!


On Tue, Apr 3, 2018 at 5:44 AM Joel Bernstein  wrote:

> Welcome Dat!
>
> Joel Bernstein
> http://joelsolr.blogspot.com/
>
> On Mon, Apr 2, 2018 at 6:42 PM, Tomas Fernandez Lobbe 
> wrote:
>
>> Welcome Đạt!
>>
>>
>> On Apr 2, 2018, at 2:59 PM, Christian Moen  wrote:
>>
>> Congrats, Đạt!
>>
>> On Mon, Apr 2, 2018 at 11:48 PM Varun Thacker  wrote:
>>
>>> Congratulations and welcome Dat!
>>>
>>> On Mon, Apr 2, 2018 at 12:50 PM, Adrien Grand  wrote:
>>>
 Fixing the subject of the email.

 Le lun. 2 avr. 2018 à 21:48, Adrien Grand  a écrit :

> I am pleased to announce that Cao Mạnh Đạt has accepted the PMC's
> invitation to join.
>
> Welcome Đạt!
>

>>>
>>
>


[jira] [Commented] (LUCENE-8233) Add support for soft deletes to IndexWriter delete accounting

2018-04-02 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16423361#comment-16423361
 ] 

Robert Muir commented on LUCENE-8233:
-

I think the overall idea is interesting. So the user's "api" is to just 
indicate a field name to indexwriter to be used for soft deletes, similar to 
using an active=Y/N field for a relational database or whatever. I think thats 
intuitive.

The main thing confusing me is the exact docs around that:

{quote}
Returns the field that should be used to find soft deletes. If soft deletes are 
used all documents that have a doc values value in this field are treated as 
deleted. The default is null.
{quote}

Can we expand the doc on this to explain it a bit more for a typical use-case? 
e.g.:
* how to soft-delete a doc (fairly obvious)
* how to undelete (this part is not obvious to me at the moment)
* how to configure a reasonable merge policy, say with a 7-day retention of 
soft deletes or some other reasonable example (there is a test for this case, 
but its trying to really exercise the merge policy and difficult as an example)



>  Add support for soft deletes to IndexWriter delete accounting
> --
>
> Key: LUCENE-8233
> URL: https://issues.apache.org/jira/browse/LUCENE-8233
> Project: Lucene - Core
>  Issue Type: Improvement
>Affects Versions: 7.4, master (8.0)
>Reporter: Simon Willnauer
>Priority: Major
> Fix For: 7.4, master (8.0)
>
> Attachments: LUCENE-8233.patch, LUCENE-8233.patch
>
>
> This change adds support for soft deletes as a fully supported feature by the 
> index writer. Soft deletes are accounted for inside the index writer and 
> therefor also by merge policies.
> 
> This change also adds a SoftDeletesRetentionMergePolicy that allows users to 
> selectively carry over soft_deleted document across merges for retention 
> policies. The merge policy selects documents that should be kept around in 
> the merged segment based on a user provided query.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8231) Nori, a Korean analyzer based on mecab-ko-dic

2018-04-02 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16423345#comment-16423345
 ] 

Robert Muir commented on LUCENE-8231:
-

Hi Jim, the latest changes look great. Thanks for optimizing it more!

Do you think there is an easy way to preserve the original compound with the 
decompoundfilter? Purely based on memory, I think kuromoji may do this by 
default, and maybe even our german decompounders too. From what I remember from 
relevance experiments in other languages, it seems like a pretty practical 
investment, especially since lucene's terms dict is good. 

Mainly the concern i have is to have some handling for errors in the 
decompounding process. For this analyzer the representation of the model with 
respect to compounds is a little awkward, and I worry about OOV cases. So at 
least preserving can assist with that a bit. Might be an easy win.

> Nori, a Korean analyzer based on mecab-ko-dic
> -
>
> Key: LUCENE-8231
> URL: https://issues.apache.org/jira/browse/LUCENE-8231
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Jim Ferenczi
>Priority: Major
> Attachments: LUCENE-8231-remap-hangul.patch, LUCENE-8231.patch, 
> LUCENE-8231.patch, LUCENE-8231.patch, LUCENE-8231.patch
>
>
> There is a dictionary similar to IPADIC but for Korean called mecab-ko-dic:
> It is available under an Apache license here:
> https://bitbucket.org/eunjeon/mecab-ko-dic
> This dictionary was built with MeCab, it defines a format for the features 
> adapted for the Korean language.
> Since the Kuromoji tokenizer uses the same format for the morphological 
> analysis (left cost + right cost + word cost) I tried to adapt the module to 
> handle Korean with the mecab-ko-dic. I've started with a POC that copies the 
> Kuromoji module and adapts it for the mecab-ko-dic.
> I used the same classes to build and read the dictionary but I had to make 
> some modifications to handle the differences with the IPADIC and Japanese. 
> The resulting binary dictionary takes 28MB on disk, it's bigger than the 
> IPADIC but mainly because the source is bigger and there are a lot of
> compound and inflect terms that define a group of terms and the segmentation 
> that can be applied. 
> I attached the patch that contains this new Korean module called -godori- 
> nori. It is an adaptation of the Kuromoji module so currently
> the two modules don't share any code. I wanted to validate the approach first 
> and check the relevancy of the results. I don't speak Korean so I used the 
> relevancy
> tests that was added for another Korean tokenizer 
> (https://issues.apache.org/jira/browse/LUCENE-4956) and tested the output 
> against mecab-ko which is the official fork of mecab to use the mecab-ko-dic.
> I had to simplify the JapaneseTokenizer, my version removes the nBest output 
> and the decomposition of too long tokens. I also
> modified the handling of whitespaces since they are important in Korean. 
> Whitespaces that appear before a term are attached to that term and this
> information is used to compute a penalty based on the Part of Speech of the 
> token. The penalty cost is a feature added to mecab-ko to handle 
> morphemes that should not appear after a morpheme and is described in the 
> mecab-ko page:
> https://bitbucket.org/eunjeon/mecab-ko
> Ignoring whitespaces is also more inlined with the official MeCab library 
> which attach the whitespaces to the term that follows.
> I also added a decompounder filter that expand the compounds and inflects 
> defined in the dictionary and a part of speech filter similar to the Japanese
> that removes the morpheme that are not useful for relevance (suffix, prefix, 
> interjection, ...). These filters don't play well with the tokenizer if it 
> can 
> output multiple paths (nBest output for instance) so for simplicity I removed 
> this ability and the Korean tokenizer only outputs the best path.
> I compared the result with mecab-ko to confirm that the analyzer is working 
> and ran the relevancy test that is defined in HantecRel.java included
> in the patch (written by Robert for another Korean analyzer). Here are the 
> results:
> ||Analyzer||Index Time||Index Size||MAP(CLASSIC)||MAP(BM25)||MAP(GL2)||
> |Standard|35s|131MB|.007|.1044|.1053|
> |CJK|36s|164MB|.1418|.1924|.1916|
> |Korean|212s|90MB|.1628|.2094|.2078|
> I find the results very promising so I plan to continue to work on this 
> project. I started to extract the part of the code that could be shared with 
> the
> Kuromoji module but I wanted to share the status and this POC first to 
> confirm that this approach is viable. The advantages of using the same model 
> than
> the Japanese analyzer are multiple: we don't have a Korean analyzer at the 
> moment ;), the resulting dictionary is 

[JENKINS] Lucene-Solr-BadApples-Tests-7.x - Build # 30 - Unstable

2018-04-02 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-BadApples-Tests-7.x/30/

1 tests failed.
FAILED:  org.apache.solr.cloud.TestAuthenticationFramework.testBasics

Error Message:
Error from server at 
http://127.0.0.1:42998/solr/testcollection_shard1_replica_n2: Expected mime 
type application/octet-stream but got text/html.Error 404 
Can not find: /solr/testcollection_shard1_replica_n2/update  
HTTP ERROR 404 Problem accessing 
/solr/testcollection_shard1_replica_n2/update. Reason: Can not find: 
/solr/testcollection_shard1_replica_n2/updatehttp://eclipse.org/jetty;>Powered by Jetty:// 9.4.8.v20171121  
  

Stack Trace:
org.apache.solr.client.solrj.impl.CloudSolrClient$RouteException: Error from 
server at http://127.0.0.1:42998/solr/testcollection_shard1_replica_n2: 
Expected mime type application/octet-stream but got text/html. 


Error 404 Can not find: 
/solr/testcollection_shard1_replica_n2/update

HTTP ERROR 404
Problem accessing /solr/testcollection_shard1_replica_n2/update. Reason:
Can not find: 
/solr/testcollection_shard1_replica_n2/updatehttp://eclipse.org/jetty;>Powered by Jetty:// 9.4.8.v20171121




at 
__randomizedtesting.SeedInfo.seed([D6868CC750358B28:EB5E22EB68DBD558]:0)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.directUpdate(CloudSolrClient.java:551)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.sendRequest(CloudSolrClient.java:1015)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:886)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:948)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:948)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:948)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:948)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:948)
at 
org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:819)
at 
org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:194)
at 
org.apache.solr.client.solrj.request.UpdateRequest.commit(UpdateRequest.java:233)
at 
org.apache.solr.cloud.TestAuthenticationFramework.collectionCreateSearchDeleteTwice(TestAuthenticationFramework.java:127)
at 
org.apache.solr.cloud.TestAuthenticationFramework.testBasics(TestAuthenticationFramework.java:75)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1737)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:934)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:970)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:984)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817)
at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:468)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:943)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:829)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:879)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:890)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 

[jira] [Commented] (SOLR-11033) Move out multi language field and fieldType to a separate example

2018-04-02 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16423269#comment-16423269
 ] 

Hoss Man commented on SOLR-11033:
-

This feels like a push in the opposite direction what was ultimately done in 
SOLR-10574: instead of making {{basic_configs}} more _basic_ we  just 
completely unified it with {{data_driven_schema_configs}} and got the worst of 
both worlds.

I'm in favor of {{_default}} being as small as possible, as long as there is a 
{{kitchen_sink_configs}} that has every possible feature on the planet.  
ideally that's what {{bin/solr -e cloud}} should use by default.

> Move out multi language field and fieldType to a separate example 
> --
>
> Key: SOLR-11033
> URL: https://issues.apache.org/jira/browse/SOLR-11033
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: examples
>Reporter: Varun Thacker
>Priority: Major
>
> The bulk of the schema file in the default configset has fieldType and 
> dynamic field definition for  different languages.  Based on the discussion 
> on SOLR-10967 if we move it to a separate config set and keep the default 
> configset english only then the size will be dramatically reduced and make 
> the schema file much more readable.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-12174) Refactor Streaming Expression function registration

2018-04-02 Thread Joel Bernstein (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-12174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein resolved SOLR-12174.
---
Resolution: Resolved

> Refactor Streaming Expression function registration
> ---
>
> Key: SOLR-12174
> URL: https://issues.apache.org/jira/browse/SOLR-12174
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
>Priority: Major
> Fix For: 7.4
>
> Attachments: SOLR-12174.patch
>
>
> This ticket adds a specific class that registers all the Streaming Expression 
> functions with the StreamFactory. It also adds a test case that ensures that 
> a list of expected functions are registered, and enforces that any new 
> functions that are registered are added to the expected list of functions in 
> the test case. This ensures that functions cannot be deregistered by accident 
> in the future.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-12172) Race condition in collection properties can cause invalid cache of properties

2018-04-02 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/SOLR-12172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tomás Fernández Löbbe updated SOLR-12172:
-
Summary: Race condition in collection properties can cause invalid cache of 
properties  (was: CollectionPropsTest.testReadWriteCached failure)

> Race condition in collection properties can cause invalid cache of properties
> -
>
> Key: SOLR-12172
> URL: https://issues.apache.org/jira/browse/SOLR-12172
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Tests
>Reporter: Tomás Fernández Löbbe
>Assignee: Tomás Fernández Löbbe
>Priority: Minor
> Fix For: 7.4, master (8.0)
>
> Attachments: SOLR-12172.patch
>
>
> From: https://builds.apache.org/job/Lucene-Solr-BadApples-Tests-master/24
> {noformat}
> java.lang.AssertionError: Could not see value change after setting collection 
> property. Name: property2, current value: value2, expected value: newValue
>   at 
> __randomizedtesting.SeedInfo.seed([1BCE6473A2A5E68A:FD89A9BD30939A79]:0)
>   at org.junit.Assert.fail(Assert.java:93)
>   at 
> org.apache.solr.cloud.CollectionPropsTest.waitForValue(CollectionPropsTest.java:146)
>   at 
> org.apache.solr.cloud.CollectionPropsTest.testReadWriteCached(CollectionPropsTest.java:115){noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-12172) CollectionPropsTest.testReadWriteCached failure

2018-04-02 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/SOLR-12172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tomás Fernández Löbbe resolved SOLR-12172.
--
   Resolution: Fixed
Fix Version/s: master (8.0)
   7.4

> CollectionPropsTest.testReadWriteCached failure
> ---
>
> Key: SOLR-12172
> URL: https://issues.apache.org/jira/browse/SOLR-12172
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Tests
>Reporter: Tomás Fernández Löbbe
>Assignee: Tomás Fernández Löbbe
>Priority: Major
> Fix For: 7.4, master (8.0)
>
> Attachments: SOLR-12172.patch
>
>
> From: https://builds.apache.org/job/Lucene-Solr-BadApples-Tests-master/24
> {noformat}
> java.lang.AssertionError: Could not see value change after setting collection 
> property. Name: property2, current value: value2, expected value: newValue
>   at 
> __randomizedtesting.SeedInfo.seed([1BCE6473A2A5E68A:FD89A9BD30939A79]:0)
>   at org.junit.Assert.fail(Assert.java:93)
>   at 
> org.apache.solr.cloud.CollectionPropsTest.waitForValue(CollectionPropsTest.java:146)
>   at 
> org.apache.solr.cloud.CollectionPropsTest.testReadWriteCached(CollectionPropsTest.java:115){noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-12172) CollectionPropsTest.testReadWriteCached failure

2018-04-02 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/SOLR-12172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tomás Fernández Löbbe updated SOLR-12172:
-
Priority: Minor  (was: Major)

> CollectionPropsTest.testReadWriteCached failure
> ---
>
> Key: SOLR-12172
> URL: https://issues.apache.org/jira/browse/SOLR-12172
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Tests
>Reporter: Tomás Fernández Löbbe
>Assignee: Tomás Fernández Löbbe
>Priority: Minor
> Fix For: 7.4, master (8.0)
>
> Attachments: SOLR-12172.patch
>
>
> From: https://builds.apache.org/job/Lucene-Solr-BadApples-Tests-master/24
> {noformat}
> java.lang.AssertionError: Could not see value change after setting collection 
> property. Name: property2, current value: value2, expected value: newValue
>   at 
> __randomizedtesting.SeedInfo.seed([1BCE6473A2A5E68A:FD89A9BD30939A79]:0)
>   at org.junit.Assert.fail(Assert.java:93)
>   at 
> org.apache.solr.cloud.CollectionPropsTest.waitForValue(CollectionPropsTest.java:146)
>   at 
> org.apache.solr.cloud.CollectionPropsTest.testReadWriteCached(CollectionPropsTest.java:115){noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12172) CollectionPropsTest.testReadWriteCached failure

2018-04-02 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-12172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16423260#comment-16423260
 ] 

ASF subversion and git services commented on SOLR-12172:


Commit ca7d72a0700e7fe37cfc8b47c448cc2ff4b103e1 in lucene-solr's branch 
refs/heads/branch_7x from [~tomasflobbe]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=ca7d72a ]

SOLR-12172: Fixed race condition in collection properties


> CollectionPropsTest.testReadWriteCached failure
> ---
>
> Key: SOLR-12172
> URL: https://issues.apache.org/jira/browse/SOLR-12172
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Tests
>Reporter: Tomás Fernández Löbbe
>Assignee: Tomás Fernández Löbbe
>Priority: Major
> Fix For: 7.4, master (8.0)
>
> Attachments: SOLR-12172.patch
>
>
> From: https://builds.apache.org/job/Lucene-Solr-BadApples-Tests-master/24
> {noformat}
> java.lang.AssertionError: Could not see value change after setting collection 
> property. Name: property2, current value: value2, expected value: newValue
>   at 
> __randomizedtesting.SeedInfo.seed([1BCE6473A2A5E68A:FD89A9BD30939A79]:0)
>   at org.junit.Assert.fail(Assert.java:93)
>   at 
> org.apache.solr.cloud.CollectionPropsTest.waitForValue(CollectionPropsTest.java:146)
>   at 
> org.apache.solr.cloud.CollectionPropsTest.testReadWriteCached(CollectionPropsTest.java:115){noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12172) CollectionPropsTest.testReadWriteCached failure

2018-04-02 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-12172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16423257#comment-16423257
 ] 

ASF subversion and git services commented on SOLR-12172:


Commit 2c1f110b6bf0053cfa50608a70454d9102744511 in lucene-solr's branch 
refs/heads/master from [~tomasflobbe]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=2c1f110 ]

SOLR-12172: Fixed race condition in collection properties


> CollectionPropsTest.testReadWriteCached failure
> ---
>
> Key: SOLR-12172
> URL: https://issues.apache.org/jira/browse/SOLR-12172
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: Tests
>Reporter: Tomás Fernández Löbbe
>Assignee: Tomás Fernández Löbbe
>Priority: Major
> Attachments: SOLR-12172.patch
>
>
> From: https://builds.apache.org/job/Lucene-Solr-BadApples-Tests-master/24
> {noformat}
> java.lang.AssertionError: Could not see value change after setting collection 
> property. Name: property2, current value: value2, expected value: newValue
>   at 
> __randomizedtesting.SeedInfo.seed([1BCE6473A2A5E68A:FD89A9BD30939A79]:0)
>   at org.junit.Assert.fail(Assert.java:93)
>   at 
> org.apache.solr.cloud.CollectionPropsTest.waitForValue(CollectionPropsTest.java:146)
>   at 
> org.apache.solr.cloud.CollectionPropsTest.testReadWriteCached(CollectionPropsTest.java:115){noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Comment Edited] (SOLR-11033) Move out multi language field and fieldType to a separate example

2018-04-02 Thread Varun Thacker (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16423244#comment-16423244
 ] 

Varun Thacker edited comment on SOLR-11033 at 4/2/18 10:48 PM:
---

I'm not sure I'm sold on implict field types . How would one configure the 
synonyms / stopwords files for the implicit field_types ? It just makes things 
less transparent from what I am thinking right now 

Could we we remove all the language specific field_types and dynamic fields 
from the default schema. Then create a page under 
[http://lucene.apache.org/solr/guide/7_2/documents-fields-and-schema-design.html]
 called "Language FieldTypes" where we add curl commands to each of the current 
field-types that are out there in the default schema. And in the managed-schema 
just leave a comment to the ref-guide page

 

 


was (Author: varunthacker):
I'm not sure I'm sold on implict field types . How would one configure the 
synonyms / stopwords files for the implicit field_types ? It just makes things 
less transparent from what I am thinking right now 

The other idea could we we remove all the language specific field_types and 
dynamic fields from the default schema. Then create a page under 
[http://lucene.apache.org/solr/guide/7_2/documents-fields-and-schema-design.html]
 called "Language FieldTypes" where we add curl commands to each of the current 
field-types that are out there in the default schema. And in the managed-schema 
just leave a comment to the ref-guide page

 

 

> Move out multi language field and fieldType to a separate example 
> --
>
> Key: SOLR-11033
> URL: https://issues.apache.org/jira/browse/SOLR-11033
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: examples
>Reporter: Varun Thacker
>Priority: Major
>
> The bulk of the schema file in the default configset has fieldType and 
> dynamic field definition for  different languages.  Based on the discussion 
> on SOLR-10967 if we move it to a separate config set and keep the default 
> configset english only then the size will be dramatically reduced and make 
> the schema file much more readable.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11033) Move out multi language field and fieldType to a separate example

2018-04-02 Thread Varun Thacker (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16423244#comment-16423244
 ] 

Varun Thacker commented on SOLR-11033:
--

I'm not sure I'm sold on implict field types . How would one configure the 
synonyms / stopwords files for the implicit field_types ? It just makes things 
less transparent from what I am thinking right now 

The other idea could we we remove all the language specific field_types and 
dynamic fields from the default schema. Then create a page under 
[http://lucene.apache.org/solr/guide/7_2/documents-fields-and-schema-design.html]
 called "Language FieldTypes" where we add curl commands to each of the current 
field-types that are out there in the default schema. And in the managed-schema 
just leave a comment to the ref-guide page

 

 

> Move out multi language field and fieldType to a separate example 
> --
>
> Key: SOLR-11033
> URL: https://issues.apache.org/jira/browse/SOLR-11033
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: examples
>Reporter: Varun Thacker
>Priority: Major
>
> The bulk of the schema file in the default configset has fieldType and 
> dynamic field definition for  different languages.  Based on the discussion 
> on SOLR-10967 if we move it to a separate config set and keep the default 
> configset english only then the size will be dramatically reduced and make 
> the schema file much more readable.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Welcome Cao Mạnh Đạt to the PMC

2018-04-02 Thread Joel Bernstein
Welcome Dat!

Joel Bernstein
http://joelsolr.blogspot.com/

On Mon, Apr 2, 2018 at 6:42 PM, Tomas Fernandez Lobbe 
wrote:

> Welcome Đạt!
>
>
> On Apr 2, 2018, at 2:59 PM, Christian Moen  wrote:
>
> Congrats, Đạt!
>
> On Mon, Apr 2, 2018 at 11:48 PM Varun Thacker  wrote:
>
>> Congratulations and welcome Dat!
>>
>> On Mon, Apr 2, 2018 at 12:50 PM, Adrien Grand  wrote:
>>
>>> Fixing the subject of the email.
>>>
>>> Le lun. 2 avr. 2018 à 21:48, Adrien Grand  a écrit :
>>>
 I am pleased to announce that Cao Mạnh Đạt has accepted the PMC's
 invitation to join.

 Welcome Đạt!

>>>
>>
>


Re: Welcome to the PMC

2018-04-02 Thread Mark Miller
Welcome!
On Mon, Apr 2, 2018 at 3:49 PM Adrien Grand  wrote:

> I am pleased to announce that Cao Mạnh Đạt has accepted the PMC's
> invitation to join.
>
> Welcome Đạt!
>
-- 
- Mark
about.me/markrmiller


[JENKINS] Lucene-Solr-NightlyTests-7.3 - Build # 17 - Still Unstable

2018-04-02 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-NightlyTests-7.3/17/

3 tests failed.
FAILED:  
junit.framework.TestSuite.org.apache.solr.cloud.TestLeaderElectionZkExpiry

Error Message:
ObjectTracker found 1 object(s) that were not released!!! [Overseer] 
org.apache.solr.common.util.ObjectReleaseTracker$ObjectTrackerException: 
org.apache.solr.cloud.Overseer  at 
org.apache.solr.common.util.ObjectReleaseTracker.track(ObjectReleaseTracker.java:42)
  at org.apache.solr.cloud.Overseer.start(Overseer.java:545)  at 
org.apache.solr.cloud.OverseerElectionContext.runLeaderProcess(ElectionContext.java:850)
  at 
org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:170) 
 at 
org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:135)  
at org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:307)  at 
org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:216)  at 
org.apache.solr.cloud.ZkController$1.command(ZkController.java:355)  at 
org.apache.solr.common.cloud.ConnectionManager$1.update(ConnectionManager.java:167)
  at 
org.apache.solr.common.cloud.DefaultConnectionStrategy.reconnect(DefaultConnectionStrategy.java:57)
  at 
org.apache.solr.common.cloud.ConnectionManager.process(ConnectionManager.java:141)
  at 
org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:531)  
at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:506)  

Stack Trace:
java.lang.AssertionError: ObjectTracker found 1 object(s) that were not 
released!!! [Overseer]
org.apache.solr.common.util.ObjectReleaseTracker$ObjectTrackerException: 
org.apache.solr.cloud.Overseer
at 
org.apache.solr.common.util.ObjectReleaseTracker.track(ObjectReleaseTracker.java:42)
at org.apache.solr.cloud.Overseer.start(Overseer.java:545)
at 
org.apache.solr.cloud.OverseerElectionContext.runLeaderProcess(ElectionContext.java:850)
at 
org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:170)
at 
org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:135)
at 
org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:307)
at 
org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:216)
at org.apache.solr.cloud.ZkController$1.command(ZkController.java:355)
at 
org.apache.solr.common.cloud.ConnectionManager$1.update(ConnectionManager.java:167)
at 
org.apache.solr.common.cloud.DefaultConnectionStrategy.reconnect(DefaultConnectionStrategy.java:57)
at 
org.apache.solr.common.cloud.ConnectionManager.process(ConnectionManager.java:141)
at 
org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:531)
at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:506)


at __randomizedtesting.SeedInfo.seed([3D69AF76A3E5BD47]:0)
at org.junit.Assert.fail(Assert.java:93)
at org.junit.Assert.assertTrue(Assert.java:43)
at org.junit.Assert.assertNull(Assert.java:551)
at 
org.apache.solr.SolrTestCaseJ4.teardownTestCases(SolrTestCaseJ4.java:301)
at sun.reflect.GeneratedMethodAccessor55.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1737)
at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:897)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.SystemPropertiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:57)
at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
at 

[jira] [Updated] (SOLR-11948) Move the lang-configurations from managed-schema to its own xml file

2018-04-02 Thread Varun Thacker (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-11948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Thacker updated SOLR-11948:
-
Affects Version/s: (was: 7.2.1)
   (was: 6.6.2)

> Move the lang-configurations from managed-schema to its own xml file
> 
>
> Key: SOLR-11948
> URL: https://issues.apache.org/jira/browse/SOLR-11948
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Sachin Goyal
>Assignee: Varun Thacker
>Priority: Major
>
> Half of the current 
> [managed-schema|https://github.com/apache/lucene-solr/blob/master/solr/server/solr/configsets/_default/conf/managed-schema#L516-L927]
>  includes lot of configuration that is un-used by many people and is somewhat 
> painful to remove.
> This includes the files present in the *lang* folder mostly - around 500 
> lines out of the 1000-line file are configuring so many different languages 
> and other stuff in lang folder that is never used.
> It might be good to consider splitting the managed-schema file into two:
> # managed-schema: Everything but the lang folder config
> # dependency-schema: lang folder config and other things that relate to other 
> files.
> If dependency-schema is absent, Solr will just assume that it is not required.
> # This makes it easy to get rid of the extra config and ~100 files that are 
> not required to be stored in zookeeper.
> # The managed-schema file becomes easier to look at.
> *Performance*: This should also reduce a lot of pressure on zookeeper because 
> with all those un-necessary files gone, no replica will download them ever



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-7.3-Linux (64bit/jdk-9.0.4) - Build # 111 - Unstable!

2018-04-02 Thread Policeman Jenkins Server
Error processing tokens: Error while parsing action 
'Text/ZeroOrMore/FirstOf/Token/DelimitedToken/DelimitedToken_Action3' at input 
position (line 79, pos 4):
)"}
   ^

java.lang.OutOfMemoryError: Java heap space

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: Welcome Cao Mạnh Đạt to the PMC

2018-04-02 Thread Christian Moen
Congrats, Đạt!

On Mon, Apr 2, 2018 at 11:48 PM Varun Thacker  wrote:

> Congratulations and welcome Dat!
>
> On Mon, Apr 2, 2018 at 12:50 PM, Adrien Grand  wrote:
>
>> Fixing the subject of the email.
>>
>> Le lun. 2 avr. 2018 à 21:48, Adrien Grand  a écrit :
>>
>>> I am pleased to announce that Cao Mạnh Đạt has accepted the PMC's
>>> invitation to join.
>>>
>>> Welcome Đạt!
>>>
>>
>


Re: Welcome Cao Mạnh Đạt to the PMC

2018-04-02 Thread Dennis Gove
Welcome Dat!

On Mon, Apr 2, 2018 at 5:30 PM, Steve Rowe  wrote:

> Congrats and welcome Đạt!
>
> --
> Steve
> www.lucidworks.com
>
> > On Apr 2, 2018, at 3:50 PM, Adrien Grand  wrote:
> >
> > Fixing the subject of the email.
> >
> > Le lun. 2 avr. 2018 à 21:48, Adrien Grand  a écrit :
> > I am pleased to announce that Cao Mạnh Đạt has accepted the PMC's
> invitation to join.
> >
> > Welcome Đạt!
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>


Re: Welcome Cao Mạnh Đạt to the PMC

2018-04-02 Thread Steve Rowe
Congrats and welcome Đạt!

--
Steve
www.lucidworks.com

> On Apr 2, 2018, at 3:50 PM, Adrien Grand  wrote:
> 
> Fixing the subject of the email.
> 
> Le lun. 2 avr. 2018 à 21:48, Adrien Grand  a écrit :
> I am pleased to announce that Cao Mạnh Đạt has accepted the PMC's invitation 
> to join.
> 
> Welcome Đạt!


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12154) Disallow Log4j2 explicit usage via forbidden APIs

2018-04-02 Thread Varun Thacker (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-12154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16423163#comment-16423163
 ] 

Varun Thacker commented on SOLR-12154:
--

{quote}Does that also cover log4j 1.2? 
{quote}
Yeah it does

I updated patch removing/fixing the nocommits and making sure precommit passes 
log4j2 usage marked as forbidden.

> Disallow Log4j2 explicit usage via forbidden APIs
> -
>
> Key: SOLR-12154
> URL: https://issues.apache.org/jira/browse/SOLR-12154
> Project: Solr
>  Issue Type: Sub-task
>Reporter: Varun Thacker
>Assignee: Varun Thacker
>Priority: Blocker
> Fix For: 7.4
>
> Attachments: SOLR-12154.patch, SOLR-12154.patch
>
>
> We need to add org.apache.logging.log4j.** to forbidden APIs
> From [Tomás|https://reviews.apache.org/users/tflobbe/] on the reviewboard 
> discussion ( [https://reviews.apache.org/r/65888/] ) 
> {quote} We *don't* do log4j calls in the code in general, we have that 
> explicitly forbidden in forbidden APIS today, and code that does something 
> with log4j has to supress that. Developers must instead use slf4j APIs. I 
> don't believe that's changing now with log4j2, or does it?
> {quote}
> We need to address this before 7.4 to make sure we don't break anything by 
> using Log4j2 directly 
> After SOLR-7887 the following classes explicitly import the 
> org.apache.logging.log4j.** package so let's validate it's usage
> - Log4j2Watcher
> - SolrLogLayout
> - StartupLoggingUtils
> - RequestLoggingTest
> - LoggingHandlerTest
> - SolrTestCaseJ4
> - TestLogLevelAnnotations
> - LogLevel



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-12154) Disallow Log4j2 explicit usage via forbidden APIs

2018-04-02 Thread Varun Thacker (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-12154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Thacker updated SOLR-12154:
-
Attachment: SOLR-12154.patch

> Disallow Log4j2 explicit usage via forbidden APIs
> -
>
> Key: SOLR-12154
> URL: https://issues.apache.org/jira/browse/SOLR-12154
> Project: Solr
>  Issue Type: Sub-task
>Reporter: Varun Thacker
>Assignee: Varun Thacker
>Priority: Blocker
> Fix For: 7.4
>
> Attachments: SOLR-12154.patch, SOLR-12154.patch
>
>
> We need to add org.apache.logging.log4j.** to forbidden APIs
> From [Tomás|https://reviews.apache.org/users/tflobbe/] on the reviewboard 
> discussion ( [https://reviews.apache.org/r/65888/] ) 
> {quote} We *don't* do log4j calls in the code in general, we have that 
> explicitly forbidden in forbidden APIS today, and code that does something 
> with log4j has to supress that. Developers must instead use slf4j APIs. I 
> don't believe that's changing now with log4j2, or does it?
> {quote}
> We need to address this before 7.4 to make sure we don't break anything by 
> using Log4j2 directly 
> After SOLR-7887 the following classes explicitly import the 
> org.apache.logging.log4j.** package so let's validate it's usage
> - Log4j2Watcher
> - SolrLogLayout
> - StartupLoggingUtils
> - RequestLoggingTest
> - LoggingHandlerTest
> - SolrTestCaseJ4
> - TestLogLevelAnnotations
> - LogLevel



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12144) Remove SOLR_LOG_PRESTART_ROTATION and leverage log4j2

2018-04-02 Thread Varun Thacker (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-12144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16423159#comment-16423159
 ] 

Varun Thacker commented on SOLR-12144:
--

+1 LGTM

> Remove SOLR_LOG_PRESTART_ROTATION and leverage log4j2 
> --
>
> Key: SOLR-12144
> URL: https://issues.apache.org/jira/browse/SOLR-12144
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: scripts and tools
>Reporter: Varun Thacker
>Assignee: Jan Høydahl
>Priority: Major
> Fix For: 7.4, master (8.0)
>
> Attachments: SOLR-12144.patch, SOLR-12144.patch, SOLR-12144.patch
>
>
> With log4j2 rotating the file on restart is as simple as adding a policy - 
> OnStartupTriggeringPolicy
> So we can remove Solr logic which does the same and exposes it via 
> SOLR_LOG_PRESTART_ROTATION .
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12160) Document Time Routed Aliases separate from API

2018-04-02 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-12160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16423148#comment-16423148
 ] 

David Smiley commented on SOLR-12160:
-

Here's an update.  [~gus_heck] could you take a look please?  I renamed 
"non-routed aliases" to "standard aliases" as I find a negative word in a 
definition clumsy.

> Document Time Routed Aliases separate from API
> --
>
> Key: SOLR-12160
> URL: https://issues.apache.org/jira/browse/SOLR-12160
> Project: Solr
>  Issue Type: Sub-task
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: documentation
>Reporter: David Smiley
>Assignee: David Smiley
>Priority: Major
> Attachments: SOLR-12160.patch, time-routed-aliases.adoc
>
>
> Time Routed Aliases ought to have some documentation that is apart from the 
> API details which are already documented (thanks to Gus for that part).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-12160) Document Time Routed Aliases separate from API

2018-04-02 Thread David Smiley (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-12160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Smiley updated SOLR-12160:

Attachment: SOLR-12160.patch

> Document Time Routed Aliases separate from API
> --
>
> Key: SOLR-12160
> URL: https://issues.apache.org/jira/browse/SOLR-12160
> Project: Solr
>  Issue Type: Sub-task
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: documentation
>Reporter: David Smiley
>Assignee: David Smiley
>Priority: Major
> Attachments: SOLR-12160.patch, time-routed-aliases.adoc
>
>
> Time Routed Aliases ought to have some documentation that is apart from the 
> API details which are already documented (thanks to Gus for that part).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12144) Remove SOLR_LOG_PRESTART_ROTATION and leverage log4j2

2018-04-02 Thread JIRA

[ 
https://issues.apache.org/jira/browse/SOLR-12144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16423142#comment-16423142
 ] 

Jan Høydahl commented on SOLR-12144:


Yes, no need to change this now. Do the refguide changes for 
SOLR_LOG_PRESTART_ROTATION look ok?

> Remove SOLR_LOG_PRESTART_ROTATION and leverage log4j2 
> --
>
> Key: SOLR-12144
> URL: https://issues.apache.org/jira/browse/SOLR-12144
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: scripts and tools
>Reporter: Varun Thacker
>Assignee: Jan Høydahl
>Priority: Major
> Fix For: 7.4, master (8.0)
>
> Attachments: SOLR-12144.patch, SOLR-12144.patch, SOLR-12144.patch
>
>
> With log4j2 rotating the file on restart is as simple as adding a policy - 
> OnStartupTriggeringPolicy
> So we can remove Solr logic which does the same and exposes it via 
> SOLR_LOG_PRESTART_ROTATION .
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12144) Remove SOLR_LOG_PRESTART_ROTATION and leverage log4j2

2018-04-02 Thread Varun Thacker (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-12144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16423127#comment-16423127
 ] 

Varun Thacker commented on SOLR-12144:
--

{quote} Guess the reason for {{solr.N.log}} being default in log4j2 is for 
better file type recognition. 
{quote}
Makes sense but I felt we should stick to the old naming pattern for now since 
it will break apps that collect log files . Hence I committed the change as 
part of SOLR-7887 to use the old logging pattern convention

> Remove SOLR_LOG_PRESTART_ROTATION and leverage log4j2 
> --
>
> Key: SOLR-12144
> URL: https://issues.apache.org/jira/browse/SOLR-12144
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: scripts and tools
>Reporter: Varun Thacker
>Assignee: Jan Høydahl
>Priority: Major
> Fix For: 7.4, master (8.0)
>
> Attachments: SOLR-12144.patch, SOLR-12144.patch, SOLR-12144.patch
>
>
> With log4j2 rotating the file on restart is as simple as adding a policy - 
> OnStartupTriggeringPolicy
> So we can remove Solr logic which does the same and exposes it via 
> SOLR_LOG_PRESTART_ROTATION .
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS] Lucene-Solr-master-Linux (64bit/jdk-10) - Build # 21744 - Failure!

2018-04-02 Thread Policeman Jenkins Server
Build: https://jenkins.thetaphi.de/job/Lucene-Solr-master-Linux/21744/
Java: 64bit/jdk-10 -XX:+UseCompressedOops -XX:+UseG1GC

All tests passed

Build Log:
[...truncated 1851 lines...]
   [junit4] JVM J0: stdout was not empty, see: 
/home/jenkins/workspace/Lucene-Solr-master-Linux/lucene/build/core/test/temp/junit4-J0-20180402_193125_28115743811880006282392.sysout
   [junit4] >>> JVM J0 emitted unexpected output (verbatim) 
   [junit4] codec: FastCompressingStoredFields, pf: BloomFilter, dvf: Memory
   [junit4] <<< JVM J0: EOF 

[...truncated 13090 lines...]
   [junit4] JVM J1: stdout was not empty, see: 
/home/jenkins/workspace/Lucene-Solr-master-Linux/solr/build/solr-core/test/temp/junit4-J1-20180402_200037_80711997800065116148335.sysout
   [junit4] >>> JVM J1 emitted unexpected output (verbatim) 
   [junit4] java.lang.OutOfMemoryError: Java heap space
   [junit4] Dumping heap to 
/home/jenkins/workspace/Lucene-Solr-master-Linux/heapdumps/java_pid331.hprof ...
   [junit4] Heap dump file created [367023673 bytes in 0.766 secs]
   [junit4] <<< JVM J1: EOF 

[...truncated 9280 lines...]
BUILD FAILED
/home/jenkins/workspace/Lucene-Solr-master-Linux/build.xml:633: The following 
error occurred while executing this line:
/home/jenkins/workspace/Lucene-Solr-master-Linux/build.xml:585: Some of the 
tests produced a heap dump, but did not fail. Maybe a suppressed 
OutOfMemoryError? Dumps created:
* java_pid331.hprof

Total time: 77 minutes 42 seconds
Build step 'Invoke Ant' marked build as failure
Archiving artifacts
Setting 
ANT_1_8_2_HOME=/var/lib/jenkins/tools/hudson.tasks.Ant_AntInstallation/ANT_1.8.2
[WARNINGS] Skipping publisher since build result is FAILURE
Recording test results
Setting 
ANT_1_8_2_HOME=/var/lib/jenkins/tools/hudson.tasks.Ant_AntInstallation/ANT_1.8.2
Email was triggered for: Failure - Any
Sending email for trigger: Failure - Any
Setting 
ANT_1_8_2_HOME=/var/lib/jenkins/tools/hudson.tasks.Ant_AntInstallation/ANT_1.8.2
Setting 
ANT_1_8_2_HOME=/var/lib/jenkins/tools/hudson.tasks.Ant_AntInstallation/ANT_1.8.2
Setting 
ANT_1_8_2_HOME=/var/lib/jenkins/tools/hudson.tasks.Ant_AntInstallation/ANT_1.8.2
Setting 
ANT_1_8_2_HOME=/var/lib/jenkins/tools/hudson.tasks.Ant_AntInstallation/ANT_1.8.2

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-12139) Support "eq" function for string fields

2018-04-02 Thread Andrey Kudryavtsev (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-12139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16423110#comment-16423110
 ] 

Andrey Kudryavtsev commented on SOLR-12139:
---

Have to sacrifice fancy comparator then (and a bit of performance): 

{code:java} 

  @Override
  public boolean compare(int doc, FunctionValues lhs, FunctionValues rhs) 
throws IOException {
Object objL = lhs.objectVal(doc);
Object objR = rhs.objectVal(doc);
if (isInteger(objL) && isInteger(objR)) {
  return Long.compare(lhs.longVal(doc), rhs.longVal(doc)) == 0;
} else if (isNumeric(objL) && isNumeric(objR)) {
  return Double.compare(lhs.doubleVal(doc), rhs.doubleVal(doc)) == 0;
} else {
  return Objects.equals(objL, objR);
}
  }

  private static boolean isInteger(Object obj) {
return obj instanceof Integer || obj instanceof Long;
  }

  private static boolean isNumeric(Object obj) {
return obj instanceof Number;
  }

{code}

> Support "eq" function for string fields
> ---
>
> Key: SOLR-12139
> URL: https://issues.apache.org/jira/browse/SOLR-12139
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: search
>Reporter: Andrey Kudryavtsev
>Assignee: David Smiley
>Priority: Minor
> Attachments: SOLR-12139.patch, SOLR-12139.patch, SOLR-12139.patch, 
> SOLR-12139.patch, SOLR-12139.patch, SOLR-12139.patch
>
>
> I just discovered that {{eq}} user function will work for numeric fields only.
> For string types it results in {{java.lang.UnsupportedOperationException}}
> What do you think if we will extend it to support at least some of string 
> types as well?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re:

2018-04-02 Thread Mark Miller
Whoops, missed the subject. I’ll try and fix the bitballoon issue and
enable full automation this week.

Mark
On Mon, Apr 2, 2018 at 4:29 PM BeastIt BeastIt 
wrote:

> BeastIt Unit Test Beasting Summary Report for Apache Solr Master
>
> BeastIt gives unit tests a chance to duke it out in a fair but difficult
> environment. Each test is beasted and then judged. See a link to the full
> reports below.
>
> Number of Tests: 1108
> Number Passed: 1073
> % Passed: 96.84%
>
> Ran 30 iterations, 5 at a time
>
> Markers (Marked tests have one of these markers)
>  @AwaitsFix: 1
>  @BadApple: 13
>  @Ignore: 3
>
>  *** Worst Test - Not Marked (It's your Moby Dick!)
>
>  - TestTriggerIntegration 33% screwy ''
>
>   If you catch that whale, next try
>- TestReplicationHandler 30% unreliable ''
>
>  *** New Test(s) Failing?! Sound the alarm, we may have a cowboy here.
>
>  - LeaderVoteWaitTimeoutTest 3% flakey
>  - ScheduledMaintenanceTriggerTest 23% unreliable
>
>  *** New Failures in Test(s)?! Everything looked so good!
>
>  - HdfsRecoveryZkTest 3% flakey was 0,0,0,0
>  - HdfsCollectionsAPIDistributedZkTest 3% flakey was 0,0,0,0
>  - HttpPartitionTest 3% flakey was 0,0,0,0
>  - TestRecovery 6% flakey was 0,0,0,0
>  - TestBulkSchemaAPI 3% flakey was 0,0,0,0
>  - SolrShardReporterTest 3% flakey was 0,0,0,0
>  - AutoScalingHandlerTest 3% flakey was 0,0,0,0
>
>  *** Slowest Test
>
>  - CdcrReplicationDistributedZkTest 1728.84 unreliable '@BadApple
> @Nightly  ' Can you speed me up?
>
>  *** Worst Test - Marked (How long must it suffer the mark?)
>
>  - TestLargeCluster 33% screwy '@BadApple '
>
>  *** Non Running Tests - Coverage Holes?!
>
>  - ChaosMonkeyShardSplitTest '@Ignore '
>  - TestRankQueryPlugin '@Ignore '
>  - TestMailEntityProcessor '@Ignore '
>
> Full Reports http://apache-solr.bitballoon.com
>
> (Report maintained by Mark Miller)
>
-- 
- Mark
about.me/markrmiller


Re: Welcome Cao Mạnh Đạt to the PMC

2018-04-02 Thread Jan Høydahl
Welcome Đạt!

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

> 2. apr. 2018 kl. 21:50 skrev Adrien Grand :
> 
> Fixing the subject of the email.
> 
> Le lun. 2 avr. 2018 à 21:48, Adrien Grand  > a écrit :
> I am pleased to announce that Cao Mạnh Đạt has accepted the PMC's invitation 
> to join.
> 
> Welcome Đạt!



[jira] [Commented] (SOLR-12144) Remove SOLR_LOG_PRESTART_ROTATION and leverage log4j2

2018-04-02 Thread JIRA

[ 
https://issues.apache.org/jira/browse/SOLR-12144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16423093#comment-16423093
 ] 

Jan Høydahl commented on SOLR-12144:


New patch with a change to the Ref-Guide:
{code:java}
-Java Garbage Collection logs are rotated by the JVM when size hits 20M, for a 
max of 9 generations. Old GC logs are moved to `SOLR_LOGS_DIR/archived`. These 
settings can only be changed by editing the start scripts.
+Java Garbage Collection logs are rotated by the JVM when size hits 20M, for a 
max of 9 generations.
 
-On every startup of Solr, the start script will clean up old logs and rotate 
the main `solr.log` file. If you changed the `` setting in `log4j2.xml`, you also need to change the corresponding 
setting `-rotate_solr_logs 10` in the start script.
-
-You can disable the automatic log rotation at startup by changing the setting 
`SOLR_LOG_PRESTART_ROTATION` found in `bin/solr.in.sh` or `bin/solr.in.cmd` to 
false.
+On every startup or restart of Solr, log4j2 performs log rotation. If you 
choose to use another log framework that does not support rotation on startup, 
you may enable `SOLR_LOG_PRESTART_ROTATION` in `solr.in.sh` or `solr.in.cmd` to 
let the start script rotate the logs on startup.
{code}

> Remove SOLR_LOG_PRESTART_ROTATION and leverage log4j2 
> --
>
> Key: SOLR-12144
> URL: https://issues.apache.org/jira/browse/SOLR-12144
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: scripts and tools
>Reporter: Varun Thacker
>Assignee: Jan Høydahl
>Priority: Major
> Fix For: 7.4, master (8.0)
>
> Attachments: SOLR-12144.patch, SOLR-12144.patch, SOLR-12144.patch
>
>
> With log4j2 rotating the file on restart is as simple as adding a policy - 
> OnStartupTriggeringPolicy
> So we can remove Solr logic which does the same and exposes it via 
> SOLR_LOG_PRESTART_ROTATION .
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-12144) Remove SOLR_LOG_PRESTART_ROTATION and leverage log4j2

2018-04-02 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/SOLR-12144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl updated SOLR-12144:
---
Attachment: SOLR-12144.patch

> Remove SOLR_LOG_PRESTART_ROTATION and leverage log4j2 
> --
>
> Key: SOLR-12144
> URL: https://issues.apache.org/jira/browse/SOLR-12144
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: scripts and tools
>Reporter: Varun Thacker
>Assignee: Jan Høydahl
>Priority: Major
> Fix For: 7.4, master (8.0)
>
> Attachments: SOLR-12144.patch, SOLR-12144.patch, SOLR-12144.patch
>
>
> With log4j2 rotating the file on restart is as simple as adding a policy - 
> OnStartupTriggeringPolicy
> So we can remove Solr logic which does the same and exposes it via 
> SOLR_LOG_PRESTART_ROTATION .
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[no subject]

2018-04-02 Thread BeastIt BeastIt
BeastIt Unit Test Beasting Summary Report for Apache Solr Master

BeastIt gives unit tests a chance to duke it out in a fair but difficult
environment. Each test is beasted and then judged. See a link to the full
reports below.

Number of Tests: 1108
Number Passed: 1073
% Passed: 96.84%

Ran 30 iterations, 5 at a time

Markers (Marked tests have one of these markers)
 @AwaitsFix: 1
 @BadApple: 13
 @Ignore: 3

 *** Worst Test - Not Marked (It's your Moby Dick!)

 - TestTriggerIntegration 33% screwy ''

  If you catch that whale, next try
   - TestReplicationHandler 30% unreliable ''

 *** New Test(s) Failing?! Sound the alarm, we may have a cowboy here.

 - LeaderVoteWaitTimeoutTest 3% flakey
 - ScheduledMaintenanceTriggerTest 23% unreliable

 *** New Failures in Test(s)?! Everything looked so good!

 - HdfsRecoveryZkTest 3% flakey was 0,0,0,0
 - HdfsCollectionsAPIDistributedZkTest 3% flakey was 0,0,0,0
 - HttpPartitionTest 3% flakey was 0,0,0,0
 - TestRecovery 6% flakey was 0,0,0,0
 - TestBulkSchemaAPI 3% flakey was 0,0,0,0
 - SolrShardReporterTest 3% flakey was 0,0,0,0
 - AutoScalingHandlerTest 3% flakey was 0,0,0,0

 *** Slowest Test

 - CdcrReplicationDistributedZkTest 1728.84 unreliable '@BadApple @Nightly
' Can you speed me up?

 *** Worst Test - Marked (How long must it suffer the mark?)

 - TestLargeCluster 33% screwy '@BadApple '

 *** Non Running Tests - Coverage Holes?!

 - ChaosMonkeyShardSplitTest '@Ignore '
 - TestRankQueryPlugin '@Ignore '
 - TestMailEntityProcessor '@Ignore '

Full Reports http://apache-solr.bitballoon.com

(Report maintained by Mark Miller)


[jira] [Commented] (LUCENE-8004) IndexUpgraderTool should rewrite segments rather than forceMerge

2018-04-02 Thread Erick Erickson (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16423075#comment-16423075
 ] 

Erick Erickson commented on LUCENE-8004:


I'm about to tear into 7976, and this may come along "for free" so I'll take 
it, at least temporarily so as not to lose track of it.

> IndexUpgraderTool should rewrite segments rather than forceMerge
> 
>
> Key: LUCENE-8004
> URL: https://issues.apache.org/jira/browse/LUCENE-8004
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Major
>
> Spinoff from LUCENE-7976. We help users get themselves into a corner by using 
> forceMerge on an index to rewrite all segments in the current Lucene format. 
> We should rewrite each individual segment instead. This would also help with 
> upgrading X-2->X-1, then X-1->X.
> Of course the preferred method is to re-index from scratch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Assigned] (LUCENE-8004) IndexUpgraderTool should rewrite segments rather than forceMerge

2018-04-02 Thread Erick Erickson (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-8004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson reassigned LUCENE-8004:
--

Assignee: Erick Erickson

> IndexUpgraderTool should rewrite segments rather than forceMerge
> 
>
> Key: LUCENE-8004
> URL: https://issues.apache.org/jira/browse/LUCENE-8004
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Major
>
> Spinoff from LUCENE-7976. We help users get themselves into a corner by using 
> forceMerge on an index to rewrite all segments in the current Lucene format. 
> We should rewrite each individual segment instead. This would also help with 
> upgrading X-2->X-1, then X-1->X.
> Of course the preferred method is to re-index from scratch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Assigned] (LUCENE-7976) Add a parameter to TieredMergePolicy to merge segments that have more than X percent deleted documents

2018-04-02 Thread Erick Erickson (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-7976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson reassigned LUCENE-7976:
--

Assignee: Erick Erickson

> Add a parameter to TieredMergePolicy to merge segments that have more than X 
> percent deleted documents
> --
>
> Key: LUCENE-7976
> URL: https://issues.apache.org/jira/browse/LUCENE-7976
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Major
> Attachments: LUCENE-7976.patch
>
>
> We're seeing situations "in the wild" where there are very large indexes (on 
> disk) handled quite easily in a single Lucene index. This is particularly 
> true as features like docValues move data into MMapDirectory space. The 
> current TMP algorithm allows on the order of 50% deleted documents as per a 
> dev list conversation with Mike McCandless (and his blog here:  
> https://www.elastic.co/blog/lucenes-handling-of-deleted-documents).
> Especially in the current era of very large indexes in aggregate, (think many 
> TB) solutions like "you need to distribute your collection over more shards" 
> become very costly. Additionally, the tempting "optimize" button exacerbates 
> the issue since once you form, say, a 100G segment (by 
> optimizing/forceMerging) it is not eligible for merging until 97.5G of the 
> docs in it are deleted (current default 5G max segment size).
> The proposal here would be to add a new parameter to TMP, something like 
>  (no, that's not serious name, suggestions 
> welcome) which would default to 100 (or the same behavior we have now).
> So if I set this parameter to, say, 20%, and the max segment size stays at 
> 5G, the following would happen when segments were selected for merging:
> > any segment with > 20% deleted documents would be merged or rewritten NO 
> > MATTER HOW LARGE. There are two cases,
> >> the segment has < 5G "live" docs. In that case it would be merged with 
> >> smaller segments to bring the resulting segment up to 5G. If no smaller 
> >> segments exist, it would just be rewritten
> >> The segment has > 5G "live" docs (the result of a forceMerge or optimize). 
> >> It would be rewritten into a single segment removing all deleted docs no 
> >> matter how big it is to start. The 100G example above would be rewritten 
> >> to an 80G segment for instance.
> Of course this would lead to potentially much more I/O which is why the 
> default would be the same behavior we see now. As it stands now, though, 
> there's no way to recover from an optimize/forceMerge except to re-index from 
> scratch. We routinely see 200G-300G Lucene indexes at this point "in the 
> wild" with 10s of  shards replicated 3 or more times. And that doesn't even 
> include having these over HDFS.
> Alternatives welcome! Something like the above seems minimally invasive. A 
> new merge policy is certainly an alternative.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Welcome Cao Mạnh Đạt to the PMC

2018-04-02 Thread Erick Erickson
Congrats and welcome!

On Mon, Apr 2, 2018 at 1:03 PM, Dawid Weiss  wrote:
> Congratulations and welcome, Đạt!
>
> Dawid
>
> On Mon, Apr 2, 2018 at 10:01 PM, Karl Wright  wrote:
>> Welcome!
>> Karl
>>
>> On Mon, Apr 2, 2018 at 3:56 PM, Anshum Gupta  wrote:
>>>
>>> Welcome Đạt!
>>>
>>>  Anshum
>>>
>>>
>>>
>>>
>>> On Apr 2, 2018, at 12:50 PM, Adrien Grand  wrote:
>>>
>>> Fixing the subject of the email.
>>>
>>> Le lun. 2 avr. 2018 à 21:48, Adrien Grand  a écrit :

 I am pleased to announce that Cao Mạnh Đạt has accepted the PMC's
 invitation to join.

 Welcome Đạt!
>>>
>>>
>>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Welcome Cao Mạnh Đạt to the PMC

2018-04-02 Thread Dawid Weiss
Congratulations and welcome, Đạt!

Dawid

On Mon, Apr 2, 2018 at 10:01 PM, Karl Wright  wrote:
> Welcome!
> Karl
>
> On Mon, Apr 2, 2018 at 3:56 PM, Anshum Gupta  wrote:
>>
>> Welcome Đạt!
>>
>>  Anshum
>>
>>
>>
>>
>> On Apr 2, 2018, at 12:50 PM, Adrien Grand  wrote:
>>
>> Fixing the subject of the email.
>>
>> Le lun. 2 avr. 2018 à 21:48, Adrien Grand  a écrit :
>>>
>>> I am pleased to announce that Cao Mạnh Đạt has accepted the PMC's
>>> invitation to join.
>>>
>>> Welcome Đạt!
>>
>>
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Welcome Cao Mạnh Đạt to the PMC

2018-04-02 Thread Karl Wright
Welcome!
Karl

On Mon, Apr 2, 2018 at 3:56 PM, Anshum Gupta  wrote:

> Welcome Đạt!
>
> * *Anshum
>
>
>
>
> On Apr 2, 2018, at 12:50 PM, Adrien Grand  wrote:
>
> Fixing the subject of the email.
>
> Le lun. 2 avr. 2018 à 21:48, Adrien Grand  a écrit :
>
>> I am pleased to announce that Cao Mạnh Đạt has accepted the PMC's
>> invitation to join.
>>
>> Welcome Đạt!
>>
>
>


[jira] [Commented] (SOLR-12144) Remove SOLR_LOG_PRESTART_ROTATION and leverage log4j2

2018-04-02 Thread JIRA

[ 
https://issues.apache.org/jira/browse/SOLR-12144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16423052#comment-16423052
 ] 

Jan Høydahl commented on SOLR-12144:


Ok, I was not suggesting to change the actual log pattern, but to adjust the 
{{rotate_solr_logs}} code to work with the new pattern. Guess the reason for 
{{solr.N.log}} being default in log4j2 is for better file type recognition. 
Anyway, if log file naming is same as before we don't need a change in this 
patch.

I'll commit soon.

> Remove SOLR_LOG_PRESTART_ROTATION and leverage log4j2 
> --
>
> Key: SOLR-12144
> URL: https://issues.apache.org/jira/browse/SOLR-12144
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: scripts and tools
>Reporter: Varun Thacker
>Assignee: Jan Høydahl
>Priority: Major
> Fix For: 7.4, master (8.0)
>
> Attachments: SOLR-12144.patch, SOLR-12144.patch
>
>
> With log4j2 rotating the file on restart is as simple as adding a policy - 
> OnStartupTriggeringPolicy
> So we can remove Solr logic which does the same and exposes it via 
> SOLR_LOG_PRESTART_ROTATION .
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Welcome Cao Mạnh Đạt to the PMC

2018-04-02 Thread Anshum Gupta
Welcome Đạt!

 Anshum




> On Apr 2, 2018, at 12:50 PM, Adrien Grand  wrote:
> 
> Fixing the subject of the email.
> 
> Le lun. 2 avr. 2018 à 21:48, Adrien Grand  > a écrit :
> I am pleased to announce that Cao Mạnh Đạt has accepted the PMC's invitation 
> to join.
> 
> Welcome Đạt!



signature.asc
Description: Message signed with OpenPGP


Welcome Cao Mạnh Đạt to the PMC

2018-04-02 Thread Adrien Grand
Fixing the subject of the email.

Le lun. 2 avr. 2018 à 21:48, Adrien Grand  a écrit :

> I am pleased to announce that Cao Mạnh Đạt has accepted the PMC's
> invitation to join.
>
> Welcome Đạt!
>


Welcome to the PMC

2018-04-02 Thread Adrien Grand
I am pleased to announce that Cao Mạnh Đạt has accepted the PMC's
invitation to join.

Welcome Đạt!


[jira] [Resolved] (LUCENE-7580) Spans tree scoring

2018-04-02 Thread Paul Elschot (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-7580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Elschot resolved LUCENE-7580.
--
Resolution: Won't Fix

Resolved: not enough interest. I'll keep the github branches available for now.

> Spans tree scoring
> --
>
> Key: LUCENE-7580
> URL: https://issues.apache.org/jira/browse/LUCENE-7580
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/search
>Affects Versions: 7.0
>Reporter: Paul Elschot
>Priority: Minor
> Attachments: Elschot20170326Counting.pdf, LUCENE-7580.patch, 
> LUCENE-7580.patch, LUCENE-7580.patch, LUCENE-7580.patch
>
>
> Recurse the spans tree to compose a score based on the type of subqueries and 
> what matched



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-7613) Update Surround query language

2018-04-02 Thread Paul Elschot (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-7613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Elschot resolved LUCENE-7613.
--
Resolution: Won't Fix

Resolved: not enough interest.

> Update Surround query language
> --
>
> Key: LUCENE-7613
> URL: https://issues.apache.org/jira/browse/LUCENE-7613
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/queryparser
>Reporter: Paul Elschot
>Priority: Minor
> Attachments: LUCENE-7613-spanstree.patch, LUCENE-7613.patch, 
> LUCENE-7613.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (LUCENE-7615) SpanSynonymQuery

2018-04-02 Thread Paul Elschot (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-7615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Elschot resolved LUCENE-7615.
--
Resolution: Won't Fix

Resolved: not enough interest.

> SpanSynonymQuery
> 
>
> Key: LUCENE-7615
> URL: https://issues.apache.org/jira/browse/LUCENE-7615
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: core/search
>Affects Versions: 7.0
>Reporter: Paul Elschot
>Priority: Minor
> Attachments: LUCENE-7615.patch, LUCENE-7615.patch
>
>
> A SpanQuery that tries to score as SynonymQuery.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-12175) Add random field type and dynamic field to the default managed-schema

2018-04-02 Thread Joel Bernstein (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-12175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-12175:
--
Summary: Add random field type and dynamic field to the default 
managed-schema  (was: Add random field field type and dynamic field to the 
default managed-schema)

> Add random field type and dynamic field to the default managed-schema
> -
>
> Key: SOLR-12175
> URL: https://issues.apache.org/jira/browse/SOLR-12175
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Priority: Major
>
> Currently the default manage-schema file doesn't have the random field 
> configured. Both the techproducts and example manage-schema files have it 
> configured. This ticket will add the random dynamic field and field type to 
> the default managed-schema so this functionality is available out of the box 
> when using the default schema.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Created] (SOLR-12175) Add random field field type and dynamic field to the default managed-schema

2018-04-02 Thread Joel Bernstein (JIRA)
Joel Bernstein created SOLR-12175:
-

 Summary: Add random field field type and dynamic field to the 
default managed-schema
 Key: SOLR-12175
 URL: https://issues.apache.org/jira/browse/SOLR-12175
 Project: Solr
  Issue Type: Improvement
  Security Level: Public (Default Security Level. Issues are Public)
Reporter: Joel Bernstein


Currently the default manage-schema file doesn't have the random field 
configured. Both the techproducts and example manage-schema files have it 
configured. This ticket will add the random dynamic field and field type to the 
default managed-schema so this functionality is available out of the box when 
using the default schema.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-9399) Delete requests do not send credentials & fails for Basic Authentication

2018-04-02 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/SOLR-9399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jan Høydahl updated SOLR-9399:
--
Fix Version/s: master (8.0)

> Delete requests do not send credentials & fails for Basic Authentication
> 
>
> Key: SOLR-9399
> URL: https://issues.apache.org/jira/browse/SOLR-9399
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrJ
>Affects Versions: 6.0, 6.0.1
>Reporter: Susheel Kumar
>Assignee: Erick Erickson
>Priority: Major
>  Labels: security
> Fix For: 7.4, master (8.0)
>
>
> The getRoutes(..) func of UpdateRequest do not pass credentials to 
> LBHttpSolrClient when deleteById is set while for updates it passes the 
> credentials.  See below code snippet
>   if (deleteById != null) {
>   
>   Iterator>> entries = 
> deleteById.entrySet()
>   .iterator();
>   while (entries.hasNext()) {
> 
> Map.Entry> entry = entries.next();
> 
> String deleteId = entry.getKey();
> Map map = entry.getValue();
> Long version = null;
> if (map != null) {
>   version = (Long) map.get(VER);
> }
> Slice slice = router.getTargetSlice(deleteId, null, null, null, col);
> if (slice == null) {
>   return null;
> }
> List urls = urlMap.get(slice.getName());
> if (urls == null) {
>   return null;
> }
> String leaderUrl = urls.get(0);
> LBHttpSolrClient.Req request = routes.get(leaderUrl);
> if (request != null) {
>   UpdateRequest urequest = (UpdateRequest) request.getRequest();
>   urequest.deleteById(deleteId, version);
> } else {
>   UpdateRequest urequest = new UpdateRequest();
>   urequest.setParams(params);
>   urequest.deleteById(deleteId, version);
>   urequest.setCommitWithin(getCommitWithin());
>   request = new LBHttpSolrClient.Req(urequest, urls);
>   routes.put(leaderUrl, request);
> }
>   }
> }



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [JENKINS-MAVEN] Lucene-Solr-Maven-7.x #171: POMs out of sync

2018-04-02 Thread Michael Braun
This is the cause in the Jenkins log:

   [mvn] [ERROR] COMPILATION ERROR :
  [mvn] [INFO] -
  [mvn] [ERROR] cannot access org.apache.yetus.audience.InterfaceAudience
  [mvn]   class file for
org.apache.yetus.audience.InterfaceAudience not found



On Mon, Apr 2, 2018 at 2:45 PM, Apache Jenkins Server <
jenk...@builds.apache.org> wrote:

> Build: https://builds.apache.org/job/Lucene-Solr-Maven-7.x/171/
>
> No tests ran.
>
> Build Log:
> [...truncated 31624 lines...]
>   [mvn] [INFO] --
> ---
>   [mvn] [INFO] --
> ---
>   [mvn] [ERROR] COMPILATION ERROR :
>   [mvn] [INFO] --
> ---
>
> [...truncated 204 lines...]
> BUILD FAILED
> /x1/jenkins/jenkins-slave/workspace/Lucene-Solr-Maven-7.x/build.xml:679:
> The following error occurred while executing this line:
> : Java returned: 1
>
> Total time: 13 minutes 14 seconds
> Build step 'Invoke Ant' marked build as failure
> Email was triggered for: Failure - Any
> Sending email for trigger: Failure - Any
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>


[jira] [Commented] (SOLR-12139) Support "eq" function for string fields

2018-04-02 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-12139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16422982#comment-16422982
 ] 

David Smiley commented on SOLR-12139:
-

bq. Ok, I see. Any concerns against checking objectVal(...) to get correct 
"type" of valueSource?

Concerns indeed!:  Functions like def() or if() (and others) may have a varying 
response to objectVal().  And if it doesn't "exist") then it'll be null.

> Support "eq" function for string fields
> ---
>
> Key: SOLR-12139
> URL: https://issues.apache.org/jira/browse/SOLR-12139
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: search
>Reporter: Andrey Kudryavtsev
>Assignee: David Smiley
>Priority: Minor
> Attachments: SOLR-12139.patch, SOLR-12139.patch, SOLR-12139.patch, 
> SOLR-12139.patch, SOLR-12139.patch, SOLR-12139.patch
>
>
> I just discovered that {{eq}} user function will work for numeric fields only.
> For string types it results in {{java.lang.UnsupportedOperationException}}
> What do you think if we will extend it to support at least some of string 
> types as well?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS-MAVEN] Lucene-Solr-Maven-7.x #171: POMs out of sync

2018-04-02 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/Lucene-Solr-Maven-7.x/171/

No tests ran.

Build Log:
[...truncated 31624 lines...]
  [mvn] [INFO] -
  [mvn] [INFO] -
  [mvn] [ERROR] COMPILATION ERROR : 
  [mvn] [INFO] -

[...truncated 204 lines...]
BUILD FAILED
/x1/jenkins/jenkins-slave/workspace/Lucene-Solr-Maven-7.x/build.xml:679: The 
following error occurred while executing this line:
: Java returned: 1

Total time: 13 minutes 14 seconds
Build step 'Invoke Ant' marked build as failure
Email was triggered for: Failure - Any
Sending email for trigger: Failure - Any

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [JENKINS] Lucene-Solr-7.x-MacOSX (64bit/jdk-9) - Build # 551 - Unstable!

2018-04-02 Thread Dawid Weiss
I see this in TestReplicationHandler:

  /**
   * character copy of file using UTF-8. If port is non-null, will be
substituted any time "TEST_PORT" is found.
   */
  private static void copyFile(File src, File dst, Integer port,
boolean internalCompression) throws IOException {
BufferedReader in = new BufferedReader(new InputStreamReader(new
FileInputStream(src), StandardCharsets.UTF_8));
Writer out = new OutputStreamWriter(new FileOutputStream(dst),
StandardCharsets.UTF_8);

for (String line = in.readLine(); null != line; line = in.readLine()) {

  if (null != port)
line = line.replace("TEST_PORT", port.toString());

So it seems port is allowed to be null and then won't be substituted.
This looks like a bug in the test scaffolding: this situation
shouldn't be allowed; if a port cannot be acquired the test should
fail much sooner?

Dawid

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12163) Ref Guide: Improve Setting Up an External ZK Ensemble page

2018-04-02 Thread Varun Thacker (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-12163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16422912#comment-16422912
 ] 

Varun Thacker commented on SOLR-12163:
--

{quote}I can't find any reference to a {{zookeeper-env.sh}} in ZK docs - is 
there anything you didn't mention that users should do during setup to be sure 
this file is read?
{quote}
The start script for ZooKeeper actually tries looking for a 
zookeeper_home/conf/zookeeper-env.sh file and picks up variables from there . 
But ZK doesn't ship with an empty zookeeper-env.sh file in the conf directory 
so one must create the file and then add those env variables . Here is where 
the variables are loaded : 
[https://github.com/apache/zookeeper/blob/master/bin/zkEnv.sh#L53]

 

I don't see the same mechanics in the windows script : 
[https://github.com/apache/zookeeper/blob/master/bin/zkEnv.cmd] , so maybe we 
just tell windows users to load those variables to bin/zkCli.cmd ?

> Ref Guide: Improve Setting Up an External ZK Ensemble page
> --
>
> Key: SOLR-12163
> URL: https://issues.apache.org/jira/browse/SOLR-12163
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: documentation
>Reporter: Cassandra Targett
>Assignee: Cassandra Targett
>Priority: Major
> Fix For: 7.4
>
> Attachments: setting-up-an-external-zookeeper-ensemble.adoc
>
>
> I had to set up a ZK ensemble the other day for the first time in a while, 
> and thought I'd test our docs on the subject while I was at it. I headed over 
> to 
> https://lucene.apache.org/solr/guide/setting-up-an-external-zookeeper-ensemble.html,
>  and...Well, I still haven't gotten back to what I was trying to do, but I 
> rewrote the entire page.
> The problem to me is that the page today is mostly a stripped down copy of 
> the ZK Getting Started docs: walking through setting up a single ZK instance 
> before introducing the idea of an ensemble and going back through the same 
> configs again to update them for the ensemble.
> IOW, despite the page being titled "setting up an ensemble", it's mostly 
> about not setting up an ensemble. That's at the end of the page, which itself 
> focuses a bit heavily on the use case of running an ensemble on a single 
> server (so, if you're counting...that's 3 use cases we don't want people to 
> use discussed in detail on a page that's supposedly about _not_ doing any of 
> those things).
> So, I took all of it and restructured the whole thing to focus primarily on 
> the use case we want people to use: running 3 ZK nodes on different machines. 
> Running 3 on one machine is still there, but noted in passing with the 
> appropriate caveats. I've also added information about choosing to use a 
> chroot, which AFAICT was only covered in the section on Taking Solr to 
> Production.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-12139) Support "eq" function for string fields

2018-04-02 Thread Andrey Kudryavtsev (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-12139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrey Kudryavtsev updated SOLR-12139:
--
Attachment: SOLR-12139.patch

> Support "eq" function for string fields
> ---
>
> Key: SOLR-12139
> URL: https://issues.apache.org/jira/browse/SOLR-12139
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: search
>Reporter: Andrey Kudryavtsev
>Assignee: David Smiley
>Priority: Minor
> Attachments: SOLR-12139.patch, SOLR-12139.patch, SOLR-12139.patch, 
> SOLR-12139.patch, SOLR-12139.patch, SOLR-12139.patch
>
>
> I just discovered that {{eq}} user function will work for numeric fields only.
> For string types it results in {{java.lang.UnsupportedOperationException}}
> What do you think if we will extend it to support at least some of string 
> types as well?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-12139) Support "eq" function for string fields

2018-04-02 Thread Andrey Kudryavtsev (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-12139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrey Kudryavtsev updated SOLR-12139:
--
Attachment: (was: SOLR-12139.patch)

> Support "eq" function for string fields
> ---
>
> Key: SOLR-12139
> URL: https://issues.apache.org/jira/browse/SOLR-12139
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: search
>Reporter: Andrey Kudryavtsev
>Assignee: David Smiley
>Priority: Minor
> Attachments: SOLR-12139.patch, SOLR-12139.patch, SOLR-12139.patch, 
> SOLR-12139.patch, SOLR-12139.patch
>
>
> I just discovered that {{eq}} user function will work for numeric fields only.
> For string types it results in {{java.lang.UnsupportedOperationException}}
> What do you think if we will extend it to support at least some of string 
> types as well?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-12139) Support "eq" function for string fields

2018-04-02 Thread Andrey Kudryavtsev (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-12139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrey Kudryavtsev updated SOLR-12139:
--
Attachment: (was: SOLR-12139.patch)

> Support "eq" function for string fields
> ---
>
> Key: SOLR-12139
> URL: https://issues.apache.org/jira/browse/SOLR-12139
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: search
>Reporter: Andrey Kudryavtsev
>Assignee: David Smiley
>Priority: Minor
> Attachments: SOLR-12139.patch, SOLR-12139.patch, SOLR-12139.patch, 
> SOLR-12139.patch, SOLR-12139.patch, SOLR-12139.patch
>
>
> I just discovered that {{eq}} user function will work for numeric fields only.
> For string types it results in {{java.lang.UnsupportedOperationException}}
> What do you think if we will extend it to support at least some of string 
> types as well?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-12139) Support "eq" function for string fields

2018-04-02 Thread Andrey Kudryavtsev (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-12139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrey Kudryavtsev updated SOLR-12139:
--
Attachment: SOLR-12139.patch

> Support "eq" function for string fields
> ---
>
> Key: SOLR-12139
> URL: https://issues.apache.org/jira/browse/SOLR-12139
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: search
>Reporter: Andrey Kudryavtsev
>Assignee: David Smiley
>Priority: Minor
> Attachments: SOLR-12139.patch, SOLR-12139.patch, SOLR-12139.patch, 
> SOLR-12139.patch, SOLR-12139.patch, SOLR-12139.patch
>
>
> I just discovered that {{eq}} user function will work for numeric fields only.
> For string types it results in {{java.lang.UnsupportedOperationException}}
> What do you think if we will extend it to support at least some of string 
> types as well?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8229) Add a method to Weight to retrieve matches for a single document

2018-04-02 Thread Alan Woodward (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16422822#comment-16422822
 ] 

Alan Woodward commented on LUCENE-8229:
---

I've pushed a few more changes - IndexOrDocValuesQuery should use the dvWeight 
to check if it matches, I've added a term() method so that the iterator can 
report which term it's currently positioned on, and I've removed the iteration 
for SpanQueries.  I want to think more about how we iterate over composite 
queries like Span or phrase (or interval, soon), as I can see situations where 
we'd both want to iterate over the whole thing, or where we'd want iterate over 
the sub parts as well, and I'd like to leave that to a follow-up issue.

> Add a method to Weight to retrieve matches for a single document
> 
>
> Key: LUCENE-8229
> URL: https://issues.apache.org/jira/browse/LUCENE-8229
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Alan Woodward
>Assignee: Alan Woodward
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The ability to find out exactly what a query has matched on is a fairly 
> frequent feature request, and would also make highlighters much easier to 
> implement.  There have been a few attempts at doing this, including adding 
> positions to Scorers, or re-writing queries as Spans, but these all either 
> compromise general performance or involve up-front knowledge of all queries.
> Instead, I propose adding a method to Weight that exposes an iterator over 
> matches in a particular document and field.  It should be used in a similar 
> manner to explain() - ie, just for TopDocs, not as part of the scoring loop, 
> which relieves some of the pressure on performance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12133) TriggerIntegrationTest fails too easily.

2018-04-02 Thread Joel Bernstein (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-12133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16422812#comment-16422812
 ] 

Joel Bernstein commented on SOLR-12133:
---

No problem!

> TriggerIntegrationTest fails too easily.
> 
>
> Key: SOLR-12133
> URL: https://issues.apache.org/jira/browse/SOLR-12133
> Project: Solr
>  Issue Type: Test
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Mark Miller
>Assignee: Shalin Shekhar Mangar
>Priority: Major
> Attachments: SOLR-12133-testNodeMarkersRegistration.patch, 
> SOLR-12133.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12139) Support "eq" function for string fields

2018-04-02 Thread Andrey Kudryavtsev (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-12139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16422811#comment-16422811
 ] 

Andrey Kudryavtsev commented on SOLR-12139:
---

Ok, I see. Any concerns against checking {{objectVal(...)}} to get correct 
"type" of {{valueSource}}?

> Support "eq" function for string fields
> ---
>
> Key: SOLR-12139
> URL: https://issues.apache.org/jira/browse/SOLR-12139
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: search
>Reporter: Andrey Kudryavtsev
>Assignee: David Smiley
>Priority: Minor
> Attachments: SOLR-12139.patch, SOLR-12139.patch, SOLR-12139.patch, 
> SOLR-12139.patch, SOLR-12139.patch, SOLR-12139.patch
>
>
> I just discovered that {{eq}} user function will work for numeric fields only.
> For string types it results in {{java.lang.UnsupportedOperationException}}
> What do you think if we will extend it to support at least some of string 
> types as well?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-7887) Upgrade Solr to use log4j2 -- log4j 1 now officially end of life

2018-04-02 Thread Varun Thacker (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-7887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Thacker resolved SOLR-7887.
-
Resolution: Fixed

> Upgrade Solr to use log4j2 -- log4j 1 now officially end of life
> 
>
> Key: SOLR-7887
> URL: https://issues.apache.org/jira/browse/SOLR-7887
> Project: Solr
>  Issue Type: Task
>Reporter: Shawn Heisey
>Assignee: Erick Erickson
>Priority: Major
> Fix For: 7.4
>
> Attachments: SOLR-7887-WIP.patch, SOLR-7887-eoe-review.patch, 
> SOLR-7887-eoe-review.patch, SOLR-7887-followup_1.patch, SOLR-7887.patch, 
> SOLR-7887.patch, SOLR-7887.patch, SOLR-7887.patch, SOLR-7887.patch, 
> SOLR-7887.patch, SOLR-7887.patch, SOLR-7887.patch, SOLR-7887.patch, 
> SOLR-7887.patch, SOLR-7887.patch, SOLR-7887.patch, SOLR-7887.patch, 
> SOLR-7887_followup_2.patch, SOLR-7887_followup_2.patch
>
>
> The logging services project has officially announced the EOL of log4j 1:
> https://blogs.apache.org/foundation/entry/apache_logging_services_project_announces
> In the official binary jetty deployment, we use use log4j 1.2 as our final 
> logging destination, so the admin UI has a log watcher that actually uses 
> log4j and java.util.logging classes.  That will need to be extended to add 
> log4j2.  I think that might be the largest pain point to this upgrade.
> There is some crossover between log4j2 and slf4j.  Figuring out exactly which 
> jars need to be in the lib/ext directory will take some research.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-12139) Support "eq" function for string fields

2018-04-02 Thread Andrey Kudryavtsev (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-12139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrey Kudryavtsev updated SOLR-12139:
--
Attachment: SOLR-12139.patch

> Support "eq" function for string fields
> ---
>
> Key: SOLR-12139
> URL: https://issues.apache.org/jira/browse/SOLR-12139
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: search
>Reporter: Andrey Kudryavtsev
>Assignee: David Smiley
>Priority: Minor
> Attachments: SOLR-12139.patch, SOLR-12139.patch, SOLR-12139.patch, 
> SOLR-12139.patch, SOLR-12139.patch, SOLR-12139.patch
>
>
> I just discovered that {{eq}} user function will work for numeric fields only.
> For string types it results in {{java.lang.UnsupportedOperationException}}
> What do you think if we will extend it to support at least some of string 
> types as well?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-7887) Upgrade Solr to use log4j2 -- log4j 1 now officially end of life

2018-04-02 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-7887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16422798#comment-16422798
 ] 

ASF subversion and git services commented on SOLR-7887:
---

Commit 41a1cbe2c337d2415a0e52415f43c7aba1059fb8 in lucene-solr's branch 
refs/heads/master from [~varun_saxena]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=41a1cbe ]

SOLR-7887: Fix logging filePattern to use solr.log.X format


> Upgrade Solr to use log4j2 -- log4j 1 now officially end of life
> 
>
> Key: SOLR-7887
> URL: https://issues.apache.org/jira/browse/SOLR-7887
> Project: Solr
>  Issue Type: Task
>Reporter: Shawn Heisey
>Assignee: Erick Erickson
>Priority: Major
> Fix For: 7.4
>
> Attachments: SOLR-7887-WIP.patch, SOLR-7887-eoe-review.patch, 
> SOLR-7887-eoe-review.patch, SOLR-7887-followup_1.patch, SOLR-7887.patch, 
> SOLR-7887.patch, SOLR-7887.patch, SOLR-7887.patch, SOLR-7887.patch, 
> SOLR-7887.patch, SOLR-7887.patch, SOLR-7887.patch, SOLR-7887.patch, 
> SOLR-7887.patch, SOLR-7887.patch, SOLR-7887.patch, SOLR-7887.patch, 
> SOLR-7887_followup_2.patch, SOLR-7887_followup_2.patch
>
>
> The logging services project has officially announced the EOL of log4j 1:
> https://blogs.apache.org/foundation/entry/apache_logging_services_project_announces
> In the official binary jetty deployment, we use use log4j 1.2 as our final 
> logging destination, so the admin UI has a log watcher that actually uses 
> log4j and java.util.logging classes.  That will need to be extended to add 
> log4j2.  I think that might be the largest pain point to this upgrade.
> There is some crossover between log4j2 and slf4j.  Figuring out exactly which 
> jars need to be in the lib/ext directory will take some research.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8231) Nori, a Korean analyzer based on mecab-ko-dic

2018-04-02 Thread Jim Ferenczi (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-8231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16422791#comment-16422791
 ] 

Jim Ferenczi commented on LUCENE-8231:
--

I attached a new patch with lots of cleanups and fixes. I ran HantecRel again, 
here are the results:

||Analyzer||Index Time||Index Size||MAP(CLASSIC)||MAP(BM25)||MAP(GL2)||
|Korean|178s|90MB|.1638|.2101|.2081|

I am not sure why it got faster, could be the new compressed format, I'll dig.

> Nori, a Korean analyzer based on mecab-ko-dic
> -
>
> Key: LUCENE-8231
> URL: https://issues.apache.org/jira/browse/LUCENE-8231
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Jim Ferenczi
>Priority: Major
> Attachments: LUCENE-8231-remap-hangul.patch, LUCENE-8231.patch, 
> LUCENE-8231.patch, LUCENE-8231.patch, LUCENE-8231.patch
>
>
> There is a dictionary similar to IPADIC but for Korean called mecab-ko-dic:
> It is available under an Apache license here:
> https://bitbucket.org/eunjeon/mecab-ko-dic
> This dictionary was built with MeCab, it defines a format for the features 
> adapted for the Korean language.
> Since the Kuromoji tokenizer uses the same format for the morphological 
> analysis (left cost + right cost + word cost) I tried to adapt the module to 
> handle Korean with the mecab-ko-dic. I've started with a POC that copies the 
> Kuromoji module and adapts it for the mecab-ko-dic.
> I used the same classes to build and read the dictionary but I had to make 
> some modifications to handle the differences with the IPADIC and Japanese. 
> The resulting binary dictionary takes 28MB on disk, it's bigger than the 
> IPADIC but mainly because the source is bigger and there are a lot of
> compound and inflect terms that define a group of terms and the segmentation 
> that can be applied. 
> I attached the patch that contains this new Korean module called -godori- 
> nori. It is an adaptation of the Kuromoji module so currently
> the two modules don't share any code. I wanted to validate the approach first 
> and check the relevancy of the results. I don't speak Korean so I used the 
> relevancy
> tests that was added for another Korean tokenizer 
> (https://issues.apache.org/jira/browse/LUCENE-4956) and tested the output 
> against mecab-ko which is the official fork of mecab to use the mecab-ko-dic.
> I had to simplify the JapaneseTokenizer, my version removes the nBest output 
> and the decomposition of too long tokens. I also
> modified the handling of whitespaces since they are important in Korean. 
> Whitespaces that appear before a term are attached to that term and this
> information is used to compute a penalty based on the Part of Speech of the 
> token. The penalty cost is a feature added to mecab-ko to handle 
> morphemes that should not appear after a morpheme and is described in the 
> mecab-ko page:
> https://bitbucket.org/eunjeon/mecab-ko
> Ignoring whitespaces is also more inlined with the official MeCab library 
> which attach the whitespaces to the term that follows.
> I also added a decompounder filter that expand the compounds and inflects 
> defined in the dictionary and a part of speech filter similar to the Japanese
> that removes the morpheme that are not useful for relevance (suffix, prefix, 
> interjection, ...). These filters don't play well with the tokenizer if it 
> can 
> output multiple paths (nBest output for instance) so for simplicity I removed 
> this ability and the Korean tokenizer only outputs the best path.
> I compared the result with mecab-ko to confirm that the analyzer is working 
> and ran the relevancy test that is defined in HantecRel.java included
> in the patch (written by Robert for another Korean analyzer). Here are the 
> results:
> ||Analyzer||Index Time||Index Size||MAP(CLASSIC)||MAP(BM25)||MAP(GL2)||
> |Standard|35s|131MB|.007|.1044|.1053|
> |CJK|36s|164MB|.1418|.1924|.1916|
> |Korean|212s|90MB|.1628|.2094|.2078|
> I find the results very promising so I plan to continue to work on this 
> project. I started to extract the part of the code that could be shared with 
> the
> Kuromoji module but I wanted to share the status and this POC first to 
> confirm that this approach is viable. The advantages of using the same model 
> than
> the Japanese analyzer are multiple: we don't have a Korean analyzer at the 
> moment ;), the resulting dictionary is small compared to other libraries that
> use the mecab-ko-dic (the FST takes only 5.4MB) and the Tokenizer prunes the 
> lattice on the fly to select the best path efficiently.
> The dictionary can be built directly from the godori module with the 
> following command:
> ant regenerate (you need to create the resource directory (mkdir 
> lucene/analysis/godori/src/resources/org/apache/lucene/analysis/ko/dict) 
> first since the dictionary is 

[jira] [Commented] (SOLR-12174) Refactor Streaming Expression function registration

2018-04-02 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-12174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16422789#comment-16422789
 ] 

ASF subversion and git services commented on SOLR-12174:


Commit 8cb52a272cc8c80421783065670b680e5b0de3d6 in lucene-solr's branch 
refs/heads/branch_7x from [~joel.bernstein]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=8cb52a2 ]

SOLR-12174: Refactor Streaming Expression function registration


> Refactor Streaming Expression function registration
> ---
>
> Key: SOLR-12174
> URL: https://issues.apache.org/jira/browse/SOLR-12174
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
>Priority: Major
> Fix For: 7.4
>
> Attachments: SOLR-12174.patch
>
>
> This ticket adds a specific class that registers all the Streaming Expression 
> functions with the StreamFactory. It also adds a test case that ensures that 
> a list of expected functions are registered, and enforces that any new 
> functions that are registered are added to the expected list of functions in 
> the test case. This ensures that functions cannot be deregistered by accident 
> in the future.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12133) TriggerIntegrationTest fails too easily.

2018-04-02 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-12133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16422790#comment-16422790
 ] 

ASF subversion and git services commented on SOLR-12133:


Commit 4d3149d4344efed263d852829273d90d9b0a1c12 in lucene-solr's branch 
refs/heads/branch_7x from [~joel.bernstein]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=4d3149d ]

SOLR-12133: Fix precommit


> TriggerIntegrationTest fails too easily.
> 
>
> Key: SOLR-12133
> URL: https://issues.apache.org/jira/browse/SOLR-12133
> Project: Solr
>  Issue Type: Test
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Mark Miller
>Assignee: Shalin Shekhar Mangar
>Priority: Major
> Attachments: SOLR-12133-testNodeMarkersRegistration.patch, 
> SOLR-12133.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12144) Remove SOLR_LOG_PRESTART_ROTATION and leverage log4j2

2018-04-02 Thread Varun Thacker (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-12144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16422784#comment-16422784
 ] 

Varun Thacker commented on SOLR-12144:
--

{quote}We should either document this so people who change log framework and 
want to use rotation can setup naming accordingly, or make the pattern 
configurable?
{quote}
I think we should fix this as part of SOLR-7887 . Let me fix that right away

> Remove SOLR_LOG_PRESTART_ROTATION and leverage log4j2 
> --
>
> Key: SOLR-12144
> URL: https://issues.apache.org/jira/browse/SOLR-12144
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: scripts and tools
>Reporter: Varun Thacker
>Assignee: Jan Høydahl
>Priority: Major
> Fix For: 7.4, master (8.0)
>
> Attachments: SOLR-12144.patch, SOLR-12144.patch
>
>
> With log4j2 rotating the file on restart is as simple as adding a policy - 
> OnStartupTriggeringPolicy
> So we can remove Solr logic which does the same and exposes it via 
> SOLR_LOG_PRESTART_ROTATION .
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-8231) Nori, a Korean analyzer based on mecab-ko-dic

2018-04-02 Thread Jim Ferenczi (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-8231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Ferenczi updated LUCENE-8231:
-
Attachment: LUCENE-8231.patch

> Nori, a Korean analyzer based on mecab-ko-dic
> -
>
> Key: LUCENE-8231
> URL: https://issues.apache.org/jira/browse/LUCENE-8231
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Jim Ferenczi
>Priority: Major
> Attachments: LUCENE-8231-remap-hangul.patch, LUCENE-8231.patch, 
> LUCENE-8231.patch, LUCENE-8231.patch, LUCENE-8231.patch
>
>
> There is a dictionary similar to IPADIC but for Korean called mecab-ko-dic:
> It is available under an Apache license here:
> https://bitbucket.org/eunjeon/mecab-ko-dic
> This dictionary was built with MeCab, it defines a format for the features 
> adapted for the Korean language.
> Since the Kuromoji tokenizer uses the same format for the morphological 
> analysis (left cost + right cost + word cost) I tried to adapt the module to 
> handle Korean with the mecab-ko-dic. I've started with a POC that copies the 
> Kuromoji module and adapts it for the mecab-ko-dic.
> I used the same classes to build and read the dictionary but I had to make 
> some modifications to handle the differences with the IPADIC and Japanese. 
> The resulting binary dictionary takes 28MB on disk, it's bigger than the 
> IPADIC but mainly because the source is bigger and there are a lot of
> compound and inflect terms that define a group of terms and the segmentation 
> that can be applied. 
> I attached the patch that contains this new Korean module called -godori- 
> nori. It is an adaptation of the Kuromoji module so currently
> the two modules don't share any code. I wanted to validate the approach first 
> and check the relevancy of the results. I don't speak Korean so I used the 
> relevancy
> tests that was added for another Korean tokenizer 
> (https://issues.apache.org/jira/browse/LUCENE-4956) and tested the output 
> against mecab-ko which is the official fork of mecab to use the mecab-ko-dic.
> I had to simplify the JapaneseTokenizer, my version removes the nBest output 
> and the decomposition of too long tokens. I also
> modified the handling of whitespaces since they are important in Korean. 
> Whitespaces that appear before a term are attached to that term and this
> information is used to compute a penalty based on the Part of Speech of the 
> token. The penalty cost is a feature added to mecab-ko to handle 
> morphemes that should not appear after a morpheme and is described in the 
> mecab-ko page:
> https://bitbucket.org/eunjeon/mecab-ko
> Ignoring whitespaces is also more inlined with the official MeCab library 
> which attach the whitespaces to the term that follows.
> I also added a decompounder filter that expand the compounds and inflects 
> defined in the dictionary and a part of speech filter similar to the Japanese
> that removes the morpheme that are not useful for relevance (suffix, prefix, 
> interjection, ...). These filters don't play well with the tokenizer if it 
> can 
> output multiple paths (nBest output for instance) so for simplicity I removed 
> this ability and the Korean tokenizer only outputs the best path.
> I compared the result with mecab-ko to confirm that the analyzer is working 
> and ran the relevancy test that is defined in HantecRel.java included
> in the patch (written by Robert for another Korean analyzer). Here are the 
> results:
> ||Analyzer||Index Time||Index Size||MAP(CLASSIC)||MAP(BM25)||MAP(GL2)||
> |Standard|35s|131MB|.007|.1044|.1053|
> |CJK|36s|164MB|.1418|.1924|.1916|
> |Korean|212s|90MB|.1628|.2094|.2078|
> I find the results very promising so I plan to continue to work on this 
> project. I started to extract the part of the code that could be shared with 
> the
> Kuromoji module but I wanted to share the status and this POC first to 
> confirm that this approach is viable. The advantages of using the same model 
> than
> the Japanese analyzer are multiple: we don't have a Korean analyzer at the 
> moment ;), the resulting dictionary is small compared to other libraries that
> use the mecab-ko-dic (the FST takes only 5.4MB) and the Tokenizer prunes the 
> lattice on the fly to select the best path efficiently.
> The dictionary can be built directly from the godori module with the 
> following command:
> ant regenerate (you need to create the resource directory (mkdir 
> lucene/analysis/godori/src/resources/org/apache/lucene/analysis/ko/dict) 
> first since the dictionary is not included in the patch).
> I've also added some minimal tests in the module to play with the analysis.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org

[jira] [Updated] (LUCENE-8231) Nori, a Korean analyzer based on mecab-ko-dic

2018-04-02 Thread Jim Ferenczi (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-8231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Ferenczi updated LUCENE-8231:
-
Attachment: (was: LUCENE-8231.patch)

> Nori, a Korean analyzer based on mecab-ko-dic
> -
>
> Key: LUCENE-8231
> URL: https://issues.apache.org/jira/browse/LUCENE-8231
> Project: Lucene - Core
>  Issue Type: New Feature
>Reporter: Jim Ferenczi
>Priority: Major
> Attachments: LUCENE-8231-remap-hangul.patch, LUCENE-8231.patch, 
> LUCENE-8231.patch, LUCENE-8231.patch, LUCENE-8231.patch
>
>
> There is a dictionary similar to IPADIC but for Korean called mecab-ko-dic:
> It is available under an Apache license here:
> https://bitbucket.org/eunjeon/mecab-ko-dic
> This dictionary was built with MeCab, it defines a format for the features 
> adapted for the Korean language.
> Since the Kuromoji tokenizer uses the same format for the morphological 
> analysis (left cost + right cost + word cost) I tried to adapt the module to 
> handle Korean with the mecab-ko-dic. I've started with a POC that copies the 
> Kuromoji module and adapts it for the mecab-ko-dic.
> I used the same classes to build and read the dictionary but I had to make 
> some modifications to handle the differences with the IPADIC and Japanese. 
> The resulting binary dictionary takes 28MB on disk, it's bigger than the 
> IPADIC but mainly because the source is bigger and there are a lot of
> compound and inflect terms that define a group of terms and the segmentation 
> that can be applied. 
> I attached the patch that contains this new Korean module called -godori- 
> nori. It is an adaptation of the Kuromoji module so currently
> the two modules don't share any code. I wanted to validate the approach first 
> and check the relevancy of the results. I don't speak Korean so I used the 
> relevancy
> tests that was added for another Korean tokenizer 
> (https://issues.apache.org/jira/browse/LUCENE-4956) and tested the output 
> against mecab-ko which is the official fork of mecab to use the mecab-ko-dic.
> I had to simplify the JapaneseTokenizer, my version removes the nBest output 
> and the decomposition of too long tokens. I also
> modified the handling of whitespaces since they are important in Korean. 
> Whitespaces that appear before a term are attached to that term and this
> information is used to compute a penalty based on the Part of Speech of the 
> token. The penalty cost is a feature added to mecab-ko to handle 
> morphemes that should not appear after a morpheme and is described in the 
> mecab-ko page:
> https://bitbucket.org/eunjeon/mecab-ko
> Ignoring whitespaces is also more inlined with the official MeCab library 
> which attach the whitespaces to the term that follows.
> I also added a decompounder filter that expand the compounds and inflects 
> defined in the dictionary and a part of speech filter similar to the Japanese
> that removes the morpheme that are not useful for relevance (suffix, prefix, 
> interjection, ...). These filters don't play well with the tokenizer if it 
> can 
> output multiple paths (nBest output for instance) so for simplicity I removed 
> this ability and the Korean tokenizer only outputs the best path.
> I compared the result with mecab-ko to confirm that the analyzer is working 
> and ran the relevancy test that is defined in HantecRel.java included
> in the patch (written by Robert for another Korean analyzer). Here are the 
> results:
> ||Analyzer||Index Time||Index Size||MAP(CLASSIC)||MAP(BM25)||MAP(GL2)||
> |Standard|35s|131MB|.007|.1044|.1053|
> |CJK|36s|164MB|.1418|.1924|.1916|
> |Korean|212s|90MB|.1628|.2094|.2078|
> I find the results very promising so I plan to continue to work on this 
> project. I started to extract the part of the code that could be shared with 
> the
> Kuromoji module but I wanted to share the status and this POC first to 
> confirm that this approach is viable. The advantages of using the same model 
> than
> the Japanese analyzer are multiple: we don't have a Korean analyzer at the 
> moment ;), the resulting dictionary is small compared to other libraries that
> use the mecab-ko-dic (the FST takes only 5.4MB) and the Tokenizer prunes the 
> lattice on the fly to select the best path efficiently.
> The dictionary can be built directly from the godori module with the 
> following command:
> ant regenerate (you need to create the resource directory (mkdir 
> lucene/analysis/godori/src/resources/org/apache/lucene/analysis/ko/dict) 
> first since the dictionary is not included in the patch).
> I've also added some minimal tests in the module to play with the analysis.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: 

[jira] [Commented] (SOLR-12133) TriggerIntegrationTest fails too easily.

2018-04-02 Thread Shalin Shekhar Mangar (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-12133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16422771#comment-16422771
 ] 

Shalin Shekhar Mangar commented on SOLR-12133:
--

Thanks Joel!

> TriggerIntegrationTest fails too easily.
> 
>
> Key: SOLR-12133
> URL: https://issues.apache.org/jira/browse/SOLR-12133
> Project: Solr
>  Issue Type: Test
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Mark Miller
>Assignee: Shalin Shekhar Mangar
>Priority: Major
> Attachments: SOLR-12133-testNodeMarkersRegistration.patch, 
> SOLR-12133.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12133) TriggerIntegrationTest fails too easily.

2018-04-02 Thread Joel Bernstein (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-12133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16422760#comment-16422760
 ] 

Joel Bernstein commented on SOLR-12133:
---

Just pushed a fix for a precommit error related to this ticket. I'll backport 
as well.

> TriggerIntegrationTest fails too easily.
> 
>
> Key: SOLR-12133
> URL: https://issues.apache.org/jira/browse/SOLR-12133
> Project: Solr
>  Issue Type: Test
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Mark Miller
>Assignee: Shalin Shekhar Mangar
>Priority: Major
> Attachments: SOLR-12133-testNodeMarkersRegistration.patch, 
> SOLR-12133.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12174) Refactor Streaming Expression function registration

2018-04-02 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-12174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16422758#comment-16422758
 ] 

ASF subversion and git services commented on SOLR-12174:


Commit d89a90067b2a3ef4e6b059aed8f3f518b46820e1 in lucene-solr's branch 
refs/heads/master from [~joel.bernstein]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=d89a900 ]

SOLR-12174: Refactor Streaming Expression function registration


> Refactor Streaming Expression function registration
> ---
>
> Key: SOLR-12174
> URL: https://issues.apache.org/jira/browse/SOLR-12174
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
>Priority: Major
> Fix For: 7.4
>
> Attachments: SOLR-12174.patch
>
>
> This ticket adds a specific class that registers all the Streaming Expression 
> functions with the StreamFactory. It also adds a test case that ensures that 
> a list of expected functions are registered, and enforces that any new 
> functions that are registered are added to the expected list of functions in 
> the test case. This ensures that functions cannot be deregistered by accident 
> in the future.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12133) TriggerIntegrationTest fails too easily.

2018-04-02 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-12133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16422759#comment-16422759
 ] 

ASF subversion and git services commented on SOLR-12133:


Commit 269a67694058a6e6b86d5e97b0dc6d579783ceda in lucene-solr's branch 
refs/heads/master from [~joel.bernstein]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=269a676 ]

SOLR-12133: Fix precommit


> TriggerIntegrationTest fails too easily.
> 
>
> Key: SOLR-12133
> URL: https://issues.apache.org/jira/browse/SOLR-12133
> Project: Solr
>  Issue Type: Test
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Mark Miller
>Assignee: Shalin Shekhar Mangar
>Priority: Major
> Attachments: SOLR-12133-testNodeMarkersRegistration.patch, 
> SOLR-12133.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11879) prevent EOFException in FastinputStream

2018-04-02 Thread Erick Erickson (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16422730#comment-16422730
 ] 

Erick Erickson commented on SOLR-11879:
---

[~noble.paul][~ysee...@gmail.com]Can we close this?


> prevent EOFException in FastinputStream
> ---
>
> Key: SOLR-11879
> URL: https://issues.apache.org/jira/browse/SOLR-11879
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
> Environment: FastI
>Reporter: Noble Paul
>Assignee: Noble Paul
>Priority: Trivial
> Attachments: SOLR-11879.patch, SOLR-11879.patch, SOLR-11879.patch, 
> Screen Shot 2018-01-24 at 7.26.16 PM.png
>
>
> FastInputStream creates and throws a new EOFException, every time an end of 
> stream is encountered. This is wasteful as we never use the stack trace 
> anywhere 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-12163) Ref Guide: Improve Setting Up an External ZK Ensemble page

2018-04-02 Thread Cassandra Targett (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-12163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16422725#comment-16422725
 ] 

Cassandra Targett commented on SOLR-12163:
--

Thanks [~varunthacker].

I can't find any reference to a {{zookeeper-env.sh}} in ZK docs - is there 
anything you didn't mention that users should do during setup to be sure this 
file is read?

Quasi-related: everything in the example {{zoo.cfg}} assumes a *nix-based OS 
(paths in particular), and I note that ZK 3.4.11 docs now say Windows is 
supported for production deployments (earlier versions said it was not...). 
Besides providing a Windows-based example of {{zoo.cfg}}, how do we provide a 
Windows-based {{zookeeper-env}}?

We can probably remove the "all 3 zk on one server" examples - I left/adapted 
it because I thought maybe it was commonly used in pre-prod scenarios since it 
was the only one really discussed on the page. But if it's confusing, it should 
go.

I'll mix in the other suggestions, thank you.

> Ref Guide: Improve Setting Up an External ZK Ensemble page
> --
>
> Key: SOLR-12163
> URL: https://issues.apache.org/jira/browse/SOLR-12163
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: documentation
>Reporter: Cassandra Targett
>Assignee: Cassandra Targett
>Priority: Major
> Fix For: 7.4
>
> Attachments: setting-up-an-external-zookeeper-ensemble.adoc
>
>
> I had to set up a ZK ensemble the other day for the first time in a while, 
> and thought I'd test our docs on the subject while I was at it. I headed over 
> to 
> https://lucene.apache.org/solr/guide/setting-up-an-external-zookeeper-ensemble.html,
>  and...Well, I still haven't gotten back to what I was trying to do, but I 
> rewrote the entire page.
> The problem to me is that the page today is mostly a stripped down copy of 
> the ZK Getting Started docs: walking through setting up a single ZK instance 
> before introducing the idea of an ensemble and going back through the same 
> configs again to update them for the ensemble.
> IOW, despite the page being titled "setting up an ensemble", it's mostly 
> about not setting up an ensemble. That's at the end of the page, which itself 
> focuses a bit heavily on the use case of running an ensemble on a single 
> server (so, if you're counting...that's 3 use cases we don't want people to 
> use discussed in detail on a page that's supposedly about _not_ doing any of 
> those things).
> So, I took all of it and restructured the whole thing to focus primarily on 
> the use case we want people to use: running 3 ZK nodes on different machines. 
> Running 3 on one machine is still there, but noted in passing with the 
> appropriate caveats. I've also added information about choosing to use a 
> chroot, which AFAICT was only covered in the section on Taking Solr to 
> Production.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[JENKINS-EA] Lucene-Solr-7.3-Windows (64bit/jdk-11-ea+5) - Build # 27 - Unstable!

2018-04-02 Thread Policeman Jenkins Server
Error processing tokens: Error while parsing action 
'Text/ZeroOrMore/FirstOf/Token/DelimitedToken/DelimitedToken_Action3' at input 
position (line 79, pos 4):
)"}
   ^

java.lang.OutOfMemoryError: Java heap space

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-12174) Refactor Streaming Expression function registration

2018-04-02 Thread Joel Bernstein (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-12174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-12174:
--
Fix Version/s: 7.4

> Refactor Streaming Expression function registration
> ---
>
> Key: SOLR-12174
> URL: https://issues.apache.org/jira/browse/SOLR-12174
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
>Priority: Major
> Fix For: 7.4
>
> Attachments: SOLR-12174.patch
>
>
> This ticket adds a specific class that registers all the Streaming Expression 
> functions with the StreamFactory. It also adds a test case that ensures that 
> a list of expected functions are registered, and enforces that any new 
> functions that are registered are added to the expected list of functions in 
> the test case. This ensures that functions cannot be deregistered by accident 
> in the future.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Assigned] (SOLR-12174) Refactor Streaming Expression function registration

2018-04-02 Thread Joel Bernstein (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-12174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein reassigned SOLR-12174:
-

Assignee: Joel Bernstein

> Refactor Streaming Expression function registration
> ---
>
> Key: SOLR-12174
> URL: https://issues.apache.org/jira/browse/SOLR-12174
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Assignee: Joel Bernstein
>Priority: Major
> Fix For: 7.4
>
> Attachments: SOLR-12174.patch
>
>
> This ticket adds a specific class that registers all the Streaming Expression 
> functions with the StreamFactory. It also adds a test case that ensures that 
> a list of expected functions are registered, and enforces that any new 
> functions that are registered are added to the expected list of functions in 
> the test case. This ensures that functions cannot be deregistered by accident 
> in the future.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (SOLR-12174) Refactor Streaming Expression function registration

2018-04-02 Thread Joel Bernstein (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-12174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joel Bernstein updated SOLR-12174:
--
Summary: Refactor Streaming Expression function registration  (was: 
Refactor how Streaming Expression functions are registered)

> Refactor Streaming Expression function registration
> ---
>
> Key: SOLR-12174
> URL: https://issues.apache.org/jira/browse/SOLR-12174
> Project: Solr
>  Issue Type: Improvement
>  Security Level: Public(Default Security Level. Issues are Public) 
>Reporter: Joel Bernstein
>Priority: Major
> Attachments: SOLR-12174.patch
>
>
> This ticket adds a specific class that registers all the Streaming Expression 
> functions with the StreamFactory. It also adds a test case that ensures that 
> a list of expected functions are registered, and enforces that any new 
> functions that are registered are added to the expected list of functions in 
> the test case. This ensures that functions cannot be deregistered by accident 
> in the future.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Resolved] (SOLR-9399) Delete requests do not send credentials & fails for Basic Authentication

2018-04-02 Thread Erick Erickson (JIRA)

 [ 
https://issues.apache.org/jira/browse/SOLR-9399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson resolved SOLR-9399.
--
   Resolution: Fixed
Fix Version/s: 7.4

> Delete requests do not send credentials & fails for Basic Authentication
> 
>
> Key: SOLR-9399
> URL: https://issues.apache.org/jira/browse/SOLR-9399
> Project: Solr
>  Issue Type: Bug
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrJ
>Affects Versions: 6.0, 6.0.1
>Reporter: Susheel Kumar
>Assignee: Erick Erickson
>Priority: Major
>  Labels: security
> Fix For: 7.4
>
>
> The getRoutes(..) func of UpdateRequest do not pass credentials to 
> LBHttpSolrClient when deleteById is set while for updates it passes the 
> credentials.  See below code snippet
>   if (deleteById != null) {
>   
>   Iterator>> entries = 
> deleteById.entrySet()
>   .iterator();
>   while (entries.hasNext()) {
> 
> Map.Entry> entry = entries.next();
> 
> String deleteId = entry.getKey();
> Map map = entry.getValue();
> Long version = null;
> if (map != null) {
>   version = (Long) map.get(VER);
> }
> Slice slice = router.getTargetSlice(deleteId, null, null, null, col);
> if (slice == null) {
>   return null;
> }
> List urls = urlMap.get(slice.getName());
> if (urls == null) {
>   return null;
> }
> String leaderUrl = urls.get(0);
> LBHttpSolrClient.Req request = routes.get(leaderUrl);
> if (request != null) {
>   UpdateRequest urequest = (UpdateRequest) request.getRequest();
>   urequest.deleteById(deleteId, version);
> } else {
>   UpdateRequest urequest = new UpdateRequest();
>   urequest.setParams(params);
>   urequest.deleteById(deleteId, version);
>   urequest.setCommitWithin(getCommitWithin());
>   request = new LBHttpSolrClient.Req(urequest, urls);
>   routes.put(leaderUrl, request);
> }
>   }
> }



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



  1   2   >