date:20140423

Robert Muir created LUCENE-5626:
---

 Summary: SimpleFSLockFactory access denied on windows.
 Key: LUCENE-5626
 URL: https://issues.apache.org/jira/browse/LUCENE-5626
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Robert Muir


This happened twice in jenkins:

{noformat}
[lockStressTest2] Exception in thread main java.io.IOException:
Access is denied
[lockStressTest2] at
java.io.WinNTFileSystem.createFileExclusively(Native Method)
[lockStressTest2] at java.io.File.createNewFile(File.java:1012)
[lockStressTest2] at
org.apache.lucene.store.SimpleFSLock.obtain(SimpleFSLockFactory.java:135)
{noformat}

My windows machine got struck by lightning, so I cannot fix this easily. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5626) SimpleFSLockFactory access denied on windows.


[ 
https://issues.apache.org/jira/browse/LUCENE-5626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13978084#comment-13978084
 ] 

Uwe Schindler commented on LUCENE-5626:
---

Hi,
this is almost impossible to reproduce here locally. It is also a concurrency 
issue on windows. The problem is: if there is already something else trying to 
create the file, Windows generally responds with

I think we should catch IOException in the obtain() method and handle this in a 
similar way like in NIOFSDir (return false, so lock was not aquired).

I will provide a patch after trying to reproduce this on windows (by making 
filesystem and cpu busy).

 SimpleFSLockFactory access denied on windows.
 ---

 Key: LUCENE-5626
 URL: https://issues.apache.org/jira/browse/LUCENE-5626
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Robert Muir

 This happened twice in jenkins:
 {noformat}
 [lockStressTest2] Exception in thread main java.io.IOException:
 Access is denied
 [lockStressTest2] at
 java.io.WinNTFileSystem.createFileExclusively(Native Method)
 [lockStressTest2] at java.io.File.createNewFile(File.java:1012)
 [lockStressTest2] at
 org.apache.lucene.store.SimpleFSLock.obtain(SimpleFSLockFactory.java:135)
 {noformat}
 My windows machine got struck by lightning, so I cannot fix this easily. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (LUCENE-5626) SimpleFSLockFactory access denied on windows.


[ 
https://issues.apache.org/jira/browse/LUCENE-5626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13978084#comment-13978084
 ] 

Uwe Schindler edited comment on LUCENE-5626 at 4/23/14 11:41 AM:
-

Hi,
this is almost impossible to reproduce here locally. It is also a concurrency 
issue on windows. The problem is: if there is already something else trying to 
create the file, Windows generally responds with Access is denied.

I think we should catch IOException in the obtain() method and handle this in a 
similar way like in NIOFSDir (return false, so lock was not aquired 
successfully).

I will provide a patch after trying to reproduce this on windows (by making 
filesystem and cpu busy).


was (Author: thetaphi):
Hi,
this is almost impossible to reproduce here locally. It is also a concurrency 
issue on windows. The problem is: if there is already something else trying to 
create the file, Windows generally responds with

I think we should catch IOException in the obtain() method and handle this in a 
similar way like in NIOFSDir (return false, so lock was not aquired).

I will provide a patch after trying to reproduce this on windows (by making 
filesystem and cpu busy).

 SimpleFSLockFactory access denied on windows.
 ---

 Key: LUCENE-5626
 URL: https://issues.apache.org/jira/browse/LUCENE-5626
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Robert Muir

 This happened twice in jenkins:
 {noformat}
 [lockStressTest2] Exception in thread main java.io.IOException:
 Access is denied
 [lockStressTest2] at
 java.io.WinNTFileSystem.createFileExclusively(Native Method)
 [lockStressTest2] at java.io.File.createNewFile(File.java:1012)
 [lockStressTest2] at
 org.apache.lucene.store.SimpleFSLock.obtain(SimpleFSLockFactory.java:135)
 {noformat}
 My windows machine got struck by lightning, so I cannot fix this easily. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5626) SimpleFSLockFactory access denied on windows.


[ 
https://issues.apache.org/jira/browse/LUCENE-5626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13978086#comment-13978086
 ] 

Robert Muir commented on LUCENE-5626:
-

Is it easy with the macro to increase the number of lock clients? Perhaps that 
is enough to trigger the issue.

 SimpleFSLockFactory access denied on windows.
 ---

 Key: LUCENE-5626
 URL: https://issues.apache.org/jira/browse/LUCENE-5626
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Robert Muir

 This happened twice in jenkins:
 {noformat}
 [lockStressTest2] Exception in thread main java.io.IOException:
 Access is denied
 [lockStressTest2] at
 java.io.WinNTFileSystem.createFileExclusively(Native Method)
 [lockStressTest2] at java.io.File.createNewFile(File.java:1012)
 [lockStressTest2] at
 org.apache.lucene.store.SimpleFSLock.obtain(SimpleFSLockFactory.java:135)
 {noformat}
 My windows machine got struck by lightning, so I cannot fix this easily. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5626) SimpleFSLockFactory access denied on windows.


[ 
https://issues.apache.org/jira/browse/LUCENE-5626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13978093#comment-13978093
 ] 

Dawid Weiss commented on LUCENE-5626:
-

See this post, interesting. 
http://stackoverflow.com/questions/10516472/file-createnewfile-randomly-fails

 SimpleFSLockFactory access denied on windows.
 ---

 Key: LUCENE-5626
 URL: https://issues.apache.org/jira/browse/LUCENE-5626
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Robert Muir

 This happened twice in jenkins:
 {noformat}
 [lockStressTest2] Exception in thread main java.io.IOException:
 Access is denied
 [lockStressTest2] at
 java.io.WinNTFileSystem.createFileExclusively(Native Method)
 [lockStressTest2] at java.io.File.createNewFile(File.java:1012)
 [lockStressTest2] at
 org.apache.lucene.store.SimpleFSLock.obtain(SimpleFSLockFactory.java:135)
 {noformat}
 My windows machine got struck by lightning, so I cannot fix this easily. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5626) SimpleFSLockFactory access denied on windows.


[ 
https://issues.apache.org/jira/browse/LUCENE-5626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13978096#comment-13978096
 ] 

Robert Muir commented on LUCENE-5626:
-

Yes the NativeFSLockFactory already has a catch block for this. 

Interestingly enough, its code comment refers to MacOS X...



 SimpleFSLockFactory access denied on windows.
 ---

 Key: LUCENE-5626
 URL: https://issues.apache.org/jira/browse/LUCENE-5626
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Robert Muir
 Attachments: LUCENE-5626.patch


 This happened twice in jenkins:
 {noformat}
 [lockStressTest2] Exception in thread main java.io.IOException:
 Access is denied
 [lockStressTest2] at
 java.io.WinNTFileSystem.createFileExclusively(Native Method)
 [lockStressTest2] at java.io.File.createNewFile(File.java:1012)
 [lockStressTest2] at
 org.apache.lucene.store.SimpleFSLock.obtain(SimpleFSLockFactory.java:135)
 {noformat}
 My windows machine got struck by lightning, so I cannot fix this easily. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-5626) SimpleFSLockFactory access denied on windows.


 [ 
https://issues.apache.org/jira/browse/LUCENE-5626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated LUCENE-5626:
--

Attachment: LUCENE-5626.patch

Simple patch, doing the same Exception handling in obtain() like with 
NativeFSLockFactory.

This bug is not new, it exists as long as SimpleFSLockFactory exists. But it 
only happens on Windows and SimpleFSLF is no longer the default, so no need to 
hold 4.8.

 SimpleFSLockFactory access denied on windows.
 ---

 Key: LUCENE-5626
 URL: https://issues.apache.org/jira/browse/LUCENE-5626
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Robert Muir
 Attachments: LUCENE-5626.patch


 This happened twice in jenkins:
 {noformat}
 [lockStressTest2] Exception in thread main java.io.IOException:
 Access is denied
 [lockStressTest2] at
 java.io.WinNTFileSystem.createFileExclusively(Native Method)
 [lockStressTest2] at java.io.File.createNewFile(File.java:1012)
 [lockStressTest2] at
 org.apache.lucene.store.SimpleFSLock.obtain(SimpleFSLockFactory.java:135)
 {noformat}
 My windows machine got struck by lightning, so I cannot fix this easily. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-5626) SimpleFSLockFactory access denied on windows.


 [ 
https://issues.apache.org/jira/browse/LUCENE-5626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated LUCENE-5626:
--

  Component/s: core/store
Fix Version/s: 5.0
   4.9
 Assignee: Uwe Schindler

 SimpleFSLockFactory access denied on windows.
 ---

 Key: LUCENE-5626
 URL: https://issues.apache.org/jira/browse/LUCENE-5626
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/store
Reporter: Robert Muir
Assignee: Uwe Schindler
 Fix For: 4.9, 5.0

 Attachments: LUCENE-5626.patch


 This happened twice in jenkins:
 {noformat}
 [lockStressTest2] Exception in thread main java.io.IOException:
 Access is denied
 [lockStressTest2] at
 java.io.WinNTFileSystem.createFileExclusively(Native Method)
 [lockStressTest2] at java.io.File.createNewFile(File.java:1012)
 [lockStressTest2] at
 org.apache.lucene.store.SimpleFSLock.obtain(SimpleFSLockFactory.java:135)
 {noformat}
 My windows machine got struck by lightning, so I cannot fix this easily. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5626) SimpleFSLockFactory access denied on windows.


[ 
https://issues.apache.org/jira/browse/LUCENE-5626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13978099#comment-13978099
 ] 

Uwe Schindler commented on LUCENE-5626:
---

bq. See this post, interesting. 
http://stackoverflow.com/questions/10516472/file-createnewfile-randomly-fails

Thanks. Virus scanner or Indexer are not a problem on Jenkins, as they are 
disabled there. Otherwise the tests never pass :-) The same on my local 
machine, I excluded all directories that are development from virus and indexer.

 SimpleFSLockFactory access denied on windows.
 ---

 Key: LUCENE-5626
 URL: https://issues.apache.org/jira/browse/LUCENE-5626
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/store
Reporter: Robert Muir
Assignee: Uwe Schindler
 Fix For: 4.9, 5.0

 Attachments: LUCENE-5626.patch


 This happened twice in jenkins:
 {noformat}
 [lockStressTest2] Exception in thread main java.io.IOException:
 Access is denied
 [lockStressTest2] at
 java.io.WinNTFileSystem.createFileExclusively(Native Method)
 [lockStressTest2] at java.io.File.createNewFile(File.java:1012)
 [lockStressTest2] at
 org.apache.lucene.store.SimpleFSLock.obtain(SimpleFSLockFactory.java:135)
 {noformat}
 My windows machine got struck by lightning, so I cannot fix this easily. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5626) SimpleFSLockFactory access denied on windows.


[ 
https://issues.apache.org/jira/browse/LUCENE-5626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13978100#comment-13978100
 ] 

Uwe Schindler commented on LUCENE-5626:
---

bq. Interestingly enough, its code comment refers to MacOS X...

But it applies to Windows, too! If I disable it, it fails on my machine, too!

 SimpleFSLockFactory access denied on windows.
 ---

 Key: LUCENE-5626
 URL: https://issues.apache.org/jira/browse/LUCENE-5626
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/store
Reporter: Robert Muir
Assignee: Uwe Schindler
 Fix For: 4.9, 5.0

 Attachments: LUCENE-5626.patch


 This happened twice in jenkins:
 {noformat}
 [lockStressTest2] Exception in thread main java.io.IOException:
 Access is denied
 [lockStressTest2] at
 java.io.WinNTFileSystem.createFileExclusively(Native Method)
 [lockStressTest2] at java.io.File.createNewFile(File.java:1012)
 [lockStressTest2] at
 org.apache.lucene.store.SimpleFSLock.obtain(SimpleFSLockFactory.java:135)
 {noformat}
 My windows machine got struck by lightning, so I cannot fix this easily. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5626) SimpleFSLockFactory access denied on windows.


[ 
https://issues.apache.org/jira/browse/LUCENE-5626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13978101#comment-13978101
 ] 

Robert Muir commented on LUCENE-5626:
-

Patch looks good. Thanks Uwe.

 SimpleFSLockFactory access denied on windows.
 ---

 Key: LUCENE-5626
 URL: https://issues.apache.org/jira/browse/LUCENE-5626
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/store
Reporter: Robert Muir
Assignee: Uwe Schindler
 Fix For: 4.9, 5.0

 Attachments: LUCENE-5626.patch


 This happened twice in jenkins:
 {noformat}
 [lockStressTest2] Exception in thread main java.io.IOException:
 Access is denied
 [lockStressTest2] at
 java.io.WinNTFileSystem.createFileExclusively(Native Method)
 [lockStressTest2] at java.io.File.createNewFile(File.java:1012)
 [lockStressTest2] at
 org.apache.lucene.store.SimpleFSLock.obtain(SimpleFSLockFactory.java:135)
 {noformat}
 My windows machine got struck by lightning, so I cannot fix this easily. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5626) SimpleFSLockFactory access denied on windows.


[ 
https://issues.apache.org/jira/browse/LUCENE-5626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13978102#comment-13978102
 ] 

Dawid Weiss commented on LUCENE-5626:
-

It would be interesting to see what GetLastError shows, but for this you'd have 
to access winapi (via jna or something alike). I can't reproduce this locally, 
unfortunately.

 SimpleFSLockFactory access denied on windows.
 ---

 Key: LUCENE-5626
 URL: https://issues.apache.org/jira/browse/LUCENE-5626
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/store
Reporter: Robert Muir
Assignee: Uwe Schindler
 Fix For: 4.9, 5.0

 Attachments: LUCENE-5626.patch


 This happened twice in jenkins:
 {noformat}
 [lockStressTest2] Exception in thread main java.io.IOException:
 Access is denied
 [lockStressTest2] at
 java.io.WinNTFileSystem.createFileExclusively(Native Method)
 [lockStressTest2] at java.io.File.createNewFile(File.java:1012)
 [lockStressTest2] at
 org.apache.lucene.store.SimpleFSLock.obtain(SimpleFSLockFactory.java:135)
 {noformat}
 My windows machine got struck by lightning, so I cannot fix this easily. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-5626) SimpleFSLockFactory access denied on windows.


 [ 
https://issues.apache.org/jira/browse/LUCENE-5626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler updated LUCENE-5626:
--

Attachment: LUCENE-5626.patch

This patch adds the failureReason like NativeFSLF does. This allows 
LockFactory throw the real failure after the locking failed. I don't like this 
code (looks like a hack), but this would make it consistent.

 SimpleFSLockFactory access denied on windows.
 ---

 Key: LUCENE-5626
 URL: https://issues.apache.org/jira/browse/LUCENE-5626
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/store
Reporter: Robert Muir
Assignee: Uwe Schindler
 Fix For: 4.9, 5.0

 Attachments: LUCENE-5626.patch, LUCENE-5626.patch


 This happened twice in jenkins:
 {noformat}
 [lockStressTest2] Exception in thread main java.io.IOException:
 Access is denied
 [lockStressTest2] at
 java.io.WinNTFileSystem.createFileExclusively(Native Method)
 [lockStressTest2] at java.io.File.createNewFile(File.java:1012)
 [lockStressTest2] at
 org.apache.lucene.store.SimpleFSLock.obtain(SimpleFSLockFactory.java:135)
 {noformat}
 My windows machine got struck by lightning, so I cannot fix this easily. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5626) SimpleFSLockFactory access denied on windows.


[ 
https://issues.apache.org/jira/browse/LUCENE-5626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13978110#comment-13978110
 ] 

Uwe Schindler commented on LUCENE-5626:
---

bq. It would be interesting to see what GetLastError shows, but for this you'd 
have to access winapi (via jna or something alike). I can't reproduce this 
locally, unfortunately.

Very simple: Access denied. The JNI code transforms GetLastError to a string 
and throws it as IOException.

 SimpleFSLockFactory access denied on windows.
 ---

 Key: LUCENE-5626
 URL: https://issues.apache.org/jira/browse/LUCENE-5626
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/store
Reporter: Robert Muir
Assignee: Uwe Schindler
 Fix For: 4.9, 5.0

 Attachments: LUCENE-5626.patch, LUCENE-5626.patch


 This happened twice in jenkins:
 {noformat}
 [lockStressTest2] Exception in thread main java.io.IOException:
 Access is denied
 [lockStressTest2] at
 java.io.WinNTFileSystem.createFileExclusively(Native Method)
 [lockStressTest2] at java.io.File.createNewFile(File.java:1012)
 [lockStressTest2] at
 org.apache.lucene.store.SimpleFSLock.obtain(SimpleFSLockFactory.java:135)
 {noformat}
 My windows machine got struck by lightning, so I cannot fix this easily. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5626) SimpleFSLockFactory access denied on windows.


[ 
https://issues.apache.org/jira/browse/LUCENE-5626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13978115#comment-13978115
 ] 

Dawid Weiss commented on LUCENE-5626:
-

Access denied... why? :)

 SimpleFSLockFactory access denied on windows.
 ---

 Key: LUCENE-5626
 URL: https://issues.apache.org/jira/browse/LUCENE-5626
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/store
Reporter: Robert Muir
Assignee: Uwe Schindler
 Fix For: 4.9, 5.0

 Attachments: LUCENE-5626.patch, LUCENE-5626.patch


 This happened twice in jenkins:
 {noformat}
 [lockStressTest2] Exception in thread main java.io.IOException:
 Access is denied
 [lockStressTest2] at
 java.io.WinNTFileSystem.createFileExclusively(Native Method)
 [lockStressTest2] at java.io.File.createNewFile(File.java:1012)
 [lockStressTest2] at
 org.apache.lucene.store.SimpleFSLock.obtain(SimpleFSLockFactory.java:135)
 {noformat}
 My windows machine got struck by lightning, so I cannot fix this easily. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5622) Fail tests if they print, and tests.verbose is not set


[ 
https://issues.apache.org/jira/browse/LUCENE-5622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13978120#comment-13978120
 ] 

Dawid Weiss commented on LUCENE-5622:
-

The problem is with Java loggers (default console handler) which grabs the 
System.out reference once and for all. This causes  tests to be order-dependent 
and the DelegateStream to be propagated outside a given test's scope.

I'll think of a way to fix this.

 Fail tests if they print, and tests.verbose is not set
 --

 Key: LUCENE-5622
 URL: https://issues.apache.org/jira/browse/LUCENE-5622
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Robert Muir
Assignee: Dawid Weiss
 Attachments: LUCENE-5622.patch, LUCENE-5622.patch, LUCENE-5622.patch, 
 LUCENE-5622.patch


 Some tests print so much stuff they are now undebuggable (see LUCENE-5612).
 I think its bad that the testrunner hides this stuff, we used to stay on top 
 of it. Instead, whne tests.verbose is false, we should install a printstreams 
 (system.out/err) that fail the test instantly because they are noisy. 
 This will ensure that our tests don't go out of control.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5626) SimpleFSLockFactory access denied on windows.


[ 
https://issues.apache.org/jira/browse/LUCENE-5626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13978125#comment-13978125
 ] 

Uwe Schindler commented on LUCENE-5626:
---

Windows responds with this error code whenever a file is opened by another or 
the same process and you want to change the directory entry/inode/whatever of a 
file. Quite common, nothing special.

 SimpleFSLockFactory access denied on windows.
 ---

 Key: LUCENE-5626
 URL: https://issues.apache.org/jira/browse/LUCENE-5626
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/store
Reporter: Robert Muir
Assignee: Uwe Schindler
 Fix For: 4.9, 5.0

 Attachments: LUCENE-5626.patch, LUCENE-5626.patch


 This happened twice in jenkins:
 {noformat}
 [lockStressTest2] Exception in thread main java.io.IOException:
 Access is denied
 [lockStressTest2] at
 java.io.WinNTFileSystem.createFileExclusively(Native Method)
 [lockStressTest2] at java.io.File.createNewFile(File.java:1012)
 [lockStressTest2] at
 org.apache.lucene.store.SimpleFSLock.obtain(SimpleFSLockFactory.java:135)
 {noformat}
 My windows machine got struck by lightning, so I cannot fix this easily. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5626) SimpleFSLockFactory access denied on windows.


[ 
https://issues.apache.org/jira/browse/LUCENE-5626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13978134#comment-13978134
 ] 

Uwe Schindler commented on LUCENE-5626:
---

Some comment from StackOverflow: 
http://stackoverflow.com/questions/4312568/what-causes-writefile-to-return-error-access-denied

{quote}
There is about a dozen different situations that might result in 
ERROR_ACCESS_DENIED. Internally, all WriteFile does is call NtWriteFile and map 
its (somewhat meaningful) NTSTATUS error code into a less meaningful HRESULT.

Among other things, ERROR_ACCESS_DENIED could indicate that the file is on a 
network volume and something went wrong with write permissions, or that the 
file is really not a file but a directory.
{quote}

 SimpleFSLockFactory access denied on windows.
 ---

 Key: LUCENE-5626
 URL: https://issues.apache.org/jira/browse/LUCENE-5626
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/store
Reporter: Robert Muir
Assignee: Uwe Schindler
 Fix For: 4.9, 5.0

 Attachments: LUCENE-5626.patch, LUCENE-5626.patch


 This happened twice in jenkins:
 {noformat}
 [lockStressTest2] Exception in thread main java.io.IOException:
 Access is denied
 [lockStressTest2] at
 java.io.WinNTFileSystem.createFileExclusively(Native Method)
 [lockStressTest2] at java.io.File.createNewFile(File.java:1012)
 [lockStressTest2] at
 org.apache.lucene.store.SimpleFSLock.obtain(SimpleFSLockFactory.java:135)
 {noformat}
 My windows machine got struck by lightning, so I cannot fix this easily. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5626) SimpleFSLockFactory access denied on windows.

2014-04-23 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13978137#comment-13978137
 ] 

Uwe Schindler commented on LUCENE-5626:
---

In fact this is a major pain and goes back to MS DOS times... If you at some 
time used the Win32 API, you know that whenever windows does not know how to 
handle a file it returns ERROR_ACCESS_DENIED on GetLastError (0x5). This is 
just the error code for, everything DOS never supported.

 SimpleFSLockFactory access denied on windows.
 ---

 Key: LUCENE-5626
 URL: https://issues.apache.org/jira/browse/LUCENE-5626
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/store
Reporter: Robert Muir
Assignee: Uwe Schindler
 Fix For: 4.9, 5.0

 Attachments: LUCENE-5626.patch, LUCENE-5626.patch


 This happened twice in jenkins:
 {noformat}
 [lockStressTest2] Exception in thread main java.io.IOException:
 Access is denied
 [lockStressTest2] at
 java.io.WinNTFileSystem.createFileExclusively(Native Method)
 [lockStressTest2] at java.io.File.createNewFile(File.java:1012)
 [lockStressTest2] at
 org.apache.lucene.store.SimpleFSLock.obtain(SimpleFSLockFactory.java:135)
 {noformat}
 My windows machine got struck by lightning, so I cannot fix this easily. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5626) SimpleFSLockFactory access denied on windows.


[ 
https://issues.apache.org/jira/browse/LUCENE-5626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13978151#comment-13978151
 ] 

ASF subversion and git services commented on LUCENE-5626:
-

Commit 1589394 from [~thetaphi] in branch 'dev/trunk'
[ https://svn.apache.org/r1589394 ]

LUCENE-5626: Fix bug in SimpleFSLockFactory's obtain() that sometimes throwed 
IOException (ERROR_ACESS_DENIED) on Windows if the lock file was created 
concurrently

 SimpleFSLockFactory access denied on windows.
 ---

 Key: LUCENE-5626
 URL: https://issues.apache.org/jira/browse/LUCENE-5626
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/store
Reporter: Robert Muir
Assignee: Uwe Schindler
 Fix For: 4.9, 5.0

 Attachments: LUCENE-5626.patch, LUCENE-5626.patch


 This happened twice in jenkins:
 {noformat}
 [lockStressTest2] Exception in thread main java.io.IOException:
 Access is denied
 [lockStressTest2] at
 java.io.WinNTFileSystem.createFileExclusively(Native Method)
 [lockStressTest2] at java.io.File.createNewFile(File.java:1012)
 [lockStressTest2] at
 org.apache.lucene.store.SimpleFSLock.obtain(SimpleFSLockFactory.java:135)
 {noformat}
 My windows machine got struck by lightning, so I cannot fix this easily. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5626) SimpleFSLockFactory access denied on windows.

2014-04-23 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13978153#comment-13978153
 ] 

ASF subversion and git services commented on LUCENE-5626:
-

Commit 1589397 from [~thetaphi] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1589397 ]

Merged revision(s) 1589394 from lucene/dev/trunk:
LUCENE-5626: Fix bug in SimpleFSLockFactory's obtain() that sometimes throwed 
IOException (ERROR_ACESS_DENIED) on Windows if the lock file was created 
concurrently

 SimpleFSLockFactory access denied on windows.
 ---

 Key: LUCENE-5626
 URL: https://issues.apache.org/jira/browse/LUCENE-5626
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/store
Reporter: Robert Muir
Assignee: Uwe Schindler
 Fix For: 4.9, 5.0

 Attachments: LUCENE-5626.patch, LUCENE-5626.patch


 This happened twice in jenkins:
 {noformat}
 [lockStressTest2] Exception in thread main java.io.IOException:
 Access is denied
 [lockStressTest2] at
 java.io.WinNTFileSystem.createFileExclusively(Native Method)
 [lockStressTest2] at java.io.File.createNewFile(File.java:1012)
 [lockStressTest2] at
 org.apache.lucene.store.SimpleFSLock.obtain(SimpleFSLockFactory.java:135)
 {noformat}
 My windows machine got struck by lightning, so I cannot fix this easily. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Resolved] (LUCENE-5626) SimpleFSLockFactory access denied on windows.

2014-04-23 Thread ASF subversion and git services (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-5626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe Schindler resolved LUCENE-5626.
---

Resolution: Fixed

 SimpleFSLockFactory access denied on windows.
 ---

 Key: LUCENE-5626
 URL: https://issues.apache.org/jira/browse/LUCENE-5626
 Project: Lucene - Core
  Issue Type: Bug
  Components: core/store
Reporter: Robert Muir
Assignee: Uwe Schindler
 Fix For: 4.9, 5.0

 Attachments: LUCENE-5626.patch, LUCENE-5626.patch


 This happened twice in jenkins:
 {noformat}
 [lockStressTest2] Exception in thread main java.io.IOException:
 Access is denied
 [lockStressTest2] at
 java.io.WinNTFileSystem.createFileExclusively(Native Method)
 [lockStressTest2] at java.io.File.createNewFile(File.java:1012)
 [lockStressTest2] at
 org.apache.lucene.store.SimpleFSLock.obtain(SimpleFSLockFactory.java:135)
 {noformat}
 My windows machine got struck by lightning, so I cannot fix this easily. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-5340) Add support for named snapshots

2014-04-23 Thread Varun Thacker (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-5340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Thacker updated SOLR-5340:


Attachment: SOLR-5340.patch

New patch with changes suggested by Noble.

 Add support for named snapshots
 ---

 Key: SOLR-5340
 URL: https://issues.apache.org/jira/browse/SOLR-5340
 Project: Solr
  Issue Type: Improvement
  Components: SolrCloud
Affects Versions: 4.5
Reporter: Mike Schrag
Assignee: Noble Paul
 Attachments: SOLR-5340.patch, SOLR-5340.patch


 It would be really nice if Solr supported named snapshots. Right now if you 
 snapshot a SolrCloud cluster, every node potentially records a slightly 
 different timestamp. Correlating those back together to effectively restore 
 the entire cluster to a consistent snapshot is pretty tedious.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5514) atomic update throws exception if the schema contains uuid fields: Invalid UUID String: 'java.util.UUID:e26c4d56-e98d-41de-9b7f-f63192089670'

2014-04-23 Thread Elran Dvir (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13978197#comment-13978197
 ] 

Elran Dvir commented on SOLR-5514:
--

I have encountered the exact same problem as Dirk.
I managed to fix it.
The value of the uuid field is combination of class name, colon and (uuid) 
actual value.
I searched in the code where this form of string is created.
The only place I found it is in the method writeVal(Object val) in JavaBinCodec 
class.
So I added a special handling for uuid values.
please see patch attached. The patch is based on Solr 4.4, but takes into 
considerations changes made in 4.7.
I will be happy to get feedback from you guys.

Thanks.

 atomic update throws exception if the schema contains uuid fields: Invalid 
 UUID String: 'java.util.UUID:e26c4d56-e98d-41de-9b7f-f63192089670'
 -

 Key: SOLR-5514
 URL: https://issues.apache.org/jira/browse/SOLR-5514
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.5.1
 Environment: unix and windows
Reporter: Dirk Reuss 
Assignee: Shalin Shekhar Mangar

 I am updating an exiting document with the statement 
 adddocfield name='name' update='set'newvalue/field
 All fields are stored and I have several UUID fields. About 10-20% of the 
 update commands will fail with the message: (example)
 Invalid UUID String: 'java.util.UUID:532c9353-d391-4a04-8618-dc2fa1ef8b35'
 the point is that java.util.UUID seems to be prepended to the original uuid 
 stored in the field and when the value is written this error occours.
 I tried to check if this specific uuid field was the problem and
 added the uuid field in the update xml with(field name='id1' 
 update='set'...). But the error simply moved to an other uuid field.
 here is the original exception:
 lst name=responseHeaderint name=status500/intint 
 name=QTime34/int/lstlst name=errorstr name=msgError while 
 creating field 
 'MyUUIDField{type=uuid,properties=indexed,stored,omitTermFreqAndPositions,required,
  required=true}' from value 
 'java.util.UUID:e26c4d56-e98d-41de-9b7f-f63192089670'/strstr 
 name=traceorg.apache.solr.common.SolrException: Error while creating field 
 'MyUUIDField{type=uuid,properties=indexed,stored,omitTermFreqAndPositions,required,
  required=true}' from value 
 'java.util.UUID:e26c4d56-e98d-41de-9b7f-f63192089670'
   at org.apache.solr.schema.FieldType.createField(FieldType.java:259)
   at org.apache.solr.schema.StrField.createFields(StrField.java:56)
   at 
 org.apache.solr.update.DocumentBuilder.addField(DocumentBuilder.java:47)
   at 
 org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:118)
   at 
 org.apache.solr.update.AddUpdateCommand.getLuceneDocument(AddUpdateCommand.java:77)
   at 
 org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:215)
   at 
 org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:69)
   at 
 org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51)
   at 
 org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalAdd(DistributedUpdateProcessor.java:556)
   at 
 org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:692)
   at 
 org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:435)
   at 
 org.apache.solr.update.processor.LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:100)
   at 
 org.apache.solr.handler.loader.XMLLoader.processUpdate(XMLLoader.java:247)
   at org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:174)
   at 
 org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:92)
   at 
 org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
   at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
   at org.apache.solr.core.SolrCore.execute(SolrCore.java:1859)
   at 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:703)
   at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:406)
   at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:195)
   at 
 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
   at 
 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
   at 
 org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
   at

[jira] [Updated] (SOLR-5514) atomic update throws exception if the schema contains uuid fields: Invalid UUID String: 'java.util.UUID:e26c4d56-e98d-41de-9b7f-f63192089670'

2014-04-23 Thread Elran Dvir (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-5514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elran Dvir updated SOLR-5514:
-

Attachment: SOLR-5514.patch

 atomic update throws exception if the schema contains uuid fields: Invalid 
 UUID String: 'java.util.UUID:e26c4d56-e98d-41de-9b7f-f63192089670'
 -

 Key: SOLR-5514
 URL: https://issues.apache.org/jira/browse/SOLR-5514
 Project: Solr
  Issue Type: Bug
Affects Versions: 4.5.1
 Environment: unix and windows
Reporter: Dirk Reuss 
Assignee: Shalin Shekhar Mangar
 Attachments: SOLR-5514.patch


 I am updating an exiting document with the statement 
 adddocfield name='name' update='set'newvalue/field
 All fields are stored and I have several UUID fields. About 10-20% of the 
 update commands will fail with the message: (example)
 Invalid UUID String: 'java.util.UUID:532c9353-d391-4a04-8618-dc2fa1ef8b35'
 the point is that java.util.UUID seems to be prepended to the original uuid 
 stored in the field and when the value is written this error occours.
 I tried to check if this specific uuid field was the problem and
 added the uuid field in the update xml with(field name='id1' 
 update='set'...). But the error simply moved to an other uuid field.
 here is the original exception:
 lst name=responseHeaderint name=status500/intint 
 name=QTime34/int/lstlst name=errorstr name=msgError while 
 creating field 
 'MyUUIDField{type=uuid,properties=indexed,stored,omitTermFreqAndPositions,required,
  required=true}' from value 
 'java.util.UUID:e26c4d56-e98d-41de-9b7f-f63192089670'/strstr 
 name=traceorg.apache.solr.common.SolrException: Error while creating field 
 'MyUUIDField{type=uuid,properties=indexed,stored,omitTermFreqAndPositions,required,
  required=true}' from value 
 'java.util.UUID:e26c4d56-e98d-41de-9b7f-f63192089670'
   at org.apache.solr.schema.FieldType.createField(FieldType.java:259)
   at org.apache.solr.schema.StrField.createFields(StrField.java:56)
   at 
 org.apache.solr.update.DocumentBuilder.addField(DocumentBuilder.java:47)
   at 
 org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:118)
   at 
 org.apache.solr.update.AddUpdateCommand.getLuceneDocument(AddUpdateCommand.java:77)
   at 
 org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:215)
   at 
 org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:69)
   at 
 org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51)
   at 
 org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalAdd(DistributedUpdateProcessor.java:556)
   at 
 org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:692)
   at 
 org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:435)
   at 
 org.apache.solr.update.processor.LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:100)
   at 
 org.apache.solr.handler.loader.XMLLoader.processUpdate(XMLLoader.java:247)
   at org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:174)
   at 
 org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:92)
   at 
 org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
   at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
   at org.apache.solr.core.SolrCore.execute(SolrCore.java:1859)
   at 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:703)
   at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:406)
   at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:195)
   at 
 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
   at 
 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
   at 
 org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
   at 
 org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
   at 
 org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
   at 
 org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
   at 
 org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
   at 
 org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:286)
   at 
 org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:852)
   at

[jira] [Commented] (LUCENE-5487) Can we separate top scorer from sub scorer?


[ 
https://issues.apache.org/jira/browse/LUCENE-5487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13978221#comment-13978221
 ] 

ASF subversion and git services commented on LUCENE-5487:
-

Commit 1589416 from [~mikemccand] in branch 'dev/trunk'
[ https://svn.apache.org/r1589416 ]

LUCENE-5487: add comment explaining hotspot voodoo

 Can we separate top scorer from sub scorer?
 ---

 Key: LUCENE-5487
 URL: https://issues.apache.org/jira/browse/LUCENE-5487
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/search
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 4.8, 5.0

 Attachments: LUCENE-5487.patch, LUCENE-5487.patch, LUCENE-5487.patch


 This is just an exploratory patch ... still many nocommits, but I
 think it may be promising.
 I find the two booleans we pass to Weight.scorer confusing, because
 they really only apply to whoever will call score(Collector) (just
 IndexSearcher and BooleanScorer).
 The params are pointless for the vast majority of scorers, because
 very, very few query scorers really need to change how top-scoring is
 done, and those scorers can *only* score top-level (throw throw UOE
 from nextDoc/advance).  It seems like these two types of scorers
 should be separately typed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5487) Can we separate top scorer from sub scorer?

2014-04-23 Thread Michael McCandless (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13978222#comment-13978222
 ] 

Michael McCandless commented on LUCENE-5487:


bq. Michael McCandless could save me from starring at broke out separate 
Weight.scoreRange/scoreAll methods for a few mins, if it was clued by a 
comment mentions #hotspot

I committed a fix ...

 Can we separate top scorer from sub scorer?
 ---

 Key: LUCENE-5487
 URL: https://issues.apache.org/jira/browse/LUCENE-5487
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/search
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 4.8, 5.0

 Attachments: LUCENE-5487.patch, LUCENE-5487.patch, LUCENE-5487.patch


 This is just an exploratory patch ... still many nocommits, but I
 think it may be promising.
 I find the two booleans we pass to Weight.scorer confusing, because
 they really only apply to whoever will call score(Collector) (just
 IndexSearcher and BooleanScorer).
 The params are pointless for the vast majority of scorers, because
 very, very few query scorers really need to change how top-scoring is
 done, and those scorers can *only* score top-level (throw throw UOE
 from nextDoc/advance).  It seems like these two types of scorers
 should be separately typed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5487) Can we separate top scorer from sub scorer?

2014-04-23 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13978227#comment-13978227
 ] 

ASF subversion and git services commented on LUCENE-5487:
-

Commit 1589422 from [~mikemccand] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1589422 ]

LUCENE-5487: add comment explaining hotspot voodoo

 Can we separate top scorer from sub scorer?
 ---

 Key: LUCENE-5487
 URL: https://issues.apache.org/jira/browse/LUCENE-5487
 Project: Lucene - Core
  Issue Type: Improvement
  Components: core/search
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 4.8, 5.0

 Attachments: LUCENE-5487.patch, LUCENE-5487.patch, LUCENE-5487.patch


 This is just an exploratory patch ... still many nocommits, but I
 think it may be promising.
 I find the two booleans we pass to Weight.scorer confusing, because
 they really only apply to whoever will call score(Collector) (just
 IndexSearcher and BooleanScorer).
 The params are pointless for the vast majority of scorers, because
 very, very few query scorers really need to change how top-scoring is
 done, and those scorers can *only* score top-level (throw throw UOE
 from nextDoc/advance).  It seems like these two types of scorers
 should be separately typed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (SOLR-6008) HDFS tests are using local filesystem because solr.hdfs.home is set to a local filesystem path.

Mark Miller created SOLR-6008:
-

 Summary: HDFS tests are using local filesystem because 
solr.hdfs.home is set to a local filesystem path.
 Key: SOLR-6008
 URL: https://issues.apache.org/jira/browse/SOLR-6008
 Project: Solr
  Issue Type: Test
  Components: Tests
Reporter: Mark Miller
Assignee: Mark Miller
 Fix For: 4.9, 5.0






--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-6006) SolrJ maven pom file specifying log4j as a dependency

2014-04-23 Thread Steve Rowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-6006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Rowe updated SOLR-6006:
-

Attachment: SOLR-6006.patch

Patch, retrieves Solrj test deps to {{solr/solrj/test-lib/}}.  Solr-core 
{{ivy.xml}} needed to changes too, since the solr-core classpath includes 
{{solr/solrj/lib/}}.  Tests pass under Ant and, after {{ant get-maven-poms}}, 
under Maven.

The only compile-phase logging dependency Solrj retains is slf4j-api - turns 
out Solrj doesn't need the log4j dependency, even for tests.

I'll see if it's useful to make the same change in the Solr contribs other than 
dataimporthandler, which already has the {{test-lib/}} setup.

 SolrJ maven pom file specifying log4j as a dependency
 -

 Key: SOLR-6006
 URL: https://issues.apache.org/jira/browse/SOLR-6006
 Project: Solr
  Issue Type: Bug
  Components: Build
Affects Versions: 4.7.2
Reporter: Steven Scott
Priority: Minor
 Attachments: SOLR-6006.patch


 I'm not sure what version this first appeared in, as we just bumped from 4.5 
 to 4.7, but log4j is specified as a dependency in the solr-solrj pom.xml, and 
 without the optional flag. I checked out the source to verify that there 
 isn't actually a dependency on log4j (doesn't seem to be), but I wasn't able 
 to decipher the ant build (looks like there's a pom.xml.template that 
 generates the pom with dependencies coming from Ivy?)
 Anyway, this is an issue since now we have to manually exclude log4j from 
 every project that depends on SolrJ.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5609) Should we revisit the default numeric precision step?

2014-04-23 Thread David Smiley (JIRA)

[
https://issues.apache.org/jira/browse/LUCENE-5609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13978392#comment-13978392
]

David Smiley commented on LUCENE-5609:
--

bq. I think testing on randomly distributed longs is too synthetic? Most
real-world data is much more restricted in practice, and those exception cases
can re-tune precisionStep to meet their cases.

Agreed -- real-world data is definitely much more restricted in practice.

I wish the precisionStep was variable. If it were, I'd usually configure the
precisionIncrement step to be 16,8,8,8,8,16 for doubles longs. Variable
prefix-tree precision is definitely a goal of LUCENE-4922 in the spatial
module. At the very high level, it's extremely rare to do gigantic
continent-spanning queries, so at that level I'd like many cells (corresponds
to a high precision step in trie numeric fields). And at the bottom levels,
it's fastest to scan() instead of seek() because there is a limited amount of
data once you get down low enough. So preferably fewer intermediate aggregate
cells down there.

Should we revisit the default numeric precision step?
-

Key: LUCENE-5609
URL: https://issues.apache.org/jira/browse/LUCENE-5609
Project: Lucene - Core
Issue Type: Improvement
Components: core/search
Reporter: Michael McCandless
Fix For: 4.9, 5.0

Attachments: LUCENE-5609.patch

Right now it's 4, for both 8 (long/double) and 4 byte (int/float)
numeric fields, but this is a pretty big hit on indexing speed and
disk usage, especially for tiny documents, because it creates many (8
or 16) terms for each value.
Since we originally set these defaults, a lot has changed... e.g. we
now rewrite MTQs per-segment, we have a faster (BlockTree) terms dict,
a faster postings format, etc.
Index size is important because it limits how much of the index will
be hot (fit in the OS's IO cache). And more apps are using Lucene for
tiny docs where the overhead of individual fields is sizable.
I used the Geonames corpus to run a simple benchmark (all sources are
committed to luceneutil). It has 8.6 M tiny docs, each with 23 fields,
with these numeric fields:
* lat/lng (double)
* modified time, elevation, population (long)
* dem (int)
I tested 4, 8 and 16 precision steps:
{noformat}
indexing:
PrecStepSizeIndexTime
4 1812.7 MB651.4 sec
8 1203.0 MB443.2 sec
16894.3 MB361.6 sec
searching:
Field PrecStep QueryTime TermCount
geoNameID 4 2872.5 ms 20306
geoNameID 8 2903.3 ms 104856
geoNameID16 3371.9 ms 5871427
latitude 4 2160.1 ms 36805
latitude 8 2249.0 ms 240655
latitude16 2725.9 ms 4649273
modified 4 2038.3 ms 13311
modified 8 2029.6 ms 58344
modified16 2060.5 ms 77763
longitude 4 3468.5 ms 33818
longitude 8 3629.9 ms 214863
longitude16 4060.9 ms 4532032
{noformat}
Index time is with 1 thread (for identical index structure).
The query time is time to run 100 random ranges for that field,
averaged over 20 iterations. TermCount is the total number of terms
the MTQ rewrote to across all 100 queries / segments, and it gets
higher as expected as precStep gets higher, but the search time is not
that heavily impacted ... negligible going from 4 to 8, and then some
impact from 8 to 16.
Maybe we should increase the int/float default precision step to 8 and
long/double to 16? Or both to 16?

--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-6006) Separate test and compile scope dependencies in ivy.xml files, so that maven dependencies' scope can be specified appropriately

2014-04-23 Thread Steve Rowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-6006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Rowe updated SOLR-6006:
-

Summary: Separate test and compile scope dependencies in ivy.xml files, so 
that maven dependencies' scope can be specified appropriately(was: SolrJ 
maven pom file specifying log4j as a dependency)

 Separate test and compile scope dependencies in ivy.xml files, so that maven 
 dependencies' scope can be specified appropriately  
 -

 Key: SOLR-6006
 URL: https://issues.apache.org/jira/browse/SOLR-6006
 Project: Solr
  Issue Type: Bug
  Components: Build
Affects Versions: 4.7.2
Reporter: Steven Scott
Priority: Minor
 Attachments: SOLR-6006.patch


 I'm not sure what version this first appeared in, as we just bumped from 4.5 
 to 4.7, but log4j is specified as a dependency in the solr-solrj pom.xml, and 
 without the optional flag. I checked out the source to verify that there 
 isn't actually a dependency on log4j (doesn't seem to be), but I wasn't able 
 to decipher the ant build (looks like there's a pom.xml.template that 
 generates the pom with dependencies coming from Ivy?)
 Anyway, this is an issue since now we have to manually exclude log4j from 
 every project that depends on SolrJ.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2894) Implement distributed pivot faceting

2014-04-23 Thread Brett Lucey (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13978442#comment-13978442
 ] 

Brett Lucey commented on SOLR-2894:
---

Hi Elran,

Having a mincount of -1 for the shards is correct.  The reason is that while a 
given shard may have a count lower than mincount for a given term, the 
aggregate total count for that value when combined with the other shards could 
exceed the mincount, so we do need to know about it.  For example, consider a 
mincount of 10.  If we have 3 shards with a count of 5 for a term of Boston, 
we would still need to know about these because the total count would be 15, 
and would be higher than the mincount.

If you were to set a facet.limit of 10 for all levels of the pivot, what is the 
memory usage like?

-Brett

 Implement distributed pivot faceting
 

 Key: SOLR-2894
 URL: https://issues.apache.org/jira/browse/SOLR-2894
 Project: Solr
  Issue Type: Improvement
Reporter: Erik Hatcher
 Fix For: 4.9, 5.0

 Attachments: SOLR-2894-reworked.patch, SOLR-2894.patch, 
 SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, 
 SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, 
 SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, 
 SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, 
 SOLR-2894.patch, SOLR-2894.patch, dateToObject.patch


 Following up on SOLR-792, pivot faceting currently only supports 
 undistributed mode.  Distributed pivot faceting needs to be implemented.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-2894) Implement distributed pivot faceting

2014-04-23 Thread Brett Lucey (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13978442#comment-13978442
 ] 

Brett Lucey edited comment on SOLR-2894 at 4/23/14 4:54 PM:


Hi Elran,

Having a mincount of -1 for the shards is correct.  The reason is that while a 
given shard may have a count lower than mincount for a given term, the 
aggregate total count for that value when combined with the other shards could 
exceed the mincount, so we do need to know about it.  For example, consider a 
mincount of 10.  If we have 3 shards with a count of 5 for a term of Boston, 
we would still need to know about these because the total count would be 15, 
and would be higher than the mincount.

I would expect the skipRefinementAtThisLevel to be false for the top level 
pivot facet, and true for each other level.  Are you seeing otherwise?

If you were to set a facet.limit of 10 for all levels of the pivot, what is the 
memory usage like?

-Brett


was (Author: brett.lucey):
Hi Elran,

Having a mincount of -1 for the shards is correct.  The reason is that while a 
given shard may have a count lower than mincount for a given term, the 
aggregate total count for that value when combined with the other shards could 
exceed the mincount, so we do need to know about it.  For example, consider a 
mincount of 10.  If we have 3 shards with a count of 5 for a term of Boston, 
we would still need to know about these because the total count would be 15, 
and would be higher than the mincount.

If you were to set a facet.limit of 10 for all levels of the pivot, what is the 
memory usage like?

-Brett

 Implement distributed pivot faceting
 

 Key: SOLR-2894
 URL: https://issues.apache.org/jira/browse/SOLR-2894
 Project: Solr
  Issue Type: Improvement
Reporter: Erik Hatcher
 Fix For: 4.9, 5.0

 Attachments: SOLR-2894-reworked.patch, SOLR-2894.patch, 
 SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, 
 SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, 
 SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, 
 SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, 
 SOLR-2894.patch, SOLR-2894.patch, dateToObject.patch


 Following up on SOLR-792, pivot faceting currently only supports 
 undistributed mode.  Distributed pivot faceting needs to be implemented.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5473) Make one state.json per collection


[ 
https://issues.apache.org/jira/browse/SOLR-5473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13978454#comment-13978454
 ] 

Mark Miller commented on SOLR-5473:
---

I'll take look asap.

I was just looking at the zk dump for a stateFormat=2 setup and had a couple 
comments:

{code}
/solr/collections/collection1/state (0)
DATA:
{collection1:{
stateFormat:2,
{code}

* I think it should be state.json just like clusterstate.json.
* Why is stateFormat in the collection? It seems we would already have to know 
its stateFormat=2 to even been reading the info, so does this really provide 
for anything?

 Make one state.json per collection
 --

 Key: SOLR-5473
 URL: https://issues.apache.org/jira/browse/SOLR-5473
 Project: Solr
  Issue Type: Sub-task
  Components: SolrCloud
Reporter: Noble Paul
Assignee: Noble Paul
 Fix For: 5.0

 Attachments: SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
 SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
 SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
 SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
 SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, ec2-23-20-119-52_solr.log, 
 ec2-50-16-38-73_solr.log


 As defined in the parent issue, store the states of each collection under 
 /collections/collectionname/state.json node



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: 4.8 Solr Ref Guide Release Plan

2014-04-23 Thread Chris Hostetter


: 2) I'll review the TODO list arround 24 hours after the first Lucene/Solr 4.8
: RC VOTE is called -- if it doesn't look like anyone is in hte middle of
: working on stuff, I'll go ahead and cut a ref-guide RC.  If it looks like

FYI: Tim Potter reached out to me that he's working on docing up the REST 
Manager stuff today -- so i'll plan on doing the RC arround 34 hours from 
now.

If you see any low hanging fruit, jump on it today.


-Hoss
http://www.lucidworks.com/

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5473) Make one state.json per collection


[ 
https://issues.apache.org/jira/browse/SOLR-5473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13978545#comment-13978545
 ] 

Noble Paul commented on SOLR-5473:
--

bq.Why is stateFormat in the collection
 
We are using stateFormat for differentiating two different versions . If we 
have a version 3 later how do we differentiate between versions ? It is just a 
good practice to explicitly state the versions 

bq.I think it should be state.json just like clusterstate.json.

We could do that . was thinking that we are planning to keep almost everything 
in json format , so is the suffix redundant?

 Make one state.json per collection
 --

 Key: SOLR-5473
 URL: https://issues.apache.org/jira/browse/SOLR-5473
 Project: Solr
  Issue Type: Sub-task
  Components: SolrCloud
Reporter: Noble Paul
Assignee: Noble Paul
 Fix For: 5.0

 Attachments: SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
 SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
 SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
 SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
 SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, ec2-23-20-119-52_solr.log, 
 ec2-50-16-38-73_solr.log


 As defined in the parent issue, store the states of each collection under 
 /collections/collectionname/state.json node



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-5707) Lucene Expressions in Solr

2014-04-23 Thread Hoss Man (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-5707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated SOLR-5707:
---

Attachment: SOLR-5707_vsp.patch

The new SOLR-5707_vsp.patch I just posted heads down the route I mentioned 
before of implementing exprssions using a ValueSourceParser.  From what i can 
tell, this new patch provides the same level of functionality as Ryan's 
previous ComputedField based patch (i was able to re-use most of his tests 
with just some minor tweaks to the syntax) but is (in my option) a cleaner 
approach, and leaves us more room to move forward with uesful things like:
* easily using request params in bindings
* dynamicly specified expressions (which can be compiled  cached if they are 
reused)

At present there are a handful of nocommit's in the patch along 3 general lines:
* more javadocs
* more tests - in particular, ryan's patch had more tests of bad-config  error 
cases - my patch definitely needs beefed up in that area
* forcing wantsScore so expressions used in fl can depend on score. 

This last item is problem ryan noted in his patch as well -- it's no worse with 
the new ValueSource approach, but it's also no better... 

Solr currently decides if/when it needs to compte scores based on the 
ReturnField class - which decides based on whether score is in the fl.  So 
at present, if you wnat to put an expression in the fl that depends on score, 
it won't work unless you also put score in the fl.  I have some ideas on 
how on a generalized improvement to make this better, but it's non trivial, and 
i'm not sure if it should block this issue or if folks think the current 
situation is an acceptible workarround and we can tackle simplifying the score 
case later?

In any case - i wnated to get this patch up for folks to look at before 
investing a lot more time in polishing it.

[~rjernst]: what do you think of this approach?


 Lucene Expressions in Solr
 --

 Key: SOLR-5707
 URL: https://issues.apache.org/jira/browse/SOLR-5707
 Project: Solr
  Issue Type: New Feature
Reporter: Ryan Ernst
 Attachments: SOLR-5707.patch, SOLR-5707_vsp.patch


 Expressions should be available for use in Solr.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5473) Make one state.json per collection


[ 
https://issues.apache.org/jira/browse/SOLR-5473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13978564#comment-13978564
 ] 

Mark Miller commented on SOLR-5473:
---

bq.  It is just a good practice to explicitly state the versions

Can you give a concrete example of how having that info there will be useful? 
It still makes no sense to me.

bq. keep almost everything in json format , so is the suffix redundant?

The 'almost' makes it not redundant. So far, we have taken the convention of 
naming it clusterstate.json - it used to be clusterstate.xml. Just like when 
reading a file from a filesystem, an extension is useful when reading a file 
from ZooKeeper (you might think of it like a distributed file system).

In any case, it's how we currently do things and it should take an argument to 
change rather than to keep the same I think.

 Make one state.json per collection
 --

 Key: SOLR-5473
 URL: https://issues.apache.org/jira/browse/SOLR-5473
 Project: Solr
  Issue Type: Sub-task
  Components: SolrCloud
Reporter: Noble Paul
Assignee: Noble Paul
 Fix For: 5.0

 Attachments: SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
 SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
 SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
 SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
 SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, ec2-23-20-119-52_solr.log, 
 ec2-50-16-38-73_solr.log


 As defined in the parent issue, store the states of each collection under 
 /collections/collectionname/state.json node



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-5244) Full Search Result Export

2014-04-23 Thread Lianyi Han (JIRA)


 [ 
https://issues.apache.org/jira/browse/SOLR-5244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lianyi Han updated SOLR-5244:
-

Attachment: 0001-SOLR_5244.patch

This plugin works great in our project and we have make two small changes in 
this patch

1 add omitHeader option
2 allow the cost parameter with a default value of 200, which might helps to 
order the post filters if you have more than one of them.

  

 Full Search Result Export
 -

 Key: SOLR-5244
 URL: https://issues.apache.org/jira/browse/SOLR-5244
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 5.0
Reporter: Joel Bernstein
Priority: Minor
 Fix For: 5.0

 Attachments: 0001-SOLR_5244.patch, SOLR-5244.patch


 It would be great if Solr could efficiently export entire search result sets 
 without scoring or ranking documents. This would allow external systems to 
 perform rapid bulk imports from Solr. It also provides a possible platform 
 for exporting results to support distributed join scenarios within Solr.
 This ticket provides a patch that has two pluggable components:
 1) ExportQParserPlugin: which is a post filter that gathers a BitSet with 
 document results and does not delegate to ranking collectors. Instead it puts 
 the BitSet on the request context.
 2) BinaryExportWriter: Is a output writer that iterates the BitSet and prints 
 the entire result as a binary stream. A header is provided at the beginning 
 of the stream so external clients can self configure.
 Note:
 These two components will be sufficient for a non-distributed environment. 
 For distributed export a new Request handler will need to be developed.
 After applying the patch and building the dist or example, you can register 
 the components through the following changes to solrconfig.xml
 Register export contrib libraries:
 lib dir=../../../dist/ regex=solr-export-\d.*\.jar /
  
 Register the export queryParser with the following line:
  
 queryParser name=export 
 class=org.apache.solr.export.ExportQParserPlugin/
  
 Register the xbin writer:
  
 queryResponseWriter name=xbin 
 class=org.apache.solr.export.BinaryExportWriter/
  
 The following query will perform the export:
 {code}
 http://localhost:8983/solr/collection1/select?q=*:*fq={!export}wt=xbinfl=join_i
 {code}
 Initial patch supports export of four data-types:
 1) Single value trie int, long and float
 2) Binary doc values.
 The numerics are currently exported from the FieldCache and the Binary doc 
 values can be in memory or on disk.
 Since this is designed to export very large result sets efficiently, stored 
 fields are not used for the export.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-5244) Full Search Result Export

2014-04-23 Thread Lianyi Han (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13978579#comment-13978579
 ] 

Lianyi Han edited comment on SOLR-5244 at 4/23/14 6:36 PM:
---

This plugin works great in our project and we have made two small changes in 
this patch

1 add omitHeader option
2 allow the cost parameter with a default value of 200, which might helps to 
order the post filters if you have more than one of them.

  


was (Author: lianyi):
This plugin works great in our project and we have make two small changes in 
this patch

1 add omitHeader option
2 allow the cost parameter with a default value of 200, which might helps to 
order the post filters if you have more than one of them.

  

 Full Search Result Export
 -

 Key: SOLR-5244
 URL: https://issues.apache.org/jira/browse/SOLR-5244
 Project: Solr
  Issue Type: New Feature
  Components: search
Affects Versions: 5.0
Reporter: Joel Bernstein
Priority: Minor
 Fix For: 5.0

 Attachments: 0001-SOLR_5244.patch, SOLR-5244.patch


 It would be great if Solr could efficiently export entire search result sets 
 without scoring or ranking documents. This would allow external systems to 
 perform rapid bulk imports from Solr. It also provides a possible platform 
 for exporting results to support distributed join scenarios within Solr.
 This ticket provides a patch that has two pluggable components:
 1) ExportQParserPlugin: which is a post filter that gathers a BitSet with 
 document results and does not delegate to ranking collectors. Instead it puts 
 the BitSet on the request context.
 2) BinaryExportWriter: Is a output writer that iterates the BitSet and prints 
 the entire result as a binary stream. A header is provided at the beginning 
 of the stream so external clients can self configure.
 Note:
 These two components will be sufficient for a non-distributed environment. 
 For distributed export a new Request handler will need to be developed.
 After applying the patch and building the dist or example, you can register 
 the components through the following changes to solrconfig.xml
 Register export contrib libraries:
 lib dir=../../../dist/ regex=solr-export-\d.*\.jar /
  
 Register the export queryParser with the following line:
  
 queryParser name=export 
 class=org.apache.solr.export.ExportQParserPlugin/
  
 Register the xbin writer:
  
 queryResponseWriter name=xbin 
 class=org.apache.solr.export.BinaryExportWriter/
  
 The following query will perform the export:
 {code}
 http://localhost:8983/solr/collection1/select?q=*:*fq={!export}wt=xbinfl=join_i
 {code}
 Initial patch supports export of four data-types:
 1) Single value trie int, long and float
 2) Binary doc values.
 The numerics are currently exported from the FieldCache and the Binary doc 
 values can be in memory or on disk.
 Since this is designed to export very large result sets efficiently, stored 
 fields are not used for the export.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[GitHub] lucene-solr pull request: Squashed commit of efbytesref, 20140422

GitHub user PaulElschot opened a pull request:

https://github.com/apache/lucene-solr/pull/45

Squashed commit of efbytesref, 20140422

LUCENE-5524

This PR adds encoding/decoding an Elias-Fano sequence in/from a BytesRef.
This PR adds three classes:
EliasFanoLongs and EliasFanoBytes both extending EliasFanoSequence,
and the long[] encoding is moved from EliasFanoEncoder into EliasFanoLongs.
The EliasFanoDecoder is changed to use these classes.
(There are also some improved variable names, this makes the changes 
somewhat less easy to read...)

The recent fix for the number of index entry bits is included.

This PR also adds methods readVlong and writeVLong to BytesRef. I 
considered keeping them local in EliasFanoBytes, but these fit better in 
BytesRef I think.

This PR also changes EliasFanoDocIdSet to use EliasFanoLongs, and to fall 
back to a FixedBitSet when too many bits are set. This fall back could be a 
separate issue, but that would be more work now.

I hope I got the generics and diamonds right...
This is a squashed commit against trunk, 12 March 2014

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/PaulElschot/lucene-solr efbytesref-201404a

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/lucene-solr/pull/45.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #45


commit a214913ac2143277d9539a3e9e3d1cd1662b754a
Author: Paul Elschot paul.j.elsc...@gmail.com
Date:   2014-03-12T21:21:13Z

Squashed commit of efbytesref, 20140312




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5524) Elias-Fano sequence also on BytesRef

2014-04-23 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13978590#comment-13978590
 ] 

ASF GitHub Bot commented on LUCENE-5524:


GitHub user PaulElschot opened a pull request:

https://github.com/apache/lucene-solr/pull/45

Squashed commit of efbytesref, 20140422

LUCENE-5524

This PR adds encoding/decoding an Elias-Fano sequence in/from a BytesRef.
This PR adds three classes:
EliasFanoLongs and EliasFanoBytes both extending EliasFanoSequence,
and the long[] encoding is moved from EliasFanoEncoder into EliasFanoLongs.
The EliasFanoDecoder is changed to use these classes.
(There are also some improved variable names, this makes the changes 
somewhat less easy to read...)

The recent fix for the number of index entry bits is included.

This PR also adds methods readVlong and writeVLong to BytesRef. I 
considered keeping them local in EliasFanoBytes, but these fit better in 
BytesRef I think.

This PR also changes EliasFanoDocIdSet to use EliasFanoLongs, and to fall 
back to a FixedBitSet when too many bits are set. This fall back could be a 
separate issue, but that would be more work now.

I hope I got the generics and diamonds right...
This is a squashed commit against trunk, 12 March 2014

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/PaulElschot/lucene-solr efbytesref-201404a

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/lucene-solr/pull/45.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #45


commit a214913ac2143277d9539a3e9e3d1cd1662b754a
Author: Paul Elschot paul.j.elsc...@gmail.com
Date:   2014-03-12T21:21:13Z

Squashed commit of efbytesref, 20140312




 Elias-Fano sequence also on BytesRef
 

 Key: LUCENE-5524
 URL: https://issues.apache.org/jira/browse/LUCENE-5524
 Project: Lucene - Core
  Issue Type: New Feature
  Components: core/other
Reporter: Paul Elschot
Priority: Minor





--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[GitHub] lucene-solr pull request: Elias-Fano sequence also on BytesRef

Github user PaulElschot closed the pull request at:

https://github.com/apache/lucene-solr/pull/41


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[GitHub] lucene-solr pull request: Elias-Fano sequence also on BytesRef

2014-04-23 Thread ASF subversion and git services (JIRA)

Github user PaulElschot commented on the pull request:

https://github.com/apache/lucene-solr/pull/41#issuecomment-41198963
  
Closed, replaced by pull request #45


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5524) Elias-Fano sequence also on BytesRef

2014-04-23 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13978596#comment-13978596
 ] 

ASF GitHub Bot commented on LUCENE-5524:


Github user PaulElschot closed the pull request at:

https://github.com/apache/lucene-solr/pull/41


 Elias-Fano sequence also on BytesRef
 

 Key: LUCENE-5524
 URL: https://issues.apache.org/jira/browse/LUCENE-5524
 Project: Lucene - Core
  Issue Type: New Feature
  Components: core/other
Reporter: Paul Elschot
Priority: Minor





--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5735) ChaosMonkey test timeouts.


[ 
https://issues.apache.org/jira/browse/SOLR-5735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13978609#comment-13978609
 ] 

ASF subversion and git services commented on SOLR-5735:
---

Commit 1589489 from [~markrmil...@gmail.com] in branch 'dev/trunk'
[ https://svn.apache.org/r1589489 ]

SOLR-5735: Try to enable this test again.

 ChaosMonkey test timeouts.
 --

 Key: SOLR-5735
 URL: https://issues.apache.org/jira/browse/SOLR-5735
 Project: Solr
  Issue Type: Task
Reporter: Mark Miller
Assignee: Mark Miller
Priority: Critical
 Fix For: 4.9, 5.0


 This started showing up in jenkins runs a while back.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5735) ChaosMonkey test timeouts.

2014-04-23 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13978610#comment-13978610
 ] 

ASF subversion and git services commented on SOLR-5735:
---

Commit 1589490 from [~markrmil...@gmail.com] in branch 'dev/branches/branch_4x'
[ https://svn.apache.org/r1589490 ]

SOLR-5735: Try to enable this test again.

 ChaosMonkey test timeouts.
 --

 Key: SOLR-5735
 URL: https://issues.apache.org/jira/browse/SOLR-5735
 Project: Solr
  Issue Type: Task
Reporter: Mark Miller
Assignee: Mark Miller
Priority: Critical
 Fix For: 4.9, 5.0


 This started showing up in jenkins runs a while back.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5524) Elias-Fano sequence also on BytesRef


[ 
https://issues.apache.org/jira/browse/LUCENE-5524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13978611#comment-13978611
 ] 

Paul Elschot commented on LUCENE-5524:
--

PR 45 is the same as PR 41, rebased to trunk of 22 April 2014.


 Elias-Fano sequence also on BytesRef
 

 Key: LUCENE-5524
 URL: https://issues.apache.org/jira/browse/LUCENE-5524
 Project: Lucene - Core
  Issue Type: New Feature
  Components: core/other
Reporter: Paul Elschot
Priority: Minor





--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-5627) Positional joins

Paul Elschot created LUCENE-5627:


 Summary: Positional joins
 Key: LUCENE-5627
 URL: https://issues.apache.org/jira/browse/LUCENE-5627
 Project: Lucene - Core
  Issue Type: New Feature
Reporter: Paul Elschot
Priority: Minor


Prototype of analysis and search for labeled fragments



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[GitHub] lucene-solr pull request: Labeledfragments 201404a

GitHub user PaulElschot opened a pull request:

https://github.com/apache/lucene-solr/pull/46

Labeledfragments 201404a

LUCENE-5627

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/PaulElschot/lucene-solr 
labeledfragments-201404a

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/lucene-solr/pull/46.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #46


commit a214913ac2143277d9539a3e9e3d1cd1662b754a
Author: Paul Elschot paul.j.elsc...@gmail.com
Date:   2014-03-12T21:21:13Z

Squashed commit of efbytesref, 20140312

commit 4c3db731b634365fb50df35f3eea562c9b51015a
Author: Paul Elschot paul.j.elsc...@gmail.com
Date:   2014-04-22T23:06:06Z

Squashed commit of labeled fragments code.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5627) Positional joins

2014-04-23 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/LUCENE-5627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13978752#comment-13978752
 ] 

ASF GitHub Bot commented on LUCENE-5627:


GitHub user PaulElschot opened a pull request:

https://github.com/apache/lucene-solr/pull/46

Labeledfragments 201404a

LUCENE-5627

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/PaulElschot/lucene-solr 
labeledfragments-201404a

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/lucene-solr/pull/46.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #46


commit a214913ac2143277d9539a3e9e3d1cd1662b754a
Author: Paul Elschot paul.j.elsc...@gmail.com
Date:   2014-03-12T21:21:13Z

Squashed commit of efbytesref, 20140312

commit 4c3db731b634365fb50df35f3eea562c9b51015a
Author: Paul Elschot paul.j.elsc...@gmail.com
Date:   2014-04-22T23:06:06Z

Squashed commit of labeled fragments code.




 Positional joins
 

 Key: LUCENE-5627
 URL: https://issues.apache.org/jira/browse/LUCENE-5627
 Project: Lucene - Core
  Issue Type: New Feature
Reporter: Paul Elschot
Priority: Minor

 Prototype of analysis and search for labeled fragments



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5627) Positional joins


[ 
https://issues.apache.org/jira/browse/LUCENE-5627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13978757#comment-13978757
 ] 

Paul Elschot commented on LUCENE-5627:
--

This adds a module called label as a prototype for index-time positional 
joins by labeled text fragments.

This provides a 1 : 0..n positional join.
It is a generalization of FieldMaskingSpanQuery that provides a 1 : 1 
positional join. 

At indexing time labeled text fragments for a document are analysed from a 
TokenStream.

In package org.apache.lucene.analysis.label such a labeled fragments stream is 
split into
a label stream, and into pairs of streams for fragments and fragment positions.
A fragment is series of tokens, possibly empty.
The fragments in each fragment stream will be contiguous,
the labels and the other fragment streams have no influence on their positions.

The output streams can be used to provide documents with different fields per 
stream.
It is up to the user to associate the output streams with fields in documents 
to be indexed for search.

Labels and fragments are represented at query time by Spans.
Querying labeled fragments with positional joins is supported in package 
org.apache.lucene.search.spans.label.

This implementation uses EliasFanoBytes (LUCENE-5524) to compress a payload 
with start/end positions.
These have a value index, which allows for fast fragment to label associations.
Currently these have no position index, so label to fragment associations will 
be somewhat slower.
Since payloads need to be loaded completely during searches, this will not have 
high performance for larger payloads.

This is a prototype because I don't expect high performance for larger payloads.
All code javadocs are marked experimental.


 Positional joins
 

 Key: LUCENE-5627
 URL: https://issues.apache.org/jira/browse/LUCENE-5627
 Project: Lucene - Core
  Issue Type: New Feature
Reporter: Paul Elschot
Priority: Minor

 Prototype of analysis and search for labeled fragments



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5627) Positional joins


[ 
https://issues.apache.org/jira/browse/LUCENE-5627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13978769#comment-13978769
 ] 

Paul Elschot commented on LUCENE-5627:
--

The name label is already used in the facet module in Lucene, e.g. 
FacetLabel.java.
I don't think this is problematic, but in case this causes confusion another 
name could be used here.


 Positional joins
 

 Key: LUCENE-5627
 URL: https://issues.apache.org/jira/browse/LUCENE-5627
 Project: Lucene - Core
  Issue Type: New Feature
Reporter: Paul Elschot
Priority: Minor

 Prototype of analysis and search for labeled fragments



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5627) Positional joins


[ 
https://issues.apache.org/jira/browse/LUCENE-5627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13978777#comment-13978777
 ] 

Paul Elschot commented on LUCENE-5627:
--

The pull request also changes the javadocs of the join module to use document 
join instead of just join.

 Positional joins
 

 Key: LUCENE-5627
 URL: https://issues.apache.org/jira/browse/LUCENE-5627
 Project: Lucene - Core
  Issue Type: New Feature
Reporter: Paul Elschot
Priority: Minor

 Prototype of analysis and search for labeled fragments



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (LUCENE-5627) Positional joins


[ 
https://issues.apache.org/jira/browse/LUCENE-5627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13978824#comment-13978824
 ] 

Paul Elschot commented on LUCENE-5627:
--

The commit for the PR also contains LeafFragmentsQuery.java, which is actually 
not needed here now. It slipped in from an extension that puts the labels in a 
tree. I'll open an issue for that later...

 Positional joins
 

 Key: LUCENE-5627
 URL: https://issues.apache.org/jira/browse/LUCENE-5627
 Project: Lucene - Core
  Issue Type: New Feature
Reporter: Paul Elschot
Priority: Minor

 Prototype of analysis and search for labeled fragments



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Re: [CONF] Apache Solr Reference Guide Collections API

2014-04-23 Thread Chris Hostetter


Is ADDREPLICA something that supports the async=true option?  If so it 
should be noted in the input parameters.



: Date: Wed, 23 Apr 2014 18:46:00 + (UTC)
: From: Noble Paul (Confluence) conflue...@apache.org
: Reply-To: dev@lucene.apache.org
: To: comm...@lucene.apache.org
: Subject: [CONF] Apache Solr Reference Guide  Collections API
: 
: [IMAGE]
: Noble Paul edited the page:
: 
: [IMAGE] COLLECTIONS API
: 
: Comment: ADDREPLICA documentation
: 
: View Online · Like · View Changes · Add Comment
: Stop watching space · Manage Notifications
: This message was sent by Atlassian Confluence 5.0.3, Team Collaboration 
Software
: 

-Hoss
http://www.lucidworks.com/

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Created] (LUCENE-5628) SpecialOperations.getFiniteStrings should not recurse

2014-04-23 Thread Michael McCandless (JIRA)

Michael McCandless created LUCENE-5628:
--

 Summary: SpecialOperations.getFiniteStrings should not recurse
 Key: LUCENE-5628
 URL: https://issues.apache.org/jira/browse/LUCENE-5628
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 4.9, 5.0


Today it consumes one Java stack frame per transition, which when used by 
AnalyzingSuggester is per character in each token.  This can lead to stack 
overflows if you have a long suggestion.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (LUCENE-5628) SpecialOperations.getFiniteStrings should not recurse

2014-04-23 Thread Michael McCandless (JIRA)


 [ 
https://issues.apache.org/jira/browse/LUCENE-5628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael McCandless updated LUCENE-5628:
---

Attachment: LUCENE-5628.patch

Patch.  The method is more hairy than before ... but I think it's working.  I 
added some more tests for getFiniteStrings.

 SpecialOperations.getFiniteStrings should not recurse
 -

 Key: LUCENE-5628
 URL: https://issues.apache.org/jira/browse/LUCENE-5628
 Project: Lucene - Core
  Issue Type: Bug
Reporter: Michael McCandless
Assignee: Michael McCandless
 Fix For: 4.9, 5.0

 Attachments: LUCENE-5628.patch


 Today it consumes one Java stack frame per transition, which when used by 
 AnalyzingSuggester is per character in each token.  This can lead to stack 
 overflows if you have a long suggestion.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5926) Add ComplexPhraseQParserPlugin to Ref Guide (cwiki)

2014-04-23 Thread Hoss Man (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-5926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13979095#comment-13979095
 ] 

Hoss Man commented on SOLR-5926:


I added some basics, but it could certainly be beefed up by someone that 
understands it well...

https://cwiki.apache.org/confluence/pages/diffpagesbyversion.action?pageId=32604257selectedPageVersions=22selectedPageVersions=21

 Add ComplexPhraseQParserPlugin to Ref Guide (cwiki)
 ---

 Key: SOLR-5926
 URL: https://issues.apache.org/jira/browse/SOLR-5926
 Project: Solr
  Issue Type: Improvement
  Components: documentation
Affects Versions: 4.7
Reporter: Ahmet Arslan
Assignee: Erick Erickson
Priority: Minor
  Labels: documentation
 Fix For: 4.9, 5.0


 Documentation of http://wiki.apache.org/solr/ComplexPhraseQueryParser
 in the ref guide, Other Parsers section. 
 https://cwiki.apache.org/confluence/display/solr/Other+Parsers 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2894) Implement distributed pivot faceting


[ 
https://issues.apache.org/jira/browse/SOLR-2894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13979128#comment-13979128
 ] 

Mark Miller commented on SOLR-2894:
---

We should get this in to get more feedback. Wish I had some time to tackle it, 
but I won't in the near term.


 Implement distributed pivot faceting
 

 Key: SOLR-2894
 URL: https://issues.apache.org/jira/browse/SOLR-2894
 Project: Solr
  Issue Type: Improvement
Reporter: Erik Hatcher
 Fix For: 4.9, 5.0

 Attachments: SOLR-2894-reworked.patch, SOLR-2894.patch, 
 SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, 
 SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, 
 SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, 
 SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, 
 SOLR-2894.patch, SOLR-2894.patch, dateToObject.patch


 Following up on SOLR-792, pivot faceting currently only supports 
 undistributed mode.  Distributed pivot faceting needs to be implemented.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5473) Make one state.json per collection


[ 
https://issues.apache.org/jira/browse/SOLR-5473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13979135#comment-13979135
 ] 

Mark Miller commented on SOLR-5473:
---

I've started looking at this patch. It looks like this is not just make one 
state.json per collection?

When you make one state.sjon per collection, does this patch also stop using 
watchers on those collections? That is, if I'm using stateFormat=2, cluster 
state will no longer be updated by watchers?

 Make one state.json per collection
 --

 Key: SOLR-5473
 URL: https://issues.apache.org/jira/browse/SOLR-5473
 Project: Solr
  Issue Type: Sub-task
  Components: SolrCloud
Reporter: Noble Paul
Assignee: Noble Paul
 Fix For: 5.0

 Attachments: SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
 SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
 SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
 SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
 SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, ec2-23-20-119-52_solr.log, 
 ec2-50-16-38-73_solr.log


 As defined in the parent issue, store the states of each collection under 
 /collections/collectionname/state.json node



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5473) Make one state.json per collection


[ 
https://issues.apache.org/jira/browse/SOLR-5473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13979144#comment-13979144
 ] 

Mark Miller commented on SOLR-5473:
---

{code}
  if(collection !=null  collection.isExternal()  ){
log.info(Registering watch for external collection 
{},cd.getCloudDescriptor().getCollectionName());

zkStateReader.addCollectionWatch(cd.getCloudDescriptor().getCollectionName());
  }
{code}

That is the part I'm curious about - I don't know what internal, external means.

Here is what what makes sense to me for a JIRA issue named Make one state.json 
per collection and the previous discussions we have had.

Storing all collections in cluster.json will be considered clusterstate 
stateFormat=1.
Storing each collection in it's own cluster.json will be considered 
clusterstate stateFormat=2.

I looked at your patch and it still leaves tons of internal, external stuff. 
It's all very, very confusing. If I had reviewed this before it was committed, 
I would say it's not ready.

In which cases do we makes watches? In which cases don't we? I would oppose 
changing the default behavior to stop using watches, but without spending a lot 
of time with the code, it's not very clear how the new stuff works vs the old.

Some notes on your new patch:

In your patch you have made the change:
{code}
-if (coll.isExternal()) {
+if (coll.getStateVersion()1) {
{code}

Why do we have getStateVersion and stateFormat? Why  1 and not =2?

bq. log.info(Creating collection with stateFormat=3:  + collectionName);

stateFormat=3?

Back to this issue overall:

This adds a lot of confusing methods and uses a lot of confusing terminology, 
non of which is defined. Most of the new methods are also not documented and 
there are few comments to help explain what is going on in the new stuff at a 
high level. I think this really muddles a bunch of user API's.

I'd like to understand more about the watcher situation, but I think there is 
still a lot of clean up to do here and I'll look at making a patch next week.


 Make one state.json per collection
 --

 Key: SOLR-5473
 URL: https://issues.apache.org/jira/browse/SOLR-5473
 Project: Solr
  Issue Type: Sub-task
  Components: SolrCloud
Reporter: Noble Paul
Assignee: Noble Paul
 Fix For: 5.0

 Attachments: SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
 SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
 SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
 SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
 SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, ec2-23-20-119-52_solr.log, 
 ec2-50-16-38-73_solr.log


 As defined in the parent issue, store the states of each collection under 
 /collections/collectionname/state.json node



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5473) Make one state.json per collection


[ 
https://issues.apache.org/jira/browse/SOLR-5473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13979177#comment-13979177
 ] 

Mark Miller commented on SOLR-5473:
---

bq. externalWatchedCollections

What are these? Could we get a comment to describe?

bq. getExternCollectionFresh

What is this? Again, I think we need something better than Extern(sp?) and 
Fresh, but the method itself also has no documentation.

bq.   * bAdvance usage/b

The project tends to use Expert or Expert method, and a lot of these new 
methods should probably have been marked lucene.experimental. Also, appears to 
be a typo.

bq. private MapString , DocCollection
bq.  if(zkStateReader.ephemeralCollectionData !=null ){
bq.  return  cs.getCommonCollection(coll);

A lot of your code that gets committed has odd formatting - would be great to 
use one of the eclipse or intellij code profiles we have for formatting.

{noformat}
   * This method can be used to fetch a collection object and control whether 
it hits
   * the cache only or if information can be looked up from ZooKeeper.
{noformat}

I brought this up before and it didn't seem to be address in your patch? I 
don't know how a user can understand this - clusterState

{code}
} catch (InterruptedException e) {
  throw new SolrException(ErrorCode.BAD_REQUEST, Could not load collection 
from ZK: + coll, e);
}
{code}

When you catch an InterruptedException you should do 
Thread.currenThread.interrupt() to reset the flag.

{code}
  /**This is not a public API. Only used by ZkController */
  public void removeZKWatch(final String coll){
{code}

Should be marked internal then. Not a great design that has public internal 
methods on public objects though.

 Make one state.json per collection
 --

 Key: SOLR-5473
 URL: https://issues.apache.org/jira/browse/SOLR-5473
 Project: Solr
  Issue Type: Sub-task
  Components: SolrCloud
Reporter: Noble Paul
Assignee: Noble Paul
 Fix For: 5.0

 Attachments: SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
 SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
 SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
 SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
 SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, ec2-23-20-119-52_solr.log, 
 ec2-50-16-38-73_solr.log


 As defined in the parent issue, store the states of each collection under 
 /collections/collectionname/state.json node



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-5473) Make one state.json per collection


[ 
https://issues.apache.org/jira/browse/SOLR-5473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13979177#comment-13979177
 ] 

Mark Miller edited comment on SOLR-5473 at 4/24/14 1:31 AM:


bq. externalWatchedCollections

What are these? Could we get a comment to describe?

bq. getExternCollectionFresh

What is this? Again, I think we need something better than Extern(sp?) and 
Fresh, but the method itself also has no documentation.

bq.   * bAdvance usage/b

The project tends to use Expert or Expert method, and a lot of these new 
methods should probably have been marked lucene.experimental. Also, appears to 
be a typo.

bq. private MapString , DocCollection
bq.  if(zkStateReader.ephemeralCollectionData !=null ){
bq.  return  cs.getCommonCollection(coll);

A lot of your code that gets committed has odd formatting - would be great to 
use one of the eclipse or intellij code profiles we have for formatting.

{noformat}
   * This method can be used to fetch a collection object and control whether 
it hits
   * the cache only or if information can be looked up from ZooKeeper.
{noformat}

I brought this up before and it didn't seem to be address in your patch? I 
don't know how a user can understand this - you could always get the latest 
collection info by doing zkStateReader#updateState and then getCollecttion. 
What does this method offer that is different that is worth the API ugliness? 

{code}
} catch (InterruptedException e) {
  throw new SolrException(ErrorCode.BAD_REQUEST, Could not load collection 
from ZK: + coll, e);
}
{code}

When you catch an InterruptedException you should do 
Thread.currenThread.interrupt() to reset the flag.

{code}
  /**This is not a public API. Only used by ZkController */
  public void removeZKWatch(final String coll){
{code}

Should be marked internal then. Not a great design that has public internal 
methods on public objects though.


was (Author: markrmil...@gmail.com):
bq. externalWatchedCollections

What are these? Could we get a comment to describe?

bq. getExternCollectionFresh

What is this? Again, I think we need something better than Extern(sp?) and 
Fresh, but the method itself also has no documentation.

bq.   * bAdvance usage/b

The project tends to use Expert or Expert method, and a lot of these new 
methods should probably have been marked lucene.experimental. Also, appears to 
be a typo.

bq. private MapString , DocCollection
bq.  if(zkStateReader.ephemeralCollectionData !=null ){
bq.  return  cs.getCommonCollection(coll);

A lot of your code that gets committed has odd formatting - would be great to 
use one of the eclipse or intellij code profiles we have for formatting.

{noformat}
   * This method can be used to fetch a collection object and control whether 
it hits
   * the cache only or if information can be looked up from ZooKeeper.
{noformat}

I brought this up before and it didn't seem to be address in your patch? I 
don't know how a user can understand this - clusterState

{code}
} catch (InterruptedException e) {
  throw new SolrException(ErrorCode.BAD_REQUEST, Could not load collection 
from ZK: + coll, e);
}
{code}

When you catch an InterruptedException you should do 
Thread.currenThread.interrupt() to reset the flag.

{code}
  /**This is not a public API. Only used by ZkController */
  public void removeZKWatch(final String coll){
{code}

Should be marked internal then. Not a great design that has public internal 
methods on public objects though.

 Make one state.json per collection
 --

 Key: SOLR-5473
 URL: https://issues.apache.org/jira/browse/SOLR-5473
 Project: Solr
  Issue Type: Sub-task
  Components: SolrCloud
Reporter: Noble Paul
Assignee: Noble Paul
 Fix For: 5.0

 Attachments: SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
 SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
 SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
 SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
 SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, ec2-23-20-119-52_solr.log, 
 ec2-50-16-38-73_solr.log


 As defined in the parent issue, store the states of each collection under 
 /collections/collectionname/state.json node



--
This message was

[jira] [Created] (SOLR-6009) edismax mis-parsing RegexpQuery

Evan Sayer created SOLR-6009:


 Summary: edismax mis-parsing RegexpQuery
 Key: SOLR-6009
 URL: https://issues.apache.org/jira/browse/SOLR-6009
 Project: Solr
  Issue Type: Bug
  Components: query parsers
Affects Versions: 4.7.2
Reporter: Evan Sayer


edismax appears to be leaking its IMPOSSIBLE_FIELD_NAME into queries involving 
a RegexpQuery.  Steps to reproduce on 4.7.2:

1) remove the explicit field / definition for 'text'
2) add a catch-all '*' dynamic field of type text_general

dynamicField name=* type=text_general multiValued=true indexed=true 
stored=true /

3) index the exampledocs/ data
4) run a query like the following:

http://localhost:8983/solr/collection1/select?q={!edismax%20qf=%27text%27}%20/.*elec.*/debugQuery=true

The debugQuery output will look like this:

lst name=debug
str name=rawquerystring{!edismax qf='text'} /.*elec.*//str
str name=querystring{!edismax qf='text'} /.*elec.*//str
str name=parsedquery(+RegexpQuery(:/.*elec.*/))/no_coord/str
str name=parsedquery_toString+:/.*elec.*//str

If you copy/paste the parsed-query into a text editor or something, you can see 
that the field-name isn't actually blank.  The IMPOSSIBLE_FIELD_NAME ends up in 
there.

I haven't been able to reproduce this behavior on 4.7.2 without getting rid of 
the explicit field definition for 'text' and using a dynamicField, which is how 
things are setup on the machine where this issue was discovered.  The query 
isn't quite right with the explicit field definition in place either, though:

lst name=debug
str name=rawquerystring{!edismax qf='text'} /.*elec.*//str
str name=querystring{!edismax qf='text'} /.*elec.*//str
str name=parsedquery(+DisjunctionMaxQuery((text:elec)))/no_coord/str
str name=parsedquery_toString+(text:elec)/str

numFound=0 for both of these.  This site is useful for looking at the 
characters in the first variant:

http://rishida.net/tools/conversion/




--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-6009) edismax mis-parsing RegexpQuery


 [ 
https://issues.apache.org/jira/browse/SOLR-6009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Evan Sayer updated SOLR-6009:
-

Description: 
edismax appears to be leaking its IMPOSSIBLE_FIELD_NAME into queries involving 
a RegexpQuery.  Steps to reproduce on 4.7.2:

1) remove the explicit field / definition for 'text'
2) add a catch-all '*' dynamic field of type text_general

dynamicField name=* type=text_general multiValued=true indexed=true 
stored=true /

3) index the exampledocs/ data
4) run a query like the following:

{code}
http://localhost:8983/solr/collection1/select?q={!edismax%20qf=%27text%27}%20/.*elec.*/debugQuery=true
{code}

The debugQuery output will look like this:

lst name=debug
str name=rawquerystring{!edismax qf='text'} /.*elec.*//str
str name=querystring{!edismax qf='text'} /.*elec.*//str
str name=parsedquery(+RegexpQuery(:/.*elec.*/))/no_coord/str
str name=parsedquery_toString+:/.*elec.*//str

If you copy/paste the parsed-query into a text editor or something, you can see 
that the field-name isn't actually blank.  The IMPOSSIBLE_FIELD_NAME ends up in 
there.

I haven't been able to reproduce this behavior on 4.7.2 without getting rid of 
the explicit field definition for 'text' and using a dynamicField, which is how 
things are setup on the machine where this issue was discovered.  The query 
isn't quite right with the explicit field definition in place either, though:

lst name=debug
str name=rawquerystring{!edismax qf='text'} /.*elec.*//str
str name=querystring{!edismax qf='text'} /.*elec.*//str
str name=parsedquery(+DisjunctionMaxQuery((text:elec)))/no_coord/str
str name=parsedquery_toString+(text:elec)/str

numFound=0 for both of these.  This site is useful for looking at the 
characters in the first variant:

http://rishida.net/tools/conversion/


  was:
edismax appears to be leaking its IMPOSSIBLE_FIELD_NAME into queries involving 
a RegexpQuery.  Steps to reproduce on 4.7.2:

1) remove the explicit field / definition for 'text'
2) add a catch-all '*' dynamic field of type text_general

dynamicField name=* type=text_general multiValued=true indexed=true 
stored=true /

3) index the exampledocs/ data
4) run a query like the following:

http://localhost:8983/solr/collection1/select?q={!edismax%20qf=%27text%27}%20/.*elec.*/debugQuery=true

The debugQuery output will look like this:

lst name=debug
str name=rawquerystring{!edismax qf='text'} /.*elec.*//str
str name=querystring{!edismax qf='text'} /.*elec.*//str
str name=parsedquery(+RegexpQuery(:/.*elec.*/))/no_coord/str
str name=parsedquery_toString+:/.*elec.*//str

If you copy/paste the parsed-query into a text editor or something, you can see 
that the field-name isn't actually blank.  The IMPOSSIBLE_FIELD_NAME ends up in 
there.

I haven't been able to reproduce this behavior on 4.7.2 without getting rid of 
the explicit field definition for 'text' and using a dynamicField, which is how 
things are setup on the machine where this issue was discovered.  The query 
isn't quite right with the explicit field definition in place either, though:

lst name=debug
str name=rawquerystring{!edismax qf='text'} /.*elec.*//str
str name=querystring{!edismax qf='text'} /.*elec.*//str
str name=parsedquery(+DisjunctionMaxQuery((text:elec)))/no_coord/str
str name=parsedquery_toString+(text:elec)/str

numFound=0 for both of these.  This site is useful for looking at the 
characters in the first variant:

http://rishida.net/tools/conversion/



 edismax mis-parsing RegexpQuery
 ---

 Key: SOLR-6009
 URL: https://issues.apache.org/jira/browse/SOLR-6009
 Project: Solr
  Issue Type: Bug
  Components: query parsers
Affects Versions: 4.7.2
Reporter: Evan Sayer

 edismax appears to be leaking its IMPOSSIBLE_FIELD_NAME into queries 
 involving a RegexpQuery.  Steps to reproduce on 4.7.2:
 1) remove the explicit field / definition for 'text'
 2) add a catch-all '*' dynamic field of type text_general
 dynamicField name=* type=text_general multiValued=true indexed=true 
 stored=true /
 3) index the exampledocs/ data
 4) run a query like the following:
 {code}
 http://localhost:8983/solr/collection1/select?q={!edismax%20qf=%27text%27}%20/.*elec.*/debugQuery=true
 {code}
 The debugQuery output will look like this:
 lst name=debug
 str name=rawquerystring{!edismax qf='text'} /.*elec.*//str
 str name=querystring{!edismax qf='text'} /.*elec.*//str
 str name=parsedquery(+RegexpQuery(:/.*elec.*/))/no_coord/str
 str name=parsedquery_toString+:/.*elec.*//str
 If you copy/paste the parsed-query into a text editor or something, you can 
 see that the field-name isn't actually blank.  The IMPOSSIBLE_FIELD_NAME ends 
 up in there.
 I haven't been able to reproduce this behavior on 4.7.2 without getting rid 
 of the explicit field definition for 'text' and using a dynamicField, which

[jira] [Updated] (SOLR-6009) edismax mis-parsing RegexpQuery


 [ 
https://issues.apache.org/jira/browse/SOLR-6009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Evan Sayer updated SOLR-6009:
-

Description: 
edismax appears to be leaking its IMPOSSIBLE_FIELD_NAME into queries involving 
a RegexpQuery.  Steps to reproduce on 4.7.2:

1) remove the explicit field / definition for 'text'
2) add a catch-all '*' dynamic field of type text_general

dynamicField name=* type=text_general multiValued=true indexed=true 
stored=true /

3) index the exampledocs/ data
4) run a query like the following:

{code}
http://localhost:8983/solr/collection1/select?q={!edismax%20qf=%27text%27}%20/.*elec.*/debugQuery=true
{code}

The debugQuery output will look like this:

{code}
lst name=debug
str name=rawquerystring{!edismax qf='text'} /.*elec.*//str
str name=querystring{!edismax qf='text'} /.*elec.*//str
str name=parsedquery(+RegexpQuery(:/.*elec.*/))/no_coord/str
str name=parsedquery_toString+:/.*elec.*//str
{code}

If you copy/paste the parsed-query into a text editor or something, you can see 
that the field-name isn't actually blank.  The IMPOSSIBLE_FIELD_NAME ends up in 
there.

I haven't been able to reproduce this behavior on 4.7.2 without getting rid of 
the explicit field definition for 'text' and using a dynamicField, which is how 
things are setup on the machine where this issue was discovered.  The query 
isn't quite right with the explicit field definition in place either, though:

{code}
lst name=debug
str name=rawquerystring{!edismax qf='text'} /.*elec.*//str
str name=querystring{!edismax qf='text'} /.*elec.*//str
str name=parsedquery(+DisjunctionMaxQuery((text:elec)))/no_coord/str
str name=parsedquery_toString+(text:elec)/str
{code}

numFound=0 for both of these.  This site is useful for looking at the 
characters in the first variant:

http://rishida.net/tools/conversion/


  was:
edismax appears to be leaking its IMPOSSIBLE_FIELD_NAME into queries involving 
a RegexpQuery.  Steps to reproduce on 4.7.2:

1) remove the explicit field / definition for 'text'
2) add a catch-all '*' dynamic field of type text_general

dynamicField name=* type=text_general multiValued=true indexed=true 
stored=true /

3) index the exampledocs/ data
4) run a query like the following:

{code}
http://localhost:8983/solr/collection1/select?q={!edismax%20qf=%27text%27}%20/.*elec.*/debugQuery=true
{code}

The debugQuery output will look like this:

lst name=debug
str name=rawquerystring{!edismax qf='text'} /.*elec.*//str
str name=querystring{!edismax qf='text'} /.*elec.*//str
str name=parsedquery(+RegexpQuery(:/.*elec.*/))/no_coord/str
str name=parsedquery_toString+:/.*elec.*//str

If you copy/paste the parsed-query into a text editor or something, you can see 
that the field-name isn't actually blank.  The IMPOSSIBLE_FIELD_NAME ends up in 
there.

I haven't been able to reproduce this behavior on 4.7.2 without getting rid of 
the explicit field definition for 'text' and using a dynamicField, which is how 
things are setup on the machine where this issue was discovered.  The query 
isn't quite right with the explicit field definition in place either, though:

lst name=debug
str name=rawquerystring{!edismax qf='text'} /.*elec.*//str
str name=querystring{!edismax qf='text'} /.*elec.*//str
str name=parsedquery(+DisjunctionMaxQuery((text:elec)))/no_coord/str
str name=parsedquery_toString+(text:elec)/str

numFound=0 for both of these.  This site is useful for looking at the 
characters in the first variant:

http://rishida.net/tools/conversion/



 edismax mis-parsing RegexpQuery
 ---

 Key: SOLR-6009
 URL: https://issues.apache.org/jira/browse/SOLR-6009
 Project: Solr
  Issue Type: Bug
  Components: query parsers
Affects Versions: 4.7.2
Reporter: Evan Sayer

 edismax appears to be leaking its IMPOSSIBLE_FIELD_NAME into queries 
 involving a RegexpQuery.  Steps to reproduce on 4.7.2:
 1) remove the explicit field / definition for 'text'
 2) add a catch-all '*' dynamic field of type text_general
 dynamicField name=* type=text_general multiValued=true indexed=true 
 stored=true /
 3) index the exampledocs/ data
 4) run a query like the following:
 {code}
 http://localhost:8983/solr/collection1/select?q={!edismax%20qf=%27text%27}%20/.*elec.*/debugQuery=true
 {code}
 The debugQuery output will look like this:
 {code}
 lst name=debug
 str name=rawquerystring{!edismax qf='text'} /.*elec.*//str
 str name=querystring{!edismax qf='text'} /.*elec.*//str
 str name=parsedquery(+RegexpQuery(:/.*elec.*/))/no_coord/str
 str name=parsedquery_toString+:/.*elec.*//str
 {code}
 If you copy/paste the parsed-query into a text editor or something, you can 
 see that the field-name isn't actually blank.  The IMPOSSIBLE_FIELD_NAME ends 
 up in there.
 I haven't been able to reproduce this behavior on 4.7.2 without getting rid 
 of the explicit

[jira] [Commented] (SOLR-5473) Make one state.json per collection


[ 
https://issues.apache.org/jira/browse/SOLR-5473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13979195#comment-13979195
 ] 

Mark Miller commented on SOLR-5473:
---

bq. public Map ephemeralCollectionData;

This is not thread safe and ZkStateReader needs to be.

 Make one state.json per collection
 --

 Key: SOLR-5473
 URL: https://issues.apache.org/jira/browse/SOLR-5473
 Project: Solr
  Issue Type: Sub-task
  Components: SolrCloud
Reporter: Noble Paul
Assignee: Noble Paul
 Fix For: 5.0

 Attachments: SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
 SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
 SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
 SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
 SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, ec2-23-20-119-52_solr.log, 
 ec2-50-16-38-73_solr.log


 As defined in the parent issue, store the states of each collection under 
 /collections/collectionname/state.json node



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Comment Edited] (SOLR-5473) Make one state.json per collection


[ 
https://issues.apache.org/jira/browse/SOLR-5473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13979195#comment-13979195
 ] 

Mark Miller edited comment on SOLR-5473 at 4/24/14 2:02 AM:


bq. public Map ephemeralCollectionData;

This is not thread safe and ZkStateReader needs to be.

Also, -1 on this public variable on ZkStateReader. It's a bad design smell for 
good reason.


was (Author: markrmil...@gmail.com):
bq. public Map ephemeralCollectionData;

This is not thread safe and ZkStateReader needs to be.

 Make one state.json per collection
 --

 Key: SOLR-5473
 URL: https://issues.apache.org/jira/browse/SOLR-5473
 Project: Solr
  Issue Type: Sub-task
  Components: SolrCloud
Reporter: Noble Paul
Assignee: Noble Paul
 Fix For: 5.0

 Attachments: SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
 SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
 SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
 SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
 SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, ec2-23-20-119-52_solr.log, 
 ec2-50-16-38-73_solr.log


 As defined in the parent issue, store the states of each collection under 
 /collections/collectionname/state.json node



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-6009) edismax mis-parsing RegexpQuery


 [ 
https://issues.apache.org/jira/browse/SOLR-6009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Evan Sayer updated SOLR-6009:
-

Description: 
edismax appears to be leaking its IMPOSSIBLE_FIELD_NAME into queries involving 
a RegexpQuery.  Steps to reproduce on 4.7.2:

1) remove the explicit field / definition for 'text'
2) add a catch-all '*' dynamic field of type text_general

{code}
dynamicField name=* type=text_general multiValued=true indexed=true 
stored=true /
{code}

3) index the exampledocs/ data
4) run a query like the following:

{code}
http://localhost:8983/solr/collection1/select?q={!edismax%20qf=%27text%27}%20/.*elec.*/debugQuery=true
{code}

The debugQuery output will look like this:

{code}
lst name=debug
str name=rawquerystring{!edismax qf='text'} /.*elec.*//str
str name=querystring{!edismax qf='text'} /.*elec.*//str
str name=parsedquery(+RegexpQuery(:/.*elec.*/))/no_coord/str
str name=parsedquery_toString+:/.*elec.*//str
{code}

If you copy/paste the parsed-query into a text editor or something, you can see 
that the field-name isn't actually blank.  The IMPOSSIBLE_FIELD_NAME ends up in 
there.

I haven't been able to reproduce this behavior on 4.7.2 without getting rid of 
the explicit field definition for 'text' and using a dynamicField, which is how 
things are setup on the machine where this issue was discovered.  The query 
isn't quite right with the explicit field definition in place either, though:

{code}
lst name=debug
str name=rawquerystring{!edismax qf='text'} /.*elec.*//str
str name=querystring{!edismax qf='text'} /.*elec.*//str
str name=parsedquery(+DisjunctionMaxQuery((text:elec)))/no_coord/str
str name=parsedquery_toString+(text:elec)/str
{code}

numFound=0 for both of these.  This site is useful for looking at the 
characters in the first variant:

http://rishida.net/tools/conversion/


  was:
edismax appears to be leaking its IMPOSSIBLE_FIELD_NAME into queries involving 
a RegexpQuery.  Steps to reproduce on 4.7.2:

1) remove the explicit field / definition for 'text'
2) add a catch-all '*' dynamic field of type text_general

dynamicField name=* type=text_general multiValued=true indexed=true 
stored=true /

3) index the exampledocs/ data
4) run a query like the following:

{code}
http://localhost:8983/solr/collection1/select?q={!edismax%20qf=%27text%27}%20/.*elec.*/debugQuery=true
{code}

The debugQuery output will look like this:

{code}
lst name=debug
str name=rawquerystring{!edismax qf='text'} /.*elec.*//str
str name=querystring{!edismax qf='text'} /.*elec.*//str
str name=parsedquery(+RegexpQuery(:/.*elec.*/))/no_coord/str
str name=parsedquery_toString+:/.*elec.*//str
{code}

If you copy/paste the parsed-query into a text editor or something, you can see 
that the field-name isn't actually blank.  The IMPOSSIBLE_FIELD_NAME ends up in 
there.

I haven't been able to reproduce this behavior on 4.7.2 without getting rid of 
the explicit field definition for 'text' and using a dynamicField, which is how 
things are setup on the machine where this issue was discovered.  The query 
isn't quite right with the explicit field definition in place either, though:

{code}
lst name=debug
str name=rawquerystring{!edismax qf='text'} /.*elec.*//str
str name=querystring{!edismax qf='text'} /.*elec.*//str
str name=parsedquery(+DisjunctionMaxQuery((text:elec)))/no_coord/str
str name=parsedquery_toString+(text:elec)/str
{code}

numFound=0 for both of these.  This site is useful for looking at the 
characters in the first variant:

http://rishida.net/tools/conversion/



 edismax mis-parsing RegexpQuery
 ---

 Key: SOLR-6009
 URL: https://issues.apache.org/jira/browse/SOLR-6009
 Project: Solr
  Issue Type: Bug
  Components: query parsers
Affects Versions: 4.7.2
Reporter: Evan Sayer

 edismax appears to be leaking its IMPOSSIBLE_FIELD_NAME into queries 
 involving a RegexpQuery.  Steps to reproduce on 4.7.2:
 1) remove the explicit field / definition for 'text'
 2) add a catch-all '*' dynamic field of type text_general
 {code}
 dynamicField name=* type=text_general multiValued=true indexed=true 
 stored=true /
 {code}
 3) index the exampledocs/ data
 4) run a query like the following:
 {code}
 http://localhost:8983/solr/collection1/select?q={!edismax%20qf=%27text%27}%20/.*elec.*/debugQuery=true
 {code}
 The debugQuery output will look like this:
 {code}
 lst name=debug
 str name=rawquerystring{!edismax qf='text'} /.*elec.*//str
 str name=querystring{!edismax qf='text'} /.*elec.*//str
 str name=parsedquery(+RegexpQuery(:/.*elec.*/))/no_coord/str
 str name=parsedquery_toString+:/.*elec.*//str
 {code}
 If you copy/paste the parsed-query into a text editor or something, you can 
 see that the field-name isn't actually blank.  The IMPOSSIBLE_FIELD_NAME ends 
 up in there.
 I haven't been able to reproduce this

[jira] [Commented] (SOLR-5474) Have a new mode for SolrJ to support stateFormat=2


[ 
https://issues.apache.org/jira/browse/SOLR-5474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13979200#comment-13979200
 ] 

Mark Miller commented on SOLR-5474:
---

Attached is a test fail for this.

{noformat}
org.apache.solr.client.solrj.SolrServerException: 
org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException: STATE 
STALE: collection1:15valid : false
at 
__randomizedtesting.SeedInfo.seed([170B85B5FF6722D6:96ED0BAD883842EA]:0)
at 
org.apache.solr.client.solrj.impl.CloudSolrServer.requestWithRetryOnStaleState(CloudSolrServer.java:683)
at 
org.apache.solr.client.solrj.impl.CloudSolrServer.requestWithRetryOnStaleState(CloudSolrServer.java:676)
at 
org.apache.solr.client.solrj.impl.CloudSolrServer.request(CloudSolrServer.java:556)
at 
org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:91)
at org.apache.solr.client.solrj.SolrServer.query(SolrServer.java:301)
at 
org.apache.solr.cloud.AbstractFullDistribZkTestBase.queryServer(AbstractFullDistribZkTestBase.java:1292)
at 
org.apache.solr.BaseDistributedSearchTestCase.query(BaseDistributedSearchTestCase.java:558)
at 
org.apache.solr.BaseDistributedSearchTestCase.query(BaseDistributedSearchTestCase.java:540)
at 
org.apache.solr.BaseDistributedSearchTestCase.query(BaseDistributedSearchTestCase.java:519)
at 
org.apache.solr.cloud.BasicDistributedZk2Test.brindDownShardIndexSomeDocsAndRecover(BasicDistributedZk2Test.java:291)
at 
org.apache.solr.cloud.BasicDistributedZk2Test.doTest(BasicDistributedZk2Test.java:109)
at 
org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:865)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
{noformat}

 Have a new mode for SolrJ to support stateFormat=2
 --

 Key: SOLR-5474
 URL: https://issues.apache.org/jira/browse/SOLR-5474
 Project: Solr
  Issue Type: Sub-task
  Components: SolrCloud
Reporter: Noble Paul
Assignee: Noble Paul
 Fix For: 5.0

 Attachments: SOLR-5474.patch, SOLR-5474.patch, SOLR-5474.patch, 
 fail.logs


 In this mode SolrJ would not watch any ZK node
 It fetches the state  on demand and cache the most recently used n 
 collections in memory.
 SolrJ would not listen to any ZK node. When a request comes for a collection 
 ‘xcoll’
 it would first check if such a collection exists
 If yes it first looks up the details in the local cache for that collection
 If not found in cache , it fetches the node /collections/xcoll/state.json and 
 caches the information
 Any query/update will be sent with extra query param specifying the 
 collection name , version (example \_stateVer=xcoll:34) . A node would throw 
 an error (INVALID_NODE) if it does not have the right version
 If SolrJ gets INVALID_NODE error it would invalidate the cache and fetch 
 fresh state information for that collection (and caches it again)
 If there is a connection timeout, SolrJ assumes the node is down and re-fetch 
 the state for the collection and try again



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Updated] (SOLR-5474) Have a new mode for SolrJ to support stateFormat=2


 [ 
https://issues.apache.org/jira/browse/SOLR-5474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Miller updated SOLR-5474:
--

Attachment: fail.logs

 Have a new mode for SolrJ to support stateFormat=2
 --

 Key: SOLR-5474
 URL: https://issues.apache.org/jira/browse/SOLR-5474
 Project: Solr
  Issue Type: Sub-task
  Components: SolrCloud
Reporter: Noble Paul
Assignee: Noble Paul
 Fix For: 5.0

 Attachments: SOLR-5474.patch, SOLR-5474.patch, SOLR-5474.patch, 
 fail.logs


 In this mode SolrJ would not watch any ZK node
 It fetches the state  on demand and cache the most recently used n 
 collections in memory.
 SolrJ would not listen to any ZK node. When a request comes for a collection 
 ‘xcoll’
 it would first check if such a collection exists
 If yes it first looks up the details in the local cache for that collection
 If not found in cache , it fetches the node /collections/xcoll/state.json and 
 caches the information
 Any query/update will be sent with extra query param specifying the 
 collection name , version (example \_stateVer=xcoll:34) . A node would throw 
 an error (INVALID_NODE) if it does not have the right version
 If SolrJ gets INVALID_NODE error it would invalidate the cache and fetch 
 fresh state information for that collection (and caches it again)
 If there is a connection timeout, SolrJ assumes the node is down and re-fetch 
 the state for the collection and try again



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5473) Make one state.json per collection


[ 
https://issues.apache.org/jira/browse/SOLR-5473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13979201#comment-13979201
 ] 

Mark Miller commented on SOLR-5473:
---

I also think this expanded usage of zkStateReader in ClusterState is really the 
wrong thing to do. If anything, we should pull the reader that was there 
already out.

ClusterState is an immutable object that is simply meant to hold the 
clusterstate - this zkStatereader calls in and methods like getCached* are a 
complete violation of the classes spirit and intent.

There is a lot to do on this patch IMO. I would prefer we take it out and get 
it in shape on a branch before it's too hard to take out. As it is, it's really 
not fit to go in, and the longer it hangs around in 5, the harder it will be to 
pull it out if it's not improved. 

 Make one state.json per collection
 --

 Key: SOLR-5473
 URL: https://issues.apache.org/jira/browse/SOLR-5473
 Project: Solr
  Issue Type: Sub-task
  Components: SolrCloud
Reporter: Noble Paul
Assignee: Noble Paul
 Fix For: 5.0

 Attachments: SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
 SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
 SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
 SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
 SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, ec2-23-20-119-52_solr.log, 
 ec2-50-16-38-73_solr.log


 As defined in the parent issue, store the states of each collection under 
 /collections/collectionname/state.json node



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5473) Make one state.json per collection


[ 
https://issues.apache.org/jira/browse/SOLR-5473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13979227#comment-13979227
 ] 

Mark Miller commented on SOLR-5473:
---

I have to make an official -1 veto vote on this issue.

The issue is to break clusterstate.json into one per collection instead of a 
global one. This format change in ZooKeeper should not ripple in such an ugly 
manner through all of our cloud API's. We should also be dropping the global 
clusterstate.json in 5, so there is no way any of these crazy API changes 
should exist. If anything, the new API's should be clean and we should have 
uglier back compat stuff.

As it is, it seems like things were hacked in the shortest route possible - and 
while that's great for a prototype or straw man impl, it's a horrendous 
direction for all these API's.

There is a lot of general clean up to do, but more importantly, there are 
problems we need to solve without muddling up all the cloud API's.

The way things are done now, there is not even a reason for stateFormat=1 or 2 
- that was clearly just jammed on top of what was going on anyway, with little 
thought or integration.

I have to veto this, it's way to crazy still. The work needs to be done to keep 
this from bubbling through all these API's so terribly and trunk is not the 
place to do it. It's too far from ready and will cause too much pain with 4x 
backports.

 Make one state.json per collection
 --

 Key: SOLR-5473
 URL: https://issues.apache.org/jira/browse/SOLR-5473
 Project: Solr
  Issue Type: Sub-task
  Components: SolrCloud
Reporter: Noble Paul
Assignee: Noble Paul
 Fix For: 5.0

 Attachments: SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
 SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
 SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
 SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, 
 SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, ec2-23-20-119-52_solr.log, 
 ec2-50-16-38-73_solr.log


 As defined in the parent issue, store the states of each collection under 
 /collections/collectionname/state.json node



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5473) Make one state.json per collection


[ 
https://issues.apache.org/jira/browse/SOLR-5473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13979259#comment-13979259
 ] 

Noble Paul commented on SOLR-5473:
--

Before I comment of the comments I would like to lay out the objective and 
design of this ticket. 

Splitting out the main clutserstate.json was not done just to make each state 
smaller, but also to make fewer nodes update the state changes. If you have a 
large no:of collections , most of the nodes will never need to know about most 
of the other collections. So , it is not a good idea to have an explosion on 
the no:of watchers because we split the monolithic clusterstate . I would say 
it actually would be bad for scalability. 

So the approach is , whenever a node has a core which is a member of a 
collection , register a watch for the state of that collection and when the 
last core that is a member of that collection is unloaded, remove the watch. 
For those who don't watch other collections they can still fetch the state 
directly on demand. 

The fact that the main clusterstate.json may not have all the collections means 
that it cannot hold all the states within it. Though the actual data inside the 
object does not change, the returned values can be different based on different 
situations. The system does not rely on the immutability of clusterstate 
anywhere, It just expects the method calls to provide the correct data all the 
time. In fact it is even plain wrong to hold on to the clusterstate object for 
later use because it could change the data at anytime. The only place where we 
reuse the object is inside Overseer where we know that the state is only 
modified by itself. ZkStateReader is the class that is responsible for keeping 
the states upto date and the best way to make the clusterstate APIs behave 
consistently is to have a reference of ZkStateReader


The latest patch is not a final one . The objective of that patch was to get 
the naming right . 

bq.Can you give a concrete example of how having that info(stateFormat) there 
will be useful? It still makes no sense to me.

 Though it is possible to deduce the format from where the data is fetched, It 
will have to be onus of the code that serialize/deserialize the object back and 
forth from json  to have format correctly. If it is stored within the object  
that will be be automatically be taken care of. If we introduce a new 
stateFormat=3 , the system will have already collections made with 
stateFormat=2 but that does not explicitly say so. So we will have to put some 
logic into the serialization/deserialization logic to differentiate 2 from 3 . 
So it is just safer and consistent to encode the version of the format with the 
payload itself. That is the way all the serialization/deserialization libraries 
work



getExternCollectionFresh, getCommonCollection etc are gone in the latest patch. 
The objective was to get rid of external/internal from the public APIs .The 
internal variables contain the external moniker for lack of better names . I 
hope it is not a problem. We can always add javadocs to make them clearer.


Actually I use the intellij project formatting after doing an 'ant idea' . I 
will have to investigate if/why it is different.


bq.Why do we have getStateVersion and stateFormat? Why  1 and not =2?

If I introduce a new format 3, will it not include 2 also? That is just future 
proofing .
 

bq.This is not thread safe and ZkStateReader needs to be.
Also, -1 on this public variable on ZkStateReader. It's a bad design smell for 
good reason.

This has to be threadsafe. Will fix

I'll then have to make it a public setter method. Other than that there is no 
saner way for Overseer to expose the temporary data . I spent some time on 
figuring out a cleaner way , I will be glad to implement a better suggestion


bq.When you catch an InterruptedException you should do 
Thread.currenThread.interrupt() to reset the flag.

will do

bq.removeZkWatch() Should be marked internal then. Not a great design that has 
public internal methods on public objects though.

I would have preferred them to be package local, If possible. Unfortunately 
java does not allow it. What other choice do we have ?




 Make one state.json per collection
 --

 Key: SOLR-5473
 URL: https://issues.apache.org/jira/browse/SOLR-5473
 Project: Solr
  Issue Type: Sub-task
  Components: SolrCloud
Reporter: Noble Paul
Assignee: Noble Paul
 Fix For: 5.0

 Attachments: SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, 
 SOLR-5473-74.patch,

[jira] [Comment Edited] (SOLR-5473) Make one state.json per collection


[ 
https://issues.apache.org/jira/browse/SOLR-5473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13979259#comment-13979259
 ] 

Noble Paul edited comment on SOLR-5473 at 4/24/14 4:19 AM:
---

Before I comment further, I would like to lay out the objective and design of 
this ticket. 

Splitting out the main clutserstate.json was not done just to make each state 
smaller, but also to make fewer nodes update the state changes. If you have a 
large no:of collections , most of the nodes will never need to know about most 
of the other collections. So , it is not a good idea to have an explosion on 
the no:of watchers because we split the monolithic clusterstate . I would say 
it actually would be bad for scalability. 

So the approach is , whenever a node has a core which is a member of a 
collection , register a watch for the state of that collection and when the 
last core that is a member of that collection is unloaded, remove the watch. 
For those who don't watch other collections they can still fetch the state 
directly on demand. 

The fact that the main clusterstate.json may not have all the collections means 
that it cannot hold all the states within it. Though the actual data inside the 
object does not change, the returned values can be different based on different 
situations. The system does not rely on the immutability of clusterstate 
anywhere, It just expects the method calls to provide the correct data all the 
time. In fact it is even plain wrong to hold on to the clusterstate object for 
later use because it could change the data at anytime. The only place where we 
reuse the object is inside Overseer where we know that the state is only 
modified by itself. ZkStateReader is the class that is responsible for keeping 
the states upto date and the best way to make the clusterstate APIs behave 
consistently is to have a reference of ZkStateReader


The latest patch is not a final one . The objective of that patch was to get 
the naming right . 

bq.Can you give a concrete example of how having that info(stateFormat) there 
will be useful? It still makes no sense to me.

 Though it is possible to deduce the format from where the data is fetched, It 
will have to be onus of the code that serialize/deserialize the object back and 
forth from json  to have format correctly. If it is stored within the object  
that will be be automatically be taken care of. If we introduce a new 
stateFormat=3 , the system will have already collections made with 
stateFormat=2 but that does not explicitly say so. So we will have to put some 
logic into the serialization/deserialization logic to differentiate 2 from 3 . 
So it is just safer and consistent to encode the version of the format with the 
payload itself. That is the way all the serialization/deserialization libraries 
work



getExternCollectionFresh, getCommonCollection etc are gone in the latest patch. 
The objective was to get rid of external/internal from the public APIs .The 
internal variables contain the external moniker for lack of better names . I 
hope it is not a problem. We can always add javadocs to make them clearer.


Actually I use the intellij project formatting after doing an 'ant idea' . I 
will have to investigate if/why it is different.


bq.Why do we have getStateVersion and stateFormat? Why  1 and not =2?

If I introduce a new format 3, will it not include 2 also? That is just future 
proofing .
 

bq.This is not thread safe and ZkStateReader needs to be.
Also, -1 on this public variable on ZkStateReader. It's a bad design smell for 
good reason.

This has to be threadsafe. Will fix

I'll then have to make it a public setter method. Other than that there is no 
saner way for Overseer to expose the temporary data . I spent some time on 
figuring out a cleaner way , I will be glad to implement a better suggestion


bq.When you catch an InterruptedException you should do 
Thread.currenThread.interrupt() to reset the flag.

will do

bq.removeZkWatch() Should be marked internal then. Not a great design that has 
public internal methods on public objects though.

I would have preferred them to be package local, If possible. Unfortunately 
java does not allow it. What other choice do we have ?





was (Author: noble.paul):
Before I comment of the comments I would like to lay out the objective and 
design of this ticket. 

Splitting out the main clutserstate.json was not done just to make each state 
smaller, but also to make fewer nodes update the state changes. If you have a 
large no:of collections , most of the nodes will never need to know about most 
of the other collections. So , it is not a good idea to have an explosion on 
the no:of watchers because we split the monolithic clusterstate . I would say 
it actually would be bad for scalability. 

So the approach is , whenever a node has a core which is a member of a 
collection ,

[jira] [Comment Edited] (SOLR-5473) Make one state.json per collection


[ 
https://issues.apache.org/jira/browse/SOLR-5473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13979259#comment-13979259
 ] 

Noble Paul edited comment on SOLR-5473 at 4/24/14 4:45 AM:
---

Before I comment further, I would like to lay out the objective and design of 
this ticket. 

Splitting out the main clutserstate.json was not done just to make each state 
smaller, but also to make fewer nodes update the state changes. If you have a 
large no:of collections , most of the nodes will never need to know about most 
of the other collections. So , it is not a good idea to have an explosion on 
the no:of watchers because we split the monolithic clusterstate . I would say 
it actually would be bad for scalability. 

So the approach is , whenever a node has a core which is a member of a 
collection , register a watch for the state of that collection and when the 
last core that is a member of that collection is unloaded, remove the watch. 
For those who don't watch other collections they can still fetch the state 
directly on demand. 

The fact that the main clusterstate.json may not have all the collections means 
that it cannot hold all the states within it. Though the actual data inside the 
object does not change, the returned values can be different based on different 
situations. The system does not rely on the immutability of clusterstate 
anywhere, It just expects the method calls to provide the correct data all the 
time. In fact it is even plain wrong to hold on to the clusterstate object for 
later use because it could change the data at anytime. The only place where we 
reuse the object is inside Overseer where we know that the state is only 
modified by itself. ZkStateReader is the class that is responsible for keeping 
the states upto date and the best way to make the clusterstate APIs behave 
consistently is to have a reference of ZkStateReader


The latest patch is not a final one . The objective of that patch was to get 
the naming right . 

bq.Can you give a concrete example of how having that info(stateFormat) there 
will be useful? It still makes no sense to me.

 Though it is possible to deduce the format from where the data is fetched, It 
will have to be onus of the code that serialize/deserialize the object back and 
forth from json  to have format correctly. If it is stored within the object  
that will be be automatically be taken care of. If we introduce a new 
stateFormat=3 , the system will have already collections made with 
stateFormat=2 but that does not explicitly say so. So we will have to put some 
logic into the serialization/deserialization logic to differentiate 2 from 3 . 
So it is just safer and consistent to encode the version of the format with the 
payload itself. That is the convention all the serialization/deserialization 
libraries follow



getExternCollectionFresh, getCommonCollection etc are gone in the latest patch. 
The objective was to get rid of external/internal from the public APIs .The 
internal variables contain the external moniker for lack of better names . I 
hope it is not a problem. We can always add javadocs to make them clearer.


Actually I use the intellij project formatting after doing an 'ant idea' . I 
will have to investigate if/why it is different.


bq.Why do we have getStateVersion and stateFormat? Why  1 and not =2?

If I introduce a new format 3, will it not include 2 also? That is just future 
proofing .
 

bq.This is not thread safe and ZkStateReader needs to be.
Also, -1 on this public variable on ZkStateReader. It's a bad design smell for 
good reason.

This has to be threadsafe. Will fix

I'll then have to make it a public setter method. Other than that there is no 
saner way for Overseer to expose the temporary data . I spent some time on 
figuring out a cleaner way , I will be glad to implement a better suggestion


bq.When you catch an InterruptedException you should do 
Thread.currenThread.interrupt() to reset the flag.

will do

bq.removeZkWatch() Should be marked internal then. Not a great design that has 
public internal methods on public objects though.

I would have preferred them to be package local, If possible. Unfortunately 
java does not allow it. What other choice do we have ?





was (Author: noble.paul):
Before I comment further, I would like to lay out the objective and design of 
this ticket. 

Splitting out the main clutserstate.json was not done just to make each state 
smaller, but also to make fewer nodes update the state changes. If you have a 
large no:of collections , most of the nodes will never need to know about most 
of the other collections. So , it is not a good idea to have an explosion on 
the no:of watchers because we split the monolithic clusterstate . I would say 
it actually would be bad for scalability. 

So the approach is , whenever a node has a core which is a member of a 
collection ,

[jira] [Commented] (SOLR-5474) Have a new mode for SolrJ to support stateFormat=2