[jira] [Commented] (LUCENE-6829) OfflineSorter should use Directory API

2015-10-19 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14963100#comment-14963100
 ] 

Michael McCandless commented on LUCENE-6829:


bq. Michael McCandless, looks like your commit to RAMDirector.listAll() on this 
issue is the problem:

Sigh, sorry [~steve_rowe] and thanks for digging/fixing.  Tricky concurrency...

> OfflineSorter should use Directory API
> --
>
> Key: LUCENE-6829
> URL: https://issues.apache.org/jira/browse/LUCENE-6829
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Michael McCandless
>Assignee: Michael McCandless
> Fix For: Trunk
>
> Attachments: LUCENE-6829.patch, LUCENE-6829.patch, LUCENE-6829.patch, 
> LUCENE-6829.patch
>
>
> I think this is a blocker for LUCENE-6825, because the block KD-tree makes 
> heavy use of OfflineSorter and we don't want to fill up tmp space ...
> This should be a straightforward cutover, but there are some challenges, e.g. 
> the test was failing because virus checker blocked deleting of files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6829) OfflineSorter should use Directory API

2015-10-18 Thread Steve Rowe (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14962819#comment-14962819
 ] 

Steve Rowe commented on LUCENE-6829:


Another test failure my Jenkins found with the same root cause, only reproduces 
8/100 iters for me:

{noformat}
   [junit4]   2> NOTE: reproduce with: ant test  
-Dtestcase=TestIndexWriterCommit -Dtests.method=testPrepareCommitRollback 
-Dtests.seed=98FCDFF6002C97E1 -Dtests.nightly=true -Dtests.slow=true 
-Dtests.linedocsfile=/home/jenkins/lucene-data/e
nwiki.random.lines.txt -Dtests.locale=is -Dtests.timezone=Australia/Sydney 
-Dtests.asserts=true -Dtests.file.encoding=UTF-8
   [junit4] ERROR   0.32s | TestIndexWriterCommit.testPrepareCommitRollback <<<
   [junit4]> Throwable #1: java.lang.NullPointerException
   [junit4]>at __randomizedtesting.SeedInfo.seed([98FCDFF6002C97E1:9
4F93159522F894E]:0)
   [junit4]>at java.util.ComparableTimSort.binarySort(ComparableTimS
ort.java:258)
   [junit4]>at java.util.ComparableTimSort.sort(ComparableTimSort.ja
va:203)
   [junit4]>at java.util.Arrays.sort(Arrays.java:1246)
   [junit4]>at 
org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:660)
   [junit4]>at 
org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:73)
   [junit4]>at 
org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:63)
   [junit4]>at 
org.apache.lucene.index.TestIndexWriterCommit.testPrepareCommitRollback(TestIndexWriterCommit.java:599)
   [junit4]>at java.lang.Thread.run(Thread.java:745)
   [junit4]   2> NOTE: test params are: codec=Asserting(Lucene53): 
{content=Lucene50(blocksize=128)}, docValues:{}, 
sim=RandomSimilarityProvider(queryNorm=true,coord=crazy): {content=DFR 
I(ne)3(800.0)}, locale=is, timezone=Australia/Sydney
   [junit4]   2> NOTE: Linux 4.1.0-custom2-amd64 amd64/Oracle Corporation 
1.8.0_45 (64-bit)/cpus=16,threads=1,free=401791056,total=514850816
   [junit4]   2> NOTE: All tests run in this JVM: [TestIndexWriterCommit]
{noformat}

> OfflineSorter should use Directory API
> --
>
> Key: LUCENE-6829
> URL: https://issues.apache.org/jira/browse/LUCENE-6829
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Michael McCandless
>Assignee: Michael McCandless
> Fix For: Trunk
>
> Attachments: LUCENE-6829.patch, LUCENE-6829.patch, LUCENE-6829.patch, 
> LUCENE-6829.patch
>
>
> I think this is a blocker for LUCENE-6825, because the block KD-tree makes 
> heavy use of OfflineSorter and we don't want to fill up tmp space ...
> This should be a straightforward cutover, but there are some challenges, e.g. 
> the test was failing because virus checker blocked deleting of files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6829) OfflineSorter should use Directory API

2015-10-15 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14958600#comment-14958600
 ] 

ASF subversion and git services commented on LUCENE-6829:
-

Commit 1708760 from [~mikemccand] in branch 'dev/trunk'
[ https://svn.apache.org/r1708760 ]

LUCENE-6829: OfflineSorter now uses Directory API; add 
Directory.createTempOutput and IndexOutput.getName

> OfflineSorter should use Directory API
> --
>
> Key: LUCENE-6829
> URL: https://issues.apache.org/jira/browse/LUCENE-6829
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Michael McCandless
>Assignee: Michael McCandless
> Fix For: Trunk
>
> Attachments: LUCENE-6829.patch, LUCENE-6829.patch, LUCENE-6829.patch, 
> LUCENE-6829.patch
>
>
> I think this is a blocker for LUCENE-6825, because the block KD-tree makes 
> heavy use of OfflineSorter and we don't want to fill up tmp space ...
> This should be a straightforward cutover, but there are some challenges, e.g. 
> the test was failing because virus checker blocked deleting of files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6829) OfflineSorter should use Directory API

2015-10-13 Thread Dawid Weiss (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14954520#comment-14954520
 ] 

Dawid Weiss commented on LUCENE-6829:
-

This is a large patch and I only scanned it briefly. It looks good to me. I 
don't know how to avoid the virus checker special case (requiring odd hacks in 
the code to disable it).

Also, blocks like this one:
{code}
+  Path tempPath = 
Files.createTempDirectory(Dictionary.getDefaultTempDir(), "Hunspell");
+  boolean success = false;
+  try (Directory tempDir = FSDirectory.open(tempPath)) {
+this.dictionary = new Dictionary(tempDir, "hunspell", affix, 
dictionaries, ignoreCase);
+success = true;
+  } finally {
+// tempPath (directory) should be empty at this point:
+if (success) {
+  Files.delete(tempPath);
+} else {
+  IOUtils.deleteFilesIgnoringExceptions(tempPath);
+}
+  }
{code}

Is there any reason why we shouldn't just let the regular exception suppression 
be used here? I know it'd reverse the precedence, but at least you'd get the 
full picture (temp. file couldn't be deleted too). Isn't this a leftover 
pattern from before 1.7 days?


> OfflineSorter should use Directory API
> --
>
> Key: LUCENE-6829
> URL: https://issues.apache.org/jira/browse/LUCENE-6829
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Michael McCandless
>Assignee: Michael McCandless
> Fix For: Trunk, 5.4
>
> Attachments: LUCENE-6829.patch, LUCENE-6829.patch, LUCENE-6829.patch, 
> LUCENE-6829.patch
>
>
> I think this is a blocker for LUCENE-6825, because the block KD-tree makes 
> heavy use of OfflineSorter and we don't want to fill up tmp space ...
> This should be a straightforward cutover, but there are some challenges, e.g. 
> the test was failing because virus checker blocked deleting of files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6829) OfflineSorter should use Directory API

2015-10-13 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14954759#comment-14954759
 ] 

Michael McCandless commented on LUCENE-6829:


bq. Otherwise you'd have temp files piling up without any visible reason?

Exactly!

> OfflineSorter should use Directory API
> --
>
> Key: LUCENE-6829
> URL: https://issues.apache.org/jira/browse/LUCENE-6829
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Michael McCandless
>Assignee: Michael McCandless
> Fix For: Trunk, 5.4
>
> Attachments: LUCENE-6829.patch, LUCENE-6829.patch, LUCENE-6829.patch, 
> LUCENE-6829.patch
>
>
> I think this is a blocker for LUCENE-6825, because the block KD-tree makes 
> heavy use of OfflineSorter and we don't want to fill up tmp space ...
> This should be a straightforward cutover, but there are some challenges, e.g. 
> the test was failing because virus checker blocked deleting of files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6829) OfflineSorter should use Directory API

2015-10-13 Thread Dawid Weiss (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14954742#comment-14954742
 ] 

Dawid Weiss commented on LUCENE-6829:
-

There was something odd about it. :) I think IOUtils.rm sounds good -- we 
always attempt to remove the temp. stuff. If we can't do it for some reason, we 
should fail explicitly. Otherwise you'd have temp files piling up without any 
visible reason?

> OfflineSorter should use Directory API
> --
>
> Key: LUCENE-6829
> URL: https://issues.apache.org/jira/browse/LUCENE-6829
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Michael McCandless
>Assignee: Michael McCandless
> Fix For: Trunk, 5.4
>
> Attachments: LUCENE-6829.patch, LUCENE-6829.patch, LUCENE-6829.patch, 
> LUCENE-6829.patch
>
>
> I think this is a blocker for LUCENE-6825, because the block KD-tree makes 
> heavy use of OfflineSorter and we don't want to fill up tmp space ...
> This should be a straightforward cutover, but there are some challenges, e.g. 
> the test was failing because virus checker blocked deleting of files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6829) OfflineSorter should use Directory API

2015-10-13 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14954738#comment-14954738
 ] 

Michael McCandless commented on LUCENE-6829:


bq. So, to be clear, why isn't the above just:

Woops: on exception I had wanted that code block to recursively remove the temp 
dir, and any files under it, because the temp dir is likely not empty on 
exception.

But the patch does not do that, because 
{{IOUtils.deleteFilesIgnoringExceptions}} will not recursively delete.  I think 
I need to change it to your new version, but use {{IOUtils.rm}} instead?

> OfflineSorter should use Directory API
> --
>
> Key: LUCENE-6829
> URL: https://issues.apache.org/jira/browse/LUCENE-6829
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Michael McCandless
>Assignee: Michael McCandless
> Fix For: Trunk, 5.4
>
> Attachments: LUCENE-6829.patch, LUCENE-6829.patch, LUCENE-6829.patch, 
> LUCENE-6829.patch
>
>
> I think this is a blocker for LUCENE-6825, because the block KD-tree makes 
> heavy use of OfflineSorter and we don't want to fill up tmp space ...
> This should be a straightforward cutover, but there are some challenges, e.g. 
> the test was failing because virus checker blocked deleting of files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6829) OfflineSorter should use Directory API

2015-10-09 Thread Dawid Weiss (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14949983#comment-14949983
 ] 

Dawid Weiss commented on LUCENE-6829:
-

bq. StandardOpenOption.CREATE_NEW

Oh, I like this!

> OfflineSorter should use Directory API
> --
>
> Key: LUCENE-6829
> URL: https://issues.apache.org/jira/browse/LUCENE-6829
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Michael McCandless
>Assignee: Michael McCandless
> Fix For: Trunk, 5.4
>
> Attachments: LUCENE-6829.patch, LUCENE-6829.patch
>
>
> I think this is a blocker for LUCENE-6825, because the block KD-tree makes 
> heavy use of OfflineSorter and we don't want to fill up tmp space ...
> This should be a straightforward cutover, but there are some challenges, e.g. 
> the test was failing because virus checker blocked deleting of files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6829) OfflineSorter should use Directory API

2015-10-08 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14949598#comment-14949598
 ] 

Robert Muir commented on LUCENE-6829:
-

{quote}
Second, this is an API break for things that secretly used
OfflineSorter (suggesters, Hunspell Dictionary, etc.), but I think
this is good because it makes it clear to the caller that disk space
is going to be used by this method and lets the call control where, vs
running the risk of e.g. filling up your tmp partition.

Some things still wanted access to "the default temp dir", so I moved
it from OfflineSorter to IOUtils, and kept the test infra initializing
that to the mock fs wrapped version. Hmm I think maybe it's just
Hunspell, maybe I can remove that from IOUtils and just put it
(privately) in Hunspell. I think ideally nothing in Lucene should be
secretly using /tmp anymore.
{quote}

Yeah, +1 for fixing that, I think the break is ok. It was a trap before.

{quote}
I also think somehow we should extend the "retry the file delete" that
IW/IFD provides "down" to things like OfflineSorter
{quote}

Its more than that right? (with the current patch) Its also inflateGens too? :)

Looks like something went horribly wrong with the formatting of 
TestRandomChains?

{quote}
I think to fix this more safely we need to add a new required
(abstract) method to Directory:
{quote}

So would FSDirectory implement this with the obvious Files.createTempFile() ? 
That would be my vote. I guess some could argue its no longer technically 
write-once, since methods like that rely on the atomicity of creating a 
(zero-byte) file, but we are just going to append to that file after, once. And 
its a separate method, solely as a temporary file facility, so I think this is 
how it should be implemented? Deserves some thought I guess either way.


> OfflineSorter should use Directory API
> --
>
> Key: LUCENE-6829
> URL: https://issues.apache.org/jira/browse/LUCENE-6829
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Michael McCandless
>Assignee: Michael McCandless
> Fix For: Trunk, 5.4
>
> Attachments: LUCENE-6829.patch, LUCENE-6829.patch
>
>
> I think this is a blocker for LUCENE-6825, because the block KD-tree makes 
> heavy use of OfflineSorter and we don't want to fill up tmp space ...
> This should be a straightforward cutover, but there are some challenges, e.g. 
> the test was failing because virus checker blocked deleting of files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (LUCENE-6829) OfflineSorter should use Directory API

2015-10-08 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-6829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14949608#comment-14949608
 ] 

Robert Muir commented on LUCENE-6829:
-

and we always have the option of our own simple "temp file logic" that just 
uses StandardOpenOption.CREATE_NEW to ensure its really unique. I think i like 
this better...I don't think we have to worry about crazy security concerns like 
someone guessing what the filename will be. And really we should be opening all 
files with that flag anyway, its just that we don't yet today because files 
aren't always write-once in exceptional cases.

> OfflineSorter should use Directory API
> --
>
> Key: LUCENE-6829
> URL: https://issues.apache.org/jira/browse/LUCENE-6829
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Michael McCandless
>Assignee: Michael McCandless
> Fix For: Trunk, 5.4
>
> Attachments: LUCENE-6829.patch, LUCENE-6829.patch
>
>
> I think this is a blocker for LUCENE-6825, because the block KD-tree makes 
> heavy use of OfflineSorter and we don't want to fill up tmp space ...
> This should be a straightforward cutover, but there are some challenges, e.g. 
> the test was failing because virus checker blocked deleting of files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org