[jira] Commented: (LUCENE-1618) Allow setting the IndexWriter docstore to be a different directory

2009-04-27 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12703363#action_12703363
 ] 

Yonik Seeley commented on LUCENE-1618:
--

I can see how this would potentially be useful for realtime... but it seems 
like only IndexWriter could eventually fix the situation of having the docstore 
on disk and the rest of a segment in RAM.  Which means that this API shouldn't 
be public?

> Allow setting the IndexWriter docstore to be a different directory
> --
>
> Key: LUCENE-1618
> URL: https://issues.apache.org/jira/browse/LUCENE-1618
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Index
>Affects Versions: 2.4.1
>Reporter: Jason Rutherglen
>Priority: Minor
> Fix For: 2.9
>
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> Add an IndexWriter.setDocStoreDirectory method that allows doc
> stores to be placed in a different directory than the IW default
> dir.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1618) Allow setting the IndexWriter docstore to be a different directory

2009-04-27 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12703370#action_12703370
 ] 

Michael McCandless commented on LUCENE-1618:


Yeah I also think this should be an "under the hood" (done only by NRT) 
optimization inside IndexWriter.

The only possible non-NRT case I can think of is when users make temporary 
indices in RAM, it's possible one would want to write the docStore files to an 
FSDirectory (because they are so large) but keep postings, norms, deletes, etc 
in RAM.  But going down that road opens up a can of worms... eg does segments_N 
somehow have to keep track of which dir has which parts of a segment?  Suddenly 
IndexReader must also know to look in different dirs for different parts of a 
segment, etc.

it might be cleaner to make a Directory impl that dispatches certain files to a 
RAMDir and others to an FSDir, so IndexWriter/IndexReader still see a single 
Directory API.

> Allow setting the IndexWriter docstore to be a different directory
> --
>
> Key: LUCENE-1618
> URL: https://issues.apache.org/jira/browse/LUCENE-1618
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Index
>Affects Versions: 2.4.1
>Reporter: Jason Rutherglen
>Priority: Minor
> Fix For: 2.9
>
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> Add an IndexWriter.setDocStoreDirectory method that allows doc
> stores to be placed in a different directory than the IW default
> dir.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1618) Allow setting the IndexWriter docstore to be a different directory

2009-04-27 Thread Jason Rutherglen (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12703375#action_12703375
 ] 

Jason Rutherglen commented on LUCENE-1618:
--

{quote}
non-NRT case I can think of is when users make temporary indices in RAM
{quote}

Yes, and there could be others we don't know about.  

{quote}
it might be cleaner to make a Directory impl that dispatches certain files to a 
RAMDir and others to an FSDir
{quote}

Good idea.  I'll try that method first.  If this one works out, then the API 
will be public?

> Allow setting the IndexWriter docstore to be a different directory
> --
>
> Key: LUCENE-1618
> URL: https://issues.apache.org/jira/browse/LUCENE-1618
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Index
>Affects Versions: 2.4.1
>Reporter: Jason Rutherglen
>Priority: Minor
> Fix For: 2.9
>
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> Add an IndexWriter.setDocStoreDirectory method that allows doc
> stores to be placed in a different directory than the IW default
> dir.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1618) Allow setting the IndexWriter docstore to be a different directory

2009-04-27 Thread Tim Smith (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12703384#action_12703384
 ] 

Tim Smith commented on LUCENE-1618:
---

Would also further suggest that this Directory implementation would take one or 
more directories to store documents, along with one or more directories to 
store the index itself

one of the directories should be explicitly marked for "reading" for each use

this allows creating a Directory instance that will:
* store documents to disk (reading from disk during searches)
* write index to disk and ram (reading from RAM during searches)

> Allow setting the IndexWriter docstore to be a different directory
> --
>
> Key: LUCENE-1618
> URL: https://issues.apache.org/jira/browse/LUCENE-1618
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Index
>Affects Versions: 2.4.1
>Reporter: Jason Rutherglen
>Priority: Minor
> Fix For: 2.9
>
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> Add an IndexWriter.setDocStoreDirectory method that allows doc
> stores to be placed in a different directory than the IW default
> dir.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1618) Allow setting the IndexWriter docstore to be a different directory

2009-04-27 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12703388#action_12703388
 ] 

Michael McCandless commented on LUCENE-1618:


{quote}
> it might be cleaner to make a Directory impl that dispatches certain files to 
> a RAMDir and others to an FSDir

Good idea. I'll try that method first. If this one works out, then the API will 
be public?
{quote}

Which API would be public?

If this (call it "FileSwitchDirectory" for now ;) ) works then we would not add 
any API to IndexWriter (ie it's either or)?  But FileSwitchDirectory would be 
public & "expert".

One downside to this approach is it's brittle -- whenever we change file 
extensions you'd have to "know" to fix this Directory.  Or maybe we make the 
Directory specialized to only storing the doc stores in the FSDir, then 
whenever we change file formats we would fix this directory?  But in the 
future, with custom codecs, things could be named whatever... hmmm.  Lacking 
clarity.

> Allow setting the IndexWriter docstore to be a different directory
> --
>
> Key: LUCENE-1618
> URL: https://issues.apache.org/jira/browse/LUCENE-1618
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Index
>Affects Versions: 2.4.1
>Reporter: Jason Rutherglen
>Priority: Minor
> Fix For: 2.9
>
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> Add an IndexWriter.setDocStoreDirectory method that allows doc
> stores to be placed in a different directory than the IW default
> dir.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1618) Allow setting the IndexWriter docstore to be a different directory

2009-04-27 Thread Eks Dev (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12703406#action_12703406
 ] 

Eks Dev commented on LUCENE-1618:
-

Maybe, 
FileSwitchDirectory should have possibility to get file list/extensions that 
should be loaded into RAM... making it maintenance free, pushing this decision 
to end user... if, and when we decide to support users in it, we could than 
maintain static list at separate place . Kind of separate execution and 
configuration

I *think* I saw something similar Ning Lee made quite a while ago, from hadoop 
camp (indexing on hadoop something...). But cannot remember what was it :(


  

> Allow setting the IndexWriter docstore to be a different directory
> --
>
> Key: LUCENE-1618
> URL: https://issues.apache.org/jira/browse/LUCENE-1618
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Index
>Affects Versions: 2.4.1
>Reporter: Jason Rutherglen
>Priority: Minor
> Fix For: 2.9
>
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> Add an IndexWriter.setDocStoreDirectory method that allows doc
> stores to be placed in a different directory than the IW default
> dir.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1618) Allow setting the IndexWriter docstore to be a different directory

2009-04-27 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12703415#action_12703415
 ] 

Michael McCandless commented on LUCENE-1618:


bq. Would also further suggest that this Directory implementation would take 
one or more directories to store documents, along with one or more directories 
to store the index itself

You mean an opened IndexOutput would write its output to two (or more) 
different places?  So you could "write through" a RAMDir down to an FSDir?  
(This way both the RAMDir and FSDir have a copy of the index).

> Allow setting the IndexWriter docstore to be a different directory
> --
>
> Key: LUCENE-1618
> URL: https://issues.apache.org/jira/browse/LUCENE-1618
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Index
>Affects Versions: 2.4.1
>Reporter: Jason Rutherglen
>Priority: Minor
> Fix For: 2.9
>
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> Add an IndexWriter.setDocStoreDirectory method that allows doc
> stores to be placed in a different directory than the IW default
> dir.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1618) Allow setting the IndexWriter docstore to be a different directory

2009-04-27 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12703416#action_12703416
 ] 

Michael McCandless commented on LUCENE-1618:


{quote}
ileSwitchDirectory should have possibility to get file list/extensions that 
should be loaded into RAM... making it maintenance free, pushing this decision 
to end user... if, and when we decide to support users in it, we could than 
maintain static list at separate place . Kind of separate execution and 
configuration
{quote}

+1

With flexible indexing, presumably one could use their codec to ask it for the 
"doc store extensions" vs the "postings extensions", etc., and pass to this 
configurable FileSwitchDirectory.

> Allow setting the IndexWriter docstore to be a different directory
> --
>
> Key: LUCENE-1618
> URL: https://issues.apache.org/jira/browse/LUCENE-1618
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Index
>Affects Versions: 2.4.1
>Reporter: Jason Rutherglen
>Priority: Minor
> Fix For: 2.9
>
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> Add an IndexWriter.setDocStoreDirectory method that allows doc
> stores to be placed in a different directory than the IW default
> dir.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1618) Allow setting the IndexWriter docstore to be a different directory

2009-04-28 Thread Tim Smith (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12703630#action_12703630
 ] 

Tim Smith commented on LUCENE-1618:
---

{quote}
You mean an opened IndexOutput would write its output to two (or more) 
different places? So you could "write through" a RAMDir down to an FSDir? (This 
way both the RAMDir and FSDir have a copy of the index).
{quote}

yes, so if you register more than one directory for "index files", then the 
IndexOutput for the directory would dispatch to an IndexOutput for both sub 
directories
then, the IndexInput would only be opened on the "primary" directory (for 
instance, the RAM directory)

This will allow extremely fast searches, with the persistence of a backing 
FSDirectory

coupled with then having a set of directories for the "Stored Documents", then 
allows:
* RAM directory search speed
* All changes persisted to disk
* Documents Stored (and retrieved from disk) (or optionally retrieved from RAM)


> Allow setting the IndexWriter docstore to be a different directory
> --
>
> Key: LUCENE-1618
> URL: https://issues.apache.org/jira/browse/LUCENE-1618
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Index
>Affects Versions: 2.4.1
>Reporter: Jason Rutherglen
>Priority: Minor
> Fix For: 2.9
>
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> Add an IndexWriter.setDocStoreDirectory method that allows doc
> stores to be placed in a different directory than the IW default
> dir.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1618) Allow setting the IndexWriter docstore to be a different directory

2009-04-28 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12703651#action_12703651
 ] 

Michael McCandless commented on LUCENE-1618:


Neat.  This is sounding like one cool Directory...

> Allow setting the IndexWriter docstore to be a different directory
> --
>
> Key: LUCENE-1618
> URL: https://issues.apache.org/jira/browse/LUCENE-1618
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Index
>Affects Versions: 2.4.1
>Reporter: Jason Rutherglen
>Priority: Minor
> Fix For: 2.9
>
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> Add an IndexWriter.setDocStoreDirectory method that allows doc
> stores to be placed in a different directory than the IW default
> dir.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1618) Allow setting the IndexWriter docstore to be a different directory

2009-04-28 Thread Earwin Burrfoot (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12703656#action_12703656
 ] 

Earwin Burrfoot commented on LUCENE-1618:
-

bq. You mean an opened IndexOutput would write its output to two (or more) 
different places?
Except the best way is to write directly to FSDir.IndexOutput, and when it is 
closed, read back into memory.
That way, if FSDir.IO hits an exception while writing, you don't have to jump 
through the hoops to keep your RAMDir in consistent state (we had real troubles 
when some files were 'written' to RAMDir, but failed to persist in FSDir).
Also, when reading the file back you already know it's exact size and can 
allocate appropriate buffer, saving on resizings (my draft impl) / chunking 
(lucene's current impl) overhead.

> Allow setting the IndexWriter docstore to be a different directory
> --
>
> Key: LUCENE-1618
> URL: https://issues.apache.org/jira/browse/LUCENE-1618
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Index
>Affects Versions: 2.4.1
>Reporter: Jason Rutherglen
>Priority: Minor
> Fix For: 2.9
>
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> Add an IndexWriter.setDocStoreDirectory method that allows doc
> stores to be placed in a different directory than the IW default
> dir.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1618) Allow setting the IndexWriter docstore to be a different directory

2009-04-28 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12703658#action_12703658
 ] 

Yonik Seeley commented on LUCENE-1618:
--

As it relates to near real time, the search speed of the RAM directory in 
relation to FSDirectory seems unimportant (what is this diff anyway?) - the 
FSDirectory will be much larger and that is where the bulk of the search time 
will be.

It seems like the main benefit of RAMDirectory for NRT is faster creation time 
(no need to create on-disk files, write them, then sync them), right?  Actually 
the sync is only needed if a new segments file will be written... but there 
still may be synchronous metadata operations for open-write-close of a file, 
depending on the FS?


> Allow setting the IndexWriter docstore to be a different directory
> --
>
> Key: LUCENE-1618
> URL: https://issues.apache.org/jira/browse/LUCENE-1618
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Index
>Affects Versions: 2.4.1
>Reporter: Jason Rutherglen
>Priority: Minor
> Fix For: 2.9
>
> Attachments: MemoryCachedDirectory.java
>
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> Add an IndexWriter.setDocStoreDirectory method that allows doc
> stores to be placed in a different directory than the IW default
> dir.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1618) Allow setting the IndexWriter docstore to be a different directory

2009-04-28 Thread Earwin Burrfoot (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12703666#action_12703666
 ] 

Earwin Burrfoot commented on LUCENE-1618:
-

bq. what is this diff anyway?
That's not a diff, I gave a sample of write-through ram directory Tim and Mike 
were speaking about.

> Allow setting the IndexWriter docstore to be a different directory
> --
>
> Key: LUCENE-1618
> URL: https://issues.apache.org/jira/browse/LUCENE-1618
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Index
>Affects Versions: 2.4.1
>Reporter: Jason Rutherglen
>Priority: Minor
> Fix For: 2.9
>
> Attachments: MemoryCachedDirectory.java
>
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> Add an IndexWriter.setDocStoreDirectory method that allows doc
> stores to be placed in a different directory than the IW default
> dir.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1618) Allow setting the IndexWriter docstore to be a different directory

2009-04-28 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12703676#action_12703676
 ] 

Yonik Seeley commented on LUCENE-1618:
--

bq.  That's not a diff

Sorry, by "diff" I meant the difference in search performance on a RAMDirectory 
vs NIOFSDirectory where the files are all cached by the OS.

> Allow setting the IndexWriter docstore to be a different directory
> --
>
> Key: LUCENE-1618
> URL: https://issues.apache.org/jira/browse/LUCENE-1618
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Index
>Affects Versions: 2.4.1
>Reporter: Jason Rutherglen
>Priority: Minor
> Fix For: 2.9
>
> Attachments: MemoryCachedDirectory.java
>
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> Add an IndexWriter.setDocStoreDirectory method that allows doc
> stores to be placed in a different directory than the IW default
> dir.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1618) Allow setting the IndexWriter docstore to be a different directory

2009-04-28 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12703683#action_12703683
 ] 

Michael McCandless commented on LUCENE-1618:


bq. by "diff" I meant the difference in search performance on a RAMDirectory vs 
NIOFSDirectory where the files are all cached by the OS.

It's a good question -- I haven't tested it directly.  I'd love to know too...

For an NRT writer using RAMDir for recently flushed tiny segments 
(LUCENE-1313), the gains are more about the speed of reading/writing many tiny 
files.  Probably we should try [somehow] to test this case, to see if 
LUCENE-1313 is even a worthwhile optimization.

> Allow setting the IndexWriter docstore to be a different directory
> --
>
> Key: LUCENE-1618
> URL: https://issues.apache.org/jira/browse/LUCENE-1618
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Index
>Affects Versions: 2.4.1
>Reporter: Jason Rutherglen
>Priority: Minor
> Fix For: 2.9
>
> Attachments: MemoryCachedDirectory.java
>
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> Add an IndexWriter.setDocStoreDirectory method that allows doc
> stores to be placed in a different directory than the IW default
> dir.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1618) Allow setting the IndexWriter docstore to be a different directory

2009-04-28 Thread Earwin Burrfoot (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12703684#action_12703684
 ] 

Earwin Burrfoot commented on LUCENE-1618:
-

bq. Sorry, by "diff" I meant the difference in search performance on a 
RAMDirectory vs NIOFSDirectory where the files are all cached by the OS.
Ah! :) It exists. Ranked by speed, directories are FSDirectory (native/sys 
calls), MMapDirectory (native), RAMDirectory (chunked), MemCachedDirectory (raw 
array access). But for the purporses of searching a small amount of 
freshly-indexed docs this difference is miniscule at best, me thinks.

> Allow setting the IndexWriter docstore to be a different directory
> --
>
> Key: LUCENE-1618
> URL: https://issues.apache.org/jira/browse/LUCENE-1618
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Index
>Affects Versions: 2.4.1
>Reporter: Jason Rutherglen
>Priority: Minor
> Fix For: 2.9
>
> Attachments: MemoryCachedDirectory.java
>
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> Add an IndexWriter.setDocStoreDirectory method that allows doc
> stores to be placed in a different directory than the IW default
> dir.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1618) Allow setting the IndexWriter docstore to be a different directory

2009-04-28 Thread Jason Rutherglen (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12703850#action_12703850
 ] 

Jason Rutherglen commented on LUCENE-1618:
--

{quote}For an NRT writer using RAMDir for recently flushed tiny
segments (LUCENE-1313), the gains are more about the speed of
reading/writing many tiny files. Probably we should try
[somehow] to test this case, to see if LUCENE-1313 is even a
worthwhile optimization.{quote}

True a test would be good, how many files per second would it
produce?

When testing the realtime and the .del files (which are created
numerously before LUCENE-1516) the slowdown was quite dramatic
as it's not a sequential write which means the disk head can
move each time. That coupled with merges going on which
completely ties up the IO I think it's hard for small file
writes to not slow down with a rapidly updating index. 

An index that is being updated rapidly presumably would be
performing merges more often to remove deletes. 

> Allow setting the IndexWriter docstore to be a different directory
> --
>
> Key: LUCENE-1618
> URL: https://issues.apache.org/jira/browse/LUCENE-1618
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Index
>Affects Versions: 2.4.1
>Reporter: Jason Rutherglen
>Priority: Minor
> Fix For: 2.9
>
> Attachments: MemoryCachedDirectory.java
>
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> Add an IndexWriter.setDocStoreDirectory method that allows doc
> stores to be placed in a different directory than the IW default
> dir.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1618) Allow setting the IndexWriter docstore to be a different directory

2009-04-28 Thread Jason Rutherglen (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12703855#action_12703855
 ] 

Jason Rutherglen commented on LUCENE-1618:
--

{quote}One downside to this approach is it's brittle - whenever
we change file extensions you'd have to "know" to fix this
Directory.{quote}

True, I don't think we can expect the user to pass in the
correct FileSwitchDirectory (with the attendant file
extensions), we can make the particular implementation of
Directory we use to solve this problem internal to IW. Meaning
the writer can pass through the real directory calls to FSD, and
handle the RAMDir calls on it's own. 

> Allow setting the IndexWriter docstore to be a different directory
> --
>
> Key: LUCENE-1618
> URL: https://issues.apache.org/jira/browse/LUCENE-1618
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Index
>Affects Versions: 2.4.1
>Reporter: Jason Rutherglen
>Priority: Minor
> Fix For: 2.9
>
> Attachments: MemoryCachedDirectory.java
>
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> Add an IndexWriter.setDocStoreDirectory method that allows doc
> stores to be placed in a different directory than the IW default
> dir.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1618) Allow setting the IndexWriter docstore to be a different directory

2009-04-29 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12704049#action_12704049
 ] 

Michael McCandless commented on LUCENE-1618:


Patch looks good Jason!

Can you add copyright header & CHANGES.txt entry, and remove some noise (eg 
TestIndexWriterReader.java)?

Also: I think you should allow any Directory instance as primary/secondary?  
(You're hardwiring to RAMDir/FSDir now).  I realize NRT's use of this will be a 
RAMDir/FSDir, but I think this dir can be generic.  Can you also implement 
listAll()?

Finally: maybe for the "tee" (IndexOutput "writes through" two Dirs, suggested 
above) functionality, we should create a different Directory impl?

> Allow setting the IndexWriter docstore to be a different directory
> --
>
> Key: LUCENE-1618
> URL: https://issues.apache.org/jira/browse/LUCENE-1618
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Index
>Affects Versions: 2.4.1
>Reporter: Jason Rutherglen
>Priority: Minor
> Fix For: 2.9
>
> Attachments: LUCENE-1618.patch, MemoryCachedDirectory.java
>
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> Add an IndexWriter.setDocStoreDirectory method that allows doc
> stores to be placed in a different directory than the IW default
> dir.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1618) Allow setting the IndexWriter docstore to be a different directory

2009-05-04 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12705548#action_12705548
 ] 

Michael McCandless commented on LUCENE-1618:


OK thanks Jason, I just committed that (w/ small change to listAll to directly 
allocate the String[]).

> Allow setting the IndexWriter docstore to be a different directory
> --
>
> Key: LUCENE-1618
> URL: https://issues.apache.org/jira/browse/LUCENE-1618
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Index
>Affects Versions: 2.4.1
>Reporter: Jason Rutherglen
>Assignee: Michael McCandless
>Priority: Minor
> Fix For: 2.9
>
> Attachments: LUCENE-1618.patch, LUCENE-1618.patch, LUCENE-1618.patch, 
> LUCENE-1618.patch, MemoryCachedDirectory.java
>
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> Add an IndexWriter.setDocStoreDirectory method that allows doc
> stores to be placed in a different directory than the IW default
> dir.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1618) Allow setting the IndexWriter docstore to be a different directory

2009-05-08 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12707284#action_12707284
 ] 

Michael McCandless commented on LUCENE-1618:


bq. Added fileExists checking in getDirectory

Jason, why is this needed?  Why is the mapping based on extension insufficient?

> Allow setting the IndexWriter docstore to be a different directory
> --
>
> Key: LUCENE-1618
> URL: https://issues.apache.org/jira/browse/LUCENE-1618
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Index
>Affects Versions: 2.4.1
>Reporter: Jason Rutherglen
>Assignee: Michael McCandless
>Priority: Minor
> Fix For: 2.9
>
> Attachments: LUCENE-1618.patch, LUCENE-1618.patch, LUCENE-1618.patch, 
> LUCENE-1618.patch, LUCENE-1618.patch, MemoryCachedDirectory.java
>
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> Add an IndexWriter.setDocStoreDirectory method that allows doc
> stores to be placed in a different directory than the IW default
> dir.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1618) Allow setting the IndexWriter docstore to be a different directory

2009-05-08 Thread Jason Rutherglen (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12707403#action_12707403
 ] 

Jason Rutherglen commented on LUCENE-1618:
--

One example of the use case is when IndexFileDeleter needs to
access the directory's files as is without extension
interpretation. A .fdt file that was written directly to the
primary directory (not through FSD) would fit this case. When
IFD tries to access the .fdt file (using the current code) FSD
says it's not there (because it thinks it's in the secondary
dir). 

Maybe we need a different type of FSD for this case?

> Allow setting the IndexWriter docstore to be a different directory
> --
>
> Key: LUCENE-1618
> URL: https://issues.apache.org/jira/browse/LUCENE-1618
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Index
>Affects Versions: 2.4.1
>Reporter: Jason Rutherglen
>Assignee: Michael McCandless
>Priority: Minor
> Fix For: 2.9
>
> Attachments: LUCENE-1618.patch, LUCENE-1618.patch, LUCENE-1618.patch, 
> LUCENE-1618.patch, LUCENE-1618.patch, MemoryCachedDirectory.java
>
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> Add an IndexWriter.setDocStoreDirectory method that allows doc
> stores to be placed in a different directory than the IW default
> dir.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1618) Allow setting the IndexWriter docstore to be a different directory

2009-05-08 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12707438#action_12707438
 ] 

Michael McCandless commented on LUCENE-1618:


I think if one is directly writing a file to the primary directory (not through 
FSD) then one should/could also delete directly from that directory?  I don't 
think we should be putting the magic inside FSD.

> Allow setting the IndexWriter docstore to be a different directory
> --
>
> Key: LUCENE-1618
> URL: https://issues.apache.org/jira/browse/LUCENE-1618
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Index
>Affects Versions: 2.4.1
>Reporter: Jason Rutherglen
>Assignee: Michael McCandless
>Priority: Minor
> Fix For: 2.9
>
> Attachments: LUCENE-1618.patch, LUCENE-1618.patch, LUCENE-1618.patch, 
> LUCENE-1618.patch, LUCENE-1618.patch, MemoryCachedDirectory.java
>
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> Add an IndexWriter.setDocStoreDirectory method that allows doc
> stores to be placed in a different directory than the IW default
> dir.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org