[ https://issues.apache.org/jira/browse/LUCENENET-417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13039436#comment-13039436 ]
Christopher Currens commented on LUCENENET-417: ----------------------------------------------- Good call. I think I was confusing storing the whole field with storing the term vectors, which lucene.net can do. I still think at the very least being able to store binary values via a stream is a necessary addition to Lucene.Net. Strings are less of an issue, to me at least, of making streamable. However, I can see the benefit when indexing large items, which is really all this is attempting to solve. There are speed/memory issues created by being forced to load large quantities of data into memory to perform any sort of indexing operation on them. This may not be a terribly large use case for some people, but anyone trying to write a multi-threaded indexing system would certainly enjoy the benefits of a low memory footprint/speed increase. > implement streams as field values > --------------------------------- > > Key: LUCENENET-417 > URL: https://issues.apache.org/jira/browse/LUCENENET-417 > Project: Lucene.Net > Issue Type: New Feature > Components: Lucene.Net Core > Reporter: Christopher Currens > Attachments: BinaryStream.patch > > > Adding binary values to a field is an expensive operation, as the whole > binary data must be loaded into memory and then written to the index. Adding > the ability to use a stream instead of a byte array could not only speed up > the indexing process, but reducing the memory footprint as well. > Java lucene has the ability to use a TextReader the both analyze and store > text in the index. .NET lacks the ability to store the data in the index, > due to the fact that .net TextReaders cannot seek or reset the position of > the stream. This should be a feature added into Lucene.NET as well. My > thoughts are to add another Field constructor, that is Field(string name, > System.IO.Stream stream, System.Text.Encoding encoding), that will allow the > text to be analyzed and stored into the index. > Comments about this approach are greatly appreciated. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira