Alternatively, you could create a multivalued field whereby each
sentence is in the same document, but retrievable in order.


On Fri, Jul 22, 2011 at 11:10 AM, Glen Newton <glen.new...@gmail.com> wrote:
> So to use Lucene-speak, each sentence is a document.
>
> I don't know how you are indexing and what code you are using (and
> what hardware, etc.), but you if you are not already, should consider
> multi-threading the indexing which should give you a significant
> indexing performance boost.
>
> -Glen
>
>
> On Fri, Jul 22, 2011 at 11:04 AM, starz10de <farag_ah...@yahoo.com> wrote:
>> I am interested to search in sentence level.
>> It is a parallel corpora , each sentence in the first language is
>> equivalence to sentence in the second language. I want to index each
>> sentence and have some id for each sentence in order when I retrieve it I go
>> easily and retrieve its equivalence in the second language.
>>
>> This I did by splitting the file and consider each sentence as text file.
>> However, this really takes long time to do for many huge text files.
>>
>>
>> --
>> View this message in context: 
>> http://lucene.472066.n3.nabble.com/Index-one-huge-text-file-tp3191605p3191628.html
>> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: java-user-h...@lucene.apache.org
>>
>>
>
>
>
> --
>
> -
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to