Otis,
Not sure about the Solr, but with Lucene It was certainly doable. I
saw fields way bigger than 400Mb indexed, sometimes having a large set
of unique terms as well (think something like log file with lots of
alphanumeric tokens, couple of gigs in size). While indexing and
querying of such thi
blogs
From: Otis Gospodnetic [otis_gospodne...@yahoo.com]
Sent: Tuesday, June 07, 2011 6:59 PM
To: solr-user@lucene.apache.org
Subject: 400 MB Fields
Hello,
What are the biggest document fields that you've ever indexed in Solr or that
you've heard of? Ah, it must be Tom's Hathi t
uot; wrote:
>
>>Hi,
>>
>>
>>> I think the question is strange... May be you are wondering about
>>>possible
>>> OOM exceptions?
>>
>>No, that's an easier one. I was more wondering whether with 400 MB Fields
>>(indexed, not store
gt;
>> I think the question is strange... May be you are wondering about
>>possible
>> OOM exceptions?
>
>No, that's an easier one. I was more wondering whether with 400 MB Fields
>(indexed, not stored) it becomes incredibly slow to:
>* analyze
>* commit / write
Hi,
> I think the question is strange... May be you are wondering about possible
> OOM exceptions?
No, that's an easier one. I was more wondering whether with 400 MB Fields
(indexed, not stored) it becomes incredibly slow to:
* analyze
* commit / write to disk
* search
> I thi
I think the question is strange... May be you are wondering about possible
OOM exceptions? I think we can pass to Lucene single document containing
comma separated list of "term, term, ..." (few billion times)... Except
"stored" and "TermVectorComponent"...
I believe thousands companies already in
>From older (2.4) Lucene days, I once indexed the 23 volume "Encyclopedia
of Michigan Civil War Volunteers" in a single document/field, so it's probably
within the realm of possibility at least ...
Erick
On Tue, Jun 7, 2011 at 6:59 PM, Otis Gospodnetic
wrote:
> Hello,
>
> What are the biggest do
Hello,
What are the biggest document fields that you've ever indexed in Solr or that
you've heard of? Ah, it must be Tom's Hathi trust. :)
I'm asking because I just heard of a case of an index where some documents
having a field that can be around 400 MB in size! I'm curious if anyone has
an