Copying files ala HDFS is trivial because it's sequential,
Lucene merging isn't, so scaling merging over 20 machines vs 4 Solr
has clear advantages... That and on-demand expandability, so I
can reindex 2 terabytes of data in half a day vs weeks or more
with 4 Solr masters has compelling advantages.

On Fri, Jan 15, 2010 at 12:09 PM, Grant Ingersoll <gsing...@apache.org> wrote:
> I can see why that is a win over the existing, but I still don't get why it 
> wouldn't be faster just to index to a suite of Solr master indexers and save 
> all this file slogging around.  But, I guess that is a separate patch all 
> together.
>
>
>
> On Jan 15, 2010, at 2:35 PM, Jason Rutherglen wrote:
>
>> Zipping cores/shards is in the latest patch...
>>
>> On Fri, Jan 15, 2010 at 11:22 AM, Andrzej Bialecki <a...@getopt.org> wrote:
>>> On 2010-01-15 20:13, Ted Dunning wrote:
>>>>
>>>> This can also be a big performance win.  Jason Venner reports significant
>>>> index and cluster start time improvements by indexing to local disk,
>>>> zipping
>>>> and then uploading the resulting zip file.  Hadoop has significant file
>>>> open
>>>> overhead so moving one zip file wins big over many index component files.
>>>> There is a secondary bandwidth win as well.
>>>
>>> Indeed, this one should be easy to add to this patch. Unless Jason & Jason
>>> already cooked a patch for this? ;)
>>>
>>>>
>>>> On Fri, Jan 15, 2010 at 8:34 AM, Andrzej Bialecki
>>>> (JIRA)<j...@apache.org>wrote:
>>>>
>>>>>
>>>>> HDFS doesn't support enough POSIX to support writing Lucene indexes
>>>>> directly to HDFS - for this reason indexes are always created on local
>>>>> storage of each node, and then after closing they are copied to HDFS.
>>>
>>>
>>>
>>>
>>> --
>>> Best regards,
>>> Andrzej Bialecki     <><
>>>  ___. ___ ___ ___ _ _   __________________________________
>>> [__ || __|__/|__||\/|  Information Retrieval, Semantic Web
>>> ___|||__||  \|  ||  |  Embedded Unix, System Integration
>>> http://www.sigram.com  Contact: info at sigram dot com
>>>
>>>
>
> --------------------------
> Grant Ingersoll
> http://www.lucidimagination.com/
>
> Search the Lucene ecosystem using Solr/Lucene: 
> http://www.lucidimagination.com/search
>
>

Reply via email to