Copying files ala HDFS is trivial because it's sequential, Lucene merging isn't, so scaling merging over 20 machines vs 4 Solr has clear advantages... That and on-demand expandability, so I can reindex 2 terabytes of data in half a day vs weeks or more with 4 Solr masters has compelling advantages.
On Fri, Jan 15, 2010 at 12:09 PM, Grant Ingersoll <gsing...@apache.org> wrote: > I can see why that is a win over the existing, but I still don't get why it > wouldn't be faster just to index to a suite of Solr master indexers and save > all this file slogging around. But, I guess that is a separate patch all > together. > > > > On Jan 15, 2010, at 2:35 PM, Jason Rutherglen wrote: > >> Zipping cores/shards is in the latest patch... >> >> On Fri, Jan 15, 2010 at 11:22 AM, Andrzej Bialecki <a...@getopt.org> wrote: >>> On 2010-01-15 20:13, Ted Dunning wrote: >>>> >>>> This can also be a big performance win. Jason Venner reports significant >>>> index and cluster start time improvements by indexing to local disk, >>>> zipping >>>> and then uploading the resulting zip file. Hadoop has significant file >>>> open >>>> overhead so moving one zip file wins big over many index component files. >>>> There is a secondary bandwidth win as well. >>> >>> Indeed, this one should be easy to add to this patch. Unless Jason & Jason >>> already cooked a patch for this? ;) >>> >>>> >>>> On Fri, Jan 15, 2010 at 8:34 AM, Andrzej Bialecki >>>> (JIRA)<j...@apache.org>wrote: >>>> >>>>> >>>>> HDFS doesn't support enough POSIX to support writing Lucene indexes >>>>> directly to HDFS - for this reason indexes are always created on local >>>>> storage of each node, and then after closing they are copied to HDFS. >>> >>> >>> >>> >>> -- >>> Best regards, >>> Andrzej Bialecki <>< >>> ___. ___ ___ ___ _ _ __________________________________ >>> [__ || __|__/|__||\/| Information Retrieval, Semantic Web >>> ___|||__|| \| || | Embedded Unix, System Integration >>> http://www.sigram.com Contact: info at sigram dot com >>> >>> > > -------------------------- > Grant Ingersoll > http://www.lucidimagination.com/ > > Search the Lucene ecosystem using Solr/Lucene: > http://www.lucidimagination.com/search > >