subject:"Re\: Lucene\-based Distributed Index Leveraging Hadoop"

Re: Lucene-based Distributed Index Leveraging Hadoop

2008-02-11 Thread Tim Jones

I am guessing that the idea behind not putting the indexes in HDFS is (1) maximize performance; (2) they are relatively transient - meaning the data they are created from could be in HDFS, but the indexes themselves are just local. To avoid having to recreate them, a backup copy could be k

Re: Lucene-based Distributed Index Leveraging Hadoop

2008-02-09 Thread Srikant Jakilinki

Hi Ning, In continuation with our offline conversation, here is a public expression of interest in your work and a description of our work. Sorry for the length in advance and I hope that the folk will be able to collaborate and/or share experiences and/or give us some pointers... 1) We are

Re: Lucene-based Distributed Index Leveraging Hadoop

2008-02-07 Thread Andrzej Bialecki

Doug Cutting wrote: Ning, I am also interested in starting a new project in this area. The approach I have in mind is slightly different, but hopefully we can come to some agreement and collaborate. I'm interested in this too. My current thinking is that the Solr search API is the appropri

Re: Lucene-based Distributed Index Leveraging Hadoop

2008-02-07 Thread Doug Cutting

Ning, I am also interested in starting a new project in this area. The approach I have in mind is slightly different, but hopefully we can come to some agreement and collaborate. My current thinking is that the Solr search API is the appropriate model. Solr's facets are an important featur

Re: Lucene-based Distributed Index Leveraging Hadoop

2008-02-06 Thread J. Delgado

I'm pretty sure that what you describe is the case, specially taking into consideration that PageRank (what drives their search results) is a per document value that is probably recomputed after some long time interval. I did see a MapReduce algorithm to compute PageRank as well. However I do think

Re: Lucene-based Distributed Index Leveraging Hadoop

2008-02-06 Thread Andrzej Bialecki

(trimming excessive cc-s) Ning Li wrote: No. I'm curious too. :) On Feb 6, 2008 11:44 AM, J. Delgado <[EMAIL PROTECTED]> wrote: I assume that Google also has distributed index over their GFS/MapReduce implementation. Any idea how they achieve this? I'm pretty sure that MapReduce/GFS/BigTabl

Re: Lucene-based Distributed Index Leveraging Hadoop

2008-02-06 Thread Ning Li

One main focus is to provide fault-tolerance in this distributed index system. Correct me if I'm wrong, I think SOLR-303 is focusing on merging results from multiple shards right now. We'd like to start an open source project for a fault-tolerant distributed index system (or join if one already exi

Re: Lucene-based Distributed Index Leveraging Hadoop

2008-02-06 Thread Ning Li

No. I'm curious too. :) On Feb 6, 2008 11:44 AM, J. Delgado <[EMAIL PROTECTED]> wrote: > I assume that Google also has distributed index over their > GFS/MapReduce implementation. Any idea how they achieve this? > > J.D. >

Re: Lucene-based Distributed Index Leveraging Hadoop

2008-02-06 Thread Ning Li

I work for IBM Research. I read the Rackspace article. Rackspace's Mailtrust has a similar design. Happy to see an existing application on such a system. Do they plan to open-source it? Is the AOL project an open source project? On Feb 6, 2008 11:33 AM, Clay Webster <[EMAIL PROTECTED]> wrote: > >

Re: Lucene-based Distributed Index Leveraging Hadoop

2008-02-06 Thread Ian Holsman

Clay Webster wrote: There seem to be a few other players in this space too. Are you from Rackspace? (http://highscalability.com/how-rackspace-now-uses-mapreduce-and-hadoop- query-terabytes-data) AOL also has a Hadoop/Solr project going on. CNET does not have much brewing there. Although Yo

Re: Lucene-based Distributed Index Leveraging Hadoop

2008-02-06 Thread J. Delgado

I assume that Google also has distributed index over their GFS/MapReduce implementation. Any idea how they achieve this? J.D. On Feb 6, 2008 11:33 AM, Clay Webster <[EMAIL PROTECTED]> wrote: > > There seem to be a few other players in this space too. > > Are you from Rackspace? > (http://highsc

Re: Lucene-based Distributed Index Leveraging Hadoop

Re: Lucene-based Distributed Index Leveraging Hadoop

Re: Lucene-based Distributed Index Leveraging Hadoop

Re: Lucene-based Distributed Index Leveraging Hadoop

Re: Lucene-based Distributed Index Leveraging Hadoop

Re: Lucene-based Distributed Index Leveraging Hadoop

Re: Lucene-based Distributed Index Leveraging Hadoop

Re: Lucene-based Distributed Index Leveraging Hadoop

Re: Lucene-based Distributed Index Leveraging Hadoop

Re: Lucene-based Distributed Index Leveraging Hadoop

Re: Lucene-based Distributed Index Leveraging Hadoop

11 matches

Site Navigation

Mail list logo

Footer information