Re: SOLR on hdfs

Amit Nithian Wed, 06 Mar 2013 19:53:40 -0800

Joseph,

Doing what Otis said will do literally what you want which is copying the
index to HDFS. It's no different than copying it to a different machine
which btw is what Solr's master/slave replication scheme does.
Alternatively, I think people are starting to setup new Solr instances with
SolrCloud which doesn't have the concept of master/slave but rather a
series of nodes with the option of having replicas (what I believe to be
backup nodes) so that you have the redundancy you want.


Honestly HDFS in the way that you are looking for is probably no different
than storing  your solr index in a RAIDed storage format but I don't
pretend to know much about RAID arrays.

What exactly are you trying to achieve from a systems perspective? Why do
you want Hadoop in the mix here and how does copying the index to HDFS help
you? If SolrCloud seems complicated try just setting up a simple
master/slave replication scheme for that's really easy.

Cheers
Amit


On Wed, Mar 6, 2013 at 9:55 PM, Joseph Lim <ysli...@gmail.com> wrote:

> Hi Amit,
>
> so you mean that if I just want to get redundancy for solr in hdfs, the
> only best way to do it is to as per what Otis suggested using the following
> command
>
> hadoop fs -copyFromLocal <localsrc> URI
>
> Ok let me try out solrcloud as I will need to make sure it works well with
> nutch too..
>
> Thanks for the help..
>
>
> On Thu, Mar 7, 2013 at 5:47 AM, Amit Nithian <anith...@gmail.com> wrote:
>
> > Why wouldn't SolrCloud help you here? You can setup shards and replicas
> etc
> > to have redundancy b/c HDFS isn't designed to serve real time queries as
> > far as I understand. If you are using HDFS as a backup mechanism to me
> > you'd be better served having multiple slaves tethered to a master (in a
> > non-cloud environment) or setup SolrCloud either option would give you
> more
> > redundancy than copying an index to HDFS.
> >
> > - Amit
> >
> >
> > On Wed, Mar 6, 2013 at 12:23 PM, Joseph Lim <ysli...@gmail.com> wrote:
> >
> > > Hi Upayavira,
> > >
> > > sure, let me explain. I am setting up Nutch and SOLR in hadoop
> > environment.
> > > Since I am using hdfs, in the event if there is any crashes to the
> > > localhost(running solr), i will still have the shards of data being
> > stored
> > > in hdfs.
> > >
> > > Thanks you so much =)
> > >
> > > On Thu, Mar 7, 2013 at 1:19 AM, Upayavira <u...@odoko.co.uk> wrote:
> > >
> > > > What are you actually trying to achieve? If you can share what you
> are
> > > > trying to achieve maybe folks can help you find the right way to do
> it.
> > > >
> > > > Upayavira
> > > >
> > > > On Wed, Mar 6, 2013, at 02:54 PM, Joseph Lim wrote:
> > > > > Hello Otis ,
> > > > >
> > > > > Is there any configuration where it will index into hdfs instead?
> > > > >
> > > > > I tried crawlzilla and  lily but I hope to update specific package
> > such
> > > > > as
> > > > > Hadoop only or nutch only when there are updates.
> > > > >
> > > > > That's y would prefer to install separately .
> > > > >
> > > > > Thanks so much. Looking forward for your reply.
> > > > >
> > > > > On Wednesday, March 6, 2013, Otis Gospodnetic wrote:
> > > > >
> > > > > > Hello Joseph,
> > > > > >
> > > > > > You can certainly put them there, as in:
> > > > > >   hadoop fs -copyFromLocal <localsrc> URI
> > > > > >
> > > > > > But searching such an index will be slow.
> > > > > > See also: http://katta.sourceforge.net/
> > > > > >
> > > > > > Otis
> > > > > > --
> > > > > > Solr & ElasticSearch Support
> > > > > > http://sematext.com/
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Wed, Mar 6, 2013 at 7:50 AM, Joseph Lim <ysli...@gmail.com
> > > > <javascript:;>>
> > > > > > wrote:
> > > > > >
> > > > > > > Hi,
> > > > > > > Would like to know how can i put the indexed solr shards into
> > hdfs?
> > > > > > >
> > > > > > > Thanks..
> > > > > > >
> > > > > > > Joseph
> > > > > > > On Mar 6, 2013 7:28 PM, "Otis Gospodnetic" <
> > > > otis.gospodne...@gmail.com<javascript:;>
> > > > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hi Joseph,
> > > > > > > >
> > > > > > > > What exactly are you looking to to?
> > > > > > > > See http://incubator.apache.org/blur/
> > > > > > > >
> > > > > > > > Otis
> > > > > > > > --
> > > > > > > > Solr & ElasticSearch Support
> > > > > > > > http://sematext.com/
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > On Wed, Mar 6, 2013 at 2:39 AM, Joseph Lim <
> ysli...@gmail.com
> > > > <javascript:;>>
> > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi I am running hadoop distributed file system, how do I
> put
> > my
> > > > > > output
> > > > > > > of
> > > > > > > > > the solr dir into hdfs automatically?
> > > > > > > > >
> > > > > > > > > Thanks so much..
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > Best Regards,
> > > > > > > > > *Joseph*
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Best Regards,
> > > > > *Joseph*
> > > >
> > >
> > >
> > >
> > > --
> > > Best Regards,
> > > *Joseph*
> > >
> >
>
>
>
> --
> Best Regards,
> *Joseph*
>

Re: SOLR on hdfs

Reply via email to