Welcome to Blur and hadoop!

Aaron can fill in more but I think your on the right track.  I dont know
what the servers file is for as I'm running an older version, If I guessed
it would be the list of machines you will run Shards on.  You mention a
cluster, and you mention having a namenode separated which is good.  To
roll blur out usually you run a controller (could probably live on the
namenode if its a small cluster), and you run a shard server on each
datanode/tasktracker node.  The HDFS:// path you entered probably should
have 8020 on it unless you changed the port in core-site.xml.

The last point is preference really.  I roll out jars with the MR jobs and
I have added to hadoop-env.sh blur into the HADOOP_CLASSPATH for use.
 There are other ways this just worked really well for me.

I hope that helps, let us know how it works. :)

~Garrett




On Wed, Feb 20, 2013 at 1:41 PM, Paul O'Donoghue <[email protected]> wrote:

> Hi,
>
> First up I would like to say I’m really excited by the Blur project, it
> seems to fit the need of a potential project perfectly. I’m hoping that I
> can someday contribute back to this project in some way as it seems that it
> will be of enormous help to me.
>
> Now, on to the meat of the issue. I’m a complete search newbie. I am coming
> from a Spring/Application development background but have to get involved
> in the Search/Big data field for a current client. Since the new year I
> have been looking at Hadoop and have setup a small cluster using Cloudera’s
> excellent tools. I’ve been downloading datasets, running MR jobs, etc. and
> think I have gleaned a very basic level of knowledge which is enough for me
> to learn more when I need it. This week I have started looking at Blur, and
> at present I have cloned the src to the hadoop namenode where I have built
> and started the blur servers. But now I am stuck, and don’t know where to
> go. So I will ask the following
>
> 1 - /apache-blur-0.2.0-SNAPSHOT/conf/servers. At present I just have my
> namenode defined in here. Do I need to add my datanodes as well?
>
> 2 - blur> create repl-table hdfs://localhost:9000/blur/repl-table 1
> java.net.ConnectException: Call to localhost/127.0.0.1:9000 failed on
> connection exception: java.net.ConnectException: Connection refused.
>
> I’m confused here. Is 9000 the correct port? Is there some sort of user
> auth issue?
>
> 3 - Assuming I create a table on the hdfs, when I want to import my data
> into it I use a MR job yes? What is the best way to package this job? Do I
> have to include all the Blur jars or do I install Blur on the datanodes and
> set a classpath? Is it possible to link to an example MR job in a maven
> project? Or am I on completely the wrong track.
>
> Thanks for your help,
>
> Paul.
>

Reply via email to