Re: A Scale-Out RDF Store for Distributed Processing on Map/Reduce

2008-10-20 Thread Hyunsik Choi
Although we proposed the system for RDF data, we actually are considering more general system for graph data model. Actually, many data in real world can be represented graph data model. In particular, besides web data some data domains (i.e., biological data, chemical data, social networks, and so

Re: How do I implement a Writable into another Writable?

2008-10-20 Thread Yih Sun Khoo
Thanks Chris. Last question (pinkie promise) for this thread at least: public Iterator getUrlListAsArrayList(){ return urlList.iterator(); } is a better alternative to the clone() method because you had mentioned it will simply fail "in this case". Does "in this case" refer to wh

Re: How do I implement a Writable into another Writable?

2008-10-20 Thread Chris Douglas
On Oct 20, 2008, at 8:08 PM, Yih Sun Khoo wrote: Awesome! One question about the example you gave me. When you say "clears the collection", the expected value should just be "B0, B1" right? Because the collection gets cleared of the old contained value A0...A2. Sorry, I meant for the co

Re: A Scale-Out RDF Store for Distributed Processing on Map/Reduce

2008-10-20 Thread Ted Dunning
At Veoh the recommendation data amounts to many billions of (roughly) these triples and this approach works very well indeed, even on tiny development clusters. On Mon, Oct 20, 2008 at 6:23 PM, Colin Evans <[EMAIL PROTECTED]> wrote: > Hi Edward, > At Metaweb, we're experimenting with storing raw

Re: Problems running the Hadoop Quickstart

2008-10-20 Thread Alex Loddengaard
Have you looked at your logs yet? You should look at your logs and post any errors or warnings. Alex On Mon, Oct 20, 2008 at 8:29 PM, Amareshwari Sriramadasu < [EMAIL PROTECTED]> wrote: > Has your task-tracker started? I mean, do you see non-zero nodes on your > job tracker UI? > > -Amareshwari

Re: A Scale-Out RDF Store for Distributed Processing on Map/Reduce

2008-10-20 Thread Edward J. Yoon
Oh, I remember freebase.com which are mentioned by barney pell (powerset CTO) at our company (NHN, corp) meeting. Hmm, The two approaches seem slightly different. However, I hope we can work together in the near future if it possible. /Edward On Tue, Oct 21, 2008 at 1:41 PM, Colin Evans <[EMAIL

Re: A Scale-Out RDF Store for Distributed Processing on Map/Reduce

2008-10-20 Thread Colin Evans
We've got a lot of open source projects related to Hadoop and to our graph data available at http://research.freebase.com, but we aren't planning on open sourcing our graph processing work around Hadoop yet. Hyunsik Choi wrote: Hi Colin, I'm a member of RDF proposal. I have one question as

Re: Problems running the Hadoop Quickstart

2008-10-20 Thread Amareshwari Sriramadasu
Has your task-tracker started? I mean, do you see non-zero nodes on your job tracker UI? -Amareshwari John Babilon wrote: Hello, I've been trying to get Hadoop up and running on a Windows Desktop running Windows XP. I've installed Cygwin and Hadoop. I run the start-all.sh script, it starts

Re: How do I implement a Writable into another Writable?

2008-10-20 Thread Yih Sun Khoo
Awesome! One question about the example you gave me. When you say "clears the collection", the expected value should just be "B0, B1" right? Because the collection gets cleared of the old contained value A0...A2. MyWritable foo = new MyWritable(); // foo contains "A0", "A1", "A2" in its namelis

Re: A Scale-Out RDF Store for Distributed Processing on Map/Reduce

2008-10-20 Thread Hyunsik Choi
Hi Colin, I'm a member of RDF proposal. I have one question as to Metaweb. Do you intend to make Metaweb open source? Hyunsik Choi On Mon, 2008-10-20 at 18:23 -0700, Colin Evans wrote: > Hi Edward, > At Metaweb, we're experimenting with storing raw triples in HDFS flat > files, and have written

Problems running the Hadoop Quickstart

2008-10-20 Thread John Babilon
Hello, I've been trying to get Hadoop up and running on a Windows Desktop running Windows XP. I've installed Cygwin and Hadoop. I run the start-all.sh script, it starts the namenode, but does not seem to start the datanode. I found that if I run hadoop datanode then, the datanode starts. Wh

Re: A Scale-Out RDF Store for Distributed Processing on Map/Reduce

2008-10-20 Thread Hyunsik Choi
Hi Colin, I'm a member of RDF proposal. I have one question as to Metaweb. Do you (or your company) have a plan to make Metaweb to be open source? Hyunsik Choi - Hyunsik Choi (Ph.D Student) Laboratory of Prof. Yon Dohn Chung Databa

Re: How do I implement a Writable into another Writable?

2008-10-20 Thread Chris Douglas
On Oct 20, 2008, at 6:43 PM, Yih Sun Khoo wrote: Thanks Chris and Joman for your detailed explanations. Would this be a good example of using a shallow copy? Also I'm trying to wrap my head around why the shallow copy is needed. You mentioned it is to eliminate any state from the values th

Re: How do I implement a Writable into another Writable?

2008-10-20 Thread Yih Sun Khoo
Thanks Chris and Joman for your detailed explanations. Would this be a good example of using a shallow copy? Also I'm trying to wrap my head around why the shallow copy is needed. You mentioned it is to eliminate any state from the values the list might have formerly contained. Could you give me a

Re: A Scale-Out RDF Store for Distributed Processing on Map/Reduce

2008-10-20 Thread Colin Evans
Hi Edward, At Metaweb, we're experimenting with storing raw triples in HDFS flat files, and have written a simple query language and planner that executes the queries with chained map-reduce jobs. This approach works well for warehousing triple data, and doesn't require HBase. Queries may ta

Re: adding more datanode

2008-10-20 Thread David Wei
this is quite easy. U can just config your new datanodes as others and format the filesystem before u start it. Remember to make it ssh-able for your master and run ./bin/start-all.sh on the master machine if you want to start all the deamons. This will start and add the new datanodes to the up-

adding more datanode

2008-10-20 Thread Ski Gh3
hi, I am wondering how to add more datanodes to an up-and-running hadoop instance? Couldn't find instructions on this from the wiki page. Thanks!

Re: Hadoop for real time

2008-10-20 Thread Stas Oskin
Hi Ted. Thanks for sharing some of inner workings of Veoh, which btw I'm a frequent user of (or at least when time permits :) ). I indeed recall reading somewhere that Veoh used a heavily modified version of MogileFS, but have switched since as it wasn't ready enough for Veoh needs. If not Hadoo

Re: Does anybody have tried to setup a cluster with multiple namenodes?

2008-10-20 Thread Chris Douglas
The secondary namenode is neither a backup service for the HDFS namespace nor a failover for requests: http://hadoop.apache.org/core/docs/current/hdfs_user_guide.html#Secondary+NameNode The secondary namenode periodically merges an image (FSImage) of the namesystem with recent changes (FSEdi

Re: How do I implement a Writable into another Writable?

2008-10-20 Thread Chris Douglas
TupleWritable is not a general-purpose type. It's used for map-side joins, where the arity of a tuple is fixed by construction. Its intent is a transient type with very, very specific applications in mind. It sounds like you don't need a general list type, as you don't need to worry about e

Re: mysql in hadoop

2008-10-20 Thread Gerardo Velez
Hi! Actually I got same problem and temporally I've solved it including jdbc dependecies inside main jar. Actually another solution I've found is you can place all jar dependencias inside hadoop/lib directory. Hope it helps. -- Gerardo On Mon, Oct 20, 2008 at 9:43 AM, Deepak Diwakar <[EMAIL

Re: mysql in hadoop

2008-10-20 Thread Sandeep Dey
Hi Deepak, I can suggest a crude solution :) that might work . you can extract the jdbc connecter jar and copy the classes from there to ur class/build directory. this way the jdbc driver classes are included in the final jar. Hope that helps, sandeep On Mon, 20 Oct 2008, Deepak Diwakar wrot

mysql in hadoop

2008-10-20 Thread Deepak Diwakar
Hi all, I am sure someone must have tried mysql connection using hadoop. But I am getting problem. Basically I am not getting how to inlcude classpath of jar of jdbc connector in the run command of hadoop or is there any other way so that we can incorporate jdbc connector jar into the main jar wh

Re: Does anybody have tried to setup a cluster with multiple namenodes?

2008-10-20 Thread Alex Loddengaard
I believe the common practice is to have a secondary namenode, which by default is enabled. Secondary namenodes serve the purpose of having a redundant backup. However, as far as I'm aware, they are not hot swappable. This means that if your namenode fails, then your cluster will go down until y

Re: Hadoop for real time

2008-10-20 Thread Ted Dunning
Hadoop may not be quite what you want for this. You could definitely use Hadopo for storage and streaming. You can also do various kinds of processing on hadoop. But because Hadoop is primarily intended for batch style operations, there is a bit of an assumption that some administrative tasks wi

Re: Distributed cache Design

2008-10-20 Thread Ted Dunning
I was very surprised by this as well. I was doing variants on all-pairs shortest paths and found that the best representation really was triples containing from-node, to-node and distance. The nice side of this is that you get scaling like you wouldn't believe (subject to big-omega, of course) O

Re: Can jobs be configured to be sequential

2008-10-20 Thread Ted Dunning
Take a look at queuing systems like Amazon's simple queuing service. Also look at cascading: http://www.cascading.org/ On Sat, Oct 18, 2008 at 4:13 PM, Ravion <[EMAIL PROTECTED]> wrote: > Hi Paco, > > Thanks - This is exactly what I was looking for.. > > Regards, > Ravi > - Original Message

Re: If I use the third party jar file, where will I put the file?

2008-10-20 Thread δΈε…‰εŽ
$Hadoop_Home/lib 2008/10/20 imcaptor <[EMAIL PROTECTED]> > > > > > > > -- --~--~-~--~~~---~--~~ Guanghua Ding My research interests include distributed computing, cloud-computing, HPC, and Data mining.

If I use the third party jar file, where will I put the file?

2008-10-20 Thread imcaptor

Does anybody have tried to setup a cluster with multiple namenodes?

2008-10-20 Thread David Wei
I have been trying to get more information about this kind of installation, but right now still find null. :-(