Edward,
I think for now we'll start with modeling how to store triples such that we
can run real time SPARQL queries on them and then later look at the Pregel
model and how we can leverage that for bulk processing. The Bigtable data
model doesnt lend itself directly to store triples such that fast
Currently it is just something I expect to run into problems with as I
am yet some ways from going load testing though I'd hope to get started
on it soon. The 0.21 planned implementation of MultiDelete will
certainly help a lot though.
Perhaps running a M/R job with a scan result as the input th
Hi, I'm a proposer/sponsor of heart project.
I have no doubt that RDF can be stored in HBase because google also
stores linked-data in their bigtable.
However, If you want to focus on large-scale (distributed) processing,
I would recommend you to read google pregel project (google's graph
computi
2010/4/4 Onur AKTAS :
>
> Thank you very much for your answers.. I'm checking the document that you
> gave.
> In short words, unless massive traffic and massive data size and massive
> scale is needed, stick with regular RDBMSs, then if we grow up to terabytes
> of data to be querried, then we c
hi,
i'm totally new to hbase and mapreduce and could really need some
pointer into the right direction for the following situation.
i managed to run a basic mapreduce example - analog to Export.java
in the hbase.mapreduce package.
what i need to achieve is the following :
do a map/reduce scan
It's likely not the actual deserialization itself but rather the time
to read the entire row from hdfs. There are some optimizations that
can be made here (using block index to get all blocks for a row with a
single hdfs read, tcp socket reuse, etc)
On Apr 3, 2010, at 11:35 AM, "Sammy Yu"
Thank you very much for your answers.. I'm checking the document that you gave.
In short words, unless massive traffic and massive data size and massive scale
is needed, stick with regular RDBMSs, then if we grow up to terabytes of data
to be querried, then we can switch no NO-SQL databases.
Tha
Thanks, that seemed to help.
Our jobs are running without failures, for the last 48 hours.
On Thu, Apr 1, 2010 at 11:43 PM, Andrew Purtell wrote:
> First,
>
> "ulimit: 1024"
>
> That's fatal. You need to up file descriptors to something like 32K.
>
> See http://wiki.apache.org/hadoop/Hbase/Trou