I guess the old adage is true.
If you only have a hammer, then every problem looks like a nail.
As an architect, its your role to find the right tools to be used to solve the
problem in the most efficient and effective manner.
So the first question you need to ask is if HBase is the right too
On Fri, Mar 13, 2015 at 2:41 PM, Michael Segel
wrote:
>
> In stand alone, you’re writing to local disk. You lose the disk you lose
> the data, unless of course you’ve raided your drives.
> Then when you lose the node, you lose the data because its not being
> replicated. While this may not be a m
Joseph,
In stand alone, you’re writing to local disk. You lose the disk you lose the
data, unless of course you’ve raided your drives.
Then when you lose the node, you lose the data because its not being
replicated. While this may not be a major issue or concern… you have to be
aware of it’s
Michael,
Thanks for your concern. Let me ask a few questions, since you’re implying
that HDFS is the only way to reduce risk and ensure security, which is not
the assumption under which I’ve been working.
A brief rundown of our problem’s characteristics, since I haven’t really
described what we’r
Guys,
More than just needing some love.
No HDFS… means data at risk.
No HDFS… means that stand alone will have security issues.
Patient Data? HINT: HIPPA.
Please think your design through and if you go w HBase… you will want to build
out a small cluster.
> On Mar 10, 2015, at 6:16 PM, Nic
As Stack and Andrew said, just wanted to give you fair warning that this
mode may need some love. Likewise, there are probably alternative that run
a bit lighter weight, though you flatter us with the reminder of the long
feature list.
I have no problem with helping to fix and committing fixes to
On:
- Future investment in a design that scales better
Indeed, designing against key value store is different from designing
against RDBMs.
I wonder if you explored an option to abstract the storage layer and using
"single node purposed" store until you grow enough to switch to another one?
E.g
Sorry, never answered your question about versions. I have 1.0.0 version
of hbase, which has hadoop-common 2.5.1 in its lib folder.
-j
On 3/10/15, 11:36 AM, "Rose, Joseph"
wrote:
>I tried it and it does work now. It looks like the interface for
>hadoop.fs.Syncable changed in March, 2012 to re
I tried it and it does work now. It looks like the interface for
hadoop.fs.Syncable changed in March, 2012 to remove the deprecated sync()
method and define only hsync() instead. The same committer did the right
thing and removed sync() from FSDataOutputStream at the same time. The
remaining hsync(
You’re dealing with patient data which is either very structured or
semi-structured where you can use an RDBMs if you really think about your
schema.
If you want an RDBMs that can be used to hold objects, look at Informix’s IDS
which is now IBM’s IDS. It contains the extensibility that you cou
On Fri, Mar 6, 2015 at 1:50 PM, Rose, Joseph <
joseph.r...@childrens.harvard.edu> wrote:
> So, I think Nick, St.Ack and Wilm have all made some excellent points, but
> this last email more or less hit it on the head. Like I said, I¹m working
> with patient data and while the volume is small now, i
... And if you have at most "small data" at this stage, you might be able
to cut the heap sizes of the HDFS daemons in half.
On Fri, Mar 6, 2015 at 2:18 PM, Andrew Purtell wrote:
> > I think the final issue with hadoop-common (re: unimplemented sync for local
> filesystems) is the one showstoppe
> I think the final issue with hadoop-common (re: unimplemented sync for local
filesystems) is the one showstopper for us.
Although the unnecessary overhead would be significant, you could run a
stripped down HDFS stack on the VM. Give the NameNode, SecondaryNameNode,
and DataNode 1GB of heap only
So, I think Nick, St.Ack and Wilm have all made some excellent points, but
this last email more or less hit it on the head. Like I said, I¹m working
with patient data and while the volume is small now, it¹s not going to
stay that way. And the cell-level security is a *huge* win ‹ I¹m sure you
folks
Hi,
Am 06.03.2015 um 19:18 schrieb Stack:
> Why not use an RDBMS then?
When I first read the hbase documentation I also stumbled about the
"only use for large datasets" or "standalone only in dev mode" etc. In
my point of view there are some arguments against RDBMSs and for e.g.
hbase, although w
On Tue, Mar 3, 2015 at 7:32 AM, Rose, Joseph <
joseph.r...@childrens.harvard.edu> wrote:
> Folks,
>
> I’m new to HBase (but not new to these sorts of data stores.) I think
> HBase would be a good fit for a project I’m working on, except for one
> thing: the amount of data we’re talking about, here
Hi Joseph,
Generally speaking we've thought of stand-alone mode a dev/testing because
the common use case for HBase is larger datasets. There's nothing
specifically non-production about a stand-alone mode, though you obviously
won't have high-availability, and there may be bugs in the code paths t
Folks,
I’m new to HBase (but not new to these sorts of data stores.) I think HBase
would be a good fit for a project I’m working on, except for one thing: the
amount of data we’re talking about, here, is far smaller than what’s usually
recommended for HBase. As I read the docs, though, it seems
18 matches
Mail list logo