Observe about 50% iowait before even starting clients - that is when
there is actually no load from clients on the system. So only internal
stuff in HBase/HDFS can cause this - HBase compaction? HDFS?
Regards, Per Steffensen
Per Steffensen skrev:
Hi
We have a system a.o. with a HBase
Per Steffensen skrev:
Observe about 50% iowait before even starting clients - that is when
there is actually no load from clients on the system. So only
internal stuff in HBase/HDFS can cause this - HBase compaction? HDFS?
Ahh ok, that was only for half a minute after restart. So basically down
deleting those regions entirely? And is
explicit deletion of entire regions possible at all?
The reason I want to do this is that I expect it to be much faster than
doing explicit deletion record by record of 50mio+ records every day.
Regards, Per Steffensen
...
Mike Segel
On Dec 8, 2011, at 7:13 AM, Per Steffensen st...@designware.dk wrote:
Hi
The system we are going to work on will receive 50mio+ new datarecords every
day. We need to keep a history of 2 years of data (thats 35+ billion
datarecords in the storage all in all), and that basically
Ahhh stupid me. I probably just want to use different tables for
different days/months. Believe tables can fairly quickly be deleted on
HBase?
Regards, Per Steffensen
Per Steffensen skrev:
Thanks for your reply!
Michel Segel skrev:
Per Seffensen,
I would urge you to step away from
I am not sure exactly what you want to do, but maybe you want to have a
look at products like elasticsearch, solandra, solr, sphinx etc.
27g skrev:
I wanna use hadoop/contrib/index to create a distrabute lucene index on
hadoop ,who can help me by giving me the sourcecode of the
Vitalii Tymchyshyn skrev:
01.09.11 21:55, Per Steffensen написав(ла):
Vitalii Tymchyshyn skrev:
Hello.
AFAIK now you still have HDFS NameNode and as soon as NameNode is
down - your cluster is down. So, putting scheduling on the same
machine as NameNode won't make you cluster worse in terms
of a distributed timer framework
running in a cluster, so that I could just register a timer job with
the cluster, and then be sure that it is invoked every 5th minute, no
matter if one or two particular machines in the cluster is down.
Any suggestions are very welcome.
Regards, Per Steffensen
not trigger anything if this machine is down. Can you
confirm that the Coordinator Application-role is distributed in a
distribued Oozie setup, so that jobs gets triggered even if one or two
machines are down?
Regards, Per Steffensen
Ronen Itkin skrev:
Hi
Try to use Oozie for job coordination
Thanks for your response. See comments below.
Regards, Per Steffensen
Alejandro Abdelnur skrev:
[moving common-user@ to BCC]
Oozie is not HA yet. But it would be relatively easy to make it. It was
designed with that in mind, we even did a prototype.
Ok, so if it isnt HA out-of-the-box I
want my timer framework to also be
clustered, distributed and coordinated, so that I will also have my
timer jobs triggered even though 3 out of 10 machines are down.
Regards, Per Steffensen
Ronen Itkin skrev:
If I get you right you are asking about Installing Oozie as Distributed
and/or HA
Vitalii Tymchyshyn skrev:
01.09.11 18:14, Per Steffensen написав(ла):
Well I am not sure I get you right, but anyway, basically I want a
timer framework that triggers my jobs. And the triggering of the jobs
need to work even though one or two particular machines goes down. So
the timer
12 matches
Mail list logo