Team ,
i had wrote a mapreduce program . scenario of my program is to emit
.
Total no user : 825
Total no seqid:6583100
No of map which the program will emit is : 825 * 6583100
I have Hbase table called ObjectSequence : which consist of 6583100(rows)
i had use TableMapper and
Which version of HBase are you using ? Did any RegionServer fail ?
On Tue, Jul 17, 2012 at 5:35 AM, deanforwever2010 <
deanforwever2...@gmail.com> wrote:
> in my client it printed:
> org.apache.hadoop.hbase.NotServingRegionException: Region is not online
> in server ,it kept printing : Region i
This sounds similar to something I've seen before, but in that case I found the
winning GC arguments to be something like
-XX:+UseParallelGC -XX:+UseParallelOldGC -XX:MaxDirectMemorySize=128m
(note the old gen parallel compacting collector rather than the ParNew
collector which IIRC is used wit
Hey guys,
thanks for help but I am stuck.
I tried changing the GC:
" instead of CMSIncrementalMode try UseParNewGC"
Also checked for swap, which in vmstat is always zero and analizying top
is not an option.
Load factor never gets higher than 10.0 in a 16 cpu and usually it I around
1.5.
Finally
If you are just trying to find certain text in the data files and you just want
to do bulk process to create reports once a day or so, and prefer to use Hive:
you can create a table with with single string column. You need to pre-process
your data to replace the default column delimiter in your
Hi Anand,
As usual, the answer is that 'it depends' :)
I think that the main question here is: why are you afraid that this setup
would lead to region server hotspotting? Is because you don't know how your
production data will seems?
Based on what you told about your rowkey, you will query m
Hi all,
I couldn't find anything like this, so I've put together what I hope is a
fairly simple but comprehensive test script to evaluate write performance on a
HBase cluster that is running Thrift:
https://gist.github.com/3085350
This is written in Python, and requires the installation of Happ
Hi Yogesh
If you are looking at some indexing and search kind of operation you can take a
look at lucene.
Whether you are using hive or Hbase you cannot do any operation without having
a table structure defined for the data. So you need to create tables for each
dataset and then only you can g
* The first lock is for guarding closes of Region. I.e. for forbidding
reading/writing to the Region which is being closed.
* The second lock is row lock.
Alex Baranau
--
Sematext :: http://blog.sematext.com/ :: Hadoop - HBase - ElasticSearch -
Solr
On Mon, Jul 16, 2012 at 10:14 AM, Howard w
*When I use hbase,I found an Error log:*
*2012-07-14 15:41:04,023 ERROR
org.apache.hadoop.hbase.regionserver.HRegionServer:
java.lang.NullPointerException
at
java.util.concurrent.ConcurrentHashMap.remove(ConcurrentHashMap.java:922)
at
org.apache.hadoop.hbase.regionserver.HRegion.rel
Some mapred jobs running scans on our HBase could not succeed because of
the dreaded LeaseException or ScannerTimeoutException, even with
hbase.client.scanner.caching set to 1 and long timeout properties. Mind you
that no row is ever bigger than 5MB (sure it's bigger then most use cases
but still i
Hi ,
Thank you for responce. Can you please point me on issue which was resolved
in 0.92. We use CDH3U3 cloudera with hbase 0.90.4?
I run a lot of tests and increasing parameters to 90 resolve the issue.
conf.set("hbase.regionserver.lease.period" , "90");
conf.set("hbase.
12 matches
Mail list logo