Dear, Tatsuya
1. Delete hadoop data directory
> 2. bin/hadoop namenode -format
> 3. bin/start-all.sh
> -> namenode will start immediately and go in service, but data
> node will be making a long (almost seven minutes) pause in a middle of
> the startup.
>
> 4. Before the data node becomes ready, do an HDFS write operation
> (e.g. "bin/hadoop fs -put conf input"), and then the write operations
> will fail with the following error:
>
Today I tried to restart Hadoop and HBase skipping step #1 and step #2.
First I stop HBase, then Hadoop and then start Hadoop, wait for 10 minutes
and start HBase - it works. Data was not lost and was available to read and
etc. Then I tried to scan several times the table with 6 000 000 rows and
HBase hanged down again with the same exceptions as in my previous post (see
post at Thu, 29 Oct, 10:06).
hbase(main):006:0> list
> NativeException: org.apache.hadoop.hbase.client.RetriesExhaustedException:
> Trying to contact region server 127.0.0.1:57613 for region .META.,,1, row
> '', but failed after 5 attempts.
> Exceptions:
> java.net.ConnectException: Connection refused
> java.net.ConnectException: Connection refused
> java.net.ConnectException: Connection refused
> java.net.ConnectException: Connection refused
> java.net.ConnectException: Connection refused
>
> from org/apache/hadoop/hbase/client/HConnectionManager.java:1001:in
> `getRegionServerWithRetries'
> from org/apache/hadoop/hbase/client/MetaScanner.java:55:in `metaScan'
> from org/apache/hadoop/hbase/client/MetaScanner.java:28:in `metaScan'
> from org/apache/hadoop/hbase/client/HConnectionManager.java:432:in
> `listTables'
> from org/apache/hadoop/hbase/client/HBaseAdmin.java:127:in `listTables'
> from sun/reflect/NativeMethodAccessorImpl.java:-2:in `invoke0'
> from sun/reflect/NativeMethodAccessorImpl.java:39:in `invoke'
> from sun/reflect/DelegatingMethodAccessorImpl.java:25:in `invoke'
> from java/lang/reflect/Method.java:597:in `invoke'
> from org/jruby/javasupport/JavaMethod.java:298:in
> `invokeWithExceptionHandling'
> from org/jruby/javasupport/JavaMethod.java:259:in `invoke'
> from org/jruby/java/invokers/InstanceMethodInvoker.java:36:in `call'
> from org/jruby/runtime/callsite/CachingCallSite.java:70:in `call'
> from org/jruby/ast/CallNoArgNode.java:61:in `interpret'
> from org/jruby/ast/ForNode.java:104:in `interpret'
> from org/jruby/ast/NewlineNode.java:104:in `interpret'
> ... 110 levels...
> from hadoop/hbase/bin/$_dot_dot_/bin/hirb#start:-1:in `call'
> from org/jruby/internal/runtime/methods/DynamicMethod.java:226:in
> `call'
> from org/jruby/internal/runtime/methods/CompiledMethod.java:211:in
> `call'
> from org/jruby/internal/runtime/methods/CompiledMethod.java:71:in
> `call'
> from org/jruby/runtime/callsite/CachingCallSite.java:253:in
> `cacheAndCall'
> from org/jruby/runtime/callsite/CachingCallSite.java:72:in `call'
> from hadoop/hbase/bin/$_dot_dot_/bin/hirb.rb:497:in `__file__'
> from hadoop/hbase/bin/$_dot_dot_/bin/hirb.rb:-1:in `load'
> from org/jruby/Ruby.java:577:in `runScript'
> from org/jruby/Ruby.java:480:in `runNormally'
> from org/jruby/Ruby.java:354:in `runFromMain'
> from org/jruby/Main.java:229:in `run'
> from org/jruby/Main.java:110:in `run'
> from org/jruby/Main.java:94:in `main'
> from /hadoop/hbase/bin/../bin/hirb.rb:338:in `list'
> from (hbase):7hbase(main):007:0> status
> 0 servers, 0 dead, NaN average load
> hbase(main):008:0> exit
>
Full hbase and hadoop logs can be found in my post at Thu, 29 Oct, 07:52
The main issue for now is that HBase hangs down each time after I try to
scan the table (after second or third time). By the way, this time it was
enough to restart HBase only. And it was became available to scan/get/put
operations.
Table structure:
hbase(main):003:0> describe 'channel_products'
> DESCRIPTION
> ENABLED
> {NAME => 'channel_products', FAMILIES => [{NAME => 'active', VERSIONS
> true
> => '3', COMPRESSION => 'NONE', TTL => '2147483647', BLOCKSIZE =>
> '6553
> 6', IN_MEMORY => 'false', BLOCKCACHE => 'true'}, {NAME =>
> 'channel_cat
> egory_id', VERSIONS => '3', COMPRESSION => 'NONE', TTL =>
> '2147483647'
> , BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'},
> {
> NAME => 'channel_id', VERSIONS => '3', COMPRESSION => 'NONE', TTL =>
> '
> 2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE
> =>
> 'true'}, {NAME => 'contract_id', VERSIONS => '3', COMPRESSION =>
> 'NON
> E', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false',
> B
> LOCKCACHE => 'true'}, {NAME => 'created_at', VERSIONS => '3',
> COMPRESS
> ION => 'NONE', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY
> =>
> 'false', BLOCKCACHE => 'true'}, {NAME => 'shop_category_id',
> VERSIONS
> => '3', COMPRESSION => 'NONE', TTL => '2147483647', BLOCKSIZE =>
> '655
> 36', IN_MEMORY => 'false', BLOCKCACHE => 'true'}, {NAME =>
> 'shop_id',
> VERSIONS => '3', COMPRESSION => 'NONE', TTL => '2147483647',
> BLOCKSIZE
> => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}, {NAME =>
> 'sh
> op_product_id', VERSIONS => '3', COMPRESSION => 'NONE', TTL =>
> '214748
> 3647', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE =>
> 'true
> '}, {NAME => 'updated_at', VERSIONS => '3', COMPRESSION => 'NONE',
> TTL
> => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false',
> BLOCKCAC
> HE =>
> 'true'}]}
>
> 1 row(s) in 0.0630 seconds
>
Table contains ~ 6 000 000 rows, each value is a String.
Code to scan the table:
protected void doGet(HttpServletRequest request, HttpServletResponse
> response) throws ServletException, IOException {
>
Date startDate = new Date();
>
Date finishDate;
>
log(startDate + ": Get activation status started");
>
String shop_id = request.getParameter("shop_id");
>
> String[] shop_product_ids =
> request.getParameterValues("shop_product_ids");
>
if (shop_product_ids != null && shop_product_ids.length == 1) {
>
shop_product_ids = shop_product_ids[0].split(",");
>
}
>
> String channel_id = request.getParameter("channel_id");
>
String channel_category_id = request.getParameter("channel_category_id");
>
> String tableName = "channel_products";
>
StringBuffer result = new StringBuffer("<?xml version=\"1.0\"?>");
>
> if (this.admin.tableExists(tableName)) {
>
result.append("<result>");
>
> HTable table = new HTable(this.configuration, tableName);
>
> Scan scan = new Scan();
>
> FilterList mainFilterList = new FilterList();
>
> if (shop_id != null) {
>
mainFilterList.addFilter(new
> SingleColumnValueFilter(Bytes.toBytes("shop_id"),Bytes.toBytes(""),CompareFilter.CompareOp.EQUAL,
> Bytes.toBytes(shop_id)));
>
}
>
if (channel_id != null) {
>
mainFilterList.addFilter(new
> SingleColumnValueFilter(Bytes.toBytes("channel_id"),Bytes.toBytes(""),CompareFilter.CompareOp.EQUAL,
> Bytes.toBytes(channel_id)));
>
}
>
if (channel_category_id != null) {
>
mainFilterList.addFilter(new
> SingleColumnValueFilter(Bytes.toBytes("channel_category_id"),Bytes.toBytes(""),CompareFilter.CompareOp.EQUAL,
> Bytes.toBytes(channel_category_id)));
>
}
>
> if (shop_product_ids != null && shop_product_ids.length > 0) {
>
List<Filter> filterList = new ArrayList<Filter>();
>
for (String shop_product_id : shop_product_ids) {
>
filterList.add(new
> SingleColumnValueFilter(Bytes.toBytes("shop_product_id"),Bytes.toBytes(""),CompareFilter.CompareOp.EQUAL,
> Bytes.toBytes(shop_product_id)));
>
}
>
FilterList filters = new FilterList(FilterList.Operator.MUST_PASS_ONE,
> filterList);
>
mainFilterList.addFilter(filters);
>
}
>
> scan.setFilter(mainFilterList);
>
ResultScanner scanner = null;
>
try {
>
scanner = table.getScanner(scan);
>
for (Result item : scanner) {
>
getItemXml(result, item);
>
}
>
} catch (Exception e) {
>
logError("Error during table scan: ", e);
>
result.append("<error>").append("Error during table scan: " +
> e).append("</error>");
>
} finally {
>
try {
>
scanner.close();
>
} catch (Exception e1) {
>
//Can be null, skip
>
}
>
result.append("</result>");
>
}
>
} else {
>
result.append("<result>").append("Table " + tableName + " not
> exists!").append("</result>");
>
}
>
finishDate = new Date();
>
log(finishDate + ": Get activation status finihed, duration: " +
> (finishDate.getTime() - startDate.getTime()) + " ms");
>
> response.getOutputStream().print(result.toString());
>
}
>
I checked regionserver logs, but regionserver was not started:
> 2009-10-29 13:34:13,754 WARN
> org.apache.hadoop.hbase.regionserver.HRegionServer: Not starting a distinct
> region server because hbase.cluster.distributed is false
HBase and Hadoop were configured according to "Getting Started" section on *
hadoop.org*. They are both started in Pseudo-distributed mode.
May be I should set this setting *hbase.cluster.distributed* to true?
I'll try to increase RAM capacity. And then I'll write here about results
-------------------------------------------------
Best wishes, Artyom Shvedchikov