JvmMetrics

2009-02-13 Thread David Alves

Hi

	I ran into a use case where I need to keep two contexts for metrics.  
One being ganglia and the other being a file context (to do offline  
metrics analysis).
	I altered JvmMetrics to allow for the user to supply a context  
instead of if getting one by name, and altered file context for it to  
be able to timestamp metrics collection (like log4j does).

Would be glad to submit a patch if anyone is interested.

REgards


Java RMI and Hadoop RecordIO

2009-01-19 Thread David Alves

Hi
	I've been testing some different serialization techniques, to go  
along with a research project.
	I know motivation behind hadoop serialization mechanism (e.g.  
Writable) and the enhancement of this feature through record I/O is  
not only performance, but also control of the input/output.
	Still I've been running some simple tests and I've foud that plain  
RMi beats Hadoop RecordIO almost every time (14-16% faster).
	In my test I have a simple java class that has 14 int fields and 1  
long field and I'm serializing aroung 35000 instances.
	Am I doing anything wrong? are there ways to improve performance in  
RecordIO? Have I got the use case wrong?


Regards
David Alves



Java RMI and Hadoop RecordIO

2009-01-19 Thread David Alves

Hi
	I've been testing some different serialization techniques, to use in  
a research project.
	I know motivation behind hadoop serialization mechanism (e.g.  
Writable) and the enhancement of this feature through record I/O is  
not only performance, but also control of the input/output.
	Still I've been running some simple tests and I've foud that plain  
RMi beats Hadoop RecordIO almost every time (14-16% faster).
	In my test I have a simple java class that has 14 int fields and 1  
long field and I'm serializing aroung 35000 instances.
	Am I doing anything wrong? are there ways to improve performance in  
RecordIO? Have I got the use case wrong?


Regards
David Alves


Re: Map merge part makes the task timeout

2008-11-21 Thread David Alves

Hi again

	Browsing the source code (Merger.class) I see that merger actually  
call reporter.progress() so shouldn't this make the task be reported  
as still working?


Regards
David Alves


On Nov 20, 2008, at 6:29 PM, David Alves wrote:


Hi all

	I have a big map task that takes a long time to complete and  
produces a lot of information.
	The merge part makes the task timeout (you can see this in the end  
of this email, the merge part is aborted after ten minutes, the  
default time).
	I've increased the mapred.tasks.timeout property to 30 min instead  
of 10, in hadoop-site.xml as follows:
	property namemapred.task.timeout/name value180/value  
/property

But stiil the task fails with:
	Task attempt_200811201704_0001_m_00_0 failed to report status  
for 603 seconds. Killing!


Is there any other property I should change?.

Regards
David Alves



17:35:40,697 INFO  [MapTask] Starting flush of map output
17:35:40,697 INFO  [MapTask] bufstart = 17472260; bufend = 51354916;  
bufvoid = 99614720
17:35:40,697 INFO  [MapTask] kvstart = 39119; kvend = 39468; length  
= 327680

17:35:40,950 INFO  [MapTask] Index: (0, 33884416, 33884416)
17:35:40,950 INFO  [MapTask] Finished spill 152
17:35:45,333 INFO  [Merger] Merging 153 sorted segments
17:35:46,337 INFO  [Merger] Merging 9 intermediate segments out of a  
total of 153
17:36:16,849 INFO  [Merger] Merging 10 intermediate segments out of  
a total of 145
17:36:47,615 INFO  [Merger] Merging 10 intermediate segments out of  
a total of 136
17:37:21,529 INFO  [Merger] Merging 10 intermediate segments out of  
a total of 127
17:37:59,883 INFO  [Merger] Merging 10 intermediate segments out of  
a total of 118
17:38:35,370 INFO  [Merger] Merging 10 intermediate segments out of  
a total of 109
17:39:14,795 INFO  [Merger] Merging 10 intermediate segments out of  
a total of 100
17:39:51,787 INFO  [Merger] Merging 10 intermediate segments out of  
a total of 91
17:40:28,721 INFO  [Merger] Merging 10 intermediate segments out of  
a total of 82
17:41:05,650 INFO  [Merger] Merging 10 intermediate segments out of  
a total of 73
17:41:43,285 INFO  [Merger] Merging 10 intermediate segments out of  
a total of 64
17:42:23,531 INFO  [Merger] Merging 10 intermediate segments out of  
a total of 55
17:43:01,709 INFO  [Merger] Merging 10 intermediate segments out of  
a total of 46
17:43:40,209 INFO  [Merger] Merging 10 intermediate segments out of  
a total of 37
17:44:20,707 INFO  [Merger] Merging 10 intermediate segments out of  
a total of 28
17:44:57,700 INFO  [Merger] Merging 10 intermediate segments out of  
a total of 19




Again UnknowScannerException

2008-11-21 Thread David Alves

Hi

	I've seen this issue a lot in the mailing list, but I still have a  
doubt.
	My map tasks keep failing with unknownscannerexception (2 map tasks  
on same node over a 3 node cluster with 4Gb mem, scanning almost 40  
GBs of data, running hadoop 0.18.0, and hbase 0.18.0), this happened  
in the past, but as it passed 50% of the times rarely a M/R task  
completely failed, as the data increased the USE now completely  
prevents the maps from running to completion. I'm only scanning the  
table, there are no inserts at the same time.
	I've previouly seen mentioned a lease period I could increase. Is  
this the hbase.regionserver.lease.period property? Should I upgrade  
to hbase 0.18.1, and if so must I also update hadoop?


Regards
David Alves




Map merge part makes the task timeout

2008-11-20 Thread David Alves

Hi all

	I have a big map task that takes a long time to complete and produces  
a lot of information.
	The merge part makes the task timeout (you can see this in the end of  
this email, the merge part is aborted after ten minutes, the default  
time).
	I've increased the mapred.tasks.timeout property to 30 min instead of  
10, in hadoop-site.xml as follows:
	property namemapred.task.timeout/name value180/value / 
property

But stiil the task fails with:
	Task attempt_200811201704_0001_m_00_0 failed to report status for  
603 seconds. Killing!


Is there any other property I should change?.

Regards
David Alves



17:35:40,697 INFO  [MapTask] Starting flush of map output
17:35:40,697 INFO  [MapTask] bufstart = 17472260; bufend = 51354916;  
bufvoid = 99614720
17:35:40,697 INFO  [MapTask] kvstart = 39119; kvend = 39468; length =  
327680

17:35:40,950 INFO  [MapTask] Index: (0, 33884416, 33884416)
17:35:40,950 INFO  [MapTask] Finished spill 152
17:35:45,333 INFO  [Merger] Merging 153 sorted segments
17:35:46,337 INFO  [Merger] Merging 9 intermediate segments out of a  
total of 153
17:36:16,849 INFO  [Merger] Merging 10 intermediate segments out of a  
total of 145
17:36:47,615 INFO  [Merger] Merging 10 intermediate segments out of a  
total of 136
17:37:21,529 INFO  [Merger] Merging 10 intermediate segments out of a  
total of 127
17:37:59,883 INFO  [Merger] Merging 10 intermediate segments out of a  
total of 118
17:38:35,370 INFO  [Merger] Merging 10 intermediate segments out of a  
total of 109
17:39:14,795 INFO  [Merger] Merging 10 intermediate segments out of a  
total of 100
17:39:51,787 INFO  [Merger] Merging 10 intermediate segments out of a  
total of 91
17:40:28,721 INFO  [Merger] Merging 10 intermediate segments out of a  
total of 82
17:41:05,650 INFO  [Merger] Merging 10 intermediate segments out of a  
total of 73
17:41:43,285 INFO  [Merger] Merging 10 intermediate segments out of a  
total of 64
17:42:23,531 INFO  [Merger] Merging 10 intermediate segments out of a  
total of 55
17:43:01,709 INFO  [Merger] Merging 10 intermediate segments out of a  
total of 46
17:43:40,209 INFO  [Merger] Merging 10 intermediate segments out of a  
total of 37
17:44:20,707 INFO  [Merger] Merging 10 intermediate segments out of a  
total of 28
17:44:57,700 INFO  [Merger] Merging 10 intermediate segments out of a  
total of 19




Full table scan fails during map

2008-11-20 Thread David Alves

Hi guys

	We've got HBase(0.18.0, r695089) and Hadoop(0.18.0, r686010) running  
for a while, and apart from the ocasional regionserver stopping  
without notice (and whithout explanations from what we can see in the  
logs), problem that we solve easily just by restarting it, we now have  
come to face a more serious problem of what I think is data loss.
	We use Hbase as a links and documents database (similar to nutch) in  
a 3 node cluster (4GB Mem on each node), the links database has a 4  
regions and the document database now has 200 regions for a total of  
216 (with meta and root).
	After the crawl task, which went ok, (we now have 60GB/300GB full in  
hdfs) we proceed to do a full table scan to create the indexes and  
thats where things started to fail.
	We are seing a problem in the logs (at the end of this email). This  
repeats untils theres a retriesexausted exception and the task fails  
in the map phase. Hadoop fsk tool tells us that hdfs is ok. I'm still  
to explore the rest of the logs searching for some kind of error I  
will post a new mail if I find anything.


Any help would be greatly appreciated.

Regards
David Alves

	2008-11-19 19:47:52,664 DEBUG org.apache.hadoop.dfs.DFSClient:  
DataStreamer block blk_-4521866854383825816_55401 wrote packet seqno:0  
size:38 offsetInBlock:0 lastPacketInBlock:true 2008-11-19 19:47:52,676  
DEBUG org.apache.hadoop.dfs.DFSClient: DFSClient received ack for  
seqno 0 2008-11-19 19:47:52,676 DEBUG org.apache.hadoop.dfs.DFSClient:  
Closing old block blk_-4521866854383825816_55401 2008-11-19  
19:47:52,769 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Added / 
hbase/links/1617869663/docDatum/mapfiles/7718188406431341070 with  
20622 entries, sequence id 5289673, data size 5.6m, file size 6.0m  
2008-11-19 19:47:52,770 DEBUG  
org.apache.hadoop.hbase.regionserver.HRegion: Finished memcache flush  
for region links,ext://myrepo/mypath/MYDOC.pdf,1227122254743 in  
3015ms, sequence id=5289673, compaction requested=false 2008-11-19  
19:53:17,524 ERROR org.apache.hadoop.hbase.regionserver.HRegionServer:  
Error opening scanner (fsOk: true) java.io.IOException: HStoreScanner  
failed construction at  
org 
.apache 
.hadoop 
.hbase.regionserver.StoreFileScanner.init(StoreFileScanner.java:70)  
at  
org 
.apache 
.hadoop.hbase.regionserver.HStoreScanner.init(HStoreScanner.java:68)  
at org.apache.hadoop.hbase.regionserver.HStore.getScanner(HStore.java: 
1916) at org.apache.hadoop.hbase.regionserver.HRegion 
$HScanner.init(HRegion.java:1954) at  
org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java: 
1345) at  
org 
.apache 
.hadoop 
.hbase.regionserver.HRegionServer.openScanner(HRegionServer.java:1170)  
at sun.reflect.GeneratedMethodAccessor21.invoke(Unknown Source) at  
sun 
.reflect 
.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java: 
25) at java.lang.reflect.Method.invoke(Method.java:597) at  
org.apache.hadoop.hbase.ipc.HbaseRPC$Server.call(HbaseRPC.java:554) at  
org.apache.hadoop.ipc.Server$Handler.run(Server.java:888) Caused by:  
java.io.FileNotFoundException: File does not exist: hdfs://cyclops- 
prod-1:9000/hbase/document/153945136/docDatum/mapfiles/ 
5163556575658593611/data at  
org 
.apache 
.hadoop 
.dfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java: 
394) at org.apache.hadoop.fs.FileSystem.getLength(FileSystem.java:695)  
at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java: 
1419) at org.apache.hadoop.io.SequenceFile 
$Reader.init(SequenceFile.java:1414) at org.apache.hadoop.io.MapFile 
$Reader.createDataFileReader(MapFile.java:301) at  
org.apache.hadoop.hbase.regionserver.HStoreFile$HbaseMapFile 
$HbaseReader.createDataFileReader(HStoreFile.java:650) at  
org.apache.hadoop.io.MapFile$Reader.open(MapFile.java:283) at  
org.apache.hadoop.hbase.regionserver.HStoreFile$HbaseMapFile 
$HbaseReader.init(HStoreFile.java:632) at  
org.apache.hadoop.hbase.regionserver.HStoreFile$BloomFilterMapFile 
$Reader.init(HStoreFile.java:714) at  
org.apache.hadoop.hbase.regionserver.HStoreFile 
$HalfMapFileReader.init(HStoreFile.java:908) at  
org 
.apache.hadoop.hbase.regionserver.HStoreFile.getReader(HStoreFile.java: 
408) at  
org 
.apache 
.hadoop 
.hbase.regionserver.StoreFileScanner.openReaders(StoreFileScanner.java: 
96) at  
org 
.apache 
.hadoop 
.hbase.regionserver.StoreFileScanner.init(StoreFileScanner.java: 
67) ... 10 more




Compound filters

2008-05-21 Thread David Alves
Hi Guys

I'm currently needing to build some compound filters for column values
(I'm needing OR but it could be easily extended to use AND, OR NOT
grouped in any way) comparing byte[] values. The objective is to fitler
the dataset that is inputed onto M/R jobs.
Would this be an interesting feature, or is it already
predicted/implemented in any way that I don't know about?
I'm thinking with this functionality RegExpRowFilter could just focus
on matching row keys and leave column matching to these filters.

Regards
David



NotServingRegionException revisited

2008-04-28 Thread David Alves
Hi Guys

I have found, what I think is a strange case. Last Friday a M/R task
failed constantly (if a task fails for some reason it is later reran a
number of times to make sure service outages won't stop the process)
with NotServingRegionException.
The thing here is that that particular region is ONLINE (at least its
what I can tell from a select * from .META. and it is not a split, and
it is not retiring (no retiring info in logs).
It is not a ocasional thing because the task keeps failing (even after
a cluster restart).
So how can a ONLINE region (as reported by a .META. scanner) not be on
the onlineRegions map in HRegionServer?
Any ideas?

Regards
David Alves

Partial Logs/Info (this keeps appearing so only one result is shown):
Master:

2008-04-28 18:44:59,235 DEBUG
org.apache.hadoop.hbase.master.BaseScanner: RegionManager.metaScanner
regioninfo: {regionname: cyclops-documents-database,,1209061263654,
startKey: , endKey:
smb://cbrfileserv.critical.pt/CyclopsRepoLocation-/Critical/Biblioteca-CyclopsRepoLocation/EBooks/DEVELOPING
 INTRANET APPLICATIONS WITH JAVA/ch5.htm, encodedName: 485063880, tableDesc: 
{name: cyclops-documents-database, families: {documentDbContent:={name: 
documentDbContent, max versions: 3, compression: NONE, in memory: false, block 
cache enabled: false, max length: 2147483647, bloom filter: none}, 
documentDbCrawlDatum:={name: documentDbCrawlDatum, max versions: 3, 
compression: NONE, in memory: false, block cache enabled: false, max length: 
2147483647, bloom filter: none}, documentDbMetadata:={name: documentDbMetadata, 
max versions: 3, compression: NONE, in memory: false, block cache enabled: 
false, max length: 2147483647, bloom filter: none}, documentDbRepoDatum:={name: 
documentDbRepoDatum, max versions: 3, compression: NONE, in memory: false, 
block cache enabled: false, max length: 2147483647, bloom filter: none, 
server: 10.0.0.1:60020, startCode: 1209390438896

Region Server:

2008-04-28 18:45:29,028 INFO org.apache.hadoop.ipc.Server: IPC Server
handler 6 on 60020, call
batchUpdate(cyclops-documents-database,smb://cbrfileserv.critical.pt/CyclopsRepoLocation-/Critical/Biblioteca-CyclopsRepoLocation/EBooks/DEVELOPING
 INTRANET APPLICATIONS WITH JAVA/ch5.htm,1209061263655, [EMAIL PROTECTED]) from 
10.0.0.2:47636: error: org.apache.hadoop.hbase.NotServingRegionException: 
cyclops-documents-database,smb://cbrfileserv.critical.pt/CyclopsRepoLocation-/Critical/Biblioteca-CyclopsRepoLocation/EBooks/DEVELOPING
 INTRANET APPLICATIONS WITH JAVA/ch5.htm,1209061263655
org.apache.hadoop.hbase.NotServingRegionException:
cyclops-documents-database,smb://cbrfileserv.critical.pt/CyclopsRepoLocation-/Critical/Biblioteca-CyclopsRepoLocation/EBooks/DEVELOPING
 INTRANET APPLICATIONS WITH JAVA/ch5.htm,1209061263655
at
org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:1318)
at
org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:1280)
at
org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdate(HRegionServer.java:1098)
at sun.reflect.GeneratedMethodAccessor12.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.hbase.ipc.HbaseRPC
$Server.call(HbaseRPC.java:413)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:896)







RE: NotServingRegionException revisited

2008-04-28 Thread David Alves
Hi Again 

After going through the logs a bit more carefully I found a FNFE
while
trying to do a compaction on that particular region. The relevant log
follows attached.
After the failed compaction because of the FNFE the region is still
online in .META. but no longer among the online regions in the region
server, which I suspect causes my problem right?.

Regards
David Alves



 -Original Message-
 From: David Alves [mailto:[EMAIL PROTECTED]
 Sent: Monday, April 28, 2008 6:31 PM
 To: hbase-user@hadoop.apache.org
 Subject: NotServingRegionException revisited
 
 Hi Guys
 
   I have found, what I think is a strange case. Last Friday a M/R task
 failed constantly (if a task fails for some reason it is later reran a
 number of times to make sure service outages won't stop the process)
 with NotServingRegionException.
   The thing here is that that particular region is ONLINE (at least
 its
 what I can tell from a select * from .META. and it is not a split, and
 it is not retiring (no retiring info in logs).
   It is not a ocasional thing because the task keeps failing (even
 after
 a cluster restart).
   So how can a ONLINE region (as reported by a .META. scanner) not be
 on
 the onlineRegions map in HRegionServer?
   Any ideas?
 
 Regards
 David Alves
 
 Partial Logs/Info (this keeps appearing so only one result is shown):
 Master:
 
 2008-04-28 18:44:59,235 DEBUG
 org.apache.hadoop.hbase.master.BaseScanner: RegionManager.metaScanner
 regioninfo: {regionname: cyclops-documents-database,,1209061263654,
 startKey: , endKey:
 smb://cbrfileserv.critical.pt/CyclopsRepoLocation-/Critical/Biblioteca-
 CyclopsRepoLocation/EBooks/DEVELOPING INTRANET APPLICATIONS WITH
 JAVA/ch5.htm, encodedName: 485063880, tableDesc: {name: cyclops-
 documents-database, families: {documentDbContent:={name:
 documentDbContent, max versions: 3, compression: NONE, in memory: false,
 block cache enabled: false, max length: 2147483647, bloom filter: none},
 documentDbCrawlDatum:={name: documentDbCrawlDatum, max versions: 3,
 compression: NONE, in memory: false, block cache enabled: false, max
 length: 2147483647, bloom filter: none}, documentDbMetadata:={name:
 documentDbMetadata, max versions: 3, compression: NONE, in memory: false,
 block cache enabled: false, max length: 2147483647, bloom filter: none},
 documentDbRepoDatum:={name: documentDbRepoDatum, max versions: 3,
 compression: NONE, in memory: false, block cache enabled: false, max
 length: 2147483647, bloom filter: none, server: 10.0.0.1:60020,
 startCode: 1209390438896
 
 Region Server:
 
 2008-04-28 18:45:29,028 INFO org.apache.hadoop.ipc.Server: IPC Server
 handler 6 on 60020, call
 batchUpdate(cyclops-documents-
 database,smb://cbrfileserv.critical.pt/CyclopsRepoLocation-
 /Critical/Biblioteca-CyclopsRepoLocation/EBooks/DEVELOPING INTRANET
 APPLICATIONS WITH JAVA/ch5.htm,1209061263655,
 [EMAIL PROTECTED]) from 10.0.0.2:47636: error:
 org.apache.hadoop.hbase.NotServingRegionException: cyclops-documents-
 database,smb://cbrfileserv.critical.pt/CyclopsRepoLocation-
 /Critical/Biblioteca-CyclopsRepoLocation/EBooks/DEVELOPING INTRANET
 APPLICATIONS WITH JAVA/ch5.htm,1209061263655
 org.apache.hadoop.hbase.NotServingRegionException:
 cyclops-documents-
 database,smb://cbrfileserv.critical.pt/CyclopsRepoLocation-
 /Critical/Biblioteca-CyclopsRepoLocation/EBooks/DEVELOPING INTRANET
 APPLICATIONS WITH JAVA/ch5.htm,1209061263655
 at
 org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer
 .java:1318)
 at
 org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer
 .java:1280)
 at
 org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdate(HRegionServ
 er.java:1098)
 at sun.reflect.GeneratedMethodAccessor12.invoke(Unknown Source)
 at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorIm
 pl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.hbase.ipc.HbaseRPC
 $Server.call(HbaseRPC.java:413)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:896)
 
 
 



Lost Rows

2008-04-23 Thread David Alves
Hi Guys

Regarding my previous problems  I'm glad to say that I can now crawl an
entire repository with only a small percentage of failed tasks, last
hbase version plus the correction of replication property seemed to
solve it for me.
Still I have two issues I'd appreciate your input in. 
The first one regards splits. I've made a small tool (built upon
stack's one) that checks DB state, and can online/offline tables and
merge regions etc. This tool gives me the report ant the end of this
email. The question here Is that I seem to have lost 144 rows (comparing
the output formats output records and the actual rows in the table from
a select count(*)). I suspect these rows are in the offline splits. Can
I use my tool to merge the splits against their online parents using
HRegion.merge() ? Or is it a big no no.
The second issue is more problematic, I misconfigured my last job and
it ran 10 maps instead of the 1 it should, but when under that kind of
load hbase completely failed, regionservers went down, at one time I had
to completely erase the database because it wouldn't start again (I
suspect .META. was offline) the other time I was able to recover all the
data by simply restarting it. Is there any kind of procedure I should
use in this situation?

Best Regards
David Alves

Log Trace:
Found region: cyclops-documents-database,,1208892792201
Id: 1208892792201
Start Key: 
End Key:
smb://cbrfileserv.critical.pt/CyclopsRepoLocation-/Critical/Biblioteca-CyclopsRepoLocation/EBooks/HOW
 TO USE HTML 3.2/ch6.htm
Online/Offline Status: ONLINE
Split?: FALSE
Found region:
cyclops-documents-database,smb://cbrfileserv.critical.pt/CyclopsRepoLocation-/Critical/Biblioteca-CyclopsRepoLocation/EBooks/HOW
 TO USE HTML 3.2/ch6.htm,1208892792202
Id: 1208892792202
Start Key:
smb://cbrfileserv.critical.pt/CyclopsRepoLocation-/Critical/Biblioteca-CyclopsRepoLocation/EBooks/HOW
 TO USE HTML 3.2/ch6.htm
End Key:
smb://cbrfileserv.critical.pt/CyclopsRepoLocation-/Critical/Biblioteca-CyclopsRepoLocation/EBooks/LINUX
 SYSTEM ADMINISTRATOR'S SURVIVAL GUIDE TABLE OF CONTENTS/lsg14.htm
Online/Offline Status: ONLINE
Split?: FALSE
DEBUG 23-04 14:54:50,744 (DFSClient.java:readChunk:934)  -DFSClient
readChunk got seqno 2 offsetInBlock 8192 lastPacketInBlock false
packetLen 4132
Found region:
cyclops-documents-database,smb://cbrfileserv.critical.pt/CyclopsRepoLocation-/Critical/Biblioteca-CyclopsRepoLocation/EBooks/LINUX
 SYSTEM ADMINISTRATOR'S SURVIVAL GUIDE TABLE OF CONTENTS/lsg14.htm,1208891918491
Id: 1208891918491
Start Key:
smb://cbrfileserv.critical.pt/CyclopsRepoLocation-/Critical/Biblioteca-CyclopsRepoLocation/EBooks/LINUX
 SYSTEM ADMINISTRATOR'S SURVIVAL GUIDE TABLE OF CONTENTS/lsg14.htm
End Key:
smb://cbrfileserv.critical.pt/CyclopsRepoLocation-/Critical/Biblioteca-CyclopsRepoLocation/EBooks/SPECIAL
 EDITION USING MICROSOFT BACKOFFICE, VOLUME 1/ch05/06.htm
Online/Offline Status: OFFLINE
Split?: TRUE
Found region:
cyclops-documents-database,smb://cbrfileserv.critical.pt/CyclopsRepoLocation-/Critical/Biblioteca-CyclopsRepoLocation/EBooks/LINUX
 SYSTEM ADMINISTRATOR'S SURVIVAL GUIDE TABLE OF CONTENTS/lsg14.htm,1208893494772
Id: 1208893494772
Start Key:
smb://cbrfileserv.critical.pt/CyclopsRepoLocation-/Critical/Biblioteca-CyclopsRepoLocation/EBooks/LINUX
 SYSTEM ADMINISTRATOR'S SURVIVAL GUIDE TABLE OF CONTENTS/lsg14.htm
End Key:
smb://cbrfileserv.critical.pt/CyclopsRepoLocation-/Critical/Biblioteca-CyclopsRepoLocation/EBooks/Platinium
 Edition Using VB 5/Books/Platinium Edition Using VB 5/ch14/09.htm
Online/Offline Status: ONLINE
Split?: FALSE
DEBUG 23-04 14:54:50,754 (DFSClient.java:readChunk:934)  -DFSClient
readChunk got seqno 3 offsetInBlock 12288 lastPacketInBlock false
packetLen 4132
Found region:
cyclops-documents-database,smb://cbrfileserv.critical.pt/CyclopsRepoLocation-/Critical/Biblioteca-CyclopsRepoLocation/EBooks/Platinium
 Edition Using VB 5/Books/Platinium Edition Using VB 5/ch14/09.htm,1208893494773
Id: 1208893494773
Start Key:
smb://cbrfileserv.critical.pt/CyclopsRepoLocation-/Critical/Biblioteca-CyclopsRepoLocation/EBooks/Platinium
 Edition Using VB 5/Books/Platinium Edition Using VB 5/ch14/09.htm
End Key:
smb://cbrfileserv.critical.pt/CyclopsRepoLocation-/Critical/Biblioteca-CyclopsRepoLocation/EBooks/SPECIAL
 EDITION USING MICROSOFT BACKOFFICE, VOLUME 1/ch05/06.htm
Online/Offline Status: OFFLINE
Split?: TRUE
Found region:
cyclops-documents-database,smb://cbrfileserv.critical.pt/CyclopsRepoLocation-/Critical/Biblioteca-CyclopsRepoLocation/EBooks/Platinium
 Edition Using VB 5/Books/Platinium Edition Using VB 5/ch14/09.htm,1208894034845
Id: 1208894034845
Start Key:
smb://cbrfileserv.critical.pt/CyclopsRepoLocation-/Critical/Biblioteca-CyclopsRepoLocation/EBooks/Platinium
 Edition Using VB 5/Books/Platinium Edition Using VB 5/ch14/09.htm
End Key:
smb://cbrfileserv.critical.pt/CyclopsRepoLocation-/Critical/Biblioteca-CyclopsRepoLocation/EBooks/Platinium
 Edition Using VB 5/Books/Platinium

Re: Lost Rows

2008-04-23 Thread David Alves
On Wed, 2008-04-23 at 11:22 -0700, stack wrote:
 Here's a few things David.
 
 Regards your tool, you could have just done 'select info:regioninfo from 
 .META.;' and it would output same data (If you did something like echo 
 'select info:regioninfo from .META.;' |./bin/hbase shell --html  
 /tmp/meta.html, the output would be html'ized and easier to read than 
 an ascii table).
Regarding the tool Well I knew that I could just do a select *
from .META. to get the info the question here was that I needed to do
stuff on the regions based on their state, besides now I use socks proxy
with my tool that allows me to check on the cluster from my laptop :)
the output you saw was from logs I actually pretty print the info to my
application (both by web and console) as hql will be deprecated anyhow
it seemed as a godd idea.
 
 If you want to do merging of regions, check out the main on 
 org.apache.hadoop.hbase.util.Merge.
 
will check it out
 Regards offline regions, looking at your report below, all offlined 
 regions look legit. Their online status is offline but they also have 
 the split attribute set (On split, the parent is offlined. The daughter 
 regions take its place.  The parent hangs around until such time as the 
 daughters no longer hold reference to the parent.  Then the parent is 
 deleted).
 
Ok.
 Regards the 144 missing rows, is it possible you fed your map task 
 duplicates?  The duplicates would increment the map count of inputs 
 processed but reduce would squash the duplicates together and output a 
 single row.  If you don't have that many rows, perhaps output inputs and 
 outputs and try to figure where the 144 are going missing?
 
The missing rows were counted from TableOutputFormat reduce output
records (from the M/R job) and matched against a select count(*) so even
if the maps were fed duplicates there are still missing rows.
 Regards hbase buckling under load, please send us logs.  If you are 
 using TRUNK, it should be able to easily carry ten concurrent clients 
 and where it can't, it puts up a gate to block updates.  It shouldn't be 
 falling over.
 
Well one of the times (the one I could recover from) I saw a lot of
NotServingRegionException in the logs, which I thinks falls into the
graceful failure category you mentioned, the other time all hell broke
loose (Like EOFExceptions reading from .META.)but I still saw a thread
dump on the logs so maybe it just OOMEd out. I will send the relevant
part of the logs aside because they are quite huge.

On another matter must hbase really log (even in debug) all filter
calls? Thats stands for about 70% of my logs.

Best Regards
David

 Thanks D,
 St.Ack
 
 David Alves wrote:
  Hi Guys
 
  Regarding my previous problems  I'm glad to say that I can now crawl an
  entire repository with only a small percentage of failed tasks, last
  hbase version plus the correction of replication property seemed to
  solve it for me.
  Still I have two issues I'd appreciate your input in. 
  The first one regards splits. I've made a small tool (built upon
  stack's one) that checks DB state, and can online/offline tables and
  merge regions etc. This tool gives me the report ant the end of this
  email. The question here Is that I seem to have lost 144 rows (comparing
  the output formats output records and the actual rows in the table from
  a select count(*)). I suspect these rows are in the offline splits. Can
  I use my tool to merge the splits against their online parents using
  HRegion.merge() ? Or is it a big no no.
  The second issue is more problematic, I misconfigured my last job and
  it ran 10 maps instead of the 1 it should, but when under that kind of
  load hbase completely failed, regionservers went down, at one time I had
  to completely erase the database because it wouldn't start again (I
  suspect .META. was offline) the other time I was able to recover all the
  data by simply restarting it. Is there any kind of procedure I should
  use in this situation?
  o
  Best Regards
  David Alves
 
  Log Trace:
  Found region: cyclops-documents-database,,1208892792201
  Id: 1208892792201
  Start Key: 
  End Key:
  smb://cbrfileserv.critical.pt/CyclopsRepoLocation-/Critical/Biblioteca-CyclopsRepoLocation/EBooks/HOW
   TO USE HTML 3.2/ch6.htm
  Online/Offline
   Status: ONLINE
  Split?: FALSE
  Found region:
  cyclops-documents-database,smb://cbrfileserv.critical.pt/CyclopsRepoLocation-/Critical/Biblioteca-CyclopsRepoLocation/EBooks/HOW
   TO USE HTML 3.2/ch6.htm,1208892792202
  Id: 1208892792202
  Start Key:
  smb://cbrfileserv.critical.pt/CyclopsRepoLocation-/Critical/Biblioteca-CyclopsRepoLocation/EBooks/HOW
   TO USE HTML 3.2/ch6.htm
  End Key:
  smb://cbrfileserv.critical.pt/CyclopsRepoLocation-/Critical/Biblioteca-CyclopsRepoLocation/EBooks/LINUX
   SYSTEM ADMINISTRATOR'S SURVIVAL GUIDE TABLE OF CONTENTS/lsg14.htm
  Online/Offline Status: ONLINE
  Split?: FALSE
  DEBUG 23-04 14:54:50,744 (DFSClient.java:readChunk:934

RE: Lost Rows

2008-04-23 Thread David Alves
agreed. thanks Jim.
On Wed, 2008-04-23 at 12:13 -0700, Jim Kellerman wrote:
 While log4j supports TRACE, apache commons logging does not, so those trace 
 messages will come out when DEBUG is set. To disable the filter messages, 
 just add the following to your log4j.properties file:
 
 log4j.logger.org.apache.hadoop.hbase.filter=INFO
 
 ---
 Jim Kellerman, Senior Engineer; Powerset
 
 
  -Original Message-
  From: Clint Morgan [mailto:[EMAIL PROTECTED]
  Sent: Wednesday, April 23, 2008 12:01 PM
  To: hbase-user@hadoop.apache.org
  Subject: Re: Lost Rows
 
  On Wed, Apr 23, 2008 at 11:58 AM, David Alves
  [EMAIL PROTECTED] wrote:
 
On another matter must hbase really log (even in debug)
  all filter
   calls? Thats stands for about 70% of my logs.
 
  Agreed, I'll drop those messages to trace.
 
  No virus found in this incoming message.
  Checked by AVG.
  Version: 7.5.524 / Virus Database: 269.23.3/1393 - Release
  Date: 4/23/2008 8:12 AM
 
 
 
 No virus found in this outgoing message.
 Checked by AVG.
 Version: 7.5.524 / Virus Database: 269.23.3/1393 - Release Date: 4/23/2008 
 8:12 AM
 



Concurrent Modification Exceptions in logs

2008-04-21 Thread David Alves
Hi Guys 

My NPE problem on online table lookup seemed to go away (at least until
now), I think the cause was different dfs.replication values for hadoop
and hbase (thanks st.ack for pointing it out), now I'm just struggling
with region offline exceptions :).
I'm seeing some CMEs in the logs they occurred while I still had bad
dfs.replication settings between hadoop and hbase but still thought you
should know.

Regards
David Alves

Trace:
2008-04-21 13:20:46,443 WARN
org.apache.hadoop.hbase.regionserver.HRegionServer: Processing message
(Retry: 0)
java.io.IOException: java.io.IOException:
java.util.ConcurrentModificationException
at java.util.HashMap$HashIterator.nextEntry(HashMap.java:793)
at java.util.HashMap$ValueIterator.next(HashMap.java:822)
at
org.apache.hadoop.hbase.master.ServerManager.processMsgs(ServerManager.java:350)
at
org.apache.hadoop.hbase.master.ServerManager.processRegionServerAllsWell(ServerManager.java:299)
at
org.apache.hadoop.hbase.master.ServerManager.regionServerReport(ServerManager.java:217)
at
org.apache.hadoop.hbase.master.HMaster.regionServerReport(HMaster.java:560)
at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.hbase.ipc.HbaseRPC$Server.call(HbaseRPC.java:413)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:896)

at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
Method)
at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
at
org.apache.hadoop.hbase.RemoteExceptionHandler.decodeRemoteException(RemoteExceptionHandler.java:82)
at
org.apache.hadoop.hbase.RemoteExceptionHandler.checkIOException(RemoteExceptionHandler.java:48)
at
org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:388)
at java.lang.Thread.run(Thread.java:619)







Make NameNode listen on multiple interfaces

2008-04-18 Thread David Alves
Hi
In my setup I have a cluster in witch each server has two network
interfaces on for hadoop network traffic (lets call it A) and one for
traffic to the rest of the network (lets call it B).
Until now I only needed to make the nodes communicate with the master
and vice-versa (through the A interface) so no problem there, but now
I'm in need of submitting jobs and accessing the filesystem itself from
outside machines (through the B interface) so my question id can I make
namenode listen on both interfaces?

Regards
David Alves



Strange logging behaviour

2008-04-18 Thread David Alves
Hi Again

On another note from the previous email my even though I followed the
faq my ..master...log is showing only one error entry and the .out is
showing only INFO entries is this normal? (i'm logging to a different
dir that default by altering the relevant property in hbase-env.sh)

Regards
David Alves



Re: Strange logging behaviour

2008-04-18 Thread David Alves

The NPE problem I'm currently having didn't happen with the released
version but I still got time outs and offline region problems.
So I tryed migrating and have already migrated most of my code to comply
to the new APIs (which are a lot better by the way congrats). Due to
this fact I would like to keep using trunk but will get back to the
released version if you think is better.

Best Regards
David


On Fri, 2008-04-18 at 10:49 -0700, stack wrote:
 It didn't?  Wasn't it the same issue of table regions being offlined or was
 it something else?
 Thanks,
 St.Ack
 
 On Tue, Apr 29, 2008 at 10:28 AM, David Alves [EMAIL PROTECTED]
 wrote:
 
  Hi Again
 
 On another note from the previous email my even though I followed
  the
  faq my ..master...log is showing only one error entry and the .out is
  showing only INFO entries is this normal? (i'm logging to a different
  dir that default by altering the relevant property in hbase-env.sh)
 
  Regards
  David Alves
 
 



Re: Strange logging behaviour

2008-04-18 Thread David Alves
I fully understand your point, I know that trunk is not guaranteed to
be stable, and by no means was I expecting it to be.
As you must imagine this is not a mission critical application and it
is still in inception phase. In fact when I refer to production I mean
future production that, for the being is only available to a limited set
of beta users. I rolled hbase trunk onto the production cluster in
order to check if the time out and region offline issues would go away
as the servers are better there, but came around to the NPE problem
happening always.
Still I think this must be a relevant problem so I thought I could get
your help debugging/solving it so both my application and hbase would
progress forward, and I wouldn't need to revert my app to the old APIs,
and learn a bit more about Hbase in the process.

Regards
David
 
On Fri, 2008-04-18 at 11:21 -0700, stack wrote:
 TRUNK comes with the usual disclaimer: no guarantees that its stable.
 Whereas with releases, if they are not stable, we'll stop work on TRUNK to
 fix release problems and try and roll a new one quicIkly.  If you're trying
 to run hbase in a production context, would suggest you use release unless
 there is an explicit feature you need that is only in TRUNK.
 
 If logging is not working correctly in TRUNK then its going to be hard for
 us to help you out since you can't pass us detail of sufficient detail (Its
 broke for you, right)?  I was going to look at trying to figure it in a
 bit
 
 St.Ack
 
 
 On Tue, Apr 29, 2008 at 11:01 AM, David Alves [EMAIL PROTECTED]
 wrote:
 
 
  The NPE problem I'm currently having didn't happen with the released
  version but I still got time outs and offline region problems.
  So I tryed migrating and have already migrated most of my code to comply
  to the new APIs (which are a lot better by the way congrats). Due to
  this fact I would like to keep using trunk but will get back to the
  released version if you think is better.
 
  Best Regards
  David
 
 
  On Fri, 2008-04-18 at 10:49 -0700, stack wrote:
   It didn't?  Wasn't it the same issue of table regions being offlined or
  was
   it something else?
   Thanks,
   St.Ack
  
   On Tue, Apr 29, 2008 at 10:28 AM, David Alves 
  [EMAIL PROTECTED]
   wrote:
  
Hi Again
   
   On another note from the previous email my even though I
  followed
the
faq my ..master...log is showing only one error entry and the .out is
showing only INFO entries is this normal? (i'm logging to a different
dir that default by altering the relevant property in hbase-env.sh)
   
Regards
David Alves
   
   
 
 



Regions Offline

2008-04-17 Thread David Alves
Hi

My system is quite simple:
- two (one quad core, one dual core) servers with 2GB mem and 150 GB
allocated to dfs.
- I use it to crawl multiple supports but mainly filesystems and
save the results onto hbase (not too many files  100.000 but rows can get
easily to 30 MB each)

I constantly getting NullPointerExceptions (on the client caused by
NotServingRegionExceptions on regionserver) when creating tables or
RegionOfflineExceptions when doing puts or sometimes just time outs.
When started with hbase I developed in 'local' mode, I then migrated
to a small dev 2 servers cluster (weaker than production is now) where I
tested the functionality, and it worked fine but, my bad, due to pressing
scheduling I didn't do any real load tests, so the system is now
continuously going under in production. I've only been able to do a full
crawl by resetting the cluster to one node and putting it in 'local' mode.

My question is what can cause regions to be offline in
regionservers?

I ask so that I can investigate the matter further but having a
starting point.

I'm willing to help anyway I can but I would really appreciate any
help and/or starting point and tools for my investigation.


Best Regards
David Alves



Batch update gain

2008-04-15 Thread David Alves
Hi All

I'm currently rewriting my own TableOutputFormat classes to
comply with the new APIs introduced in the latest version and I was
wondering if it would be valuable to rewrite them as buffered writers,
meaning keeping a predetermined set of records (set by size to avoid OOME)
before commiting them to HBase.

What are your thoughs about this?

In another note I think it would be valuable to rewrite the
TableInputFormat class to be extendable. For example in my case I needed a
Filtered (RegExpRowFilter) TableInputFormat and could not extend the
original because its instance of HTable is package protected.

 

Best regards

David Alves



Re: Batch update gain

2008-04-15 Thread David Alves
Hi
Yes I was thinking of batch (multiple rows) updates, but only then I
realized that the old commit with lock methods were deprecated so forget
I mentioned it.
About HBASE-581 I'll drop my comments in JIRA.
In another note in my application I have a region that became offline.
Is there a way of making if online again (I restarted the application
several times and it didn't help)?

Regards
David Alves
On Tue, 2008-04-15 at 09:09 -0700, stack wrote:
 David Alves wrote:
  Hi All
 
  I'm currently rewriting my own TableOutputFormat classes to
  comply with the new APIs introduced in the latest version and I was
  wondering if it would be valuable to rewrite them as buffered writers,
  meaning keeping a predetermined set of records (set by size to avoid OOME)
  before commiting them to HBase.

 
 Commits are by row.  Are you talking of batching up rows before 
 forwarding them to hbase?
 
  What are your thoughs about this?
 
  In another note I think it would be valuable to rewrite the
  TableInputFormat class to be extendable. For example in my case I needed a
  Filtered (RegExpRowFilter) TableInputFormat and could not extend the
  original because its instance of HTable is package protected.

 This needs to be done before 0.2.0 release.   Its been on my mind.  I 
 just made a JIRA for it.  Dump any thoughts you have on how it might 
 work into hbase-581.  At a minimum, at note on what currently prevents 
 your being able to subclass.
 
 If you are currently working on this, I could do the hbase end for you.  
 Just say.
 
 St.Ack



RE: StackOverFlow Error in HBase

2008-04-03 Thread David Alves
Hi Jim and all

I'll commit to test the patch under the same conditions as it failed
before, (with around 36000 records) but in this precise moment I
preparing my next development iteration, which means a lot of meetings.
By the end of the day tomorrow (friday) I should have a confirmation
whether the patch worked (or not).

Regards
David Alves

On Thu, 2008-04-03 at 09:12 -0700, Jim Kellerman wrote:
 David,
 
 Have you had a chance to try this patch? We are about to release hbase-0.1.1 
 and until we receive a confirmation in HBASE-554 from another person who has 
 tried it and verifies that it works, we cannot include it in this release. If 
 it is not in this release, there will be a significant wait for it to appear 
 in an hbase release. hbase-0.1.2 will not happen anytime soon unless there 
 are critical issues that arise that have not been fixed in 0.1.1. hbase-0.2.0 
 is also some time in the future. There are a significant number of issues to 
 address before that release is ready.
 
 Frankly, I'd like to see this patch in 0.1.1, because it is an issue for 
 people that use filters.
 
 The alternative would be for Clint to supply a test case that fails without 
 the patch but passes with the patch.
 
 We will hold up the release, but need a commitment either from David to test 
 the patch or for Clint to supply a test. We need that commitment by the end 
 of the day today 2008/04/03 along with an eta as to when it will be completed.
 
 ---
 Jim Kellerman, Senior Engineer; Powerset
 
 
  -Original Message-
  From: David Alves [mailto:[EMAIL PROTECTED]
  Sent: Tuesday, April 01, 2008 2:36 PM
  To: hbase-user@hadoop.apache.org
  Subject: RE: StackOverFlow Error in HBase
 
  Hi
 
  I just deployed the unpatched version.
  Tomorrow I'll rebuild the system with the patch and
  try it out.
  Thanks again.
 
  Regards
  David Alves
 
   -Original Message-
   From: Jim Kellerman [mailto:[EMAIL PROTECTED]
   Sent: Tuesday, April 01, 2008 10:04 PM
   To: hbase-user@hadoop.apache.org
   Subject: RE: StackOverFlow Error in HBase
  
   David,
  
   Have you tried this patch and does it work for you? If so we'll
   include it
   hbase-0.1.1
  
   ---
   Jim Kellerman, Senior Engineer; Powerset
  
  
-Original Message-
From: David Alves [mailto:[EMAIL PROTECTED]
Sent: Tuesday, April 01, 2008 10:44 AM
To: hbase-user@hadoop.apache.org
Subject: RE: StackOverFlow Error in HBase
   
Hi
Thanks for the prompt path Clint, St.Ack and all you guys.
   
Regards
David Alves
   
 -Original Message-
 From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
  On Behalf
 Of Clint Morgan
 Sent: Tuesday, April 01, 2008 2:04 AM
 To: hbase-user@hadoop.apache.org
 Subject: Re: StackOverFlow Error in HBase

 Try the patch at
  https://issues.apache.org/jira/browse/HBASE-554.

 cheers,
 -clint

 On Mon, Mar 31, 2008 at 5:39 AM, David Alves
 [EMAIL PROTECTED] wrote:
  Hi ... again
 
  In my previous mail I stated that increasing the
stack size
 solved the
   problem, well I jumped a little bit to the conclusion,
in fact it
  didn't, the StackOverFlowError always occurs at the end
of the cycle
  when no more records match the filter. Anyway I've
  rewritten my
  application to use a normal scanner and and do the
filtering after
  which is not optimal but it works.
  I'm just saying this because it might be a clue,
in previous
 versions
   (!= 0.1.0) even though a more serious problem happened
  (regionservers  became irresponsive after so many
  records) this
  didn't happen. Btw in  current version I notice no, or
very small,
  decrease of thoughput with  time, great work!
 
   Regards
   David Alves
 
 
 
 
 
 
 
   On Mon, 2008-03-31 at 05:18 +0100, David Alves wrote:
Hi again
   
  As I was almost at the end (80%) of indexable
docs, for the
 time
being I simply increased the stack size, which
  seemed to work.
  Thanks for your input St.Ack really helped me
solve the problem
 at
least for the moment.
  On another note in the same method I changed
  the way the
 scanner was
obtained when htable.getStartKeys() would be more than
1, so that
  I
 could
limit the records read each time to a single
  region, and the
  scanning
 would
start at the last region, strangely the number of keys
obtained
  by   htable.getStartKeys() was always 1 even though
  by the end
  there are
 already
21 regions.
  Any thoughts?
   
Regards
David Alves
   
 -Original Message-
 From: stack [mailto:[EMAIL PROTECTED]Sent:
Sunday, March

RE: StackOverFlow Error in HBase

2008-04-01 Thread David Alves
Hi
Thanks for the prompt path Clint, St.Ack and all you guys.

Regards
David Alves

 -Original Message-
 From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Clint
 Morgan
 Sent: Tuesday, April 01, 2008 2:04 AM
 To: hbase-user@hadoop.apache.org
 Subject: Re: StackOverFlow Error in HBase
 
 Try the patch at https://issues.apache.org/jira/browse/HBASE-554.
 
 cheers,
 -clint
 
 On Mon, Mar 31, 2008 at 5:39 AM, David Alves
 [EMAIL PROTECTED] wrote:
  Hi ... again
 
  In my previous mail I stated that increasing the stack size
 solved the
   problem, well I jumped a little bit to the conclusion, in fact it
   didn't, the StackOverFlowError always occurs at the end of the cycle
   when no more records match the filter. Anyway I've rewritten my
   application to use a normal scanner and and do the filtering after
   which is not optimal but it works.
  I'm just saying this because it might be a clue, in previous
 versions
   (!= 0.1.0) even though a more serious problem happened (regionservers
   became irresponsive after so many records) this didn't happen. Btw in
   current version I notice no, or very small, decrease of thoughput with
   time, great work!
 
   Regards
   David Alves
 
 
 
 
 
 
 
   On Mon, 2008-03-31 at 05:18 +0100, David Alves wrote:
Hi again
   
  As I was almost at the end (80%) of indexable docs, for the
 time
being I simply increased the stack size, which seemed to work.
  Thanks for your input St.Ack really helped me solve the problem
 at
least for the moment.
  On another note in the same method I changed the way the
 scanner was
obtained when htable.getStartKeys() would be more than 1, so that I
 could
limit the records read each time to a single region, and the scanning
 would
start at the last region, strangely the number of keys obtained by
htable.getStartKeys() was always 1 even though by the end there are
 already
21 regions.
  Any thoughts?
   
Regards
David Alves
   
 -Original Message-
 From: stack [mailto:[EMAIL PROTECTED]
 Sent: Sunday, March 30, 2008 9:36 PM
 To: hbase-user@hadoop.apache.org
 Subject: Re: StackOverFlow Error in HBase

 You're doing nothing wrong.

 The filters as written recurse until they find a match.  If long
 stretches between matching rows, then you will get a
 StackOverflowError.  Filters need to be changed.  Thanks for
 pointing
 this out.  Can you do without them for the moment until we get a
 chance
 to fix it?  (HBASE-554)

 Thanks,
 St.Ack



 David Alves wrote:
  Hi St.Ack and all
 
The error always occurs when trying to see if there are more
 rows to
  process.
Yes I'm using a filter(RegExpRowFilter) to select only the rows
 (any
  row key) that match a specific value in one of the columns.
Then I obtain the scanner just test the hasNext method, close
 the
  scanner and return.
Am I doing something wrong?
Still StackOverflowError is not supposed to happen right?
 
  Regards
  David Alves
  On Thu, 2008-03-27 at 12:36 -0700, stack wrote:
 
  You are using a filter?  If so, tell us more about it.
  St.Ack
 
  David Alves wrote:
 
  Hi guys
 
  I 'm using HBase to keep data that is later indexed.
  The data is indexed in chunks so the cycle is get 
 records index
  them check for more records etc...
  When I tryed the candidate-2 instead of the old 0.16.0
 (which I
  switched to do to the regionservers becoming unresponsive) I
 got the
  error in the end of this email well into an indexing job.
  So you have any idea why? Am I doing something wrong?
 
  David Alves
 
  java.lang.RuntimeException:
 org.apache.hadoop.ipc.RemoteException:
  java.io.IOException: java.lang.StackOverflowError
  at
 java.io.DataInputStream.readFully(DataInputStream.java:178)
  at
 java.io.DataInputStream.readLong(DataInputStream.java:399)
  at org.apache.hadoop.dfs.DFSClient
  $BlockReader.readChunk(DFSClient.java:735)
  at
 

 org.apache.hadoop.fs.FSInputChecker.readChecksumChunk(FSInputChecker.java:
 234)
  at
 
 org.apache.hadoop.fs.FSInputChecker.fill(FSInputChecker.java:176)
  at
 
 org.apache.hadoop.fs.FSInputChecker.read1(FSInputChecker.java:193)
  at
 
 org.apache.hadoop.fs.FSInputChecker.read(FSInputChecker.java:157)
  at org.apache.hadoop.dfs.DFSClient
  $BlockReader.read(DFSClient.java:658)
  at org.apache.hadoop.dfs.DFSClient
  $DFSInputStream.readBuffer(DFSClient.java:1130)
  at org.apache.hadoop.dfs.DFSClient
  $DFSInputStream.read(DFSClient.java:1166)
  at
 java.io.DataInputStream.readFully

RE: StackOverFlow Error in HBase

2008-03-31 Thread David Alves
Hi ... again

In my previous mail I stated that increasing the stack size solved the
problem, well I jumped a little bit to the conclusion, in fact it
didn't, the StackOverFlowError always occurs at the end of the cycle
when no more records match the filter. Anyway I've rewritten my
application to use a normal scanner and and do the filtering after
which is not optimal but it works.
I'm just saying this because it might be a clue, in previous versions
(!= 0.1.0) even though a more serious problem happened (regionservers
became irresponsive after so many records) this didn't happen. Btw in
current version I notice no, or very small, decrease of thoughput with
time, great work!

Regards
David Alves





On Mon, 2008-03-31 at 05:18 +0100, David Alves wrote:
 Hi again
 
   As I was almost at the end (80%) of indexable docs, for the time
 being I simply increased the stack size, which seemed to work.
   Thanks for your input St.Ack really helped me solve the problem at
 least for the moment.
   On another note in the same method I changed the way the scanner was
 obtained when htable.getStartKeys() would be more than 1, so that I could
 limit the records read each time to a single region, and the scanning would
 start at the last region, strangely the number of keys obtained by
 htable.getStartKeys() was always 1 even though by the end there are already
 21 regions.
   Any thoughts?
 
 Regards
 David Alves
 
  -Original Message-
  From: stack [mailto:[EMAIL PROTECTED]
  Sent: Sunday, March 30, 2008 9:36 PM
  To: hbase-user@hadoop.apache.org
  Subject: Re: StackOverFlow Error in HBase
  
  You're doing nothing wrong.
  
  The filters as written recurse until they find a match.  If long
  stretches between matching rows, then you will get a
  StackOverflowError.  Filters need to be changed.  Thanks for pointing
  this out.  Can you do without them for the moment until we get a chance
  to fix it?  (HBASE-554)
  
  Thanks,
  St.Ack
  
  
  
  David Alves wrote:
   Hi St.Ack and all
  
 The error always occurs when trying to see if there are more rows to
   process.
 Yes I'm using a filter(RegExpRowFilter) to select only the rows (any
   row key) that match a specific value in one of the columns.
 Then I obtain the scanner just test the hasNext method, close the
   scanner and return.
 Am I doing something wrong?
 Still StackOverflowError is not supposed to happen right?
  
   Regards
   David Alves
   On Thu, 2008-03-27 at 12:36 -0700, stack wrote:
  
   You are using a filter?  If so, tell us more about it.
   St.Ack
  
   David Alves wrote:
  
   Hi guys
  
   I 'm using HBase to keep data that is later indexed.
   The data is indexed in chunks so the cycle is get  records 
   index
   them check for more records etc...
   When I tryed the candidate-2 instead of the old 0.16.0 (which I
   switched to do to the regionservers becoming unresponsive) I got the
   error in the end of this email well into an indexing job.
   So you have any idea why? Am I doing something wrong?
  
   David Alves
  
   java.lang.RuntimeException: org.apache.hadoop.ipc.RemoteException:
   java.io.IOException: java.lang.StackOverflowError
   at java.io.DataInputStream.readFully(DataInputStream.java:178)
   at java.io.DataInputStream.readLong(DataInputStream.java:399)
   at org.apache.hadoop.dfs.DFSClient
   $BlockReader.readChunk(DFSClient.java:735)
   at
  
  org.apache.hadoop.fs.FSInputChecker.readChecksumChunk(FSInputChecker.java:
  234)
   at
   org.apache.hadoop.fs.FSInputChecker.fill(FSInputChecker.java:176)
   at
   org.apache.hadoop.fs.FSInputChecker.read1(FSInputChecker.java:193)
   at
   org.apache.hadoop.fs.FSInputChecker.read(FSInputChecker.java:157)
   at org.apache.hadoop.dfs.DFSClient
   $BlockReader.read(DFSClient.java:658)
   at org.apache.hadoop.dfs.DFSClient
   $DFSInputStream.readBuffer(DFSClient.java:1130)
   at org.apache.hadoop.dfs.DFSClient
   $DFSInputStream.read(DFSClient.java:1166)
   at java.io.DataInputStream.readFully(DataInputStream.java:178)
   at org.apache.hadoop.io.DataOutputBuffer
   $Buffer.write(DataOutputBuffer.java:56)
   at
   org.apache.hadoop.io.DataOutputBuffer.write(DataOutputBuffer.java:90)
   at org.apache.hadoop.io.SequenceFile
   $Reader.next(SequenceFile.java:1829)
   at org.apache.hadoop.io.SequenceFile
   $Reader.next(SequenceFile.java:1729)
   at org.apache.hadoop.io.SequenceFile
   $Reader.next(SequenceFile.java:1775)
   at org.apache.hadoop.io.MapFile$Reader.next(MapFile.java:461)
   at org.apache.hadoop.hbase.HStore
   $StoreFileScanner.getNext(HStore.java:2350)
   at
  
  org.apache.hadoop.hbase.HAbstractScanner.next(HAbstractScanner.java:256)
   at org.apache.hadoop.hbase.HStore
   $HStoreScanner.next(HStore.java:2561

Doubt in RegExpRowFilter and RowFilters in general

2008-02-11 Thread David Alves
Hi Guys
In my previous email I might have misunderstood the roles of the
RowFilterInterfaces so I'll pose my question more clearly (since the
last one wasn't in question form :)).
I save a setup when a table has to columns belonging to different
column families (Table A cf1:a cf2:b));

I'm trying to build a filter so that a scanner only returns the rows
where cf1:a = myvalue1 and cf2:b = myvalue2.

I've build a RegExpRowFilter like this;
MapText, byte[] conditionalsMap = new HashMapText, byte[]();
conditionalsMap.put(new Text(cf1:a), new myvalue1.getBytes());
conditionalsMap.put(new Text(cf2:b), myvalue2.getBytes());
return new RegExpRowFilter(.*, conditionalsMap);

My problem is this filter always fails when I know for sure that there
are rows whose columns match my values.

I'm building the the scanner like this (the purpose in this case is to
find if there are more values that match my filter):

final Text startKey = this.htable.getStartKeys()[0];
HScannerInterface scanner = htable.obtainScanner(new 
Text[] {new
Text(cf1:a), new Text(cf2:b)}, startKey, rowFilterInterface);
return scanner.iterator().hasNext();

Can anyone give me a hand please.

Thanks in advance
David Alves





Re: Doubt in RegExpRowFilter and RowFilters in general

2008-02-11 Thread David Alves
Hi Again

In my previous example I seem to have misplaced a new keyword (new
myvalue1.getBytes() where it should have been myvalue1.getBytes()).

On another note my program hangs when I supply my own filter to the
scanner (I suppose it's clear that the nodes don't know my class so
there should be a ClassNotFoundException right?).

Regards
David Alves 


On Mon, 2008-02-11 at 16:51 +, David Alves wrote: 
 Hi Guys
   In my previous email I might have misunderstood the roles of the
 RowFilterInterfaces so I'll pose my question more clearly (since the
 last one wasn't in question form :)).
   I save a setup when a table has to columns belonging to different
 column families (Table A cf1:a cf2:b));
 
 I'm trying to build a filter so that a scanner only returns the rows
 where cf1:a = myvalue1 and cf2:b = myvalue2.
 
 I've build a RegExpRowFilter like this;
 MapText, byte[] conditionalsMap = new HashMapText, byte[]();
   conditionalsMap.put(new Text(cf1:a), new myvalue1.getBytes());
   conditionalsMap.put(new Text(cf2:b), myvalue2.getBytes());
   return new RegExpRowFilter(.*, conditionalsMap);
 
 My problem is this filter always fails when I know for sure that there
 are rows whose columns match my values.
 
 I'm building the the scanner like this (the purpose in this case is to
 find if there are more values that match my filter):
 
 final Text startKey = this.htable.getStartKeys()[0];
   HScannerInterface scanner = htable.obtainScanner(new 
 Text[] {new
 Text(cf1:a), new Text(cf2:b)}, startKey, rowFilterInterface);
   return scanner.iterator().hasNext();
 
 Can anyone give me a hand please.
 
 Thanks in advance
 David Alves
 
 
 



Re: RegExpRowFilter with multiple conditions on rows matching both

2008-02-09 Thread David Alves
I now realize the text is a bit confusing.. sorry for that.
Also that last paragraph should en with: ... at the same time.

Regards 
David



On Sun, 2008-02-10 at 01:03 +, David Alves wrote:
 Hi All!
 
 First of all congrats for the great piece of software.
 I have a table with two column families (A,B) each with a column when I
 build a RegExpRowFilter to select only rows whose columns A AND B match
 the criteria (lets say A:a = 1 and B:b = 2) all the rows are filtered.
 This is strange because if I build the map required by the constructor
 with only one or the other of the conditionals the rows that match won't
 be filtered, meaning that if they pass one and the other conditionals in
 different runs the should pass them both in the same run right?
 
 More concisely when running with both conditionals they are able to pass
 the filter() method for both columns but fail to pass the
 filterNotNull() method. The debug log tells me that the
 TreeMapText,byte[] passed to filterNotNull() by the HStore scanner
 doesn't contain both columns at the same time (the method is called two
 time first with one column and then with the other).
 
 Finally when running with only one of conditionals the filterNotNull()
 method still returns true once but returns false the second time
 (therefore returning the record) meaning that not all columns of the
 same row are passing through the cycle.
 
 Regards
 David Alves
 
 
 
 
 
 



Re: Skip Reduce Phase

2008-02-07 Thread David Alves
Great!

Thanks Owen, Ted and Jason

On Thu, 2008-02-07 at 10:07 -0800, Owen O'Malley wrote:
 On Feb 7, 2008, at 9:59 AM, Ted Dunning wrote:
 
 
  I think that setting the parameter to 0 skips most of the overhead  
  of the
  later stages.
 
 Setting it to 0 skips all of the buffering, sorting, merging, and  
 shuffling. It passes the objects straight from the mapper to the  
 output format, which writes it straight to hdfs.
 
 -- Owen



Re: Skip Reduce Phase

2008-02-07 Thread David Alves
Hi Ted

But wouldn't that still go through the intermediate phases and do the
merge sort and copy to the local filesystem (which is the reduce input)?

Is there a way to provide the direct map output (saved onto DFS) to
another map task, or does you suggestion already do this and this is a
moot point?.

David

On Thu, 2008-02-07 at 09:39 -0800, Ted Dunning wrote:
 Set numReducers to 0.
 
 
 On 2/7/08 9:35 AM, David Alves [EMAIL PROTECTED] wrote:
 
  Hi All
  First of all since this is my first post I must say congrats for the
  great piece of software (both Hadoop and HBase).
  I've been using HadoopHBase for a while and I have a question, let me
  just explain a little my setup:
  
  I have an HBase Database that holds information that I want to process
  in a Map/Reduce job but that before needs to be a little processed.
  
  So I built another Map/Reduce Job that uses a Specific (Filtered)
  TableInputFormat and then pre processes the information in a Map phase.
  
  As I don't need none of the intermediate phases (like merge sort) and I
  don't need to do anything on the reduce phase I was wondering If I could
  just save the Map phase output and start the second Map/Reduce job using
  that as an input (but still saving the splits to DFS for
  backtracking/reliability reasons).
  
  Is this possible?
  
  Thanks in advance, and again great piece of software.
  David Alves
  
  
  
 



Skip Reduce Phase

2008-02-07 Thread David Alves
Hi All
First of all since this is my first post I must say congrats for the
great piece of software (both Hadoop and HBase).
I've been using HadoopHBase for a while and I have a question, let me
just explain a little my setup:

I have an HBase Database that holds information that I want to process
in a Map/Reduce job but that before needs to be a little processed.

So I built another Map/Reduce Job that uses a Specific (Filtered)
TableInputFormat and then pre processes the information in a Map phase.

As I don't need none of the intermediate phases (like merge sort) and I
don't need to do anything on the reduce phase I was wondering If I could
just save the Map phase output and start the second Map/Reduce job using
that as an input (but still saving the splits to DFS for
backtracking/reliability reasons).

Is this possible?

Thanks in advance, and again great piece of software.
David Alves