Re: Issues/Problems concerning hbase data insertion

stack Wed, 16 Sep 2009 09:46:50 -0700

I took a look at your attached configuration files.  You have very little
customization in them.  Given you are running 0.19.x, you are missing some
critical configuration.  See
http://wiki.apache.org/hadoop/Hbase/Troubleshooting.  In particular, #5, #6,
and #7.  What about file descriptor count?  Did you up that above its
default of 1024?  See http://wiki.apache.org/hadoop/Hbase/FAQ#6 (Alot of
these issues fade in 0.20.0).


St.Ack

On Wed, Sep 16, 2009 at 8:35 AM, <[email protected]>wrote:

> Hi all,
> Being in the process of evaluating hbase for managing "bigtable" (to give
> an idea ~ 1G entries of 500 bytes). We are now facing some issues and i
> would like to have comments concerning what i have noticed.
> Our configuration is hadoop 0.19.1 and hbase 0.19.3, both
> hadoop-default/site.xml and hbase-default/site.xml are attached, 15 nodes
> (16 or 8 Go RAM and 1,3To disk, linux kernel 2.6.24-standard, java version
> "1.6.0_12").
> For now the test case is on one IndexedTable (without at the moment using
> the index column) with 25 M of entries/rows:
> Map is formatting the data and 15 reduces are BatchUpdating the textual
> data (like url and simple text fields < 500 bytes)
> All processes (hadoop/hbase) are started with -Xmx1000m and IndexedTable is
> configured with AutoCommit to false.
>
> ISSUE 1, We need one column index to have "fast" UI query (for instance as
> an answer to Web form we could expect waiting at max 30sec). The only
> documentation I found concerning indexed column comes from
> http://rajeev1982.blogspot.com/2009/06/secondary-indexes-in-hbase.html
> Instead of using the indextable properties in hbase-site.xml (that I have
> tested but that gives very poor performance and also lost entries...) I pass
> the properties to the job through a -conf indextable_properties.xml (file is
> in attachement). I suppose that putting the indextable properties into the
> hbase-site.xml apply to the whole hbase cluster making the whole performance
> significantly decreasing ?
> The best perf were reached passing through the -conf option of the Tool.run
> method.
>
> ISSUE2, we are facing serious regionserver problems often leading to
> regionserver shutdown like:
> 2009-09-16 10:21:15,887 INFO
> org.apache.hadoop.hbase.regionserver.MemcacheFlusher: Too many store files
> for region urlsdata-validation,
> forum.telecharger.01net.com/index.php?page=01net_voter&forum=microhebdo&category=5&topic=344142&post=5653085,1253089082422:
> 23, waiting
>
> or
>
> 2009-09-14 16:39:24,611 INFO org.apache.hadoop.hbase.regionserver.HRegion:
> Blocking updates for 'IPC Server handler 1 on 60020' on region
> urlsdata-validation,
> www.abovetopsecret.com/forum/thread119/pg1&title=Underground+Communities,1252939031807:
> Memcache size 128.0m is >= than blocking 128.0m size
> 2009-09-14 16:39:24,942 INFO org.apache.hadoop.hdfs.DFSClient: Exception in
> createBlockOutputStream java.io.IOException: Could not read from stream
> 2009-09-14 16:39:24,942 INFO org.apache.hadoop.hdfs.DFSClient: Abandoning
> block blk_-873614322830930554_111500
> 2009-09-14 16:39:31,180 WARN org.apache.hadoop.hdfs.DFSClient: Error
> Recovery for block blk_-873614322830930554_111500 bad datanode[0] nodes ==
> null
> 2009-09-14 16:39:31,181 WARN org.apache.hadoop.hdfs.DFSClient: Could not
> get block locations. Source file
> "/hbase/urlsdata-validation/1733902030/info/mapfiles/2690714750206504745/data"
> - Aborting...
> 2009-09-14 16:39:31,241 FATAL
> org.apache.hadoop.hbase.regionserver.MemcacheFlusher: Replay of hlog
> required. Forcing server shutdown
>
> I've read some hbase/jira issues (hbase-1415, hbase-1058, hbase-1084...)
> concerning similar problems,
> but i cannot get a clear idea of what kind of fix is proposed ?
>
>
> ISSUE3, Theses problems are causing table.commit() IOException losing all
> the entries:
> org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to contact
> region server 192.168.255.8:60020 for region urlsdata-validation,
> twitter.com/statuses/434272962,1253089707924, row '
> www.harmonicasurcher.com', but failed after 10 attempts.
> Exceptions:
> java.io.IOException: Call to /192.168.255.8:60020 failed on local
> exception: java.io.EOFException
> java.net.ConnectException: Call to /192.168.255.8:60020 failed on
> connection exception: java.net.ConnectException: Connection refused
>
> Is there a way to get back the uncommitted entries (there are many of them
> because we are in AutoCommit false)
> to resubmit them later ?
> To give an idea, we sometime lost about 170 000 entries out of 25M entries
> due to this commit exception.
>
>
> Guillaume Viland ([email protected])
> FT/TGPF/OPF/PORTAIL/DOP Sophia Antipolis
>
>
>
> *********************************
> This message and any attachments (the "message") are confidential and
> intended solely for the addressees.
> Any unauthorised use or dissemination is prohibited.
> Messages are susceptible to alteration.
> France Telecom Group shall not be liable for the message if altered,
> changed or falsified.
> If you are not the intended addressee of this message, please cancel it
> immediately and inform the sender.
> ********************************
>
>

Re: Issues/Problems concerning hbase data insertion

Reply via email to