Hi, all, I found one bug in KfsClientInt.h. KFS manage socket connect action using select(). As all knowns, select() can only manage fd whose value less than 1024. I use poll() to replace select(), and the replay_log() can finish. The rangeserver seems to work again.
But I am not sure it is the same debug cause the Hypertable hung, firstly. Thanks. -- kuer On 8月2日, 下午3时33分, kuer <[email protected]> wrote: > Hi, all, > > The error message appeared in logging files : > 2009-08-01 22:09:12,870 47660526791088 Hypertable.RangeServer [INFO] > (RangeServer/Range.cc:231) Loading CellStore > /hypertable/tables/storage_se/agPORT/AB2A0D28DE6B77FFDD6C72AF/cs65 > 2009-08-01 22:09:12,875 47660526791088 Hypertable.RangeServer [ERROR] > replay_load_range (RangeServer/RangeServer.cc:1753): > Hypertable::Exception: Problem reading trailer for CellStore file > '/hypertable/tables/storage_se/agPORT/AB2A0D28DE6B77FFDD6C72AF/cs65' - > only read 0 of 512 bytes - DFS BROKER i/o error > at static Hypertable::CellStore* > Hypertable::CellStoreFactory::open(const Hypertable::String&, > const char*, const char*) (RangeServer/CellStoreFactory.cc:63) > > I list file info from KFS : > > KFS >>> ls /hypertable/tables/storage_se/agPORT/ > AB2A0D28DE6B77FFDD6C72AF/ > TYPE CREATION TIME MODIFICATION TIME FID SIZE NAME > ---- -------------------- -------------------- ----- ---------- > ------------------------------ > file 2009-07-31 13:48:17 2009-07-31 13:48:17 29205 3641 cs65 > file 2009-07-31 15:21:15 2009-07-31 15:21:15 30339 788 cs66 > file 2009-07-31 15:55:19 2009-07-31 15:55:19 30807 551 cs67 > file 2009-07-31 16:32:06 2009-07-31 16:32:06 31461 426 cs68 > file 2009-07-31 17:59:11 2009-07-31 17:59:11 32769 415 cs70 > file 2009-07-31 19:01:29 2009-07-31 19:01:29 33644 420 cs71 > file 2009-07-31 23:53:26 2009-07-31 23:53:26 38295 356 cs74 > file 2009-08-01 08:48:47 2009-08-01 08:48:47 44692 522 cs76 > > file size of cs65 is 3641 bytes. I think there should be some data > available in cs65 for the trailer reading, even though there are some > error bytes. but, why DFS broker read 0 bytes out ?? > > -- kuer > > On 8月2日, 下午12时38分, kuer <[email protected]> wrote: > > > $ ./hypertable --version > > Hypertable 0.9.2.4 (tarball) > > > I read the relase anouncement of 0.9.2.5, But I can't catch which > > updation related to my problem, so post them. > > > thanks Luke. > > > -- kuer > > > On 8月2日, 上午12时05分, Luke <[email protected]> wrote: > > > > Which hypertable version? (from bin/hypertable --version). The latest > > > version fixed a couple corruption issues. > > > > On Sat, Aug 1, 2009 at 8:01 AM, kuer<[email protected]> wrote: > > > > > Hi, all, > > > > > In my testing environment, there are 4 kfs-chunkserver + hypertable- > > > > rangeserver. > > > > > The rangeserver on the 2nd complained something, and the client got > > > > stuck with feeding data into hypertable. > > > > > Here are the error logs of rangeserver in 2nd box : > > > > > 2009-08-01 21:52:05,363 1352718656 Hypertable.RangeServer [ERROR] > > > > operator() (RangeServer/MaintenanceQueue.h:126): > > > > Hypertable::Exception: > > > > shrink failed - DFS BROKER bad filename > > > > at void Hypertable::AccessGroup::shrink(Hypertable::String&, > > > > bool) > > > > (RangeServer/AccessGroup.cc:634) > > > > at virtual int64_t Hypertable::DfsBroker::Client::length(const > > > > Hypertable::String&) (Lib/Client.cc:431): Error getting length of > > > > DFS file: > > > > /hypertable/tables/storage_se/agPORT/B6B3D15F8A7C517A21353A32/cs11 > > > > at virtual int64_t Hypertable::DfsBroker::Client::length(const > > > > Hypertable::String&) (Lib/Client.cc:425): No such file or > > > > directory > > > > 2009-08-01 21:52:05,363 1352718656 Hypertable.RangeServer [ERROR] > > > > (RangeServer/MaintenanceQueue.h:138) Maintenance Task 'SPLIT > > > > storage_se[cc.changjiang.www/bbs/viewthread.php?tid=313155&extra=page > > > > %3D1..cc.dazhihui.bbs/thread-130962-1-1.html]' > > > > failed, dropping task ... > > > > > I didnot find any other error meaningful message in loggings of other > > > > hypertable or KFS process. > > > > > Firstly, I restarted the ThriftBroker process, but it did NOT change > > > > the situation. So I restarted Hypertable.RangeServer, bad news > > > > came : > > > > > 2009-08-01 22:09:12,870 47660526791088 Hypertable.RangeServer [INFO] > > > > (RangeServer/Range.cc:231) Loading CellStore > > > > /hypertable/tables/storage_se/agPORT/AB2A0D28DE6B77FFDD6C72AF/cs65 > > > > 2009-08-01 22:09:12,875 47660526791088 Hypertable.RangeServer [ERROR] > > > > replay_load_range (RangeServer/RangeServer.cc:1753): > > > > Hypertable::Exception: Problem reading trailer for CellStore file > > > > '/hypertable/tables/storage_se/agPORT/AB2A0D28DE6B77FFDD6C72AF/cs65' - > > > > only read 0 of 512 bytes - DFS BROKER i/o error > > > > at static Hypertable::CellStore* > > > > Hypertable::CellStoreFactory::open(const Hypertable::String&, > > > > const char*, const char*) (RangeServer/CellStoreFactory.cc:63) > > > > 2009-08-01 22:09:12,875 47660526791088 Hypertable.RangeServer [INFO] > > > > (RangeServer/RangeServer.cc:408) replay log > > > > log_dir=/hypertable/servers/221.194.134.174_31060/log/user > > > > 2009-08-01 22:09:14,495 47660526791088 Hypertable.RangeServer [FATAL] > > > > (Lib/CommitLogBlockStream.cc:116) failed expectation: nread == > > > > header->get_data_zlength() > > > > > Now, the rangeserver cored and would not come back. > > > > > I guess : > > > > 1. maybe restarting operation lose some data in KFS > > > > 2. maybe some error destroyed data in KFS before restarting > > > > > I wish the system can work, even losing some data that cause the > > > > disaster. > > > > How to do? > > > > > thanks > > > > > -- kuer --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Hypertable Development" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/hypertable-dev?hl=en -~----------~----~----~----~------~----~------~--~---
