no. but I did run major compaction.
As I explained initially, I disabled the table so I could change its TTL,
then re-enabled it then ran major compaction so it would clean up the
expired data due to the TTL change.

-eran



On Wed, Jul 6, 2011 at 02:43, Ted Yu <[email protected]> wrote:

> Eran:
> You didn't run hbck during the enabling of gs_raw_events table, right ?
>
> I saw:
> 2011-06-29 16:43:50,395 DEBUG
> org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction (major)
> requested for
>
> gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.
> because User-triggered major compaction; priority=1, compaction queue
> size=1248
>
> The above might be related to:
> >> 2011-06-29 16:43:57,880 INFO
> org.apache.hadoop.hbase.
> master.AssignmentManager: Region has been
> PENDING_OPEN for too long, reassigning
>
> region=gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.
>
> Thanks
>
> On Tue, Jul 5, 2011 at 7:19 AM, Ted Yu <[email protected]> wrote:
>
> > Eran:
> > I logged https://issues.apache.org/jira/browse/HBASE-4060 for you.
> >
> >
> > On Mon, Jul 4, 2011 at 2:30 AM, Ted Yu <[email protected]> wrote:
> >
> >> Thanks for the understanding.
> >>
> >> Can you log a JIRA and put your ideas below in it ?
> >>
> >>
> >>
> >> On Jul 4, 2011, at 12:42 AM, Eran Kutner <[email protected]> wrote:
> >>
> >> > Thanks for the explanation Ted,
> >> >
> >> > I will try to apply HBASE-3789 and hope for the best but my
> >> understanding is
> >> > that it doesn't really solve the problem, it only reduces the
> >> probability of
> >> > it happening, at least in one particular scenario. I would hope for a
> >> more
> >> > robust solution.
> >> > My concern is that the region allocation process seems to rely too
> much
> >> on
> >> > timing considerations and doesn't seem to take enough measures to
> >> guarantee
> >> > conflicts do not occur. I understand that in a distributed
> environment,
> >> when
> >> > you don't get a timely response from a remote machine you can't know
> for
> >> > sure if it did or did not receive the request, however there are
> things
> >> that
> >> > can be done to mitigate this and reduce the conflict time
> significantly.
> >> For
> >> > example, when I run dbck it knows that some regions are multiply
> >> assigned,
> >> > the master could do the same and try to resolve the conflict. Another
> >> > approach would be to handle late responses, even if the response from
> >> the
> >> > remote machine arrives after it was assumed to be dead the master
> should
> >> > have enough information to know it had created a conflict by assigning
> >> the
> >> > region to another server. An even better solution, I think, is for the
> >> RS to
> >> > periodically test that it is indeed the rightful owner of every region
> >> it
> >> > holds and relinquish control over the region if it's not.
> >> > Obviously a state where two RSs hold the same region is pathological
> and
> >> can
> >> > lead to data loss, as demonstrated in my case. The system should be
> able
> >> to
> >> > actively protect itself against such a scenario. It probably doesn't
> >> need
> >> > saying but there is really nothing worse for a data storage system
> than
> >> data
> >> > loss.
> >> >
> >> > In my case the problem didn't happen in the initial phase but after
> >> > disabling and enabling a table with about 12K regions.
> >> >
> >> > -eran
> >> >
> >> >
> >> >
> >> > On Sun, Jul 3, 2011 at 23:49, Ted Yu <[email protected]> wrote:
> >> >
> >> >> Let me try to answer some of your questions.
> >> >> The two paragraphs below were written along my reasoning which is in
> >> >> reverse
> >> >> order of the actual call sequence.
> >> >>
> >> >> For #4 below, the log indicates that the following was executed:
> >> >> private void assign(final RegionState state, final boolean
> >> setOfflineInZK,
> >> >>     final boolean forceNewPlan) {
> >> >>   for (int i = 0; i < this.maximumAssignmentAttempts; i++) {
> >> >>     if (setOfflineInZK && !*setOfflineInZooKeeper*(state)) return;
> >> >>
> >> >> The above was due to the timeout which you noted in #2 which would
> have
> >> >> caused
> >> >> TimeoutMonitor.chore() to run this code (line 1787)
> >> >>
> >> >>     for (Map.Entry<HRegionInfo, Boolean> e: assigns.entrySet()){
> >> >>       assign(e.getKey(), false, e.getValue());
> >> >>     }
> >> >>
> >> >> This means there is lack of coordination between
> >> >> assignmentManager.TimeoutMonitor and OpenedRegionHandler
> >> >>
> >> >> The reason I mention HBASE-3789 is that it is marked as Incompatible
> >> change
> >> >> and is in TRUNK already.
> >> >> The application of HBASE-3789 to 0.90 branch would change the
> behavior
> >> >> (timing) of region assignment.
> >> >>
> >> >> I think it makes sense to evaluate the effect of HBASE-3789 in 0.90.4
> >> >>
> >> >> BTW were the incorrect region assignments observed for a table with
> >> >> multiple
> >> >> initial regions ?
> >> >> If so, I have HBASE-4010 in TRUNK which speeds up initial region
> >> assignment
> >> >> by about 50%.
> >> >>
> >> >> Cheers
> >> >>
> >> >> On Sun, Jul 3, 2011 at 12:02 PM, Eran Kutner <[email protected]> wrote:
> >> >>
> >> >>> Ted,
> >> >>> So if I understand correctly the the theory is that because of the
> >> issue
> >> >>> fixed in HBASE-3789 the master took too long to detect that the
> region
> >> >> was
> >> >>> successfully opened by the first server so it forced closed it and
> >> >>> transitioned to a second server, but there are a few things about
> this
> >> >>> scenario I don't understand, probably because I don't know enough
> >> about
> >> >> the
> >> >>> inner workings of the region transition process and would appreciate
> >> it
> >> >> if
> >> >>> you can help me understand:
> >> >>> 1. The RS opened the region at 16:37:49.
> >> >>> 2. The master started handling the opened event at 16:39:54 - this
> >> delay
> >> >>> can
> >> >>> probably be explained by HBASE-3789
> >> >>> 3. At 16:39:54 the master log says: Opened region
> gs_raw_events,.....
> >> on
> >> >>> hadoop1-s05.farm-ny.gigya.com
> >> >>> 4. Then at 16:40:00 the master log says:
> >> master:60000-0x13004a31d7804c4
> >> >>> Creating (or updating) unassigned node for
> >> 584dac5cc70d8682f71c4675a843c3
> >> >>> 09 with OFFLINE state - why did it decide to take the region offline
> >> >> after
> >> >>> learning it was successfully opened?
> >> >>> 5. Then it tries to reopen the region on hadoop1-s05, which
> indicates
> >> in
> >> >>> its
> >> >>> log that the open request failed because the region was already open
> -
> >> >> why
> >> >>> didn't the master use that information to learn that the region was
> >> >> already
> >> >>> open?
> >> >>> 6. At 16:43:57 the master decides the region transition timed out
> and
> >> >>> starts
> >> >>> forcing the transition - HBASE-3789 again?
> >> >>> 7. Now the master forces the transition of the region to hadoop1-s02
> >> but
> >> >>> there is no sign of that on hadoop1-s05 - why doesn't the old RS
> >> >>> (hadoop1-s05) detect that it is no longer the master and
> relinquishes
> >> >>> control of the region?
> >> >>>
> >> >>> Thanks.
> >> >>>
> >> >>> -eran
> >> >>>
> >> >>>
> >> >>>
> >> >>> On Sun, Jul 3, 2011 at 20:09, Ted Yu <[email protected]> wrote:
> >> >>>
> >> >>>> HBASE-3789 should have sped up region assignment.
> >> >>>> The patch for 0.90 is attached to that JIRA.
> >> >>>>
> >> >>>> You may prudently apply that patch.
> >> >>>>
> >> >>>> Regards
> >> >>>>
> >> >>>> On Sun, Jul 3, 2011 at 10:01 AM, Eran Kutner <[email protected]>
> wrote:
> >> >>>>
> >> >>>>> Thanks Ted, but, as stated before, I'm already using 0.90.3, so
> >> >> either
> >> >>>> it's
> >> >>>>> not fixed or it's not the same thing.
> >> >>>>>
> >> >>>>> -eran
> >> >>>>>
> >> >>>>>
> >> >>>>>
> >> >>>>> On Sun, Jul 3, 2011 at 17:27, Ted Yu <[email protected]> wrote:
> >> >>>>>
> >> >>>>>> Eran:
> >> >>>>>> I was thinking of this:
> >> >>>>>> HBASE-3789  Cleanup the locking contention in the master
> >> >>>>>>
> >> >>>>>> though it doesn't directly handle 'PENDING_OPEN for too long'
> case.
> >> >>>>>>
> >> >>>>>> https://issues.apache.org/jira/browse/HBASE-3741 is in 0.90.3
> and
> >> >>>>> actually
> >> >>>>>> close to the symptom you described.
> >> >>>>>>
> >> >>>>>> On Sun, Jul 3, 2011 at 12:00 AM, Eran Kutner <[email protected]>
> >> >> wrote:
> >> >>>>>>
> >> >>>>>>> It does seem that both servers opened the same region around the
> >> >>> same
> >> >>>>>> time.
> >> >>>>>>> The region was offline because I disabled the table so I can
> >> >> change
> >> >>>> its
> >> >>>>>>> TTL.
> >> >>>>>>>
> >> >>>>>>> Here is the log from haddop1-s05 :
> >> >>>>>>> 2011-06-29 16:37:12,576 INFO
> >> >>>>>>> org.apache.hadoop.hbase.regionserver.HRegionServer: Received
> >> >>> request
> >> >>>> to
> >> >>>>>>> open
> >> >>>>>>> region:
> >> >>>>>>>
> >> >>>>>>>
> >> >>>>>>
> >> >>>>>
> >> >>>>
> >> >>>
> >> >>
> >>
> gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.
> >> >>>>>>> 2011-06-29 16:37:12,680 DEBUG
> >> >>>>>>> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler:
> >> >>>>>> Processing
> >> >>>>>>> open of
> >> >>>>>>>
> >> >>>>>>>
> >> >>>>>>
> >> >>>>>
> >> >>>>
> >> >>>
> >> >>
> >>
> gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.
> >> >>>>>>> 2011-06-29 16:37:12,680 DEBUG
> >> >>>>> org.apache.hadoop.hbase.zookeeper.ZKAssign:
> >> >>>>>>> regionserver:60020-0x33004a38816050b Attempting to transition
> >> >> node
> >> >>>>>>> 584dac5cc70d8682f71c4675a843c309 from M_ZK_REGION_OFFLINE to
> >> >>>>>>> RS_ZK_REGION_OPENING
> >> >>>>>>> 2011-06-29 16:37:12,711 DEBUG
> >> >>>>> org.apache.hadoop.hbase.zookeeper.ZKAssign:
> >> >>>>>>> regionserver:60020-0x33004a38816050b Successfully transitioned
> >> >> node
> >> >>>>>>> 584dac5cc70d8682f71c4675a843c309 from M_ZK_REGION_OFFLINE to
> >> >>>>>>> RS_ZK_REGION_OPENING
> >> >>>>>>> 2011-06-29 16:37:12,711 DEBUG
> >> >>>>>> org.apache.hadoop.hbase.regionserver.HRegion:
> >> >>>>>>> Opening region: REGION => {NAME =>
> >> >>>>>>>
> >> >>>>>>>
> >> >>>>>>
> >> >>>>>
> >> >>>>
> >> >>>
> >> >>
> >>
> 'gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.',
> >> >>>>>>> STARTKEY => 'GSLoad_1308518553_168_WEB204', ENDKEY =>
> >> >>>>>>> 'GSLoad_1308518810_1249_WEB204', ENCODED =>
> >> >>>>>>> 584dac5cc70d8682f71c4675a843c309, TABLE => {{NAME =>
> >> >>> 'gs_raw_events',
> >> >>>>>>> FAMILIES => [{NAME => 'events', BLOOMFILTER => 'NONE',
> >> >>>>> REPLICATION_SCOPE
> >> >>>>>> =>
> >> >>>>>>> '1', VERSIONS => '3', COMPRESSION => 'LZO', TTL => '604800',
> >> >>>> BLOCKSIZE
> >> >>>>> =>
> >> >>>>>>> '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}]}}
> >> >>>>>>> 2011-06-29 16:37:12,711 DEBUG
> >> >>>>>> org.apache.hadoop.hbase.regionserver.HRegion:
> >> >>>>>>> Instantiated
> >> >>>>>>>
> >> >>>>>>>
> >> >>>>>>
> >> >>>>>
> >> >>>>
> >> >>>
> >> >>
> >>
> gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.
> >> >>>>>>> 2011-06-29 16:37:12,847 DEBUG
> >> >>>>> org.apache.hadoop.hbase.regionserver.Store:
> >> >>>>>>> loaded
> >> >>>>>>>
> >> >>>>>>>
> >> >>>>>>
> >> >>>>>
> >> >>>>
> >> >>>
> >> >>
> >>
> hdfs://hadoop1-m1:8020/hbase/gs_raw_events/584dac5cc70d8682f71c4675a843c309/events/1971818821800304360,
> >> >>>>>>> isReference=false, isBulkLoadResult=false, seqid=1162228062,
> >> >>>>>>> majorCompaction=false
> >> >>>>>>> 2011-06-29 16:37:12,848 INFO
> >> >>>>>> org.apache.hadoop.hbase.regionserver.HRegion:
> >> >>>>>>> Onlined
> >> >>>>>>>
> >> >>>>>>>
> >> >>>>>>
> >> >>>>>
> >> >>>>
> >> >>>
> >> >>
> >>
> gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.;
> >> >>>>>>> next sequenceid=1162228063
> >> >>>>>>> 2011-06-29 16:37:12,849 DEBUG
> >> >>>>> org.apache.hadoop.hbase.zookeeper.ZKAssign:
> >> >>>>>>> regionserver:60020-0x33004a38816050b Attempting to transition
> >> >> node
> >> >>>>>>> 584dac5cc70d8682f71c4675a843c309 from RS_ZK_REGION_OPENING to
> >> >>>>>>> RS_ZK_REGION_OPENING
> >> >>>>>>> 2011-06-29 16:37:12,875 DEBUG
> >> >>>>> org.apache.hadoop.hbase.zookeeper.ZKAssign:
> >> >>>>>>> regionserver:60020-0x33004a38816050b Successfully transitioned
> >> >> node
> >> >>>>>>> 584dac5cc70d8682f71c4675a843c309 from RS_ZK_REGION_OPENING to
> >> >>>>>>> RS_ZK_REGION_OPENING
> >> >>>>>>> 2011-06-29 16:37:12,951 INFO
> >> >>>>> org.apache.hadoop.hbase.catalog.MetaEditor:
> >> >>>>>>> Updated row
> >> >>>>>>>
> >> >>>>>>>
> >> >>>>>>
> >> >>>>>
> >> >>>>
> >> >>>
> >> >>
> >>
> gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.
> >> >>>>>>> in region .META.,,1 with server=
> >> >>> hadoop1-s05.farm-ny.gigya.com:60020,
> >> >>>>>>> startcode=1307349217076
> >> >>>>>>> 2011-06-29 16:37:12,951 DEBUG
> >> >>>>> org.apache.hadoop.hbase.zookeeper.ZKAssign:
> >> >>>>>>> regionserver:60020-0x33004a38816050b Attempting to transition
> >> >> node
> >> >>>>>>> 584dac5cc70d8682f71c4675a843c309 from RS_ZK_REGION_OPENING to
> >> >>>>>>> RS_ZK_REGION_OPENED
> >> >>>>>>> 2011-06-29 16:37:12,964 DEBUG
> >> >>>>> org.apache.hadoop.hbase.zookeeper.ZKAssign:
> >> >>>>>>> regionserver:60020-0x33004a38816050b Successfully transitioned
> >> >> node
> >> >>>>>>> 584dac5cc70d8682f71c4675a843c309 from RS_ZK_REGION_OPENING to
> >> >>>>>>> RS_ZK_REGION_OPENED
> >> >>>>>>> 2011-06-29 16:37:12,964 DEBUG
> >> >>>>>>> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler:
> >> >>>> Opened
> >> >>>>>>>
> >> >>>>>>>
> >> >>>>>>
> >> >>>>>
> >> >>>>
> >> >>>
> >> >>
> >>
> gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.
> >> >>>>>>> 2011-06-29 16:40:00,878 INFO
> >> >>>>>>> org.apache.hadoop.hbase.regionserver.HRegionServer: Received
> >> >>> request
> >> >>>> to
> >> >>>>>>> open
> >> >>>>>>> region:
> >> >>>>>>>
> >> >>>>>>>
> >> >>>>>>
> >> >>>>>
> >> >>>>
> >> >>>
> >> >>
> >>
> gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.
> >> >>>>>>> 2011-06-29 16:40:00,878 DEBUG
> >> >>>>>>> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler:
> >> >>>>>> Processing
> >> >>>>>>> open of
> >> >>>>>>>
> >> >>>>>>>
> >> >>>>>>
> >> >>>>>
> >> >>>>
> >> >>>
> >> >>
> >>
> gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.
> >> >>>>>>> 2011-06-29 16:40:01,079 WARN
> >> >>>>>>> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler:
> >> >>>>> Attempted
> >> >>>>>>> open of
> >> >>>>>>>
> >> >>>>>>>
> >> >>>>>>
> >> >>>>>
> >> >>>>
> >> >>>
> >> >>
> >>
> gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.
> >> >>>>>>> but already online on this server
> >> >>>>>>> 2011-06-29 16:43:50,395 DEBUG
> >> >>>>>>> org.apache.hadoop.hbase.regionserver.CompactSplitThread:
> >> >> Compaction
> >> >>>>>> (major)
> >> >>>>>>> requested for
> >> >>>>>>>
> >> >>>>>>>
> >> >>>>>>
> >> >>>>>
> >> >>>>
> >> >>>
> >> >>
> >>
> gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.
> >> >>>>>>> because User-triggered major compaction; priority=1, compaction
> >> >>> queue
> >> >>>>>>> size=1248
> >> >>>>>>> 2011-06-29 20:19:49,906 INFO
> >> >>>>>> org.apache.hadoop.hbase.regionserver.HRegion:
> >> >>>>>>> Starting major compaction on region
> >> >>>>>>>
> >> >>>>>>>
> >> >>>>>>
> >> >>>>>
> >> >>>>
> >> >>>
> >> >>
> >>
> gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.
> >> >>>>>>> 2011-06-29 20:19:49,906 INFO
> >> >>>>> org.apache.hadoop.hbase.regionserver.Store:
> >> >>>>>>> Started compaction of 1 file(s) in cf=events  into
> >> >>>>>>>
> >> >>>>>>>
> >> >>>>>>
> >> >>>>>
> >> >>>>
> >> >>>
> >> >>
> >>
> hdfs://hadoop1-m1:8020/hbase/gs_raw_events/584dac5cc70d8682f71c4675a843c309/.tmp,
> >> >>>>>>> seqid=1162228062, totalSize=98.3m
> >> >>>>>>> 2011-06-29 20:19:49,906 DEBUG
> >> >>>>> org.apache.hadoop.hbase.regionserver.Store:
> >> >>>>>>> Compacting
> >> >>>>>>>
> >> >>>>>>>
> >> >>>>>>
> >> >>>>>
> >> >>>>
> >> >>>
> >> >>
> >>
> hdfs://hadoop1-m1:8020/hbase/gs_raw_events/584dac5cc70d8682f71c4675a843c309/events/1971818821800304360,
> >> >>>>>>> keycount=6882816, bloomtype=NONE, size=98.3m
> >> >>>>>>> 2011-06-29 20:19:59,920 INFO
> >> >>>>>> org.apache.hadoop.hbase.regionserver.HRegion:
> >> >>>>>>> completed compaction on region
> >> >>>>>>>
> >> >>>>>>>
> >> >>>>>>
> >> >>>>>
> >> >>>>
> >> >>>
> >> >>
> >>
> gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.
> >> >>>>>>> after 10sec
> >> >>>>>>>
> >> >>>>>>> And here is the one from hadoop1-s02:
> >> >>>>>>> 2011-06-29 16:43:57,935 INFO
> >> >>>>>>> org.apache.hadoop.hbase.regionserver.HRegionServer: Received
> >> >>> request
> >> >>>> to
> >> >>>>>>> open
> >> >>>>>>> region:
> >> >>>>>>>
> >> >>>>>>>
> >> >>>>>>
> >> >>>>>
> >> >>>>
> >> >>>
> >> >>
> >>
> gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.
> >> >>>>>>> 2011-06-29 16:43:58,990 DEBUG
> >> >>>>>>> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler:
> >> >>>>>> Processing
> >> >>>>>>> open of
> >> >>>>>>>
> >> >>>>>>>
> >> >>>>>>
> >> >>>>>
> >> >>>>
> >> >>>
> >> >>
> >>
> gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.
> >> >>>>>>> 2011-06-29 16:43:58,990 DEBUG
> >> >>>>> org.apache.hadoop.hbase.zookeeper.ZKAssign:
> >> >>>>>>> regionserver:60020-0x23004a31d8904de Attempting to transition
> >> >> node
> >> >>>>>>> 584dac5cc70d8682f71c4675a843c309 from M_ZK_REGION_OFFLINE to
> >> >>>>>>> RS_ZK_REGION_OPENING
> >> >>>>>>> 2011-06-29 16:43:59,002 DEBUG
> >> >>>>> org.apache.hadoop.hbase.zookeeper.ZKAssign:
> >> >>>>>>> regionserver:60020-0x23004a31d8904de Successfully transitioned
> >> >> node
> >> >>>>>>> 584dac5cc70d8682f71c4675a843c309 from M_ZK_REGION_OFFLINE to
> >> >>>>>>> RS_ZK_REGION_OPENING
> >> >>>>>>> 2011-06-29 16:43:59,002 DEBUG
> >> >>>>>> org.apache.hadoop.hbase.regionserver.HRegion:
> >> >>>>>>> Opening region: REGION => {NAME =>
> >> >>>>>>>
> >> >>>>>>>
> >> >>>>>>
> >> >>>>>
> >> >>>>
> >> >>>
> >> >>
> >>
> 'gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.',
> >> >>>>>>> STARTKEY => 'GSLoad_1308518553_168_WEB204', ENDKEY =>
> >> >>>>>>> 'GSLoad_1308518810_1249_WEB204', ENCODED =>
> >> >>>>>>> 584dac5cc70d8682f71c4675a843c309, TABLE => {{NAME =>
> >> >>> 'gs_raw_events',
> >> >>>>>>> FAMILIES => [{NAME => 'events', BLOOMFILTER => 'NONE',
> >> >>>>> REPLICATION_SCOPE
> >> >>>>>> =>
> >> >>>>>>> '1', VERSIONS => '3', COMPRESSION => 'LZO', TTL => '604800',
> >> >>>> BLOCKSIZE
> >> >>>>> =>
> >> >>>>>>> '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}]}}
> >> >>>>>>> 2011-06-29 16:43:59,003 DEBUG
> >> >>>>>> org.apache.hadoop.hbase.regionserver.HRegion:
> >> >>>>>>> Instantiated
> >> >>>>>>>
> >> >>>>>>>
> >> >>>>>>
> >> >>>>>
> >> >>>>
> >> >>>
> >> >>
> >>
> gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.
> >> >>>>>>> 2011-06-29 16:43:59,204 DEBUG
> >> >>>>> org.apache.hadoop.hbase.regionserver.Store:
> >> >>>>>>> loaded
> >> >>>>>>>
> >> >>>>>>>
> >> >>>>>>
> >> >>>>>
> >> >>>>
> >> >>>
> >> >>
> >>
> hdfs://hadoop1-m1:8020/hbase/gs_raw_events/584dac5cc70d8682f71c4675a843c309/events/1971818821800304360,
> >> >>>>>>> isReference=false, isBulkLoadResult=false, seqid=1162228062,
> >> >>>>>>> majorCompaction=false
> >> >>>>>>> 2011-06-29 16:43:59,205 INFO
> >> >>>>>> org.apache.hadoop.hbase.regionserver.HRegion:
> >> >>>>>>> Onlined
> >> >>>>>>>
> >> >>>>>>>
> >> >>>>>>
> >> >>>>>
> >> >>>>
> >> >>>
> >> >>
> >>
> gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.;
> >> >>>>>>> next sequenceid=1162228063
> >> >>>>>>> 2011-06-29 16:43:59,205 DEBUG
> >> >>>>> org.apache.hadoop.hbase.zookeeper.ZKAssign:
> >> >>>>>>> regionserver:60020-0x23004a31d8904de Attempting to transition
> >> >> node
> >> >>>>>>> 584dac5cc70d8682f71c4675a843c309 from RS_ZK_REGION_OPENING to
> >> >>>>>>> RS_ZK_REGION_OPENING
> >> >>>>>>> 2011-06-29 16:43:59,212 DEBUG
> >> >>>>> org.apache.hadoop.hbase.zookeeper.ZKAssign:
> >> >>>>>>> regionserver:60020-0x23004a31d8904de Successfully transitioned
> >> >> node
> >> >>>>>>> 584dac5cc70d8682f71c4675a843c309 from RS_ZK_REGION_OPENING to
> >> >>>>>>> RS_ZK_REGION_OPENING
> >> >>>>>>> 2011-06-29 16:43:59,214 INFO
> >> >>>>> org.apache.hadoop.hbase.catalog.MetaEditor:
> >> >>>>>>> Updated row
> >> >>>>>>>
> >> >>>>>>>
> >> >>>>>>
> >> >>>>>
> >> >>>>
> >> >>>
> >> >>
> >>
> gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.
> >> >>>>>>> in region .META.,,1 with server=
> >> >>> hadoop1-s02.farm-ny.gigya.com:60020,
> >> >>>>>>> startcode=1306919627544
> >> >>>>>>> 2011-06-29 16:43:59,214 DEBUG
> >> >>>>> org.apache.hadoop.hbase.zookeeper.ZKAssign:
> >> >>>>>>> regionserver:60020-0x23004a31d8904de Attempting to transition
> >> >> node
> >> >>>>>>> 584dac5cc70d8682f71c4675a843c309 from RS_ZK_REGION_OPENING to
> >> >>>>>>> RS_ZK_REGION_OPENED
> >> >>>>>>> 2011-06-29 16:43:59,224 DEBUG
> >> >>>>> org.apache.hadoop.hbase.zookeeper.ZKAssign:
> >> >>>>>>> regionserver:60020-0x23004a31d8904de Successfully transitioned
> >> >> node
> >> >>>>>>> 584dac5cc70d8682f71c4675a843c309 from RS_ZK_REGION_OPENING to
> >> >>>>>>> RS_ZK_REGION_OPENED
> >> >>>>>>> 2011-06-29 16:43:59,224 DEBUG
> >> >>>>>>> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler:
> >> >>>> Opened
> >> >>>>>>>
> >> >>>>>>>
> >> >>>>>>
> >> >>>>>
> >> >>>>
> >> >>>
> >> >>
> >>
> gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.
> >> >>>>>>> java.io.IOException: Got error in response to OP_READ_BLOCK
> >> >> self=/
> >> >>>>>>> 10.1.104.2:33356, remote=/10.1.104.2:50010 for file
> >> >>>>>>>
> >> >>>>>>>
> >> >>>>>>
> >> >>>>>
> >> >>>>
> >> >>>
> >> >>
> >>
> /hbase/gs_raw_events/584dac5cc70d8682f71c4675a843c309/events/1971818821800304360
> >> >>>>>>> for block 3674866614142268536_674205
> >> >>>>>>>
> >> >>>>>>>
> >> >>>>>>> Ted, can you please point me to J-D's bug fix you mentioned? Are
> >> >>> you
> >> >>>>>>> positive it's the same scenario - data loss is a very serious
> >> >>> problem
> >> >>>>> for
> >> >>>>>> a
> >> >>>>>>> DB.
> >> >>>>>>> I'd really like to apply that patch ASAP, because when I run
> hbck
> >> >> I
> >> >>>> get
> >> >>>>>>> over
> >> >>>>>>> 400 regions which are multiply assigned.
> >> >>>>>>> Last question, I understand the region's data is lost but is
> >> >> there
> >> >>> a
> >> >>>>> way
> >> >>>>>> to
> >> >>>>>>> at least make the table consistent again by some how removing
> the
> >> >>>> lost
> >> >>>>>>> region?
> >> >>>>>>>
> >> >>>>>>> Thanks.
> >> >>>>>>>
> >> >>>>>>> -eran
> >> >>>>>>>
> >> >>>>>>>
> >> >>>>>>>
> >> >>>>>>> On Sat, Jul 2, 2011 at 01:46, Ted Yu <[email protected]>
> >> >> wrote:
> >> >>>>>>>
> >> >>>>>>>>>> 2011-06-29 16:43:57,880 INFO
> >> >>>>>>>> org.apache.hadoop.hbase.
> >> >>>>>>>> master.AssignmentManager: Region has been
> >> >>>>>>>> PENDING_OPEN for too long, reassigning
> >> >>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>
> >> >>>>>>
> >> >>>>>
> >> >>>>
> >> >>>
> >> >>
> >>
> region=gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.
> >> >>>>>>>>
> >> >>>>>>>> The double assignment should have been fixed by J-D's recent
> >> >>>> checkin.
> >> >>>>>>>>
> >> >>>>>>>> On Fri, Jul 1, 2011 at 3:14 PM, Stack <[email protected]>
> >> >> wrote:
> >> >>>>>>>>
> >> >>>>>>>>> Is
> >> >>>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>
> >> >>>>>>
> >> >>>>>
> >> >>>>
> >> >>>
> >> >>
> >>
> gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.
> >> >>>>>>>>> the region that was having the issue?  If so, if you looked
> >> >> in
> >> >>>>>>>>> hadoop1-s05's logs, was this region opened around 2011-06-29
> >> >>>>>> 16:43:57?
> >> >>>>>>>>> Was it also opened hadoop1-s02 not long after?  Did you say
> >> >>> what
> >> >>>>>>>>> version of hbase you are on?
> >> >>>>>>>>>
> >> >>>>>>>>> St.Ack
> >> >>>>>>>>>
> >> >>>>>>>>> On Fri, Jul 1, 2011 at 5:08 AM, Eran Kutner <[email protected]>
> >> >>>>> wrote:
> >> >>>>>>>>>> Hi Stack,
> >> >>>>>>>>>> I'm not sure what the log means. I do see references to two
> >> >>>>>> different
> >> >>>>>>>>>> servers, but that would probably happen if there was normal
> >> >>>>>>> transition
> >> >>>>>>>> I
> >> >>>>>>>>>> assume. I'm using version 0.90.3
> >> >>>>>>>>>> Here are the relevant lines from the master logs:
> >> >>>>>>>>>>
> >> >>>>>>>>>> 2011-06-19 21:39:37,164 INFO
> >> >>>>>>>>> org.apache.hadoop.hbase.master.ServerManager:
> >> >>>>>>>>>> Received REGION_SPLIT:
> >> >>>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>
> >> >>>>>>
> >> >>>>>
> >> >>>>
> >> >>>
> >> >>
> >>
> gs_raw_events,GSLoad_1308518553_168_WEB204,1308533691659.9000a5d8df9502efc90d2c23567e4658.:
> >> >>>>>>>>>> Daughters;
> >> >>>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>
> >> >>>>>>
> >> >>>>>
> >> >>>>
> >> >>>
> >> >>
> >>
> gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.,
> >> >>>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>
> >> >>>>>>
> >> >>>>>
> >> >>>>
> >> >>>
> >> >>
> >>
> gs_raw_events,GSLoad_1308518810_1249_WEB204,1308533970928.46f876a4e97be04edb35eb8f8959d482.
> >> >>>>>>>>>> from hadoop1-s05.farm-ny.gigya.com,60020,1307349217076
> >> >>>>>>>>>> 2011-06-19 21:43:12,983 INFO
> >> >>>>>>>> org.apache.hadoop.hbase.catalog.MetaEditor:
> >> >>>>>>>>>> Deleted daughter reference
> >> >>>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>
> >> >>>>>>
> >> >>>>>
> >> >>>>
> >> >>>
> >> >>
> >>
> gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.,
> >> >>>>>>>>>> qualifier=splitA, from parent
> >> >>>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>
> >> >>>>>>
> >> >>>>>
> >> >>>>
> >> >>>
> >> >>
> >>
> gs_raw_events,GSLoad_1308518553_168_WEB204,1308533691659.9000a5d8df9502efc90d2c23567e4658.
> >> >>>>>>>>>> 2011-06-29 16:29:36,143 DEBUG
> >> >>>>>>>>>> org.apache.hadoop.hbase.master.AssignmentManager: Starting
> >> >>>>>>> unassignment
> >> >>>>>>>>> of
> >> >>>>>>>>>> region
> >> >>>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>
> >> >>>>>>
> >> >>>>>
> >> >>>>
> >> >>>
> >> >>
> >>
> gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.
> >> >>>>>>>>>> (offlining)
> >> >>>>>>>>>> 2011-06-29 16:29:36,146 DEBUG
> >> >>>>>>>>>> org.apache.hadoop.hbase.master.AssignmentManager: Sent
> >> >> CLOSE
> >> >>> to
> >> >>>>>>>>> serverName=
> >> >>>>>>>>>> hadoop1-s05.farm-ny.gigya.com,60020,1307349217076,
> >> >>>>>> load=(requests=0,
> >> >>>>>>>>>> regions=1654, usedHeap=1870, maxHeap=12483) for region
> >> >>>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>
> >> >>>>>>
> >> >>>>>
> >> >>>>
> >> >>>
> >> >>
> >>
> gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.
> >> >>>>>>>>>> 2011-06-29 16:29:38,327 DEBUG
> >> >>>>>>>>>> org.apache.hadoop.hbase.master.AssignmentManager: Handling
> >> >>> new
> >> >>>>>>>> unassigned
> >> >>>>>>>>>> node: /hbase/unassigned/584dac5cc70d8682f71c4675a843c309
> >> >>>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>
> >> >>>>>>
> >> >>>>>
> >> >>>>
> >> >>>
> >> >>
> >>
> (region=gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.,
> >> >>>>>>>>>> server=hadoop1-s05.farm-ny.gigya.com,60020,1307349217076,
> >> >>>>>>>>>> state=RS_ZK_REGION_CLOSED)
> >> >>>>>>>>>> 2011-06-29 16:29:38,327 DEBUG
> >> >>>>>>>>>> org.apache.hadoop.hbase.master.AssignmentManager: Handling
> >> >>>>>>>>>> transition=RS_ZK_REGION_CLOSED,
> >> >>>>>>>>>> server=hadoop1-s05.farm-ny.gigya.com,60020,1307349217076,
> >> >>>>>>>>>> region=584dac5cc70d8682f71c4675a843c309
> >> >>>>>>>>>> 2011-06-29 16:30:53,742 DEBUG
> >> >>>>>>>>>> org.apache.hadoop.hbase.master.handler.ClosedRegionHandler:
> >> >>>>>> Handling
> >> >>>>>>>>> CLOSED
> >> >>>>>>>>>> event for 584dac5cc70d8682f71c4675a843c309
> >> >>>>>>>>>> 2011-06-29 16:30:53,742 DEBUG
> >> >>>>>>>>>> org.apache.hadoop.hbase.master.AssignmentManager: Table
> >> >> being
> >> >>>>>>> disabled
> >> >>>>>>>> so
> >> >>>>>>>>>> deleting ZK node and removing from regions in transition,
> >> >>>>> skipping
> >> >>>>>>>>>> assignment of region
> >> >>>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>
> >> >>>>>>
> >> >>>>>
> >> >>>>
> >> >>>
> >> >>
> >>
> gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.
> >> >>>>>>>>>> 2011-06-29 16:30:53,742 DEBUG
> >> >>>>>>>> org.apache.hadoop.hbase.zookeeper.ZKAssign:
> >> >>>>>>>>>> master:60000-0x13004a31d7804c4 Deleting existing unassigned
> >> >>>> node
> >> >>>>>> for
> >> >>>>>>>>>> 584dac5cc70d8682f71c4675a843c309 that is in expected state
> >> >>>>>>>>>> RS_ZK_REGION_CLOSED
> >> >>>>>>>>>> 2011-06-29 16:30:53,801 DEBUG
> >> >>>>>>>> org.apache.hadoop.hbase.zookeeper.ZKAssign:
> >> >>>>>>>>>> master:60000-0x13004a31d7804c4 Successfully deleted
> >> >>> unassigned
> >> >>>>> node
> >> >>>>>>> for
> >> >>>>>>>>>> region 584dac5cc70d8682f71c4675a843c309 in expected state
> >> >>>>>>>>>> RS_ZK_REGION_CLOSED
> >> >>>>>>>>>> 2011-06-29 16:34:01,453 INFO
> >> >>>>>>>> org.apache.hadoop.hbase.catalog.MetaEditor:
> >> >>>>>>>>>> Updated region
> >> >>>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>
> >> >>>>>>
> >> >>>>>
> >> >>>>
> >> >>>
> >> >>
> >>
> gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.
> >> >>>>>>>>>> in META
> >> >>>>>>>>>> 2011-06-29 16:37:12,247 DEBUG
> >> >>>>>>>> org.apache.hadoop.hbase.zookeeper.ZKAssign:
> >> >>>>>>>>>> master:60000-0x13004a31d7804c4 Creating (or updating)
> >> >>>> unassigned
> >> >>>>>> node
> >> >>>>>>>> for
> >> >>>>>>>>>> 584dac5cc70d8682f71c4675a843c309 with OFFLINE state
> >> >>>>>>>>>> 2011-06-29 16:37:12,576 DEBUG
> >> >>>>>>>>>> org.apache.hadoop.hbase.master.AssignmentManager: No
> >> >> previous
> >> >>>>>>>> transition
> >> >>>>>>>>>> plan was found (or we are ignoring an existing plan) for
> >> >>>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>
> >> >>>>>>
> >> >>>>>
> >> >>>>
> >> >>>
> >> >>
> >>
> gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.
> >> >>>>>>>>>> so generated a random one;
> >> >>>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>
> >> >>>>>>
> >> >>>>>
> >> >>>>
> >> >>>
> >> >>
> >>
> hri=gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.,
> >> >>>>>>>>>> src=, dest=hadoop1-s05.farm-ny.gigya.com
> >> >>> ,60020,1307349217076;
> >> >>>> 5
> >> >>>>>>>>> (online=5,
> >> >>>>>>>>>> exclude=null) available servers
> >> >>>>>>>>>> 2011-06-29 16:37:12,576 DEBUG
> >> >>>>>>>>>> org.apache.hadoop.hbase.master.AssignmentManager: Assigning
> >> >>>>> region
> >> >>>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>
> >> >>>>>>
> >> >>>>>
> >> >>>>
> >> >>>
> >> >>
> >>
> gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.
> >> >>>>>>>>>> to hadoop1-s05.farm-ny.gigya.com,60020,1307349217076
> >> >>>>>>>>>> 2011-06-29 16:37:13,102 DEBUG
> >> >>>>>>>>>> org.apache.hadoop.hbase.master.AssignmentManager: Handling
> >> >>>>>>>>>> transition=RS_ZK_REGION_OPENED,
> >> >>>>>>>>>> server=hadoop1-s05.farm-ny.gigya.com,60020,1307349217076,
> >> >>>>>>>>>> region=584dac5cc70d8682f71c4675a843c309
> >> >>>>>>>>>> 2011-06-29 16:39:54,075 DEBUG
> >> >>>>>>>>>> org.apache.hadoop.hbase.master.handler.OpenedRegionHandler:
> >> >>>>>> Handling
> >> >>>>>>>>> OPENED
> >> >>>>>>>>>> event for 584dac5cc70d8682f71c4675a843c309; deleting
> >> >>> unassigned
> >> >>>>>> node
> >> >>>>>>>>>> 2011-06-29 16:39:54,075 DEBUG
> >> >>>>>>>> org.apache.hadoop.hbase.zookeeper.ZKAssign:
> >> >>>>>>>>>> master:60000-0x13004a31d7804c4 Deleting existing unassigned
> >> >>>> node
> >> >>>>>> for
> >> >>>>>>>>>> 584dac5cc70d8682f71c4675a843c309 that is in expected state
> >> >>>>>>>>>> RS_ZK_REGION_OPENED
> >> >>>>>>>>>> 2011-06-29 16:39:54,192 DEBUG
> >> >>>>>>>> org.apache.hadoop.hbase.zookeeper.ZKAssign:
> >> >>>>>>>>>> master:60000-0x13004a31d7804c4 Successfully deleted
> >> >>> unassigned
> >> >>>>> node
> >> >>>>>>> for
> >> >>>>>>>>>> region 584dac5cc70d8682f71c4675a843c309 in expected state
> >> >>>>>>>>>> RS_ZK_REGION_OPENED
> >> >>>>>>>>>> 2011-06-29 16:39:54,326 DEBUG
> >> >>>>>>>>>> org.apache.hadoop.hbase.master.handler.OpenedRegionHandler:
> >> >>>>> Opened
> >> >>>>>>>> region
> >> >>>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>
> >> >>>>>>
> >> >>>>>
> >> >>>>
> >> >>>
> >> >>
> >>
> gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.
> >> >>>>>>>>>> on hadoop1-s05.farm-ny.gigya.com,60020,1307349217076
> >> >>>>>>>>>> 2011-06-29 16:40:00,598 DEBUG
> >> >>>>>>>> org.apache.hadoop.hbase.zookeeper.ZKAssign:
> >> >>>>>>>>>> master:60000-0x13004a31d7804c4 Creating (or updating)
> >> >>>> unassigned
> >> >>>>>> node
> >> >>>>>>>> for
> >> >>>>>>>>>> 584dac5cc70d8682f71c4675a843c309 with OFFLINE state
> >> >>>>>>>>>> 2011-06-29 16:40:00,877 DEBUG
> >> >>>>>>>>>> org.apache.hadoop.hbase.master.AssignmentManager: No
> >> >> previous
> >> >>>>>>>> transition
> >> >>>>>>>>>> plan was found (or we are ignoring an existing plan) for
> >> >>>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>
> >> >>>>>>
> >> >>>>>
> >> >>>>
> >> >>>
> >> >>
> >>
> gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.
> >> >>>>>>>>>> so generated a random one;
> >> >>>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>
> >> >>>>>>
> >> >>>>>
> >> >>>>
> >> >>>
> >> >>
> >>
> hri=gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.,
> >> >>>>>>>>>> src=, dest=hadoop1-s05.farm-ny.gigya.com
> >> >>> ,60020,1307349217076;
> >> >>>> 5
> >> >>>>>>>>> (online=5,
> >> >>>>>>>>>> exclude=null) available servers
> >> >>>>>>>>>> 2011-06-29 16:40:00,877 DEBUG
> >> >>>>>>>>>> org.apache.hadoop.hbase.master.AssignmentManager: Assigning
> >> >>>>> region
> >> >>>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>
> >> >>>>>>
> >> >>>>>
> >> >>>>
> >> >>>
> >> >>
> >>
> gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.
> >> >>>>>>>>>> to hadoop1-s05.farm-ny.gigya.com,60020,1307349217076
> >> >>>>>>>>>> 2011-06-29 16:43:57,879 INFO
> >> >>>>>>>>>> org.apache.hadoop.hbase.master.AssignmentManager: Regions
> >> >> in
> >> >>>>>>> transition
> >> >>>>>>>>>> timed out:
> >> >>>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>
> >> >>>>>>
> >> >>>>>
> >> >>>>
> >> >>>
> >> >>
> >>
> gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.
> >> >>>>>>>>>> state=PENDING_OPEN, ts=1309380052723
> >> >>>>>>>>>> 2011-06-29 16:43:57,880 INFO
> >> >>>>>>>>>> org.apache.hadoop.hbase.master.AssignmentManager: Region
> >> >> has
> >> >>>> been
> >> >>>>>>>>>> PENDING_OPEN for too long, reassigning
> >> >>>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>
> >> >>>>>>
> >> >>>>>
> >> >>>>
> >> >>>
> >> >>
> >>
> region=gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.
> >> >>>>>>>>>> 2011-06-29 16:43:57,936 DEBUG
> >> >>>>>>>>>> org.apache.hadoop.hbase.master.AssignmentManager: Forcing
> >> >>>>> OFFLINE;
> >> >>>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>
> >> >>>>>>
> >> >>>>>
> >> >>>>
> >> >>>
> >> >>
> >>
> was=gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.
> >> >>>>>>>>>> state=PENDING_OPEN, ts=1309380052723
> >> >>>>>>>>>> 2011-06-29 16:43:57,936 DEBUG
> >> >>>>>>>>>> org.apache.hadoop.hbase.master.AssignmentManager: No
> >> >> previous
> >> >>>>>>>> transition
> >> >>>>>>>>>> plan was found (or we are ignoring an existing plan) for
> >> >>>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>
> >> >>>>>>
> >> >>>>>
> >> >>>>
> >> >>>
> >> >>
> >>
> gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.
> >> >>>>>>>>>> so generated a random one;
> >> >>>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>
> >> >>>>>>
> >> >>>>>
> >> >>>>
> >> >>>
> >> >>
> >>
> hri=gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.,
> >> >>>>>>>>>> src=, dest=hadoop1-s02.farm-ny.gigya.com
> >> >>> ,60020,1306919627544;
> >> >>>> 5
> >> >>>>>>>>> (online=5,
> >> >>>>>>>>>> exclude=null) available servers
> >> >>>>>>>>>> 2011-06-29 16:43:57,936 DEBUG
> >> >>>>>>>>>> org.apache.hadoop.hbase.master.AssignmentManager: Assigning
> >> >>>>> region
> >> >>>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>
> >> >>>>>>
> >> >>>>>
> >> >>>>
> >> >>>
> >> >>
> >>
> gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.
> >> >>>>>>>>>> to hadoop1-s02.farm-ny.gigya.com,60020,1306919627544
> >> >>>>>>>>>> 2011-06-29 16:43:59,022 DEBUG
> >> >>>>>>>>>> org.apache.hadoop.hbase.master.AssignmentManager: Handling
> >> >>>>>>>>>> transition=RS_ZK_REGION_OPENING,
> >> >>>>>>>>>> server=hadoop1-s02.farm-ny.gigya.com,60020,1306919627544,
> >> >>>>>>>>>> region=584dac5cc70d8682f71c4675a843c309
> >> >>>>>>>>>> 2011-06-29 16:43:59,221 DEBUG
> >> >>>>>>>>>> org.apache.hadoop.hbase.master.AssignmentManager: Handling
> >> >>>>>>>>>> transition=RS_ZK_REGION_OPENING,
> >> >>>>>>>>>> server=hadoop1-s02.farm-ny.gigya.com,60020,1306919627544,
> >> >>>>>>>>>> region=584dac5cc70d8682f71c4675a843c309
> >> >>>>>>>>>> 2011-06-29 16:43:59,226 DEBUG
> >> >>>>>>>>>> org.apache.hadoop.hbase.master.AssignmentManager: Handling
> >> >>>>>>>>>> transition=RS_ZK_REGION_OPENED,
> >> >>>>>>>>>> server=hadoop1-s02.farm-ny.gigya.com,60020,1306919627544,
> >> >>>>>>>>>> region=584dac5cc70d8682f71c4675a843c309
> >> >>>>>>>>>> 2011-06-29 16:43:59,274 DEBUG
> >> >>>>>>>>>> org.apache.hadoop.hbase.master.handler.OpenedRegionHandler:
> >> >>>>>> Handling
> >> >>>>>>>>> OPENED
> >> >>>>>>>>>> event for 584dac5cc70d8682f71c4675a843c309; deleting
> >> >>> unassigned
> >> >>>>>> node
> >> >>>>>>>>>> 2011-06-29 16:43:59,274 DEBUG
> >> >>>>>>>> org.apache.hadoop.hbase.zookeeper.ZKAssign:
> >> >>>>>>>>>> master:60000-0x13004a31d7804c4 Deleting existing unassigned
> >> >>>> node
> >> >>>>>> for
> >> >>>>>>>>>> 584dac5cc70d8682f71c4675a843c309 that is in expected state
> >> >>>>>>>>>> RS_ZK_REGION_OPENED
> >> >>>>>>>>>> 2011-06-29 16:43:59,296 DEBUG
> >> >>>>>>>> org.apache.hadoop.hbase.zookeeper.ZKAssign:
> >> >>>>>>>>>> master:60000-0x13004a31d7804c4 Successfully deleted
> >> >>> unassigned
> >> >>>>> node
> >> >>>>>>> for
> >> >>>>>>>>>> region 584dac5cc70d8682f71c4675a843c309 in expected state
> >> >>>>>>>>>> RS_ZK_REGION_OPENED
> >> >>>>>>>>>> 2011-06-29 16:43:59,375 WARN
> >> >>>>>>>>>> org.apache.hadoop.hbase.master.AssignmentManager:
> >> >> Overwriting
> >> >>>>>>>>>> 584dac5cc70d8682f71c4675a843c309 on
> >> >>>>>>>>>> serverName=hadoop1-s05.farm-ny.gigya.com
> >> >>> ,60020,1307349217076,
> >> >>>>>>>>>> load=(requests=0, regions=1273, usedHeap=2676,
> >> >> maxHeap=12483)
> >> >>>>>>>>>> 2011-06-29 16:43:59,375 DEBUG
> >> >>>>>>>>>> org.apache.hadoop.hbase.master.handler.OpenedRegionHandler:
> >> >>>>> Opened
> >> >>>>>>>> region
> >> >>>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>
> >> >>>>>>
> >> >>>>>
> >> >>>>
> >> >>>
> >> >>
> >>
> gs_raw_events,GSLoad_1308518553_168_WEB204,1308533970928.584dac5cc70d8682f71c4675a843c309.
> >> >>>>>>>>>> on hadoop1-s02.farm-ny.gigya.com,60020,1306919627544
> >> >>>>>>>>>>
> >> >>>>>>>>>> Thanks.
> >> >>>>>>>>>>
> >> >>>>>>>>>> -eran
> >> >>>>>>>>>>
> >> >>>>>>>>>>
> >> >>>>>>>>>>
> >> >>>>>>>>>> On Fri, Jul 1, 2011 at 09:05, Stack <[email protected]>
> >> >>> wrote:
> >> >>>>>>>>>>
> >> >>>>>>>>>>> So, Eran, it seems as though two RegionServers were
> >> >> carrying
> >> >>>> the
> >> >>>>>>>>>>> region?  One deleted a file (compaction on its side)?  Can
> >> >>> you
> >> >>>>>>> figure
> >> >>>>>>>>>>> if indeed two servers had same region?  (Check master logs
> >> >>> for
> >> >>>>>> this
> >> >>>>>>>>>>> regions assignments).
> >> >>>>>>>>>>>
> >> >>>>>>>>>>> What version of hbase?
> >> >>>>>>>>>>>
> >> >>>>>>>>>>> St.Ack
> >> >>>>>>>>>>>
> >> >>>>>>>>>>>
> >> >>>>>>>>>>> On Thu, Jun 30, 2011 at 3:58 AM, Eran Kutner <
> >> >>> [email protected]>
> >> >>>>>>> wrote:
> >> >>>>>>>>>>>> Hi,
> >> >>>>>>>>>>>> I have a cluster of 5 nodes with one large table that
> >> >>>>> currently
> >> >>>>>>> has
> >> >>>>>>>>>>> around
> >> >>>>>>>>>>>> 12000 regions. Everything was working fine for
> >> >> relatively
> >> >>>> long
> >> >>>>>>> time,
> >> >>>>>>>>>>> until
> >> >>>>>>>>>>>> now.
> >> >>>>>>>>>>>> Yesterday I significantly reduced the TTL on the table
> >> >> and
> >> >>>>>>> initiated
> >> >>>>>>>>>>> major
> >> >>>>>>>>>>>> compaction. This should have reduced the table size to
> >> >>> about
> >> >>>>> 20%
> >> >>>>>>> of
> >> >>>>>>>>> its
> >> >>>>>>>>>>>> original size.
> >> >>>>>>>>>>>> Today, I'm getting errors of inaccessible files on HDFS,
> >> >>> for
> >> >>>>>>>> example:
> >> >>>>>>>>>>>> java.io.IOException: Got error in response to
> >> >>> OP_READ_BLOCK
> >> >>>>>> self=/
> >> >>>>>>>>>>>> 10.1.104.2:58047, remote=/10.1.104.2:50010 for file
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>
> >> >>>>>>
> >> >>>>>
> >> >>>>
> >> >>>
> >> >>
> >>
> /hbase/gs_raw_events/584dac5cc70d8682f71c4675a843c309/events/1971818821800304360
> >> >>>>>>>>>>>> for block 3674866614142268536_674205
> >> >>>>>>>>>>>>       at
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>
> >> >>>>>>
> >> >>>>>
> >> >>>>
> >> >>>
> >> >>
> >>
> org.apache.hadoop.hdfs.DFSClient$BlockReader.newBlockReader(DFSClient.java:1487)
> >> >>>>>>>>>>>>       at
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>
> >> >>>>>>
> >> >>>>>
> >> >>>>
> >> >>>
> >> >>
> >>
> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:1811)
> >> >>>>>>>>>>>>       at
> >> >>>>>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>
> >> >>>>>
> >> >>>
> >>
> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1948)
> >> >>>>>>>>>>>>       at
> >> >>>>> java.io.DataInputStream.read(DataInputStream.java:132)
> >> >>>>>>>>>>>>       at
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>
> >> >>>>>>
> >> >>>>>
> >> >>>>
> >> >>>
> >> >>
> >>
> org.apache.hadoop.hbase.io.hfile.BoundedRangeFileInputStream.read(BoundedRangeFileInputStream.java:105)
> >> >>>>>>>>>>>>       at
> >> >>>>>>>>>
> >> >> java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
> >> >>>>>>>>>>>>       at
> >> >>>>>>>>>
> >> >> java.io.BufferedInputStream.read(BufferedInputStream.java:237)
> >> >>>>>>>>>>>>       at
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>
> >> >>>>>>
> >> >>>>>
> >> >>>>
> >> >>>
> >> >>
> >>
> org.apache.hadoop.io.compress.BlockDecompressorStream.rawReadInt(BlockDecompressorStream.java:128)
> >> >>>>>>>>>>>>       at
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>
> >> >>>>>>
> >> >>>>>
> >> >>>>
> >> >>>
> >> >>
> >>
> org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(BlockDecompressorStream.java:68)
> >> >>>>>>>>>>>>       at
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>
> >> >>>>>>
> >> >>>>>
> >> >>>>
> >> >>>
> >> >>
> >>
> org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:75)
> >> >>>>>>>>>>>>       at
> >> >>>>>>>>>
> >> >> java.io.BufferedInputStream.read1(BufferedInputStream.java:256)
> >> >>>>>>>>>>>>       at
> >> >>>>>>>>>
> >> >> java.io.BufferedInputStream.read(BufferedInputStream.java:317)
> >> >>>>>>>>>>>>       at
> >> >>>>>> org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:102)
> >> >>>>>>>>>>>>       at
> >> >>>>>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>
> >> >>>>>
> >> >>>
> >>
> org.apache.hadoop.hbase.io.hfile.HFile$Reader.decompress(HFile.java:1094)
> >> >>>>>>>>>>>>       at
> >> >>>>>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>
> >> >>>>>
> >> >>>
> >> org.apache.hadoop.hbase.io.hfile.HFile$Reader.readBlock(HFile.java:1036)
> >> >>>>>>>>>>>>       at
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>
> >> >>>>>>
> >> >>>>>
> >> >>>>
> >> >>>
> >> >>
> >>
> org.apache.hadoop.hbase.io.hfile.HFile$Reader$Scanner.seekTo(HFile.java:1433)
> >> >>>>>>>>>>>>       at
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>
> >> >>>>>>
> >> >>>>>
> >> >>>>
> >> >>>
> >> >>
> >>
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.seekAtOrAfter(StoreFileScanner.java:139)
> >> >>>>>>>>>>>>       at
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>
> >> >>>>>>
> >> >>>>>
> >> >>>>
> >> >>>
> >> >>
> >>
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.seek(StoreFileScanner.java:96)
> >> >>>>>>>>>>>>       at
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>
> >> >>>>>>
> >> >>>>>
> >> >>>>
> >> >>>
> >> >>
> >>
> org.apache.hadoop.hbase.regionserver.StoreScanner.<init>(StoreScanner.java:77)
> >> >>>>>>>>>>>>       at
> >> >>>>>>>>>>>>
> >> >>>>>>>>
> >> >>>>>
> >> >>
> org.apache.hadoop.hbase.regionserver.Store.getScanner(Store.java:1341)
> >> >>>>>>>>>>>>       at
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>
> >> >>>>>>
> >> >>>>>
> >> >>>>
> >> >>>
> >> >>
> >>
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScanner.<init>(HRegion.java:2269)
> >> >>>>>>>>>>>>       at
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>
> >> >>>>>>
> >> >>>>>
> >> >>>>
> >> >>>
> >> >>
> >>
> org.apache.hadoop.hbase.regionserver.HRegion.instantiateInternalScanner(HRegion.java:1126)
> >> >>>>>>>>>>>>       at
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>
> >> >>>>>>
> >> >>>>>
> >> >>>>
> >> >>>
> >> >>
> >>
> org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:1118)
> >> >>>>>>>>>>>>       at
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>
> >> >>>>>>
> >> >>>>>
> >> >>>>
> >> >>>
> >> >>
> >>
> org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:1102)
> >> >>>>>>>>>>>>       at
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>
> >> >>>>>>
> >> >>>>>
> >> >>>>
> >> >>>
> >> >>
> >>
> org.apache.hadoop.hbase.regionserver.HRegionServer.openScanner(HRegionServer.java:1781)
> >> >>>>>>>>>>>>       at
> >> >>>> sun.reflect.GeneratedMethodAccessor46.invoke(Unknown
> >> >>>>>>>> Source)
> >> >>>>>>>>>>>>       at
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>
> >> >>>>>>
> >> >>>>>
> >> >>>>
> >> >>>
> >> >>
> >>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> >> >>>>>>>>>>>>       at
> >> >> java.lang.reflect.Method.invoke(Method.java:597)
> >> >>>>>>>>>>>>       at
> >> >>>>>>>>>>>>
> >> >>>>>>>
> >> >> org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:570)
> >> >>>>>>>>>>>>       at
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>
> >> >>>>>>
> >> >>>>>
> >> >>>>
> >> >>>
> >> >>
> >>
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1039)
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>> I checked and the file, indeed doesn't exist on HDFS,
> >> >> here
> >> >>>> is
> >> >>>>>> the
> >> >>>>>>>> name
> >> >>>>>>>>>>> node
> >> >>>>>>>>>>>> logs for this block, apparently because it was deleted:
> >> >>>>>>>>>>>> 2011-06-19 21:39:36,651 INFO
> >> >>>>> org.apache.hadoop.hdfs.StateChange:
> >> >>>>>>>>> BLOCK*
> >> >>>>>>>>>>>> NameSystem.allocateBlock:
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>
> >> >>>>>>
> >> >>>>>
> >> >>>>
> >> >>>
> >> >>
> >>
> /hbase/gs_raw_events/584dac5cc70d8682f71c4675a843c309/.tmp/2096863423111131624.
> >> >>>>>>>>>>>> blk_3674866614142268536_674205
> >> >>>>>>>>>>>> 2011-06-19 21:40:11,954 INFO
> >> >>>>> org.apache.hadoop.hdfs.StateChange:
> >> >>>>>>>>> BLOCK*
> >> >>>>>>>>>>>> NameSystem.addStoredBlock: blockMap updated:
> >> >>>> 10.1.104.2:50010
> >> >>>>> is
> >> >>>>>>>>> added
> >> >>>>>>>>>>> to
> >> >>>>>>>>>>>> blk_3674866614142268536_674205 size 67108864
> >> >>>>>>>>>>>> 2011-06-19 21:40:11,954 INFO
> >> >>>>> org.apache.hadoop.hdfs.StateChange:
> >> >>>>>>>>> BLOCK*
> >> >>>>>>>>>>>> NameSystem.addStoredBlock: blockMap updated:
> >> >>>> 10.1.104.3:50010
> >> >>>>> is
> >> >>>>>>>>> added
> >> >>>>>>>>>>> to
> >> >>>>>>>>>>>> blk_3674866614142268536_674205 size 67108864
> >> >>>>>>>>>>>> 2011-06-19 21:40:11,955 INFO
> >> >>>>> org.apache.hadoop.hdfs.StateChange:
> >> >>>>>>>>> BLOCK*
> >> >>>>>>>>>>>> NameSystem.addStoredBlock: blockMap updated:
> >> >>>> 10.1.104.5:50010
> >> >>>>> is
> >> >>>>>>>>> added
> >> >>>>>>>>>>> to
> >> >>>>>>>>>>>> blk_3674866614142268536_674205 size 67108864
> >> >>>>>>>>>>>> 2011-06-29 20:20:01,662 INFO
> >> >>>>> org.apache.hadoop.hdfs.StateChange:
> >> >>>>>>>>> BLOCK*
> >> >>>>>>>>>>> ask
> >> >>>>>>>>>>>> 10.1.104.2:50010 to delete
> >> >>> blk_3674866614142268536_674205
> >> >>>>>>>>>>>> 2011-06-29 20:20:13,671 INFO
> >> >>>>> org.apache.hadoop.hdfs.StateChange:
> >> >>>>>>>>> BLOCK*
> >> >>>>>>>>>>> ask
> >> >>>>>>>>>>>> 10.1.104.5:50010 to delete
> >> >>> blk_-4056387895369608597_675174
> >> >>>>>>>>>>>> blk_-5017882805850873821_672281
> >> >>>> blk_702373987100607684_672288
> >> >>>>>>>>>>>> blk_-5357157478043290010_668506
> >> >>>> blk_7118175133735412789_674903
> >> >>>>>>>>>>>> blk_-3569812563715986384_675231
> >> >>>> blk_8296855057240604851_669285
> >> >>>>>>>>>>>> blk_-6483679172530609101_674268
> >> >>>> blk_8738539715363739108_673682
> >> >>>>>>>>>>>> blk_1744841904626813502_675238
> >> >>>> blk_-6035315106100051103_674266
> >> >>>>>>>>>>>> blk_-1789501623010070237_674908
> >> >>>> blk_1944054629336265129_673689
> >> >>>>>>>>>>>> blk_3674866614142268536_674205
> >> >>>> blk_7930425446738143892_647410
> >> >>>>>>>>>>>> blk_-3007186753042268449_669296
> >> >>>>> blk_-5482302621772778061_647416
> >> >>>>>>>>>>>> blk_-3765735404924932181_672004
> >> >>>> blk_7476090998956811081_675169
> >> >>>>>>>>>>>> blk_7862291659285127712_646890
> >> >>>> blk_-2666244746343584727_672013
> >> >>>>>>>>>>>> blk_6039172613960915602_674206
> >> >>>> blk_-8470884397893086564_646899
> >> >>>>>>>>>>>> blk_4558230221166712802_668510
> >> >>>>>>>>>>>> 2011-06-29 20:20:46,698 INFO
> >> >>>>> org.apache.hadoop.hdfs.StateChange:
> >> >>>>>>>>> BLOCK*
> >> >>>>>>>>>>> ask
> >> >>>>>>>>>>>> 10.1.104.3:50010 to delete
> >> >>> blk_-7851606440036350812_671552
> >> >>>>>>>>>>>> blk_9214649160203453845_647566
> >> >>> blk_702373987100607684_672288
> >> >>>>>>>>>>>> blk_5958099369749234073_668143
> >> >>>> blk_-5172218034084903173_673109
> >> >>>>>>>>>>>> blk_-2934555181472719276_646476
> >> >>>>> blk_-1409986679370073931_672552
> >> >>>>>>>>>>>> blk_-2786034325506235869_669086
> >> >>>> blk_3674866614142268536_674205
> >> >>>>>>>>>>>> blk_510158930393283118_673225
> >> >>> blk_916244738216205237_677068
> >> >>>>>>>>>>>> blk_-4317027806407316617_670379
> >> >>>> blk_8555705688850972639_673485
> >> >>>>>>>>>>>> blk_-3765735404924932181_672004
> >> >>>>> blk_-5482302621772778061_647416
> >> >>>>>>>>>>>> blk_-2461801145731752623_674605
> >> >>>>> blk_-8737702908048998927_672549
> >> >>>>>>>>>>>> blk_-8470884397893086564_646899
> >> >>>> blk_4558230221166712802_668510
> >> >>>>>>>>>>>> blk_-4056387895369608597_675174
> >> >>>>> blk_-8675430610673886073_647695
> >> >>>>>>>>>>>> blk_-6642870230256028318_668211
> >> >>>>> blk_-3890408516362176771_677483
> >> >>>>>>>>>>>> blk_-3569812563715986384_675231
> >> >>>>> blk_-5007142629771321873_674548
> >> >>>>>>>>>>>> blk_-3345355191863431669_667066
> >> >>>> blk_8296855057240604851_669285
> >> >>>>>>>>>>>> blk_-6595462308187757470_672420
> >> >>>>> blk_-2583945228783203947_674607
> >> >>>>>>>>>>>> blk_-346988625120916345_677063
> >> >>>> blk_4449525876338684218_674496
> >> >>>>>>>>>>>> blk_2617172363857549730_668201
> >> >>>> blk_8738539715363739108_673682
> >> >>>>>>>>>>>> blk_-208904675456598428_679286
> >> >>>> blk_-497549341281882641_646477
> >> >>>>>>>>>>>> blk_-6035315106100051103_674266
> >> >>>>> blk_-2356539038067297411_672388
> >> >>>>>>>>>>>> blk_-3881703084497103249_668137
> >> >>>> blk_2214397881104950315_646643
> >> >>>>>>>>>>>> blk_-5907671443455357710_673223
> >> >>>>> blk_-2431880309956605679_669204
> >> >>>>>>>>>>>> blk_6039172613960915602_674206
> >> >>>> blk_5053643911633142711_669194
> >> >>>>>>>>>>>> blk_-2636977729205236686_674664
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>> I assume the file loss is somehow related to this change
> >> >>> and
> >> >>>>> the
> >> >>>>>>>> major
> >> >>>>>>>>>>>> compaction that followed because the same scan that is
> >> >>>> failing
> >> >>>>>> now
> >> >>>>>>>> was
> >> >>>>>>>>>>>> working fine yesterday and that is the only changed that
> >> >>>>>> happened
> >> >>>>>>> on
> >> >>>>>>>>> the
> >> >>>>>>>>>>>> cluster. Any suggestions what to do now?
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>> Thanks.
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>> -eran
> >> >>>>>>>>>>>>
> >> >>>>>>>>>>>
> >> >>>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>
> >> >>>>>>
> >> >>>>>
> >> >>>>
> >> >>>
> >> >>
> >>
> >
> >
>

Reply via email to