Table is ~ 10TB SNAPPY data. I don’t have such a big time window on production for re-inserting all data.
I don’t know how we got those cells. I can only assume that this is phoenix and/or replaying from WAL after region server crash. > On 12 May 2020, at 18:25, Wellington Chevreuil > <[email protected]> wrote: > > How large is this table? Can you afford re-insert all current data on a > new, temp table? If so, you could write a mapreduce job that scans this > table and rewrite all its cells to this new, temp table. I had verified > that 1.4.10 does have the timestamp replacing logic here: > https://github.com/apache/hbase/blob/branch-1.4/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java#L3395 > > <https://github.com/apache/hbase/blob/branch-1.4/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java#L3395> > > So if you re-insert all this table cells into a new one, the timestamps > would be inserted correctly and you would then be able to delete those. > Now, how those cells managed to get inserted with max timestamp? Was this > cluster running on an old version that then got upgraded to 1.4.10? > > > Em ter., 12 de mai. de 2020 às 13:49, Alexander Batyrshin <[email protected] > <mailto:[email protected]>> > escreveu: > >> Any ideas how to delete these rows? >> >> I see only this way: >> - backup data from region that contains “damaged” rows >> - close region >> - remove region files from HDFS >> - assign region >> - copy needed rows from backup to recreated region >> >>> On 30 Apr 2020, at 21:00, Alexander Batyrshin <[email protected]> wrote: >>> >>> The same effect for CF: >>> >>> d = >> org.apache.hadoop.hbase.client.Delete.new("\x0439d58wj434dd".to_s.to_java_bytes) >>> d.deleteFamily("d".to_s.to_java_bytes, >> 9223372036854775807.to_java(Java::long)) >>> table.delete(d) >>> >>> ROW COLUMN+CELL >>> \x0439d58wj434dd column=d:, >> timestamp=1588269277879, type=DeleteFamily >>> >>> >>>> On 29 Apr 2020, at 18:30, Wellington Chevreuil < >> [email protected] <mailto:[email protected]> >> <mailto:[email protected] >> <mailto:[email protected]>>> >> wrote: >>>> >>>> Well, it's weird that puts with such TS values were allowed, according >> to >>>> current code state. Can you afford delete the whole CF for those rows? >>>> >>>> Em qua., 29 de abr. de 2020 às 14:41, junhyeok park < >> [email protected] <mailto:[email protected]> >> <mailto:[email protected] <mailto:[email protected]>>> >>>> escreveu: >>>> >>>>> I've been through the same thing. I use 2.2.0 >>>>> >>>>> 2020년 4월 29일 (수) 오후 10:32, Alexander Batyrshin <[email protected] >>>>> <mailto:[email protected]> >> <mailto:[email protected] <mailto:[email protected]>>>님이 작성: >>>>> >>>>>> As you can see in example I already tried DELETE operation with >> timestamp >>>>>> = Long.MAX_VALUE without any success. >>>>>> >>>>>>> On 29 Apr 2020, at 12:41, Wellington Chevreuil < >>>>>> [email protected] <mailto:[email protected]> >>>>>> <mailto:[email protected] >>>>>> <mailto:[email protected]>>> >> wrote: >>>>>>> >>>>>>> That's expected behaviour [1]. If you are "travelling to the future", >>>>> you >>>>>>> need to do a delete specifying Long.MAX_VALUE timestamp as the >>>>> timestamp >>>>>>> optional parameter in the delete operation [2], if you don't specify >>>>>>> timestamp on the delete, it will assume current time for the delete >>>>>> marker, >>>>>>> which will be smaller than the Long.MAX_VALUE set to your cells, so >>>>> scans >>>>>>> wouldn't filter it. >>>>>>> >>>>>>> [1] https://hbase.apache.org/book.html#version.delete >>>>>>> <https://hbase.apache.org/book.html#version.delete> < >> https://hbase.apache.org/book.html#version.delete >> <https://hbase.apache.org/book.html#version.delete>> >>>>>>> [2] >>>>>>> >>>>>> >>>>> >> https://github.com/apache/hbase/blob/branch-1.4/hbase-client/src/main/java/org/apache/hadoop/hbase/client/Delete.java#L98 >> >> <https://github.com/apache/hbase/blob/branch-1.4/hbase-client/src/main/java/org/apache/hadoop/hbase/client/Delete.java#L98> >> < >> https://github.com/apache/hbase/blob/branch-1.4/hbase-client/src/main/java/org/apache/hadoop/hbase/client/Delete.java#L98 >>> >>>>>>> >>>>>>> Em qua., 29 de abr. de 2020 às 08:57, Alexander Batyrshin < >>>>>> [email protected]> >>>>>>> escreveu: >>>>>>> >>>>>>>> Hello all, >>>>>>>> We had faced with strange situation: table has rows with >>>>> Long.MAX_VALUE >>>>>>>> timestamp. >>>>>>>> These rows impossible to delete, because DELETE mutation uses >>>>>>>> System.currentTimeMillis() timestamp. >>>>>>>> Is there any way to delete these rows? >>>>>>>> We use HBase-1.4.10 >>>>>>>> >>>>>>>> Example: >>>>>>>> >>>>>>>> hbase(main):037:0> scan 'TRACET', { ROWPREFIXFILTER => >>>>>> "\x0439d58wj434dd", >>>>>>>> RAW=>true, VERSIONS=>10} >>>>>>>> ROW >>>>> COLUMN+CELL >>>>>>>> \x0439d58wj434dd column=d:_0, >>>>>>>> timestamp=9223372036854775807, value=x >>>>>>>> >>>>>>>> >>>>>>>> hbase(main):045:0* delete 'TRACET', "\x0439d58wj434dd", "d:_0" >>>>>>>> 0 row(s) in 0.0120 seconds >>>>>>>> >>>>>>>> hbase(main):046:0> scan 'TRACET', { ROWPREFIXFILTER => >>>>>> "\x0439d58wj434dd", >>>>>>>> RAW=>true, VERSIONS=>10} >>>>>>>> ROW >>>>> COLUMN+CELL >>>>>>>> \x0439d58wj434dd column=d:_0, >>>>>>>> timestamp=9223372036854775807, value=x >>>>>>>> \x0439d58wj434dd column=d:_0, >>>>>>>> timestamp=1588146570005, type=Delete >>>>>>>> >>>>>>>> >>>>>>>> hbase(main):047:0> delete 'TRACET', "\x0439d58wj434dd", "d:_0", >>>>>>>> 9223372036854775807 >>>>>>>> 0 row(s) in 0.0110 seconds >>>>>>>> >>>>>>>> hbase(main):048:0> scan 'TRACET', { ROWPREFIXFILTER => >>>>>> "\x0439d58wj434dd", >>>>>>>> RAW=>true, VERSIONS=>10} >>>>>>>> ROW >>>>> COLUMN+CELL >>>>>>>> \x0439d58wj434dd column=d:_0, >>>>>>>> timestamp=9223372036854775807, value=x >>>>>>>> \x0439d58wj434dd column=d:_0, >>>>>>>> timestamp=1588146678086, type=Delete >>>>>>>> \x0439d58wj434dd column=d:_0, >>>>>>>> timestamp=1588146570005, type=Delete
