[jira] [Commented] (HBASE-9582) MapReduce Scan gives different output every times

Nick Dimiduk (JIRA) Fri, 27 Sep 2013 09:54:31 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-9582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13780088#comment-13780088
 ]


Nick Dimiduk commented on HBASE-9582:
-------------------------------------

The reduced level of scanner caching results in more, smaller RPCs. This has 
the effect of throttling the client throughput and easing the concentrated 
burden on the RS.

To solve the correctness problems this describes, I suggest the TableMapper 
should be pedantic about failures and fail the task rather than produce 
incomplete results.

There may also be improvements to be made in the scanning code at client-side. 
Thoughts?
                
> MapReduce Scan gives different output every times
> -------------------------------------------------
>
>                 Key: HBASE-9582
>                 URL: https://issues.apache.org/jira/browse/HBASE-9582
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.94.5
>         Environment: hadoop 1.0.3
>            Reporter: Igor Vikhrov
>
> I have this Scan
> Scan scan = new Scan();
> scan.setCaching(50);
> scan.setCacheBlocks(false);
> scan.setMaxVersions();
> scan.setTimeRange(Long.valueOf(args[7] + "000"),Long.valueOf(args[8] + 
> "000"));
> SingleColumnValueFilter filter = new 
> SingleColumnValueFilter(Bytes.toBytes(args[1]),Bytes.toBytes(args[2]),CompareFilter.CompareOp.EQUAL,new
>  BinaryComparator(Bytes.toBytes(args[3])));
> filter.setFilterIfMissing(true);
> scan.setFilter(filter);
> It works without any warns and errors in command line. But when regionservers 
> CPU is high loaded, Scan with the same parameters (Column, value, timestamps) 
> gives different results. For example
> first time - Map output records=571374   
> second time -  Map output records=777620
> third time - Map output records=776099
> Regionservers log includes such WARNs:
> 2013-09-19 13:29:44,827 WARN org.apache.hadoop.ipc.HBaseServer: 
> (responseTooSlow): 
> {"processingtimems":30759,"call":"next(-308003858163246780, 10), rpc 
> version=1, client version=29, 
> methodsFingerPrint=-1368823753","client":"10.10.54.22:53361","starttimems":1379582954067,"queuetimems":1,"class":"HRegionServer","responsesize":51343,"method":"next"}
> and these ERRORs:
> 2013-09-19 13:26:18,202 ERROR 
> org.apache.hadoop.hbase.regionserver.HRegionServer: 
> org.apache.hadoop.hbase.ipc.CallerDisconnectedException: Aborting call 
> next(-9095740742796333934, 10), rpc version=1, client version=29, 
> methodsFingerPrint=-1368823753 from 10.10.54.22:32914 after 60059 ms, since 
> caller disconnected                                                           
>                        
>         at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Call.throwExceptionIfCallerDisconnected(HBaseServer.java:436)
>                                              
>         at 
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:3723)
>                                                     
>         at 
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:3643)
>                                                          
>         at 
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRaw(HRegion.java:3635)
>                                                          
>         at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:2483)
>                                                                   
>         at sun.reflect.GeneratedMethodAccessor24.invoke(Unknown Source)       
>                                                                               
>  
>         at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>                                                              
>         at java.lang.reflect.Method.invoke(Method.java:597)                   
>                                                                               
>  
>         at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:320)
>                                                              
>         at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1426)
>     
> When regionservers CPU is not loaded, Scan gives same results every times.
> In this case regionservers log doesn't include any WARNs.
> Why does it happen? I want to be sure that Scan give me all the data that I 
> request no matter how CPU is using now.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-9582) MapReduce Scan gives different output every times

Reply via email to