Re: Issues with import from 0.92 into 0.98

2015-05-27 Thread apache
So more experimentation over the long weekend on this. If I load sample data into the new cluster table manually through the shell, column filters work as expected. Obviously not a solution to the problem. Anyone have any ideas or things I should be looking at? The regionserver logs show

Re: Issues with import from 0.92 into 0.98

2015-05-27 Thread Dave Latham
It looks like the hbase shell (beginning with 0.96) parses column names as FAMILY:QUALIFIER[:FORMATTER] due to work from HBASE-6592. As a result, the shell basically doesn't support specifying any columns (for gets/puts/scans/etc) that include a colon in the qualifier. I filed HBASE-13788. For

Re: Issues with import from 0.92 into 0.98

2015-05-27 Thread apache
On Wed, May 27, 2015, at 01:54 PM, Dave Latham wrote: It looks like the hbase shell (beginning with 0.96) parses column names as FAMILY:QUALIFIER[:FORMATTER] due to work from HBASE-6592. As a result, the shell basically doesn't support specifying any columns (for gets/puts/scans/etc) that

Re: Issues with import from 0.92 into 0.98

2015-05-27 Thread Dave Latham
On Wed, May 27, 2015 at 11:17 AM, apa...@borkbork.net wrote: Thanks! I want to make sure I've got it right: When I import the 0.92 data into 0.98, the columns are defined properly in the 0.98 table, but I cannot perform a scan with a column filter in the shell as the shell interprets the

Re: Issues with import from 0.92 into 0.98

2015-05-27 Thread Dave Latham
Sounds like quite a puzzle. You mentioned that you can read data written through manual Puts from the shell - but not data from the Import. There must be something different about the data itself once it's in the table. Can you compare a row that was imported to a row that was manually written

Re: Issues with import from 0.92 into 0.98

2015-05-27 Thread Nick Dimiduk
Scanning without the column filter produces data? The content table on the new cluster has the same column family names ('x', in your example above)? On Wed, May 27, 2015 at 8:35 AM, Dave Latham lat...@davelink.net wrote: Sounds like quite a puzzle. You mentioned that you can read data

Re: Issues with import from 0.92 into 0.98

2015-05-27 Thread apache
On Wed, May 27, 2015, at 11:41 AM, Nick Dimiduk wrote: Scanning without the column filter produces data? The content table on the new cluster has the same column family names ('x', in your example above)? Yes, if I scan without a column filter (and I should probably try some other filters

Re: Issues with import from 0.92 into 0.98

2015-05-27 Thread apache
On Wed, May 27, 2015, at 11:35 AM, Dave Latham wrote: Sounds like quite a puzzle. You mentioned that you can read data written through manual Puts from the shell - but not data from the Import. There must be something different about the data itself once it's in the table. Can you compare

impact of using higher Hbase.hregion.memstore.flush.size=512MB

2015-05-27 Thread Gautam Borah
Hi all, The default size of Hbase.hregion.memstore.flush.size is define as 128 MB . Could anyone kindly explain what would be the impact if we increase this to a higher value 512 MB or 800 MB or higher. We have a very write heavy cluster. Also we run periodic end point co processor based jobs

Re: impact of using higher Hbase.hregion.memstore.flush.size=512MB

2015-05-27 Thread Esteban Gutierrez
Gautam, Yes, you can increase the size of the memstore to values larger to 128MB but usually you go by increasing hbase.hregion.memstore.block.multiplier only. Depending on the version of HBase you are running many things can happen, e.g. multiple memstores can be flushed at once and/or the

Re: readAtOffset error when reading from HFiles

2015-05-27 Thread Nick Dimiduk
s3:// and s3n:// are different things. Given the structure of your S3 key, I think you should be using s3n. For reference, http://wiki.apache.org/hadoop/AmazonS3

RE: How to set Timeout for get/scan operations without impacting others

2015-05-27 Thread Fang, Mike
Thanks Ted. From: Ted Yu [mailto:yuzhih...@gmail.com] Sent: Thursday, May 28, 2015 6:12 AM To: Fang, Mike Cc: user@hbase.apache.org; Dai, Kevin; Huang, Jianshi Subject: Re: How to set Timeout for get/scan operations without impacting others Mike: Please take a look at HBASE-13783 FYI On Mon,

Re: impact of using higher Hbase.hregion.memstore.flush.size=512MB

2015-05-27 Thread Gautam Borah
Hi Esteban, Thanks for your response. hbase.rs.cacheblocksonwrite would be very useful for us. We have set hbase.regionserver.maxlogs appropriately to avoid flush across memstores. Also set hbase.regionserver.optionalcacheflushinterval to 0 to disable periodic flushing, we do not write anything

Re: How to set Timeout for get/scan operations without impacting others

2015-05-27 Thread Ted Yu
Mike: Please take a look at HBASE-13783 FYI On Mon, May 18, 2015 at 6:44 PM, Fang, Mike chuf...@paypal.com wrote: Hi Ted, Thanks. Hbase version is: HBase 0.98.0.2.1.2.0-402-hadoop2 Data block encoding: DATA_BLOCK_ENCODING = 'DIFF' I tried to run the hfile tool to scan, and it