[jira] [Created] (HBASE-7924) thrift interface is inconsistently implemented on timestamp/range filtering

2013-02-25 Thread Guido Serra aka Zeph (JIRA)
Guido Serra aka Zeph created HBASE-7924:
---

 Summary: thrift interface is inconsistently implemented on 
timestamp/range filtering
 Key: HBASE-7924
 URL: https://issues.apache.org/jira/browse/HBASE-7924
 Project: HBase
  Issue Type: Bug
  Components: Thrift
Affects Versions: 0.94.5, 0.92.0
Reporter: Guido Serra aka Zeph


a getRowsWithColumnsTs or a Scan object are being exposed (as by documentation 
and .thrift description file) only as *exact* timestamp matcher, no timerange 
functionality is (supposedly) being exposed

instead, the Scan object is behaving as by documentation
but the getRowsWithColumnsTs() beneath has a timerange behaviour

see: HBASE-5694, HBASE-7907

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7924) thrift interface is inconsistently implemented on timestamp/range scan

2013-02-25 Thread Guido Serra aka Zeph (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guido Serra aka Zeph updated HBASE-7924:


Summary: thrift interface is inconsistently implemented on timestamp/range 
scan  (was: thrift interface is inconsistently implemented on timestamp/range 
filtering)

 thrift interface is inconsistently implemented on timestamp/range scan
 --

 Key: HBASE-7924
 URL: https://issues.apache.org/jira/browse/HBASE-7924
 Project: HBase
  Issue Type: Bug
  Components: Thrift
Affects Versions: 0.92.0, 0.94.5
Reporter: Guido Serra aka Zeph

 a getRowsWithColumnsTs or a Scan object are being exposed (as by 
 documentation and .thrift description file) only as *exact* timestamp 
 matcher, no timerange functionality is (supposedly) being exposed
 instead, the Scan object is behaving as by documentation
 but the getRowsWithColumnsTs() beneath has a timerange behaviour
 see: HBASE-5694, HBASE-7907

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7924) thrift interface is inconsistently implemented on timestamp/range scan

2013-02-25 Thread Guido Serra aka Zeph (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guido Serra aka Zeph updated HBASE-7924:


Description: 
a getRowsWithColumnsTs or a Scan object are being exposed (as by documentation 
and .thrift description file) only as *exact* timestamp matcher,
 no timerange functionality is (supposedly) being exposed - see: HBASE-7907

instead, the Scan object is behaving as by documentation
but the getRowsWithColumnsTs() beneath has a timerange behaviour

{code}
  if (tScan.isSetTimestamp()) {
  scan.setTimeRange(Long.MIN_VALUE, tScan.getTimestamp());  
  }
{code}

see: HBASE-5694

  was:
a getRowsWithColumnsTs or a Scan object are being exposed (as by documentation 
and .thrift description file) only as *exact* timestamp matcher, no timerange 
functionality is (supposedly) being exposed

instead, the Scan object is behaving as by documentation
but the getRowsWithColumnsTs() beneath has a timerange behaviour

{code}
  if (tScan.isSetTimestamp()) {
  scan.setTimeRange(Long.MIN_VALUE, tScan.getTimestamp());  
  }
{code}

see: HBASE-5694, HBASE-7907


 thrift interface is inconsistently implemented on timestamp/range scan
 --

 Key: HBASE-7924
 URL: https://issues.apache.org/jira/browse/HBASE-7924
 Project: HBase
  Issue Type: Bug
  Components: Thrift
Affects Versions: 0.92.0, 0.94.5
Reporter: Guido Serra aka Zeph

 a getRowsWithColumnsTs or a Scan object are being exposed (as by 
 documentation and .thrift description file) only as *exact* timestamp matcher,
  no timerange functionality is (supposedly) being exposed - see: HBASE-7907
 instead, the Scan object is behaving as by documentation
 but the getRowsWithColumnsTs() beneath has a timerange behaviour
 {code}
   if (tScan.isSetTimestamp()) {
   scan.setTimeRange(Long.MIN_VALUE, tScan.getTimestamp());  
   }
 {code}
 see: HBASE-5694

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7924) thrift interface is inconsistently implemented on timestamp/range scan

2013-02-25 Thread Guido Serra aka Zeph (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guido Serra aka Zeph updated HBASE-7924:


Description: 
a getRowsWithColumnsTs or a Scan object are being exposed (as by documentation 
and .thrift description file) only as *exact* timestamp matcher, no timerange 
functionality is (supposedly) being exposed

instead, the Scan object is behaving as by documentation
but the getRowsWithColumnsTs() beneath has a timerange behaviour

{code}
  if (tScan.isSetTimestamp()) {
  scan.setTimeRange(Long.MIN_VALUE, tScan.getTimestamp());  
  }
{code}

see: HBASE-5694, HBASE-7907

  was:
a getRowsWithColumnsTs or a Scan object are being exposed (as by documentation 
and .thrift description file) only as *exact* timestamp matcher, no timerange 
functionality is (supposedly) being exposed

instead, the Scan object is behaving as by documentation
but the getRowsWithColumnsTs() beneath has a timerange behaviour

see: HBASE-5694, HBASE-7907


 thrift interface is inconsistently implemented on timestamp/range scan
 --

 Key: HBASE-7924
 URL: https://issues.apache.org/jira/browse/HBASE-7924
 Project: HBase
  Issue Type: Bug
  Components: Thrift
Affects Versions: 0.92.0, 0.94.5
Reporter: Guido Serra aka Zeph

 a getRowsWithColumnsTs or a Scan object are being exposed (as by 
 documentation and .thrift description file) only as *exact* timestamp 
 matcher, no timerange functionality is (supposedly) being exposed
 instead, the Scan object is behaving as by documentation
 but the getRowsWithColumnsTs() beneath has a timerange behaviour
 {code}
   if (tScan.isSetTimestamp()) {
   scan.setTimeRange(Long.MIN_VALUE, tScan.getTimestamp());  
   }
 {code}
 see: HBASE-5694, HBASE-7907

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5694) getRowsWithColumnsTs() in Thrift service handles timestamps incorrectly

2013-02-25 Thread Guido Serra aka Zeph (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13585768#comment-13585768
 ] 

Guido Serra aka Zeph commented on HBASE-5694:
-

k, [~ted_yu] I opened a bug report HBASE-7924 ... fix will follow 

 getRowsWithColumnsTs() in Thrift service handles timestamps incorrectly
 ---

 Key: HBASE-5694
 URL: https://issues.apache.org/jira/browse/HBASE-5694
 Project: HBase
  Issue Type: Bug
  Components: Thrift
Affects Versions: 0.92.1
Reporter: Wouter Bolsterlee
 Fix For: 0.94.0

 Attachments: HBASE-5694.patch, HBASE-5694-trunk-20120402.patch, 
 setTimestamp.patch


 The getRowsWithColumnsTs() method in the Thrift interface only applies the 
 timestamp if columns are explicitly specified. However, this method also 
 allows for columns to be unspecified (this is even used internally to 
 implement e.g. getRows()). The cause of the bug is a minor scoping issue: the 
 time range is set inside a wrong if statement.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5154) Can't put small timestamp after delete the column

2013-02-22 Thread Guido Serra aka Zeph (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13584254#comment-13584254
 ] 

Guido Serra aka Zeph commented on HBASE-5154:
-

seems that someone is trying to fix it HBASE-5241 

 Can't put small timestamp after delete the column
 -

 Key: HBASE-5154
 URL: https://issues.apache.org/jira/browse/HBASE-5154
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.3
 Environment: OS: Linux 2.6.32-33-server #70-Ubuntu SMP
 JRE: Java(TM) SE Runtime Environment (build 1.6.0_26-b03)
 Java HotSpot(TM) 64-Bit Server VM (build 20.1-b02, mixed mode)
 Hadoop: Version: 0.20-append-r1056497, r1056491
 Hbase run on 4 HRegion + 1 HMaster cluster.
Reporter: robi
Priority: Critical

 1. Call put to insert some value in column 'fm:a' like:
 Put.add('fm', 'a', 1000, 'abc'), here timestamp = 1000.
 2. Delete the column 'fm:a'
 3. Try to do #1 again.(it doesn't work, but can insert put which use 
 timestamp  1000)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-4155) the problem in hbase thrift client when scan/get rows by timestamp

2013-02-22 Thread Guido Serra aka Zeph (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13584331#comment-13584331
 ] 

Guido Serra aka Zeph commented on HBASE-4155:
-

similar issue...
{code}
hbase(main):007:0 scan AAA_customer, {TIMERANGE = [0, 1360032970]}
ROW  COLUMN+CELL
  
0 row(s) in 1.5590 seconds

hbase(main):008:0 scan AAA_customer
ROW  COLUMN+CELL
  
 1   column=mysql:birthday, 
timestamp=1360292144, value=1999-01-01  
{code}

 the problem in hbase thrift client when scan/get rows by timestamp
 --

 Key: HBASE-4155
 URL: https://issues.apache.org/jira/browse/HBASE-4155
 Project: HBase
  Issue Type: Bug
  Components: Thrift
Affects Versions: 0.90.0
Reporter: zezhou
 Attachments: 4155.txt, patch.txt, patch.txt.svn

   Original Estimate: 1m
  Remaining Estimate: 1m

 I want to scan rows by specified timestamp. I use following hbase shell 
 command :
 scan 'testcrawl',{TIMESTAMP=1312268202071} 
 ROW COLUMN+CELL   
   
   
  put1.com   column=crawl:data, 
 timestamp=1312268202071, value=htmlput1/html  
 
  put1.com   column=crawl:type, 
 timestamp=1312268202071, value=html   
  
  put1.com   column=links:outlinks, 
 timestamp=1312268202071, value=www.163.com;www.sina.com 
 As I expected, I can get the rows which timestamp is 1312268202071.
 But when I use thift client to do the same thing ,the return data is the rows 
 which time before specified timestamp ,  not the same as hbase 
 shell.following is timestamp of return data:
 131217917
 1312268202059
 I look up the source in  
 hbase/src/main/java/org/apache/hadoop/hbase/thrift/ThriftServer.java, it use 
 following code to set time parameter .
 scan.setTimeRange(Long.MIN_VALUE, timestamp);
 This cause thrift client return rows before specified row ,not the rows 
 timestamp specified.
 But in hbase client and avro client ,it use following code to set time 
 parameter.
 scan.setTimeStamp(timestamp);
 this will return rows timestamp specified.
 Is this a feature or a bug in thrift client ?
 if this is a feature, which method in thrift client can get the rows by 
 specified timestamp?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-4155) the problem in hbase thrift client when scan/get rows by timestamp

2013-02-22 Thread Guido Serra aka Zeph (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13584334#comment-13584334
 ] 

Guido Serra aka Zeph commented on HBASE-4155:
-

while, this other works

{code}
hbase(main):009:0 scan AAA_customer, {TIMESTAMP = 1360032970}
ROW  COLUMN+CELL
  
0 row(s) in 1.3960 seconds

hbase(main):010:0 scan AAA_customer, {TIMESTAMP = 1360292144}
ROW  COLUMN+CELL
  
 1   column=mysql:birthday, 
timestamp=1360292144, 1999-01-01 
{code}

 the problem in hbase thrift client when scan/get rows by timestamp
 --

 Key: HBASE-4155
 URL: https://issues.apache.org/jira/browse/HBASE-4155
 Project: HBase
  Issue Type: Bug
  Components: Thrift
Affects Versions: 0.90.0
Reporter: zezhou
 Attachments: 4155.txt, patch.txt, patch.txt.svn

   Original Estimate: 1m
  Remaining Estimate: 1m

 I want to scan rows by specified timestamp. I use following hbase shell 
 command :
 scan 'testcrawl',{TIMESTAMP=1312268202071} 
 ROW COLUMN+CELL   
   
   
  put1.com   column=crawl:data, 
 timestamp=1312268202071, value=htmlput1/html  
 
  put1.com   column=crawl:type, 
 timestamp=1312268202071, value=html   
  
  put1.com   column=links:outlinks, 
 timestamp=1312268202071, value=www.163.com;www.sina.com 
 As I expected, I can get the rows which timestamp is 1312268202071.
 But when I use thift client to do the same thing ,the return data is the rows 
 which time before specified timestamp ,  not the same as hbase 
 shell.following is timestamp of return data:
 131217917
 1312268202059
 I look up the source in  
 hbase/src/main/java/org/apache/hadoop/hbase/thrift/ThriftServer.java, it use 
 following code to set time parameter .
 scan.setTimeRange(Long.MIN_VALUE, timestamp);
 This cause thrift client return rows before specified row ,not the rows 
 timestamp specified.
 But in hbase client and avro client ,it use following code to set time 
 parameter.
 scan.setTimeStamp(timestamp);
 this will return rows timestamp specified.
 Is this a feature or a bug in thrift client ?
 if this is a feature, which method in thrift client can get the rows by 
 specified timestamp?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-7907) time range Scan to be made available via Thrift

2013-02-22 Thread Guido Serra aka Zeph (JIRA)
Guido Serra aka Zeph created HBASE-7907:
---

 Summary: time range Scan to be made available via Thrift
 Key: HBASE-7907
 URL: https://issues.apache.org/jira/browse/HBASE-7907
 Project: HBase
  Issue Type: New Feature
Reporter: Guido Serra aka Zeph


this is the mapping of the Scan Object in Thrift as of today at
 - 
http://svn.apache.org/viewvc/hbase/trunk/hbase-server/src/main/resources/org/apache/hadoop/hbase/thrift/Hbase.thrift?view=markup
{code}
132 /**
133 * A Scan object is used to specify scanner parameters when opening a 
scanner.
134 */
135 struct TScan {
136 1:optional Text startRow,
137 2:optional Text stopRow,
138 3:optional i64 timestamp,
139 4:optional listText columns,
140 5:optional i32 caching,
141 6:optional Text filterString
142 }
{code}

this is the Scan Object
 - http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html

which has: 
bq. To only retrieve columns within a specific range of version timestamps, 
execute setTimeRange.
and
bq. To only retrieve columns with a specific timestamp, execute setTimestamp.

the second functionality/method is reachable, the first one setTimeRange() is 
not (or at least at me) via Thrift

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5694) getRowsWithColumnsTs() in Thrift service handles timestamps incorrectly

2013-02-22 Thread Guido Serra aka Zeph (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13584370#comment-13584370
 ] 

Guido Serra aka Zeph commented on HBASE-5694:
-

confirmed on Version 0.92.1-cdh4.1.2, even worse, without specifying the 
columns, given a timestamp
it behaves like a range filter from 0 (epoch) to timestamp -1 (basically an 
.. until, excluded)

 getRowsWithColumnsTs() in Thrift service handles timestamps incorrectly
 ---

 Key: HBASE-5694
 URL: https://issues.apache.org/jira/browse/HBASE-5694
 Project: HBase
  Issue Type: Bug
  Components: Thrift
Affects Versions: 0.92.1
Reporter: Wouter Bolsterlee
 Fix For: 0.94.0

 Attachments: HBASE-5694.patch, HBASE-5694-trunk-20120402.patch


 The getRowsWithColumnsTs() method in the Thrift interface only applies the 
 timestamp if columns are explicitly specified. However, this method also 
 allows for columns to be unspecified (this is even used internally to 
 implement e.g. getRows()). The cause of the bug is a minor scoping issue: the 
 time range is set inside a wrong if statement.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5694) getRowsWithColumnsTs() in Thrift service handles timestamps incorrectly

2013-02-22 Thread Guido Serra aka Zeph (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13584412#comment-13584412
 ] 

Guido Serra aka Zeph commented on HBASE-5694:
-

[~uws] if this is actually the patch (taken from above)
{code}
--- ThriftServer.java.orig  2012-04-01 23:41:16.881172406 +0200
+++ ThriftServer.java   2012-04-01 23:41:30.177238337 +0200
@@ -477,8 +477,8 @@
 get.addColumn(famAndQf[0], famAndQf[1]);
   }
 }
-get.setTimeRange(Long.MIN_VALUE, timestamp);
   }
+  get.setTimeRange(Long.MIN_VALUE, timestamp);
   gets.add(get);
 }
 Result[] result = table.get(gets);
{code}

it is the wrong behavior that I'm getting, as it is inconsistent with the 
scannerOpenWithScan

we shall not use the setTimeRange but the setTimestamp... as the signature in 
Thrift states:
{code}
471 * Get the specified columns for the specified table and rows at the 
specified
472 * timestamp. Returns an empty list if no rows exist.
{code}

and not a range scan from Long.MIN_VALUE to timestamp as implemented above

 getRowsWithColumnsTs() in Thrift service handles timestamps incorrectly
 ---

 Key: HBASE-5694
 URL: https://issues.apache.org/jira/browse/HBASE-5694
 Project: HBase
  Issue Type: Bug
  Components: Thrift
Affects Versions: 0.92.1
Reporter: Wouter Bolsterlee
 Fix For: 0.94.0

 Attachments: HBASE-5694.patch, HBASE-5694-trunk-20120402.patch


 The getRowsWithColumnsTs() method in the Thrift interface only applies the 
 timestamp if columns are explicitly specified. However, this method also 
 allows for columns to be unspecified (this is even used internally to 
 implement e.g. getRows()). The cause of the bug is a minor scoping issue: the 
 time range is set inside a wrong if statement.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-5694) getRowsWithColumnsTs() in Thrift service handles timestamps incorrectly

2013-02-22 Thread Guido Serra aka Zeph (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guido Serra aka Zeph updated HBASE-5694:


Attachment: setTimestamp.patch

up to me the correct patch shall be setTimestamp.patch that I computed 
against origin/0.92.0rc4 from the github repository

{code}
index 231a564..4c46a4f 100644
--- a/src/main/java/org/apache/hadoop/hbase/thrift/ThriftServer.java
+++ b/src/main/java/org/apache/hadoop/hbase/thrift/ThriftServer.java
@@ -413,7 +413,7 @@ public class ThriftServer {
 HTable table = getTable(tableName);
 if (columns == null) {
   Get get = new Get(getBytes(row));
-  get.setTimeRange(Long.MIN_VALUE, timestamp);
+  get.setTimestamp(timestamp);
   Result result = table.get(get);
   return ThriftUtilities.rowResultFromHBase(result);
 }
@@ -426,7 +426,7 @@ public class ThriftServer {
   get.addColumn(famAndQf[0], famAndQf[1]);
   }
 }
-get.setTimeRange(Long.MIN_VALUE, timestamp);
+get.setTimestamp(timestamp);
 Result result = table.get(get);
 return ThriftUtilities.rowResultFromHBase(result);
   } catch (IOException e) {
{code}

 getRowsWithColumnsTs() in Thrift service handles timestamps incorrectly
 ---

 Key: HBASE-5694
 URL: https://issues.apache.org/jira/browse/HBASE-5694
 Project: HBase
  Issue Type: Bug
  Components: Thrift
Affects Versions: 0.92.1
Reporter: Wouter Bolsterlee
 Fix For: 0.94.0

 Attachments: HBASE-5694.patch, HBASE-5694-trunk-20120402.patch, 
 setTimestamp.patch


 The getRowsWithColumnsTs() method in the Thrift interface only applies the 
 timestamp if columns are explicitly specified. However, this method also 
 allows for columns to be unspecified (this is even used internally to 
 implement e.g. getRows()). The cause of the bug is a minor scoping issue: the 
 time range is set inside a wrong if statement.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5694) getRowsWithColumnsTs() in Thrift service handles timestamps incorrectly

2013-02-22 Thread Guido Serra aka Zeph (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13584468#comment-13584468
 ] 

Guido Serra aka Zeph commented on HBASE-5694:
-

argh... leave it... all of this is just WRONG...
{code}
  if (tScan.isSetTimestamp()) {
  scan.setTimeRange(Long.MIN_VALUE, tScan.getTimestamp());  

  }
{code}

instead of exposing the setTimeRange on the Thrift interface someone decided to 
hide it this way

 getRowsWithColumnsTs() in Thrift service handles timestamps incorrectly
 ---

 Key: HBASE-5694
 URL: https://issues.apache.org/jira/browse/HBASE-5694
 Project: HBase
  Issue Type: Bug
  Components: Thrift
Affects Versions: 0.92.1
Reporter: Wouter Bolsterlee
 Fix For: 0.94.0

 Attachments: HBASE-5694.patch, HBASE-5694-trunk-20120402.patch, 
 setTimestamp.patch


 The getRowsWithColumnsTs() method in the Thrift interface only applies the 
 timestamp if columns are explicitly specified. However, this method also 
 allows for columns to be unspecified (this is even used internally to 
 implement e.g. getRows()). The cause of the bug is a minor scoping issue: the 
 time range is set inside a wrong if statement.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5694) getRowsWithColumnsTs() in Thrift service handles timestamps incorrectly

2013-02-22 Thread Guido Serra aka Zeph (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13584472#comment-13584472
 ] 

Guido Serra aka Zeph commented on HBASE-5694:
-

[~yuzhih...@gmail.com] I will
(sorry for this thread, I'd better go home and enjoy the weekend)

 getRowsWithColumnsTs() in Thrift service handles timestamps incorrectly
 ---

 Key: HBASE-5694
 URL: https://issues.apache.org/jira/browse/HBASE-5694
 Project: HBase
  Issue Type: Bug
  Components: Thrift
Affects Versions: 0.92.1
Reporter: Wouter Bolsterlee
 Fix For: 0.94.0

 Attachments: HBASE-5694.patch, HBASE-5694-trunk-20120402.patch, 
 setTimestamp.patch


 The getRowsWithColumnsTs() method in the Thrift interface only applies the 
 timestamp if columns are explicitly specified. However, this method also 
 allows for columns to be unspecified (this is even used internally to 
 implement e.g. getRows()). The cause of the bug is a minor scoping issue: the 
 time range is set inside a wrong if statement.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-4769) Abort RegionServer Immediately on OOME

2013-02-20 Thread Guido Serra aka Zeph (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13582112#comment-13582112
 ] 

Guido Serra aka Zeph commented on HBASE-4769:
-

guys... this is so stupid... I lost the whole morning cause HBase's 
RegionServer was dying with no logs, no nothing... how Am I supposed to debug 
the issue if u do not even generate a core dump? or a log message? ... argh

 Abort RegionServer Immediately on OOME
 --

 Key: HBASE-4769
 URL: https://issues.apache.org/jira/browse/HBASE-4769
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.0
Reporter: Nicolas Spiegelberg
Assignee: Nicolas Spiegelberg
 Fix For: 0.92.0, 0.94.0

 Attachments: HBASE-4769.patch, HBASE-4769.patch


 Currently, when the HRegionServer runs out of the memory, it will call 
 master, which will cause more heap allocations and throw a second exception 
 that it's run out of memory again. The easiest  safest way to avoid this 
 OOME storm is to abort the RegionServer immediately when it hits the memory 
 boundary.  Part of the 89-fb to trunk port.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-7645) put without timestamp duplicates the record/row

2013-01-22 Thread Guido Serra aka Zeph (JIRA)
Guido Serra aka Zeph created HBASE-7645:
---

 Summary: put without timestamp duplicates the record/row
 Key: HBASE-7645
 URL: https://issues.apache.org/jira/browse/HBASE-7645
 Project: HBase
  Issue Type: Brainstorming
  Components: Client
Reporter: Guido Serra aka Zeph


if I call a couple of times SQOOP on the same dataset, outputting to HBase,
I will end up with duplicated data...

{code}
hbase(main):030:0 get dump_HKFAS.sales_order, 1, {COLUMN = 
mysql:created_at, VERSIONS = 4}
COLUMN CELL 

mysql:created_at  timestamp=1358853505756, value=2011-12-21 
18:07:38.0 
mysql:created_at  timestamp=1358790515451, value=2011-12-21 
18:07:38.0 
2 row(s) in 0.0040 seconds

today's sqoop run
hbase(main):031:0 Date.new(1358853505756).toString()
= Tue Jan 22 11:18:25 UTC 2013

yesterday's sqoop run
hbase(main):032:0 Date.new(1358790515451).toString()
= Mon Jan 21 17:48:35 UTC 2013
{code}

the fact that the Put.add() method writes the kv without checking if, apart of 
the timestamp, the value has not changed, is it by design? or a bug?

from: trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/client/Put.java
{code}
  /**
   * Add the specified column and value to this Put operation.
   * @param family family name
   * @param qualifier column qualifier
   * @param value column value
   * @return this
   */
  public Put add(byte [] family, byte [] qualifier, byte [] value) {
return add(family, qualifier, this.ts, value);
  }

  /**
   * Add the specified column and value, with the specified timestamp as
   * its version to this Put operation.
   * @param family family name
   * @param qualifier column qualifier
   * @param ts version timestamp
   * @param value column value
   * @return this
   */
  public Put add(byte [] family, byte [] qualifier, long ts, byte [] value) {
ListKeyValue list = getKeyValueList(family);
KeyValue kv = createPutKeyValue(family, qualifier, ts, value);
list.add(kv);
familyMap.put(kv.getFamily(), list);
return this;
  }
{code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7645) put without timestamp duplicates the record/row

2013-01-22 Thread Guido Serra aka Zeph (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guido Serra aka Zeph updated HBASE-7645:


Description: 
if I call a couple of times SQOOP on the same dataset, outputting to HBase,
I will end up with duplicated data...

{code}
hbase(main):030:0 get dump_HKFAS.sales_order, 1, {COLUMN = 
mysql:created_at, VERSIONS = 4}
COLUMN CELL 

mysql:created_at  timestamp=1358853505756, value=2011-12-21 
18:07:38.0 
mysql:created_at  timestamp=1358790515451, value=2011-12-21 
18:07:38.0 
2 row(s) in 0.0040 seconds

today's sqoop run
hbase(main):031:0 Date.new(1358853505756).toString()
= Tue Jan 22 11:18:25 UTC 2013

yesterday's sqoop run
hbase(main):032:0 Date.new(1358790515451).toString()
= Mon Jan 21 17:48:35 UTC 2013
{code}

the fact that the Put.add() method writes the kv without checking if, apart of 
the timestamp, the value has not changed, is it by design? or a bug?

from: trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/client/Put.java
{code}

  public Put add(byte [] family, byte [] qualifier, byte [] value) {
return add(family, qualifier, this.ts, value);
  }

  public Put add(byte [] family, byte [] qualifier, long ts, byte [] value) {
ListKeyValue list = getKeyValueList(family);
KeyValue kv = createPutKeyValue(family, qualifier, ts, value);
list.add(kv);
familyMap.put(kv.getFamily(), list);
return this;
  }
{code}

  was:
if I call a couple of times SQOOP on the same dataset, outputting to HBase,
I will end up with duplicated data...

{code}
hbase(main):030:0 get dump_HKFAS.sales_order, 1, {COLUMN = 
mysql:created_at, VERSIONS = 4}
COLUMN CELL 

mysql:created_at  timestamp=1358853505756, value=2011-12-21 
18:07:38.0 
mysql:created_at  timestamp=1358790515451, value=2011-12-21 
18:07:38.0 
2 row(s) in 0.0040 seconds

today's sqoop run
hbase(main):031:0 Date.new(1358853505756).toString()
= Tue Jan 22 11:18:25 UTC 2013

yesterday's sqoop run
hbase(main):032:0 Date.new(1358790515451).toString()
= Mon Jan 21 17:48:35 UTC 2013
{code}

the fact that the Put.add() method writes the kv without checking if, apart of 
the timestamp, the value has not changed, is it by design? or a bug?

from: trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/client/Put.java
{code}
  /**
   * Add the specified column and value to this Put operation.
   * @param family family name
   * @param qualifier column qualifier
   * @param value column value
   * @return this
   */
  public Put add(byte [] family, byte [] qualifier, byte [] value) {
return add(family, qualifier, this.ts, value);
  }

  /**
   * Add the specified column and value, with the specified timestamp as
   * its version to this Put operation.
   * @param family family name
   * @param qualifier column qualifier
   * @param ts version timestamp
   * @param value column value
   * @return this
   */
  public Put add(byte [] family, byte [] qualifier, long ts, byte [] value) {
ListKeyValue list = getKeyValueList(family);
KeyValue kv = createPutKeyValue(family, qualifier, ts, value);
list.add(kv);
familyMap.put(kv.getFamily(), list);
return this;
  }
{code}


 put without timestamp duplicates the record/row
 ---

 Key: HBASE-7645
 URL: https://issues.apache.org/jira/browse/HBASE-7645
 Project: HBase
  Issue Type: Brainstorming
  Components: Client
Reporter: Guido Serra aka Zeph

 if I call a couple of times SQOOP on the same dataset, outputting to HBase,
 I will end up with duplicated data...
 {code}
 hbase(main):030:0 get dump_HKFAS.sales_order, 1, {COLUMN = 
 mysql:created_at, VERSIONS = 4}
 COLUMN CELL   
   
 mysql:created_at  timestamp=1358853505756, value=2011-12-21 
 18:07:38.0 
 mysql:created_at  timestamp=1358790515451, value=2011-12-21 
 18:07:38.0 
 2 row(s) in 0.0040 seconds
 today's sqoop run
 hbase(main):031:0 Date.new(1358853505756).toString()
 = Tue Jan 22 11:18:25 UTC 2013
 yesterday's sqoop run
 hbase(main):032:0 Date.new(1358790515451).toString()
 = Mon Jan 21 17:48:35 UTC 2013
 {code}
 the fact that the Put.add() method writes the kv without checking 

[jira] [Updated] (HBASE-7645) put without timestamp duplicates the record/row

2013-01-22 Thread Guido Serra aka Zeph (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guido Serra aka Zeph updated HBASE-7645:


Description: 
if I call a couple of times SQOOP on the same dataset, outputting to HBase,
I will end up with duplicated data...

{code}
hbase(main):030:0 get dump_HKFAS.sales_order, 1, {COLUMN = 
mysql:created_at, VERSIONS = 4}
COLUMN CELL 

mysql:created_at  timestamp=1358853505756, value=2011-12-21 
18:07:38.0 
mysql:created_at  timestamp=1358790515451, value=2011-12-21 
18:07:38.0 
2 row(s) in 0.0040 seconds

today's sqoop run
hbase(main):031:0 Date.new(1358853505756).toString()
= Tue Jan 22 11:18:25 UTC 2013

yesterday's sqoop run
hbase(main):032:0 Date.new(1358790515451).toString()
= Mon Jan 21 17:48:35 UTC 2013
{code}

the fact that the Put.add() method writes the kv without checking if, apart of 
the timestamp, the value has not changed, is it by design? or a bug?

I mean, what's the idea behind? Shall it be SQOOP (the client application) 
supposed to handle the read on the value before issuing an add() statement call?

from: trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/client/Put.java
{code}

  public Put add(byte [] family, byte [] qualifier, byte [] value) {
return add(family, qualifier, this.ts, value);
  }

  public Put add(byte [] family, byte [] qualifier, long ts, byte [] value) {
ListKeyValue list = getKeyValueList(family);
KeyValue kv = createPutKeyValue(family, qualifier, ts, value);
list.add(kv);
familyMap.put(kv.getFamily(), list);
return this;
  }
{code}

  was:
if I call a couple of times SQOOP on the same dataset, outputting to HBase,
I will end up with duplicated data...

{code}
hbase(main):030:0 get dump_HKFAS.sales_order, 1, {COLUMN = 
mysql:created_at, VERSIONS = 4}
COLUMN CELL 

mysql:created_at  timestamp=1358853505756, value=2011-12-21 
18:07:38.0 
mysql:created_at  timestamp=1358790515451, value=2011-12-21 
18:07:38.0 
2 row(s) in 0.0040 seconds

today's sqoop run
hbase(main):031:0 Date.new(1358853505756).toString()
= Tue Jan 22 11:18:25 UTC 2013

yesterday's sqoop run
hbase(main):032:0 Date.new(1358790515451).toString()
= Mon Jan 21 17:48:35 UTC 2013
{code}

the fact that the Put.add() method writes the kv without checking if, apart of 
the timestamp, the value has not changed, is it by design? or a bug?

from: trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/client/Put.java
{code}

  public Put add(byte [] family, byte [] qualifier, byte [] value) {
return add(family, qualifier, this.ts, value);
  }

  public Put add(byte [] family, byte [] qualifier, long ts, byte [] value) {
ListKeyValue list = getKeyValueList(family);
KeyValue kv = createPutKeyValue(family, qualifier, ts, value);
list.add(kv);
familyMap.put(kv.getFamily(), list);
return this;
  }
{code}


 put without timestamp duplicates the record/row
 ---

 Key: HBASE-7645
 URL: https://issues.apache.org/jira/browse/HBASE-7645
 Project: HBase
  Issue Type: Brainstorming
  Components: Client
Reporter: Guido Serra aka Zeph

 if I call a couple of times SQOOP on the same dataset, outputting to HBase,
 I will end up with duplicated data...
 {code}
 hbase(main):030:0 get dump_HKFAS.sales_order, 1, {COLUMN = 
 mysql:created_at, VERSIONS = 4}
 COLUMN CELL   
   
 mysql:created_at  timestamp=1358853505756, value=2011-12-21 
 18:07:38.0 
 mysql:created_at  timestamp=1358790515451, value=2011-12-21 
 18:07:38.0 
 2 row(s) in 0.0040 seconds
 today's sqoop run
 hbase(main):031:0 Date.new(1358853505756).toString()
 = Tue Jan 22 11:18:25 UTC 2013
 yesterday's sqoop run
 hbase(main):032:0 Date.new(1358790515451).toString()
 = Mon Jan 21 17:48:35 UTC 2013
 {code}
 the fact that the Put.add() method writes the kv without checking if, apart 
 of the timestamp, the value has not changed, is it by design? or a bug?
 I mean, what's the idea behind? Shall it be SQOOP (the client application) 
 supposed to handle the read on the value before issuing an add() statement 
 call?
 from: 

[jira] [Updated] (HBASE-7645) put without timestamp duplicates the record/row

2013-01-22 Thread Guido Serra aka Zeph (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guido Serra aka Zeph updated HBASE-7645:


Priority: Trivial  (was: Major)

 put without timestamp duplicates the record/row
 ---

 Key: HBASE-7645
 URL: https://issues.apache.org/jira/browse/HBASE-7645
 Project: HBase
  Issue Type: Brainstorming
  Components: Client
Reporter: Guido Serra aka Zeph
Priority: Trivial

 if I call a couple of times SQOOP on the same dataset, outputting to HBase,
 I will end up with duplicated data...
 {code}
 hbase(main):030:0 get dump_HKFAS.sales_order, 1, {COLUMN = 
 mysql:created_at, VERSIONS = 4}
 COLUMN CELL   
   
 mysql:created_at  timestamp=1358853505756, value=2011-12-21 
 18:07:38.0 
 mysql:created_at  timestamp=1358790515451, value=2011-12-21 
 18:07:38.0 
 2 row(s) in 0.0040 seconds
 today's sqoop run
 hbase(main):031:0 Date.new(1358853505756).toString()
 = Tue Jan 22 11:18:25 UTC 2013
 yesterday's sqoop run
 hbase(main):032:0 Date.new(1358790515451).toString()
 = Mon Jan 21 17:48:35 UTC 2013
 {code}
 the fact that the Put.add() method writes the kv without checking if, apart 
 of the timestamp, the value has not changed, is it by design? or a bug?
 I mean, what's the idea behind? Shall it be SQOOP (the client application) 
 supposed to handle the read on the value before issuing an add() statement 
 call?
 from: trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/client/Put.java
 {code}
   public Put add(byte [] family, byte [] qualifier, byte [] value) {
 return add(family, qualifier, this.ts, value);
   }
   public Put add(byte [] family, byte [] qualifier, long ts, byte [] value) {
 ListKeyValue list = getKeyValueList(family);
 KeyValue kv = createPutKeyValue(family, qualifier, ts, value);
 list.add(kv);
 familyMap.put(kv.getFamily(), list);
 return this;
   }
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7645) put without timestamp duplicates the record/row

2013-01-22 Thread Guido Serra aka Zeph (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13559783#comment-13559783
 ] 

Guido Serra aka Zeph commented on HBASE-7645:
-

[~anoopsamjohn] uh, ok... so that is then expect. Thanks for clarifying :)

I'll handle it on client side then

 put without timestamp duplicates the record/row
 ---

 Key: HBASE-7645
 URL: https://issues.apache.org/jira/browse/HBASE-7645
 Project: HBase
  Issue Type: Brainstorming
  Components: Client
Reporter: Guido Serra aka Zeph
Priority: Trivial

 if I call a couple of times SQOOP on the same dataset, outputting to HBase,
 I will end up with duplicated data...
 {code}
 hbase(main):030:0 get dump_HKFAS.sales_order, 1, {COLUMN = 
 mysql:created_at, VERSIONS = 4}
 COLUMN CELL   
   
 mysql:created_at  timestamp=1358853505756, value=2011-12-21 
 18:07:38.0 
 mysql:created_at  timestamp=1358790515451, value=2011-12-21 
 18:07:38.0 
 2 row(s) in 0.0040 seconds
 today's sqoop run
 hbase(main):031:0 Date.new(1358853505756).toString()
 = Tue Jan 22 11:18:25 UTC 2013
 yesterday's sqoop run
 hbase(main):032:0 Date.new(1358790515451).toString()
 = Mon Jan 21 17:48:35 UTC 2013
 {code}
 the fact that the Put.add() method writes the kv without checking if, apart 
 of the timestamp, the value has not changed, is it by design? or a bug?
 I mean, what's the idea behind? Shall it be SQOOP (the client application) 
 supposed to handle the read on the value before issuing an add() statement 
 call?
 from: trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/client/Put.java
 {code}
   public Put add(byte [] family, byte [] qualifier, byte [] value) {
 return add(family, qualifier, this.ts, value);
   }
   public Put add(byte [] family, byte [] qualifier, long ts, byte [] value) {
 ListKeyValue list = getKeyValueList(family);
 KeyValue kv = createPutKeyValue(family, qualifier, ts, value);
 list.add(kv);
 familyMap.put(kv.getFamily(), list);
 return this;
   }
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HBASE-7645) put without timestamp duplicates the record/row

2013-01-22 Thread Guido Serra aka Zeph (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guido Serra aka Zeph resolved HBASE-7645.
-

Resolution: Not A Problem

 put without timestamp duplicates the record/row
 ---

 Key: HBASE-7645
 URL: https://issues.apache.org/jira/browse/HBASE-7645
 Project: HBase
  Issue Type: Brainstorming
  Components: Client
Reporter: Guido Serra aka Zeph
Priority: Trivial

 if I call a couple of times SQOOP on the same dataset, outputting to HBase,
 I will end up with duplicated data...
 {code}
 hbase(main):030:0 get dump_HKFAS.sales_order, 1, {COLUMN = 
 mysql:created_at, VERSIONS = 4}
 COLUMN CELL   
   
 mysql:created_at  timestamp=1358853505756, value=2011-12-21 
 18:07:38.0 
 mysql:created_at  timestamp=1358790515451, value=2011-12-21 
 18:07:38.0 
 2 row(s) in 0.0040 seconds
 today's sqoop run
 hbase(main):031:0 Date.new(1358853505756).toString()
 = Tue Jan 22 11:18:25 UTC 2013
 yesterday's sqoop run
 hbase(main):032:0 Date.new(1358790515451).toString()
 = Mon Jan 21 17:48:35 UTC 2013
 {code}
 the fact that the Put.add() method writes the kv without checking if, apart 
 of the timestamp, the value has not changed, is it by design? or a bug?
 I mean, what's the idea behind? Shall it be SQOOP (the client application) 
 supposed to handle the read on the value before issuing an add() statement 
 call?
 from: trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/client/Put.java
 {code}
   public Put add(byte [] family, byte [] qualifier, byte [] value) {
 return add(family, qualifier, this.ts, value);
   }
   public Put add(byte [] family, byte [] qualifier, long ts, byte [] value) {
 ListKeyValue list = getKeyValueList(family);
 KeyValue kv = createPutKeyValue(family, qualifier, ts, value);
 list.add(kv);
 familyMap.put(kv.getFamily(), list);
 return this;
   }
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira