I have not tried the master yet branch yet, however on Phoenix 4.13 this storage discrepancy in hbase is still present with the extra column=M:\x00\x00\x00\x00 cells in hbase when using psql or sqlline. Does anyone have an understanding of the meaning of the column qualifier \x00\x00\x00\x00 ?
---------- Original Message ---------- From: "Lew Jackman" <lew9...@netzero.net> To: user@phoenix.apache.org Cc: user@phoenix.apache.org Subject: Re: hbase cell storage different bewteen bulk load and direct api Date: Thu, 19 Apr 2018 13:59:16 GMT The upsert statement appears the same as the psql results - i.e. extra cells. I will try the master branch next. Thanks for the tip. ---------- Original Message ---------- From: Sergey Soldatov <sergeysolda...@gmail.com> To: user@phoenix.apache.org Subject: Re: hbase cell storage different bewteen bulk load and direct api Date: Thu, 19 Apr 2018 12:26:25 +0600 Hi Lew,no. 1st one looks line incorrect. You may file a bug on that ( I believe that the second case is correct, but you may also check with uploading data using regular upserts). Also, you may check whether the master branch has this issue. Thanks,Sergey On Thu, Apr 19, 2018 at 10:19 AM, Lew Jackman <lew9...@netzero.net> wrote: Under Phoenix 4.11 we are seeing some storage discrepancies in hbase between a load via psql and a bulk load. To illustrate in a simple case we have modified the example table from the load reference https://phoenix.apache.org/bulk_dataload.html CREATE TABLE example (    my_pk bigint not null,    m.first_name varchar(50),    m.last_name varchar(50)    CONSTRAINT pk PRIMARY KEY (my_pk))    IMMUTABLE_ROWS=true,    IMMUTABLE_STORAGE_SCHEME = SINGLE_CELL_ARRAY_WITH_OFFSETS,    COLUMN_ENCODED_BYTES = 1; Hbase Rows when Loading via PSQL \\\\\\\\x80\\\\\\\\x00\\\\\\\\x00\\\\\\\\x00\\\\\\\\x00\\\\\\\\x0009     column=M:\\\\\\\\x00\\\\\\\\x00\\\\\\\\x00\\\\\\\\x00, timestamp=1524109827690, value=x              \\\\\\\\x80\\\\\\\\x00\\\\\\\\x00\\\\\\\\x00\\\\\\\\x00\\\\\\\\x0009     column=M:1, timestamp=1524109827690, value=xJohnDoe\\\\\\\\x00\\\\\\\\x00\\\\\\\\x00\\\\\\\\x01\\\\\\\\x00\\\\\\\\x05\\\\\\\\x00\\\\\\\\x00\\\\\\\\x00\\\\\\\\x08\\\\\\\\x00\\\\\\\\x00\\\\\\\\x00\\\\\\\\x03\\\\\\\\x02              \\\\\\\\x80\\\\\\\\x00\\\\\\\\x00\\\\\\\\x00\\\\\\\\x00\\\\\\\\x01\\\\\\\\x092  column=M:\\\\\\\\x00\\\\\\\\x00\\\\\\\\x00\\\\\\\\x00, timestamp=1524109827690, value=x              \\\\\\\\x80\\\\\\\\x00\\\\\\\\x00\\\\\\\\x00\\\\\\\\x00\\\\\\\\x01\\\\\\\\x092  column=M:1, timestamp=1524109827690, value=xMaryPoppins\\\\\\\\x00\\\\\\\\x00\\\\\\\\x00\\\\\\\\x01\\\\\\\\x00\\\\\\\\x05\\\\\\\\x00\\\\\\\\x00\\\\\\\\x00\\\\\\\\x0C\\\\\\\\x00\\\\\\\\x00\\\\\\\\x00\\\\\\\\x03\\\\\\\\x02              Hbase Rows when Loading via MapReduce using CsvBulkLoadTool \\\\\\\\x80\\\\\\\\x00\\\\\\\\x00\\\\\\\\x00\\\\\\\\x00\\\\\\\\x0009     column=M:1, timestamp=1524110486638, value=xJohnDoe\\\\\\\\x00\\\\\\\\x00\\\\\\\\x00\\\\\\\\x01\\\\\\\\x00\\\\\\\\x05\\\\\\\\x00\\\\\\\\x00\\\\\\\\x00\\\\\\\\x08\\\\\\\\x00\\\\\\\\x00\\\\\\\\x00\\\\\\\\x03\\\\\\\\x02              \\\\\\\\x80\\\\\\\\x00\\\\\\\\x00\\\\\\\\x00\\\\\\\\x00\\\\\\\\x01\\\\\\\\x092  column=M:1, timestamp=1524110486638, value=xMaryPoppins\\\\\\\\x00\\\\\\\\x00\\\\\\\\x00\\\\\\\\x01\\\\\\\\x00\\\\\\\\x05\\\\\\\\x00\\\\\\\\x00\\\\\\\\x00\\\\\\\\x0C\\\\\\\\x00\\\\\\\\x00\\\\\\\\x00\\\\\\\\x03\\\\\\\\x02              So, the bulk loaded tables have 4 cells for the two rows loaded via psql whereas a bulk load is missing two cells since it lacks the cells with col qualifier :\\\\\\\\x00\\\\\\\\x00\\\\\\\\x00\\\\\\\\x00 Is this behavior correct? Thanks much for any insight. ____________________________________________________________ How To "Remove" Dark Spots Gundry MD http://thirdpartyoffers.netzero.net/TGL3232/5ad818ce6211c18ce6b13st04vuc