Under Phoenix 4.11 we are seeing some storage discrepancies in hbase between a load via psql and a bulk load.
To illustrate in a simple case we have modified the example table from the load reference https://phoenix.apache.org/bulk_dataload.html CREATE TABLE example ( my_pk bigint not null, m.first_name varchar(50), m.last_name varchar(50) CONSTRAINT pk PRIMARY KEY (my_pk)) IMMUTABLE_ROWS=true, IMMUTABLE_STORAGE_SCHEME = SINGLE_CELL_ARRAY_WITH_OFFSETS, COLUMN_ENCODED_BYTES = 1; Hbase Rows when Loading via PSQL \\x80\\x00\\x00\\x00\\x00\\x0009 column=M:\\x00\\x00\\x00\\x00, timestamp=1524109827690, value=x \\x80\\x00\\x00\\x00\\x00\\x0009 column=M:1, timestamp=1524109827690, value=xJohnDoe\\x00\\x00\\x00\\x01\\x00\\x05\\x00\\x00\\x00\\x08\\x00\\x00\\x00\\x03\\x02 \\x80\\x00\\x00\\x00\\x00\\x01\\x092 column=M:\\x00\\x00\\x00\\x00, timestamp=1524109827690, value=x \\x80\\x00\\x00\\x00\\x00\\x01\\x092 column=M:1, timestamp=1524109827690, value=xMaryPoppins\\x00\\x00\\x00\\x01\\x00\\x05\\x00\\x00\\x00\\x0C\\x00\\x00\\x00\\x03\\x02 Hbase Rows when Loading via MapReduce using CsvBulkLoadTool \\x80\\x00\\x00\\x00\\x00\\x0009 column=M:1, timestamp=1524110486638, value=xJohnDoe\\x00\\x00\\x00\\x01\\x00\\x05\\x00\\x00\\x00\\x08\\x00\\x00\\x00\\x03\\x02 \\x80\\x00\\x00\\x00\\x00\\x01\\x092 column=M:1, timestamp=1524110486638, value=xMaryPoppins\\x00\\x00\\x00\\x01\\x00\\x05\\x00\\x00\\x00\\x0C\\x00\\x00\\x00\\x03\\x02 So, the bulk loaded tables have 4 cells for the two rows loaded via psql whereas a bulk load is missing two cells since it lacks the cells with col qualifier :\\x00\\x00\\x00\\x00 Is this behavior correct? Thanks much for any insight. ____________________________________________________________ How To "Remove" Dark Spots Gundry MD http://thirdpartyoffers.netzero.net/TGL3231/5ad818ce6211c18ce6b13st04vuc
