@James, is this similar to https://issues.apache.org/jira/browse/PHOENIX-3112? @Mac, can you try if increasing hbase.client.scanner.max.result.size helps?
On Tue, Dec 6, 2016 at 10:53 PM, James Taylor <[email protected]> wrote: > Looks like a bug to me. If you can reproduce the issue outside of Python > phoenixdb, using the Query Server directly, please file a JIRA. If not, it > sounds like a Phython phoenixdb issue. > > Thanks, > James > > On Tue, Dec 6, 2016 at 8:58 AM, Mark Heppner <[email protected]> > wrote: > >> I encountered something interesting and I'm not sure if it's a bug in >> Phoenix itself, the query server, or just a side effect of using a large >> binary column. If I create a table like this (in sqlline): >> create table test1 ( >> id integer not null primary key, >> image varbinary, >> height unsigned_int, >> width unsigned_int >> ) >> >> And insert with the Python phoenixdb client (uses the query server): >> with phoenixdb.connect('http://localhost:8765', autocommit=True) as >> con: >> with con.cursor() as cur: >> cur.execute('upsert into test1 values(?, ?, ?, ?)', [1, img, >> 900, 900]) >> >> When I "select * from test1" using sqlline, the width column appears as >> null: >> +-----+--------------+---------+--------+ >> | ID | IMAGE | HEIGHT | WIDTH | >> +-----+--------------+---------+--------+ >> | 1 | [B@5a2bd7c8 | 900 | null | >> +-----+--------------+---------+--------+ >> >> However, using "select width from test1", it returns the value: >> +--------+ >> | WIDTH | >> +--------+ >> | 900 | >> +--------+ >> >> Filtering by null using "select * from test1 where width is null" gives 0 >> rows, so it's like the value is there but doesn't get recognized when using >> a "select *". >> >> The value does show up when the binary column is not included in the >> select (select id, height, width). If the binary column is present, it >> appears as null (select image, height, width). >> >> This behavior can be confirmed with phoenixdb too: >> > cur.execute('select width from test1') >> > cur.fetchall() >> [[900]] >> >> > cur.execute('select * from test1') >> > cur.fetchall() >> [[1, '\x00\x01 ...', 900, None]] >> >> I don't think this is an issue with phoenixdb, since I can see the >> protobuf message serializing both the height and width columns the same >> way. Is this an issue with Phoenix itself, or just something the query >> server did on upsert? >> >> Is this a side effect of using a large binary column (typical length is >> 6480000)? In sqlline, using this works: >> >> upsert into test1 values (2, '\x00\x01\x02', 900, 900); >> select * from test1; >> >> +-----+--------------+---------+--------+ >> | ID | IMAGE | HEIGHT | WIDTH | >> +-----+--------------+---------+--------+ >> | 1 | [B@1835d3ed | 900 | null | >> | 2 | [B@66213a0d | 900 | 900 | >> +-----+--------------+---------+--------+ >> >> Also worth mentioning, this never happens to the height column; it always >> has the correct value. It's only happening on width, the second of the >> unsigned_int columns. >> >> HBase 1.1.2 and Phoenix 4.7 from HDP 2.5. >> > >
