Yes, AVG(TO_NUMBER(s_price)) would work fine. FWIW, even with the BigDecimal not in the row key, it be a problem if a secondary index were ever added on it.
One other option, if to_number doesn't meet your perf requirements would be to store it as a VARBINARY and create a couple of new built in functions to convert it to/from a BigDecimal. See http://phoenix-hbase.blogspot.com/2013/04/how-to-add-your-own-built-in-function.html?m=1 Thanks, James On Friday, February 27, 2015, Matthew Johnson <[email protected]> wrote: > Thanks James, > > > > We are not using BigDecimal for primary key at all, the main use case > would be something like *(rowid varchar primary key, producttype varchar, > price decimal)* and then we would want to do aggregate number functions > like AVG, MAX etc. So the vast majority would be full table scans anyway. > > > > If I had a separate column where the price was stored as a string, eg *(rowid > varchar primary key, producttype varchar, price decimal, s_price varchar)* > will Phoenix allow me to use aggregate number functions along with the > to_number function? Something like AVG(TO_NUMBER(s_price))? > > > > Thanks! > > Matt > > > > > > -----Original Message----- > From: James Taylor [mailto:[email protected] > <javascript:_e(%7B%7D,'cvml','[email protected]');>] > Sent: 26 February 2015 18:02 > To: user > Subject: Re: BigDecimal casting issue? > > > > The problem with the old way that HBase represents BigDecimal is that the > serialized bytes don't sort the same way that the BigDecimal does (FWIW, > but orthogonal to this discussion and not something that will help you with > this particular situation, a new type system was introduced in HBase to fix > this). If it's a matter of writing a conversion script that runs overnight, > I'd encourage you to do that. > > If that's not possible, there are a few alternatives, but performance > won't be as good. > > > > Phoenix requires that the serialized bytes sort order matches the sort > order of the objects they represent so that range scans and order bys work > correctly. For example, let's say you had the following schema: > > > > CREATE TABLE t (n DECIMAL PRIMARY KEY); > > > > and issued the following query: > > > > SELECT * FROM t WHERE n > 0.5 AND n < 0.9; > > > > Phoenix will compile this into a scan that has a start row of the > serialized bytes of 0.5 and a stop row of the serialized bytes of 0.9. > > If the serialized byte sort order doesn't match the object sort order, > you'd get incorrect results. We do support UNSIGNED_* types for byte, > short, int, and long because the positive numbers meet this criteria. > > We can't have an UNSIGNED_DECIMAL, because even restricted to just > positive BigDecimal values, this contract is not met. > > > > 1) You could declare your data byte for BigDecimal as VARBINARY and do the > byte[]->BigDecimal conversion in your code. Note that the filtering shown > above wouldn't work correctly, but an equality filter would be ok. > > CREATE TABLE t (n VARBINARY PRIMARY KEY); SELECT * FROM t WHERE n = 0.5; > > > > 2) You could stringify your BigDecimal and convert it on-the-fly using our > TO_NUMBER built-in: > > CREATE TABLE t (n VARCHAR PRIMARY KEY); > > SELECT * FROM t WHERE TO_NUMBER(n) > 0.5 AND TO_NUMBER(n) < 0.9; > > > > Note that this query would be a full table scan. > > > > HTH, > > James > > > > > > On Thu, Feb 26, 2015 at 6:39 AM, Matthew Johnson <[email protected] > <javascript:_e(%7B%7D,'cvml','[email protected]');>> wrote: > > > Thanks guys, > > > > > > > > > > > > Unfortunately I cannot easily change the way the value is serialized > > > into HBase because that value is used by multiple other projects that > > > don’t use Phoenix. Over time I could potentially migrate all these > > > projects but I was hoping for a slightly more immediate solution. I > > > could write an overnight batch job that scans through the entire > > > table, and for every value that is a BigDecimal I could duplicate it > > > into a new column with the PDataType Decimal, but this seems kinda > inefficient and would duplicate a lot of data. > > > > > > > > > > > > I had a look at the documentation (and tried a few combinations) but > > > could not find an UNSIGNED_DECIMAL type for Phoenix. > > > > > > > > > > > > Is there any other way I can retrieve BigDecimal values with Phoenix? > > > If I wanted to write my own wrapper for Phoenix that supports > > > BigDecimal, would I have to patch just the client or would I have to > patch the server as well? > > > > > > > > > > > > Cheers, > > > > > > Matt > > > > > > > > > > > > > > > > > > From: anil gupta [mailto:[email protected] > <javascript:_e(%7B%7D,'cvml','[email protected]');>] > > > Sent: 26 February 2015 13:44 > > > To: [email protected] > <javascript:_e(%7B%7D,'cvml','[email protected]');> > > > Subject: Re: BigDecimal casting issue? > > > > > > > > > > > > Hi Matthew, > > > > > > If you want phoenix to read a BigDecimal value out of HBase. You will > > > need to use PDataType.DECIMAL.toBytes(BigDecimal) to serialize that > > > value in HBase. IMO, serialization of of BigDecimal via PDataType is > > > better because it enables byte array comparison on BigDecimal values. > > > > > > Thanks, > > > > > > Anil > > > > > > > > > > > > On Thu, Feb 26, 2015 at 4:30 AM, Gabriel Reid <[email protected] > <javascript:_e(%7B%7D,'cvml','[email protected]');>> > > > wrote: > > > > > > Hi Matt, > > > > > > Although the object representation of the Phoenix DECIMAL type is > > > BigDecimal, the byte-level encoding is different than that of > > > Bytes.toBytes(BigDecimal). The reason for this is to allow for > > > ordering of these values based comparison of binary values. Sorting > > > the values with binary value comparison based on the return value of > > > Bytes.toBytes(BigDecimal) will not result in the correct ordering of > > > values. > > > > > > As you may have noticed, many data types in Phoenix have an UNSIGNED_* > > > counter part which uses the same underlying encoding as Bytes.toBytes, > > > although these datatypes suffer from the same binary comparison issue > > > as outlined above. > > > > > > - Gabriel > > > > > > > > > > > > > > > On Thu, Feb 26, 2015 at 12:06 PM, Matthew Johnson > > > <[email protected] > <javascript:_e(%7B%7D,'cvml','[email protected]');>> wrote: > > >> Hi all, > > >> > > >> > > >> > > >> I am trying to map an HBase column where I store java.math.BigDecimal > > >> values > > >> using: > > >> > > >> > > >> > > >> Bytes.toBytes(myBigDecimalValueInJava) > > >> > > >> > > >> > > >> My understanding from the “Data Types” page > > >> (http://phoenix.apache.org/language/datatypes.html#decimal_type) is > > >> that the DECIMAL type in Phoenix should map to this. However, when I > > >> store: > > >> > > >> > > >> > > >> 102.1 > > >> > > >> > > >> > > >> in HBase, Phoenix reads it back as: > > >> > > >> > > >> > > >> -1.02020201E+126 > > >> > > >> > > >> > > >> Not sure whether I am using it wrong? I have tried creating the view > > >> with data type DECIMAL and also DECIMAL(15,5) but both give the same > problem. > > >> Is > > >> anyone else able to successfully insert BigDecimal values via the > > >> HBase client and retrieve them using Phoenix? > > >> > > >> > > >> > > >> FYI when I retrieve the value using HBase client: > > >> > > >> > > >> > > >> BigDecimal bd = Bytes.toBigDecimal(value); > > >> > > >> > > >> > > >> It correctly prints the value of 102.1. > > >> > > >> > > >> > > >> Any thoughts? > > >> > > >> > > >> > > >> Cheers, > > >> > > >> Matt > > >> > > >> > > >> > > >> > > > > > > > > > > > > > > > -- > > > > > > Thanks & Regards, > > > Anil Gupta >
