Option 1 (throwing an error) is bad. It violates "Pigs eat anything" (see http://pig.apache.org/philosophy.html).
Do we need to give users an ability to name this unknown column? Why not just label it "unknown" and be done? Alan. On Jun 6, 2012, at 2:24 PM, Prasanth J wrote: > Hello everyone > > I would like to bring up this discussion about the ways for handling NULL > values in dimensions specified for cubing. For example, if we have a > dimension color with following values > > red > blue > null > green > > how do we differentiate if the null value represent rollup of all colors > values or actual null value? > > SQL way: > There are 2 ways in which SQL server analysis services handles null values in > dimensions > 1) Throw error when it encounters null values in dimension values > 2) Ignore error by adding the null values to UnknownMembers. By default > UnknownMembers will be named as "Unknown". The name for UnknownMembers can > also be specified by the user. > > Do we need to handle both ways in Pig? I think the first way (throwing error) > is pretty straightforward. > For the second way (ignoring error), what is the best way to provide support > for user specified name for UnknownMembers? > > Please share your thoughts about how we can handle this scenario for > different datatypes in Pig. > > Thanks > -- Prasanth >