[jira] [Commented] (PHOENIX-3295) Remove ReplaceArrayColumnWithKeyValueColumnExpressionVisitor
[ https://issues.apache.org/jira/browse/PHOENIX-3295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15560604#comment-15560604 ] James Taylor commented on PHOENIX-3295: --- Thanks, [~tdsilva]. Here's some feedback: - Is the column qualifier name of the single key value the family name? Does that create any issues for [~samarthjain]'s column encoding scheme and would it be easier to use a known/static column qualifier name like QueryConstants.EMPTY_COLUMN_BYTES? More of a question, as I think Samarth can probably change if need be. {code} public ArrayColumnExpression(PDatum column, byte[] cf, int index) { -super(column); +super(column, cf, cf); {code} - Is this TODO still relevant? {code} +// TODO: samarth think about the case when the encodedcolumn qualifier is null. Probably best to add a check here for encodedcolumnname to be true {code} - Do we still need this method in ArrayColumnExpression? I can't find a call to it in your patch. Would be good if this wasn't needed (and if it is need, then do we need to create a new instance on each call?). {code} public KeyValueColumnExpression getKeyValueExpression() { {code} - Can we use ImmutableBytesPtr as the key to this Map instead of ByteBuffer? {code} +Map>> familyToColListMap = Maps.newHashMap(); {code} - Would it be possible for PArrayDataType.positionAtArrayElement() to differentiate between the absence of a column value and a null value? We'll need something like that when support for DEFAULT comes in (PHOENIX-476). Maybe something like serializing a -1 value for the length versus a 0 and having PArrayDataType.positionAtArrayElement return true or false which would be the return value of evaluate? Another alternative would be to serialize the default value for immutable tables (but that'd be kind of wasteful space-wise if the value is large). For now, please add a TODO and a JIRA too so we don't forget. {code} @Override public boolean evaluate(Tuple tuple, ImmutableBytesWritable ptr) { -return PArrayDataType.positionAtArrayElement(tuple, ptr, index, arrayExpression, PVarbinary.INSTANCE, null); -} + if (!super.evaluate(tuple, ptr)) { +return false; +} else if (ptr.getLength() == 0) { + return true; +} -@Override -public T accept(ExpressionVisitor visitor) { -return visitor.visit(this); +// Given a ptr to the entire array, set ptr to point to a particular element within that array +// given the type of an array element (see comments in PDataTypeForArray) + PArrayDataType.positionAtArrayElement(ptr, index, PVarbinary.INSTANCE, null); +return true; } {code} > Remove ReplaceArrayColumnWithKeyValueColumnExpressionVisitor > - > > Key: PHOENIX-3295 > URL: https://issues.apache.org/jira/browse/PHOENIX-3295 > Project: Phoenix > Issue Type: Improvement >Reporter: Thomas D'Silva >Assignee: Thomas D'Silva > Attachments: PHOENIX-3295-v2.patch, PHOENIX-3295.patch > > > ReplaceArrayColumnWithKeyValueColumnExpressionVisitor is only used in one > place in IndexUtil.generateIndexData because we use a ValueGetter to get the > value of the data table column using the original data table column > reference. This is also why ArrayColumnExpression needs to keep track of the > original key value column expression. > If we don't replace the array column expression with the original column > expression when it looks up the column by the qualifier it won't find it. > {code} > ValueGetter valueGetter = new ValueGetter() { > > @Override > public byte[] getRowKey() { > return dataMutation.getRow(); > } > > @Override > public ImmutableBytesWritable > getLatestValue(ColumnReference ref) { > // Always return null for our empty key value, as > this will cause the index > // maintainer to always treat this Put as a new > row. > if (isEmptyKeyValue(table, ref)) { > return null; > } > byte[] family = ref.getFamily(); > byte[] qualifier = ref.getQualifier(); > RowMutationState rowMutationState = > valuesMap.get(ptr); > PColumn column = null; > try { > column = > table.getColumnFamily(family).getPColumnForColumnQualifier(qualifier); >
[jira] [Updated] (PHOENIX-3295) Remove ReplaceArrayColumnWithKeyValueColumnExpressionVisitor
[ https://issues.apache.org/jira/browse/PHOENIX-3295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas D'Silva updated PHOENIX-3295: Attachment: PHOENIX-3295-v2.patch [~jamestaylor] Attached a v2 patch with ArrayColumnExpression derived from KeyValueColumnExpression. [~samarthjain] This also fixes the immutable local index test failures. > Remove ReplaceArrayColumnWithKeyValueColumnExpressionVisitor > - > > Key: PHOENIX-3295 > URL: https://issues.apache.org/jira/browse/PHOENIX-3295 > Project: Phoenix > Issue Type: Improvement >Reporter: Thomas D'Silva >Assignee: Thomas D'Silva > Attachments: PHOENIX-3295-v2.patch, PHOENIX-3295.patch > > > ReplaceArrayColumnWithKeyValueColumnExpressionVisitor is only used in one > place in IndexUtil.generateIndexData because we use a ValueGetter to get the > value of the data table column using the original data table column > reference. This is also why ArrayColumnExpression needs to keep track of the > original key value column expression. > If we don't replace the array column expression with the original column > expression when it looks up the column by the qualifier it won't find it. > {code} > ValueGetter valueGetter = new ValueGetter() { > > @Override > public byte[] getRowKey() { > return dataMutation.getRow(); > } > > @Override > public ImmutableBytesWritable > getLatestValue(ColumnReference ref) { > // Always return null for our empty key value, as > this will cause the index > // maintainer to always treat this Put as a new > row. > if (isEmptyKeyValue(table, ref)) { > return null; > } > byte[] family = ref.getFamily(); > byte[] qualifier = ref.getQualifier(); > RowMutationState rowMutationState = > valuesMap.get(ptr); > PColumn column = null; > try { > column = > table.getColumnFamily(family).getPColumnForColumnQualifier(qualifier); > } catch (ColumnNotFoundException e) { > } catch (ColumnFamilyNotFoundException e) { > } > if (rowMutationState!=null && column!=null) { > byte[] value = > rowMutationState.getColumnValues().get(column); > ImmutableBytesPtr ptr = new > ImmutableBytesPtr(); > ptr.set(value==null ? > ByteUtil.EMPTY_BYTE_ARRAY : value); > > SchemaUtil.padData(table.getName().getString(), column, ptr); > return ptr; > } > return null; > } > > }; > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PHOENIX-476) Support declaration of DEFAULT in CREATE statement
[ https://issues.apache.org/jira/browse/PHOENIX-476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15560493#comment-15560493 ] James Taylor commented on PHOENIX-476: -- Looking really good, [~kliew]. Thanks for the updates! Here's some feedback: - Don't bother caching the default Expression. It's not worth the trouble as the parsing cost is minuscule. Your technique will break down if a column is dropped as the position changes in that case. This could lead to a default value of what was in a lower position to be used mistakenly. Just not caching it is fine. If it ever shows up as a perf issue, we can deal with it then. - Instead of forcing STORE_NULLS to be true for a table with default values (and I should have thought about this before), just tweak the code in PRowImpl.setValue() based on if the PColumn has a default value. Something like this: {code} @Override public void setValue(PColumn column, byte[] byteValue) { deleteRow = null; byte[] family = column.getFamilyName().getBytes(); byte[] qualifier = column.getName().getBytes(); PDataType type = column.getDataType(); // Check null, since some types have no byte representation for null boolean isNull = type.isNull(byteValue); if (isNull && !column.isNullable()) { throw new ConstraintViolationException(name.getString() + "." + column.getName().getString() + " may not be null"); } else if (isNull && PTableImpl.this.isImmutableRows() && column.getExpressionStr() == null) { // Need to preserve explicit nulls for immutable tables otherwise default value would be used removeIfPresent(setValues, family, qualifier); removeIfPresent(unsetValues, family, qualifier); } else if (isNull && !getStoreNulls() && && column.getExpressionStr() == null) { {code} - Add some test cases where the table is immutable (i.e. set IMMUTABLE_ROWS=true on DDL statement) - Add a test for the case of a default value being set on a PK column in the middle. For example: PK1, PK2 DEFAULT 'foo', PK3. - Also, have a test that uses different types for default values (i.e. VARCHAR, DECIMAL, TINYINT, and ARRAY would be interesting too). - Make sure you've imported all of our default settings in terms of keeping imports explicitly rather than wildcarding, import order, etc. For example, I noticed that this was incorrectly change by your IDE: {code} -import static org.junit.Assert.assertEquals; -import static org.junit.Assert.assertNotEquals; -import static org.junit.Assert.assertNotNull; -import static org.junit.Assert.assertTrue; -import static org.junit.Assert.fail; - -import java.sql.Connection; -import java.sql.DriverManager; -import java.sql.SQLException; -import java.sql.Statement; +import static org.junit.Assert.*; {code} See https://phoenix.apache.org/contributing.html#Code_conventions > Support declaration of DEFAULT in CREATE statement > -- > > Key: PHOENIX-476 > URL: https://issues.apache.org/jira/browse/PHOENIX-476 > Project: Phoenix > Issue Type: Task >Affects Versions: 3.0-Release >Reporter: James Taylor >Assignee: Kevin Liew > Labels: enhancement > Attachments: PHOENIX-476.2.patch, PHOENIX-476.3.patch, > PHOENIX-476.4.patch, PHOENIX-476.5.patch, PHOENIX-476.patch > > > Support the declaration of a default value in the CREATE TABLE/VIEW statement > like this: > CREATE TABLE Persons ( > Pid int NOT NULL PRIMARY KEY, > LastName varchar(255) NOT NULL, > FirstName varchar(255), > Address varchar(255), > City varchar(255) DEFAULT 'Sandnes' > ) > To implement this, we'd need to: > 1. add a new DEFAULT_VALUE key value column in SYSTEM.TABLE and pass through > the value when the table is created (in MetaDataClient). > 2. always set NULLABLE to ResultSetMetaData.columnNoNulls if a default value > is present, since the column will never be null. > 3. add a getDefaultValue() accessor in PColumn > 4. for a row key column, during UPSERT use the default value if no value was > specified for that column. This could be done in the PTableImpl.newKey method. > 5. for a key value column with a default value, we can get away without > incurring any storage cost. Although a little bit of extra effort than if we > persisted the default value on an UPSERT for key value columns, this approach > has the benefit of not incurring any storage cost for a default value. > * serialize any default value into KeyValueColumnExpression > * in the evaluate method of KeyValueColumnExpression, conditionally use > the default value if the column value is not present. If doing partial > evaluation, you should not yet return the default value, as we
[jira] [Comment Edited] (PHOENIX-476) Support declaration of DEFAULT in CREATE statement
[ https://issues.apache.org/jira/browse/PHOENIX-476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15559451#comment-15559451 ] Kevin Liew edited comment on PHOENIX-476 at 10/9/16 7:03 AM: - I attached a patch with additional test coverage and all test cases passing. Expressions are now cached outside of PColumn and ColumnDef. The driver sends the default value expression as part of the projector scan and the coprocessors reply with the result sets with default values substituted in (previously I was trying to get the driver to differentiate between nulls and default values at the client side to reduce the resultset size, but it was causing issues). Let me know your feedback. was (Author: kliew): I attached a patch with additional test coverage and all test cases passing. Expressions are now cached outside of PColumn and ColumnDef. The driver sends the default value expression as part of the projector scan and the coprocessors reply with the result sets with default values substituted in (previously I was trying to get the driver to differentiate between nulls and default values at the client side, which was causing issues). Let me know your feedback. > Support declaration of DEFAULT in CREATE statement > -- > > Key: PHOENIX-476 > URL: https://issues.apache.org/jira/browse/PHOENIX-476 > Project: Phoenix > Issue Type: Task >Affects Versions: 3.0-Release >Reporter: James Taylor >Assignee: Kevin Liew > Labels: enhancement > Attachments: PHOENIX-476.2.patch, PHOENIX-476.3.patch, > PHOENIX-476.4.patch, PHOENIX-476.5.patch, PHOENIX-476.patch > > > Support the declaration of a default value in the CREATE TABLE/VIEW statement > like this: > CREATE TABLE Persons ( > Pid int NOT NULL PRIMARY KEY, > LastName varchar(255) NOT NULL, > FirstName varchar(255), > Address varchar(255), > City varchar(255) DEFAULT 'Sandnes' > ) > To implement this, we'd need to: > 1. add a new DEFAULT_VALUE key value column in SYSTEM.TABLE and pass through > the value when the table is created (in MetaDataClient). > 2. always set NULLABLE to ResultSetMetaData.columnNoNulls if a default value > is present, since the column will never be null. > 3. add a getDefaultValue() accessor in PColumn > 4. for a row key column, during UPSERT use the default value if no value was > specified for that column. This could be done in the PTableImpl.newKey method. > 5. for a key value column with a default value, we can get away without > incurring any storage cost. Although a little bit of extra effort than if we > persisted the default value on an UPSERT for key value columns, this approach > has the benefit of not incurring any storage cost for a default value. > * serialize any default value into KeyValueColumnExpression > * in the evaluate method of KeyValueColumnExpression, conditionally use > the default value if the column value is not present. If doing partial > evaluation, you should not yet return the default value, as we may not have > encountered the the KeyValue for the column yet (since a filter evaluates > each time it sees each KeyValue, and there may be more than one KeyValue > referenced in the expression). Partial evaluation is determined by calling > Tuple.isImmutable(), where false means it is NOT doing partial evaluation, > while true means it is. > * modify EvaluateOnCompletionVisitor by adding a visitor method for > RowKeyColumnExpression and KeyValueColumnExpression to set > evaluateOnCompletion to true if they have a default value specified. This > will cause filter evaluation to execute one final time after all KeyValues > for a row have been seen, since it's at this time we know we should use the > default value. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (PHOENIX-476) Support declaration of DEFAULT in CREATE statement
[ https://issues.apache.org/jira/browse/PHOENIX-476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kevin Liew updated PHOENIX-476: --- Attachment: PHOENIX-476.5.patch I attached a patch with additional test coverage and all test cases passing. Expressions are now cached outside of PColumn and ColumnDef. The driver sends the default value expression as part of the projector scan and the coprocessors reply with the result sets with default values substituted in (previously I was trying to get the driver to differentiate between nulls and default values at the client side, which was causing issues). Let me know your feedback. > Support declaration of DEFAULT in CREATE statement > -- > > Key: PHOENIX-476 > URL: https://issues.apache.org/jira/browse/PHOENIX-476 > Project: Phoenix > Issue Type: Task >Affects Versions: 3.0-Release >Reporter: James Taylor >Assignee: Kevin Liew > Labels: enhancement > Attachments: PHOENIX-476.2.patch, PHOENIX-476.3.patch, > PHOENIX-476.4.patch, PHOENIX-476.5.patch, PHOENIX-476.patch > > > Support the declaration of a default value in the CREATE TABLE/VIEW statement > like this: > CREATE TABLE Persons ( > Pid int NOT NULL PRIMARY KEY, > LastName varchar(255) NOT NULL, > FirstName varchar(255), > Address varchar(255), > City varchar(255) DEFAULT 'Sandnes' > ) > To implement this, we'd need to: > 1. add a new DEFAULT_VALUE key value column in SYSTEM.TABLE and pass through > the value when the table is created (in MetaDataClient). > 2. always set NULLABLE to ResultSetMetaData.columnNoNulls if a default value > is present, since the column will never be null. > 3. add a getDefaultValue() accessor in PColumn > 4. for a row key column, during UPSERT use the default value if no value was > specified for that column. This could be done in the PTableImpl.newKey method. > 5. for a key value column with a default value, we can get away without > incurring any storage cost. Although a little bit of extra effort than if we > persisted the default value on an UPSERT for key value columns, this approach > has the benefit of not incurring any storage cost for a default value. > * serialize any default value into KeyValueColumnExpression > * in the evaluate method of KeyValueColumnExpression, conditionally use > the default value if the column value is not present. If doing partial > evaluation, you should not yet return the default value, as we may not have > encountered the the KeyValue for the column yet (since a filter evaluates > each time it sees each KeyValue, and there may be more than one KeyValue > referenced in the expression). Partial evaluation is determined by calling > Tuple.isImmutable(), where false means it is NOT doing partial evaluation, > while true means it is. > * modify EvaluateOnCompletionVisitor by adding a visitor method for > RowKeyColumnExpression and KeyValueColumnExpression to set > evaluateOnCompletion to true if they have a default value specified. This > will cause filter evaluation to execute one final time after all KeyValues > for a row have been seen, since it's at this time we know we should use the > default value. -- This message was sent by Atlassian JIRA (v6.3.4#6332)