[jira] [Commented] (PHOENIX-3295) Remove ReplaceArrayColumnWithKeyValueColumnExpressionVisitor

2016-10-09 Thread James Taylor (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-3295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15560604#comment-15560604
 ] 

James Taylor commented on PHOENIX-3295:
---

Thanks, [~tdsilva]. Here's some feedback:
- Is the column qualifier name of the single key value the family name? Does 
that create any issues for [~samarthjain]'s column encoding scheme and would it 
be easier to use a known/static column qualifier name like 
QueryConstants.EMPTY_COLUMN_BYTES? More of a question, as I think Samarth can 
probably change if need be.
{code}
 public ArrayColumnExpression(PDatum column, byte[] cf, int index) {
-super(column);
+super(column, cf, cf);
{code}
- Is this TODO still relevant?
{code}
+// TODO: samarth think about the case when the encodedcolumn qualifier 
is null. Probably best to add a check here for encodedcolumnname to be true
{code}
- Do we still need this method in ArrayColumnExpression? I can't find a call to 
it in your patch. Would be good if this wasn't needed (and if it is need, then 
do we need to create a new instance on each call?).
{code}
 public KeyValueColumnExpression getKeyValueExpression() {
{code}
- Can we use ImmutableBytesPtr as the key to this Map instead of ByteBuffer?
{code}
+Map>> 
familyToColListMap = Maps.newHashMap();
{code}
- Would it be possible for PArrayDataType.positionAtArrayElement() to 
differentiate between the absence of a column value and a null value? We'll 
need something like that when support for DEFAULT comes in (PHOENIX-476). Maybe 
something like serializing a -1 value for the length versus a 0 and having 
PArrayDataType.positionAtArrayElement return true or false which would be the 
return value of evaluate? Another alternative would be to serialize the default 
value for immutable tables (but that'd be kind of wasteful space-wise if the 
value is large). For now, please add a TODO and a JIRA too so we don't forget.
{code}
 @Override
 public boolean evaluate(Tuple tuple, ImmutableBytesWritable ptr) {
-return PArrayDataType.positionAtArrayElement(tuple, ptr, index, 
arrayExpression, PVarbinary.INSTANCE, null);
-}
+   if (!super.evaluate(tuple, ptr)) {
+return false;
+} else if (ptr.getLength() == 0) { 
+   return true; 
+}
 
-@Override
-public  T accept(ExpressionVisitor visitor) {
-return visitor.visit(this);
+// Given a ptr to the entire array, set ptr to point to a particular 
element within that array
+// given the type of an array element (see comments in 
PDataTypeForArray)
+   PArrayDataType.positionAtArrayElement(ptr, index, PVarbinary.INSTANCE, 
null);
+return true;
 }
{code}

> Remove ReplaceArrayColumnWithKeyValueColumnExpressionVisitor 
> -
>
> Key: PHOENIX-3295
> URL: https://issues.apache.org/jira/browse/PHOENIX-3295
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Thomas D'Silva
>Assignee: Thomas D'Silva
> Attachments: PHOENIX-3295-v2.patch, PHOENIX-3295.patch
>
>
> ReplaceArrayColumnWithKeyValueColumnExpressionVisitor is only used in one 
> place in IndexUtil.generateIndexData because we use a ValueGetter to get the 
> value of the data table column using the original data table column 
> reference. This is also why ArrayColumnExpression needs to keep track of the 
> original key value column expression. 
> If we don't replace the array column expression with the original column 
> expression when it looks up the column by the qualifier it won't find it. 
> {code}
> ValueGetter valueGetter = new ValueGetter() {
>   
>   @Override
> public byte[] getRowKey() {
>   return dataMutation.getRow();
>   }
> 
> @Override
> public ImmutableBytesWritable 
> getLatestValue(ColumnReference ref) {
> // Always return null for our empty key value, as 
> this will cause the index
> // maintainer to always treat this Put as a new 
> row.
> if (isEmptyKeyValue(table, ref)) {
> return null;
> }
> byte[] family = ref.getFamily();
> byte[] qualifier = ref.getQualifier();
> RowMutationState rowMutationState = 
> valuesMap.get(ptr);
> PColumn column = null;
> try {
> column = 
> table.getColumnFamily(family).getPColumnForColumnQualifier(qualifier);
>  

[jira] [Updated] (PHOENIX-3295) Remove ReplaceArrayColumnWithKeyValueColumnExpressionVisitor

2016-10-09 Thread Thomas D'Silva (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-3295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas D'Silva updated PHOENIX-3295:

Attachment: PHOENIX-3295-v2.patch

[~jamestaylor]

Attached a v2 patch with ArrayColumnExpression derived from 
KeyValueColumnExpression. 

[~samarthjain]

This also fixes the immutable local index test failures.

> Remove ReplaceArrayColumnWithKeyValueColumnExpressionVisitor 
> -
>
> Key: PHOENIX-3295
> URL: https://issues.apache.org/jira/browse/PHOENIX-3295
> Project: Phoenix
>  Issue Type: Improvement
>Reporter: Thomas D'Silva
>Assignee: Thomas D'Silva
> Attachments: PHOENIX-3295-v2.patch, PHOENIX-3295.patch
>
>
> ReplaceArrayColumnWithKeyValueColumnExpressionVisitor is only used in one 
> place in IndexUtil.generateIndexData because we use a ValueGetter to get the 
> value of the data table column using the original data table column 
> reference. This is also why ArrayColumnExpression needs to keep track of the 
> original key value column expression. 
> If we don't replace the array column expression with the original column 
> expression when it looks up the column by the qualifier it won't find it. 
> {code}
> ValueGetter valueGetter = new ValueGetter() {
>   
>   @Override
> public byte[] getRowKey() {
>   return dataMutation.getRow();
>   }
> 
> @Override
> public ImmutableBytesWritable 
> getLatestValue(ColumnReference ref) {
> // Always return null for our empty key value, as 
> this will cause the index
> // maintainer to always treat this Put as a new 
> row.
> if (isEmptyKeyValue(table, ref)) {
> return null;
> }
> byte[] family = ref.getFamily();
> byte[] qualifier = ref.getQualifier();
> RowMutationState rowMutationState = 
> valuesMap.get(ptr);
> PColumn column = null;
> try {
> column = 
> table.getColumnFamily(family).getPColumnForColumnQualifier(qualifier);
> } catch (ColumnNotFoundException e) {
> } catch (ColumnFamilyNotFoundException e) {
> }
> if (rowMutationState!=null && column!=null) {
> byte[] value = 
> rowMutationState.getColumnValues().get(column);
> ImmutableBytesPtr ptr = new 
> ImmutableBytesPtr();
> ptr.set(value==null ? 
> ByteUtil.EMPTY_BYTE_ARRAY : value);
> 
> SchemaUtil.padData(table.getName().getString(), column, ptr);
> return ptr;
> }
> return null;
> }
> 
> };
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PHOENIX-476) Support declaration of DEFAULT in CREATE statement

2016-10-09 Thread James Taylor (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15560493#comment-15560493
 ] 

James Taylor commented on PHOENIX-476:
--

Looking really good, [~kliew]. Thanks for the updates! Here's some feedback:
- Don't bother caching the default Expression. It's not worth the trouble as 
the parsing cost is minuscule. Your technique will break down if a column is 
dropped as the position changes in that case. This could lead to a default 
value of what was in a lower position to be used mistakenly. Just not caching 
it is fine. If it ever shows up as a perf issue, we can deal with it then.
- Instead of forcing STORE_NULLS to be true for a table with default values 
(and I should have thought about this before), just tweak the code in 
PRowImpl.setValue() based on if the PColumn has a default value. Something like 
this:
{code}
@Override
public void setValue(PColumn column, byte[] byteValue) {
deleteRow = null;
byte[] family = column.getFamilyName().getBytes();
byte[] qualifier = column.getName().getBytes();
PDataType type = column.getDataType();
// Check null, since some types have no byte representation for null
boolean isNull = type.isNull(byteValue);
if (isNull && !column.isNullable()) {
throw new ConstraintViolationException(name.getString() + "." + 
column.getName().getString() + " may not be null");
} else if (isNull && PTableImpl.this.isImmutableRows() && 
column.getExpressionStr() == null) { // Need to preserve explicit nulls for 
immutable tables otherwise default value would be used
removeIfPresent(setValues, family, qualifier);
removeIfPresent(unsetValues, family, qualifier);
} else if (isNull && !getStoreNulls() &&  && 
column.getExpressionStr() == null) {
{code}
- Add some test cases where the table is immutable (i.e. set 
IMMUTABLE_ROWS=true on DDL statement)
- Add a test for the case of a default value being set on a PK column in the 
middle. For example: PK1, PK2 DEFAULT 'foo', PK3.
- Also, have a test that uses different types for default values (i.e. VARCHAR, 
DECIMAL, TINYINT, and ARRAY would be interesting too).
- Make sure you've imported all of our default settings in terms of keeping 
imports explicitly rather than wildcarding, import order, etc. For example, I 
noticed that this was incorrectly change by your IDE:
{code}
-import static org.junit.Assert.assertEquals;
-import static org.junit.Assert.assertNotEquals;
-import static org.junit.Assert.assertNotNull;
-import static org.junit.Assert.assertTrue;
-import static org.junit.Assert.fail;
-
-import java.sql.Connection;
-import java.sql.DriverManager;
-import java.sql.SQLException;
-import java.sql.Statement;
+import static org.junit.Assert.*;
{code}
See https://phoenix.apache.org/contributing.html#Code_conventions

> Support declaration of DEFAULT in CREATE statement
> --
>
> Key: PHOENIX-476
> URL: https://issues.apache.org/jira/browse/PHOENIX-476
> Project: Phoenix
>  Issue Type: Task
>Affects Versions: 3.0-Release
>Reporter: James Taylor
>Assignee: Kevin Liew
>  Labels: enhancement
> Attachments: PHOENIX-476.2.patch, PHOENIX-476.3.patch, 
> PHOENIX-476.4.patch, PHOENIX-476.5.patch, PHOENIX-476.patch
>
>
> Support the declaration of a default value in the CREATE TABLE/VIEW statement 
> like this:
> CREATE TABLE Persons (
> Pid int NOT NULL PRIMARY KEY,
> LastName varchar(255) NOT NULL,
> FirstName varchar(255),
> Address varchar(255),
> City varchar(255) DEFAULT 'Sandnes'
> )
> To implement this, we'd need to:
> 1. add a new DEFAULT_VALUE key value column in SYSTEM.TABLE and pass through 
> the value when the table is created (in MetaDataClient).
> 2. always set NULLABLE to ResultSetMetaData.columnNoNulls if a default value 
> is present, since the column will never be null.
> 3. add a getDefaultValue() accessor in PColumn
> 4.  for a row key column, during UPSERT use the default value if no value was 
> specified for that column. This could be done in the PTableImpl.newKey method.
> 5.  for a key value column with a default value, we can get away without 
> incurring any storage cost. Although a little bit of extra effort than if we 
> persisted the default value on an UPSERT for key value columns, this approach 
> has the benefit of not incurring any storage cost for a default value.
> * serialize any default value into KeyValueColumnExpression
> * in the evaluate method of KeyValueColumnExpression, conditionally use 
> the default value if the column value is not present. If doing partial 
> evaluation, you should not yet return the default value, as we 

[jira] [Comment Edited] (PHOENIX-476) Support declaration of DEFAULT in CREATE statement

2016-10-09 Thread Kevin Liew (JIRA)

[ 
https://issues.apache.org/jira/browse/PHOENIX-476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15559451#comment-15559451
 ] 

Kevin Liew edited comment on PHOENIX-476 at 10/9/16 7:03 AM:
-

I attached a patch with additional test coverage and all test cases passing.
Expressions are now cached outside of PColumn and ColumnDef. The driver sends 
the default value expression as part of the projector scan and the coprocessors 
reply with the result sets with default values substituted in (previously I was 
trying to get the driver to differentiate between nulls and default values at 
the client side to reduce the resultset size, but it was causing issues).
Let me know your feedback.


was (Author: kliew):
I attached a patch with additional test coverage and all test cases passing.
Expressions are now cached outside of PColumn and ColumnDef. The driver sends 
the default value expression as part of the projector scan and the coprocessors 
reply with the result sets with default values substituted in (previously I was 
trying to get the driver to differentiate between nulls and default values at 
the client side, which was causing issues).
Let me know your feedback.

> Support declaration of DEFAULT in CREATE statement
> --
>
> Key: PHOENIX-476
> URL: https://issues.apache.org/jira/browse/PHOENIX-476
> Project: Phoenix
>  Issue Type: Task
>Affects Versions: 3.0-Release
>Reporter: James Taylor
>Assignee: Kevin Liew
>  Labels: enhancement
> Attachments: PHOENIX-476.2.patch, PHOENIX-476.3.patch, 
> PHOENIX-476.4.patch, PHOENIX-476.5.patch, PHOENIX-476.patch
>
>
> Support the declaration of a default value in the CREATE TABLE/VIEW statement 
> like this:
> CREATE TABLE Persons (
> Pid int NOT NULL PRIMARY KEY,
> LastName varchar(255) NOT NULL,
> FirstName varchar(255),
> Address varchar(255),
> City varchar(255) DEFAULT 'Sandnes'
> )
> To implement this, we'd need to:
> 1. add a new DEFAULT_VALUE key value column in SYSTEM.TABLE and pass through 
> the value when the table is created (in MetaDataClient).
> 2. always set NULLABLE to ResultSetMetaData.columnNoNulls if a default value 
> is present, since the column will never be null.
> 3. add a getDefaultValue() accessor in PColumn
> 4.  for a row key column, during UPSERT use the default value if no value was 
> specified for that column. This could be done in the PTableImpl.newKey method.
> 5.  for a key value column with a default value, we can get away without 
> incurring any storage cost. Although a little bit of extra effort than if we 
> persisted the default value on an UPSERT for key value columns, this approach 
> has the benefit of not incurring any storage cost for a default value.
> * serialize any default value into KeyValueColumnExpression
> * in the evaluate method of KeyValueColumnExpression, conditionally use 
> the default value if the column value is not present. If doing partial 
> evaluation, you should not yet return the default value, as we may not have 
> encountered the the KeyValue for the column yet (since a filter evaluates 
> each time it sees each KeyValue, and there may be more than one KeyValue 
> referenced in the expression). Partial evaluation is determined by calling 
> Tuple.isImmutable(), where false means it is NOT doing partial evaluation, 
> while true means it is.
> * modify EvaluateOnCompletionVisitor by adding a visitor method for 
> RowKeyColumnExpression and KeyValueColumnExpression to set 
> evaluateOnCompletion to true if they have a default value specified. This 
> will cause filter evaluation to execute one final time after all KeyValues 
> for a row have been seen, since it's at this time we know we should use the 
> default value.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PHOENIX-476) Support declaration of DEFAULT in CREATE statement

2016-10-09 Thread Kevin Liew (JIRA)

 [ 
https://issues.apache.org/jira/browse/PHOENIX-476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Liew updated PHOENIX-476:
---
Attachment: PHOENIX-476.5.patch

I attached a patch with additional test coverage and all test cases passing.
Expressions are now cached outside of PColumn and ColumnDef. The driver sends 
the default value expression as part of the projector scan and the coprocessors 
reply with the result sets with default values substituted in (previously I was 
trying to get the driver to differentiate between nulls and default values at 
the client side, which was causing issues).
Let me know your feedback.

> Support declaration of DEFAULT in CREATE statement
> --
>
> Key: PHOENIX-476
> URL: https://issues.apache.org/jira/browse/PHOENIX-476
> Project: Phoenix
>  Issue Type: Task
>Affects Versions: 3.0-Release
>Reporter: James Taylor
>Assignee: Kevin Liew
>  Labels: enhancement
> Attachments: PHOENIX-476.2.patch, PHOENIX-476.3.patch, 
> PHOENIX-476.4.patch, PHOENIX-476.5.patch, PHOENIX-476.patch
>
>
> Support the declaration of a default value in the CREATE TABLE/VIEW statement 
> like this:
> CREATE TABLE Persons (
> Pid int NOT NULL PRIMARY KEY,
> LastName varchar(255) NOT NULL,
> FirstName varchar(255),
> Address varchar(255),
> City varchar(255) DEFAULT 'Sandnes'
> )
> To implement this, we'd need to:
> 1. add a new DEFAULT_VALUE key value column in SYSTEM.TABLE and pass through 
> the value when the table is created (in MetaDataClient).
> 2. always set NULLABLE to ResultSetMetaData.columnNoNulls if a default value 
> is present, since the column will never be null.
> 3. add a getDefaultValue() accessor in PColumn
> 4.  for a row key column, during UPSERT use the default value if no value was 
> specified for that column. This could be done in the PTableImpl.newKey method.
> 5.  for a key value column with a default value, we can get away without 
> incurring any storage cost. Although a little bit of extra effort than if we 
> persisted the default value on an UPSERT for key value columns, this approach 
> has the benefit of not incurring any storage cost for a default value.
> * serialize any default value into KeyValueColumnExpression
> * in the evaluate method of KeyValueColumnExpression, conditionally use 
> the default value if the column value is not present. If doing partial 
> evaluation, you should not yet return the default value, as we may not have 
> encountered the the KeyValue for the column yet (since a filter evaluates 
> each time it sees each KeyValue, and there may be more than one KeyValue 
> referenced in the expression). Partial evaluation is determined by calling 
> Tuple.isImmutable(), where false means it is NOT doing partial evaluation, 
> while true means it is.
> * modify EvaluateOnCompletionVisitor by adding a visitor method for 
> RowKeyColumnExpression and KeyValueColumnExpression to set 
> evaluateOnCompletion to true if they have a default value specified. This 
> will cause filter evaluation to execute one final time after all KeyValues 
> for a row have been seen, since it's at this time we know we should use the 
> default value.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)