[jira] [Commented] (TRAFODION-2038) ERROR[2006] Internal error: assertion failure ((Int32)outputDataLen >= 0) during compile

2016-06-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/TRAFODION-2038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15321745#comment-15321745
 ] 

ASF GitHub Bot commented on TRAFODION-2038:
---

Github user asfgit closed the pull request at:

https://github.com/apache/incubator-trafodion/pull/529


>  ERROR[2006] Internal error: assertion failure ((Int32)outputDataLen >= 0) 
> during compile
> -
>
> Key: TRAFODION-2038
> URL: https://issues.apache.org/jira/browse/TRAFODION-2038
> Project: Apache Trafodion
>  Issue Type: Bug
>  Components: sql-cmp
>Affects Versions: 1.3-incubating
>Reporter: Suresh Subbiah
>Assignee: Howard Qin
>
> When certain query with a long varchar column in a predicate are compiled we 
> get the following error intermittently
> *** ERROR[2006] Internal error: assertion failure ((Int32)outputDataLen >= 0) 
> in file ../exp/exp_conv.cpp at line 4557.
> Problem occurs when a SQL compiler gets a query that is similar to what it 
> has seen before and can can therefore give a cached plan. The problem occurs 
> when the plan is backpatched to account for differences (i.e. does not show 
> on initial compile, or when query gets an identical text cache hit).
> It can be worked around with any one of these cqds, each one being more 
> specific than previous one.
> cqd query_cache '0';
> cqd hybrid_query_cache 'off' ;
> cqd QUERY_CACHE_USE_CONVDOIT_FOR_BACKPATCH 'off' ;
> Analysis by Sandhya and Arvind is below.
> (897)
> ssubbiah  (developer) 
> 2016-06-06 14:14
> This issue needs another round of debugging but from the Qcache point of 
> view. Here is what we found . The assertion failure happens at this point 
> (highlightedin the expressions code exp/exp_conv.cpp.
> if (rc || (targetPrecision && targetPrecision < (Lng32) translatedCharCnt))
> {
>   Lng32 canIgnoreOverflowChars = FALSE;
>   Lng32 remainingScanStringLen = 0;
>  
>   if (rc == 0)
> {
>   // We failed because the target has too many characters. Find
>   // the first offending character.
>   input_to_scan = utf8Tgt;
>   input_length = outputDataLen;
>   outputDataLen = lightValidateUTF8Str(utf8Tgt,
>(Int32)outputDataLen,
>(Int32)targetPrecision,
>
> ignoreTrailingBlanksInSource);
>   assert((Int32)outputDataLen >= 0); //LCOV_EXCL_LINE : rfi
>  
>  
> The following is the stack trace when query cache is on :
>  
> #0 convCharToChar (source=0x7fd270730d50 "reason 25", sourceLen=9, 
> sourceType=0, sourcePrecision=-1, sourceScale=1, 
> target=0x7fd269d7231a "reason 25", targetLen=31999, targetType=64, 
> targetPrecision=-1, targetScale=15, heap=0x0, diagsArea=0x0, 
> dataConversionErrorFlag=0x7fd286fa9fc8, actualTargetLen=0x7fd286fa9cb8, 
> blankPadResult=0, ignoreTrailingBlanksInSource=1, allowInvalidCodePoint=0)
> at ../exp/exp_conv.cpp:4524
> 001 0x7fd29657a34f in convDoIt (source=0x7fd270730d50 "reason 25", 
> sourceLen=9, sourceType=0, sourcePrecision=-1, sourceScale=1, 
> target=0x7fd269d7231a "reason 25", targetLen=31999, targetType=64, 
> targetPrecision=-1, targetScale=15, varCharLen=0x7fd269d72318 "\t", 
> varCharLenSize=2, heap=0x0, diagsArea=0x0, index=CONV_ASCII_F_V, 
> dataConversionErrorFlag=0x7fd286fa9fc8, flags=0)
> at ../exp/exp_conv.cpp:9272
> 002 0x7fd29432b74f in CacheData::backpatchParams 
> (this=0x7fd2, 
> listOfConstantParameters=..., 
> listOfDynamicParameters=, bindWA=..., 
> params=@0x7fd286faa0d8, parameterBufferSize=@0x7fd286faa0f8)
> at ../sqlcomp/QCache.cpp:1454
> 003 0x7fd2941d216f in CmpMain::compileFromCache (this=0x7fd286fad8e0, 
> sText=0x7fd270714da8 " SELECT SS_CUSTOMER_SK\n ,SUM(ACT_SALES) SUMSALES\n 
> FROM (SELECT SS_ITEM_SK\n ,SS_TICKET_NUMBER\n ,SS_CUSTOMER_SK\n ,CASE WHEN 
> SR_RETURN_QUANTITY IS NOT NULL THEN 
> (SS_QUANTITY-SR_RETURN_QUANTITY)*SS_SALES_"..., 
> ---Type  to continue, or q  to quit---
> charset=15, queryExpr=, bindWA=..., cachewa=..., 
> plan=0x7fd26c680a68, pLen=0x7fd26c680a60, heap=0x7fd285ad9440, op=3004, 
> bPatchOK=@0x7fd286faca8c, begTime=...) at ../sqlcomp/CmpMain.cpp:1545
> 004 0x7fd2941d7169 in CmpMain::compile (this=0x7fd286fad8e0, 
> input_str=0x7fd270714da8 " SELECT SS_CUSTOMER_SK\n ,SUM(ACT_SALES) 
> SUMSALES\n FROM (SELECT SS_ITEM_SK\n ,SS_TICKET_NUMBER\n ,SS_CUSTOMER_SK\n 
> ,CASE WHEN SR_RETURN_QUANTITY IS NOT NULL THEN 
> (SS_QUANTITY-SR_RETURN_QUANTITY)*SS_SALES_"..., charset=15, 
> queryExpr=@0x7fd286fad818, 

[jira] [Resolved] (TRAFODION-2041) UPDATE STATS uses select count(*) on large tables on HDP 2.3.4

2016-06-08 Thread David Wayne Birdsall (JIRA)

 [ 
https://issues.apache.org/jira/browse/TRAFODION-2041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Wayne Birdsall resolved TRAFODION-2041.
-
Resolution: Fixed

> UPDATE STATS uses select count(*) on large tables on HDP 2.3.4
> --
>
> Key: TRAFODION-2041
> URL: https://issues.apache.org/jira/browse/TRAFODION-2041
> Project: Apache Trafodion
>  Issue Type: Bug
>  Components: sql-cmp
>Affects Versions: 2.1-incubating
>Reporter: David Wayne Birdsall
>Assignee: David Wayne Birdsall
>Priority: Critical
> Fix For: 2.1-incubating
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TRAFODION-2041) UPDATE STATS uses select count(*) on large tables on HDP 2.3.4

2016-06-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/TRAFODION-2041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15321561#comment-15321561
 ] 

ASF GitHub Bot commented on TRAFODION-2041:
---

Github user asfgit closed the pull request at:

https://github.com/apache/incubator-trafodion/pull/532


> UPDATE STATS uses select count(*) on large tables on HDP 2.3.4
> --
>
> Key: TRAFODION-2041
> URL: https://issues.apache.org/jira/browse/TRAFODION-2041
> Project: Apache Trafodion
>  Issue Type: Bug
>  Components: sql-cmp
>Affects Versions: 2.1-incubating
>Reporter: David Wayne Birdsall
>Assignee: David Wayne Birdsall
>Priority: Critical
> Fix For: 2.1-incubating
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (TRAFODION-2041) UPDATE STATS uses select count(*) on large tables on HDP 2.3.4

2016-06-08 Thread David Wayne Birdsall (JIRA)

[ 
https://issues.apache.org/jira/browse/TRAFODION-2041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15321209#comment-15321209
 ] 

David Wayne Birdsall edited comment on TRAFODION-2041 at 6/8/16 6:56 PM:
-

I was able to reproduce this issue on a workstation by adding the following to 
hbase-site.xml in the usual install_local_hadoop configuration:

 
 hbase.bucketcache.ioengine
 offheap
   
 
 hbase.bucketcache.size
 1000
   

(I had to experiment with the hbase.bucketcache.size parameter a few times to 
make it small enough that HMaster could come up.)

I found that the original suggested fix still failed with an OOM. Therefore, I 
changed the fix to simply unset the hbase.bucketcache.ioengine property in 
HBaseClient.estimateRowCount if it is found.


was (Author: davebirdsall):
I was able to reproduce this issue on a workstation by adding the following to 
hbase-site.xml in the usual install_local_hadoop configuration:

 
 \hbase.bucketcache.ioengine\
 \offheap\
   \
   \  
 \hbase.bucketcache.size\
 \1000\
   \

(I had to experiment with the hbase.bucketcache.size parameter a few times to 
make it small enough that HMaster could come up.)

I found that the original suggested fix still failed with an OOM. Therefore, I 
changed the fix to simply unset the hbase.bucketcache.ioengine property in 
HBaseClient.estimateRowCount if it is found.

> UPDATE STATS uses select count(*) on large tables on HDP 2.3.4
> --
>
> Key: TRAFODION-2041
> URL: https://issues.apache.org/jira/browse/TRAFODION-2041
> Project: Apache Trafodion
>  Issue Type: Bug
>  Components: sql-cmp
>Affects Versions: 2.1-incubating
>Reporter: David Wayne Birdsall
>Assignee: David Wayne Birdsall
>Priority: Critical
> Fix For: 2.1-incubating
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (TRAFODION-2041) UPDATE STATS uses select count(*) on large tables on HDP 2.3.4

2016-06-08 Thread David Wayne Birdsall (JIRA)

[ 
https://issues.apache.org/jira/browse/TRAFODION-2041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15321209#comment-15321209
 ] 

David Wayne Birdsall edited comment on TRAFODION-2041 at 6/8/16 6:54 PM:
-

I was able to reproduce this issue on a workstation by adding the following to 
hbase-site.xml in the usual install_local_hadoop configuration:

 
 \hbase.bucketcache.ioengine\
 \offheap\
   \
   \  
 \hbase.bucketcache.size\
 \1000\
   \

(I had to experiment with the hbase.bucketcache.size parameter a few times to 
make it small enough that HMaster could come up.)

I found that the original suggested fix still failed with an OOM. Therefore, I 
changed the fix to simply unset the hbase.bucketcache.ioengine property in 
HBaseClient.estimateRowCount if it is found.


was (Author: davebirdsall):
I was able to reproduce this issue on a workstation by adding the following to 
hbase-site.xml in the usual install_local_hadoop configuration:

   \  
 \hbase.bucketcache.ioengine\
 \offheap\
   \
   \  
 \hbase.bucketcache.size\
 \1000\
   \

(I had to experiment with the hbase.bucketcache.size parameter a few times to 
make it small enough that HMaster could come up.)

I found that the original suggested fix still failed with an OOM. Therefore, I 
changed the fix to simply unset the hbase.bucketcache.ioengine property in 
HBaseClient.estimateRowCount if it is found.

> UPDATE STATS uses select count(*) on large tables on HDP 2.3.4
> --
>
> Key: TRAFODION-2041
> URL: https://issues.apache.org/jira/browse/TRAFODION-2041
> Project: Apache Trafodion
>  Issue Type: Bug
>  Components: sql-cmp
>Affects Versions: 2.1-incubating
>Reporter: David Wayne Birdsall
>Assignee: David Wayne Birdsall
>Priority: Critical
> Fix For: 2.1-incubating
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TRAFODION-2041) UPDATE STATS uses select count(*) on large tables on HDP 2.3.4

2016-06-08 Thread David Wayne Birdsall (JIRA)

[ 
https://issues.apache.org/jira/browse/TRAFODION-2041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15321209#comment-15321209
 ] 

David Wayne Birdsall commented on TRAFODION-2041:
-

I was able to reproduce this issue on a workstation by adding the following to 
hbase-site.xml in the usual install_local_hadoop configuration:

   \  
 \hbase.bucketcache.ioengine\
 \offheap\
   \
   \  
 \hbase.bucketcache.size\
 \1000\
   \

(I had to experiment with the hbase.bucketcache.size parameter a few times to 
make it small enough that HMaster could come up.)

I found that the original suggested fix still failed with an OOM. Therefore, I 
changed the fix to simply unset the hbase.bucketcache.ioengine property in 
HBaseClient.estimateRowCount if it is found.

> UPDATE STATS uses select count(*) on large tables on HDP 2.3.4
> --
>
> Key: TRAFODION-2041
> URL: https://issues.apache.org/jira/browse/TRAFODION-2041
> Project: Apache Trafodion
>  Issue Type: Bug
>  Components: sql-cmp
>Affects Versions: 2.1-incubating
>Reporter: David Wayne Birdsall
>Assignee: David Wayne Birdsall
>Priority: Critical
> Fix For: 2.1-incubating
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TRAFODION-2041) UPDATE STATS uses select count(*) on large tables on HDP 2.3.4

2016-06-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/TRAFODION-2041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15321186#comment-15321186
 ] 

ASF GitHub Bot commented on TRAFODION-2041:
---

GitHub user DaveBirdsall opened a pull request:

https://github.com/apache/incubator-trafodion/pull/532

[TRAFODION-2041] Turn off bucket cache in HBaseClient estimateRowCount

This solves the issue of UPDATE STATISTICS doing select count(*) on large 
tables. Details are in the JIRA.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/DaveBirdsall/incubator-trafodion Trafodion2041

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-trafodion/pull/532.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #532


commit 8b179aca0a4adbeaf721b2e29205f6c9790ade0f
Author: Dave Birdsall 
Date:   2016-06-08T18:44:02Z

[TRAFODION-2041] Turn off bucket cache in HBaseClient estimateRowCount




> UPDATE STATS uses select count(*) on large tables on HDP 2.3.4
> --
>
> Key: TRAFODION-2041
> URL: https://issues.apache.org/jira/browse/TRAFODION-2041
> Project: Apache Trafodion
>  Issue Type: Bug
>  Components: sql-cmp
>Affects Versions: 2.1-incubating
>Reporter: David Wayne Birdsall
>Assignee: David Wayne Birdsall
>Priority: Critical
> Fix For: 2.1-incubating
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TRAFODION-1977) Merge 2.0 fixes forward to master branch

2016-06-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/TRAFODION-1977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15321093#comment-15321093
 ] 

ASF GitHub Bot commented on TRAFODION-1977:
---

GitHub user svarnau opened a pull request:

https://github.com/apache/incubator-trafodion/pull/531

[TRAFODION-1977] Merge forward release2.0 changes to master

Merge 2.0.1 RC1 fixes forward to master
This now includes
TRAFODION-2023 Clarify License Text
TRAFODION-2024 Dynamically link SSL libraries



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/svarnau/incubator-trafodion mrg_rel201rc1

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-trafodion/pull/531.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #531


commit a45063a45082674224c074199d23838f96b0a40b
Author: Steve Varnau 
Date:   2016-06-01T21:43:38Z

[TRAFODION-2023] Clarify license text

Fix issues found in review of release 2.0.0.

Clarify that BSD-4 license has a rescinded clause.

Remove the GPL text of two packages that have dual license, since we are
not redistributing under GPL.

commit 0a8507aaf680216f09340110fae857b402f303ce
Author: Steve Varnau 
Date:   2016-06-01T21:51:07Z

Clarify how to run RAT, per comments from release 2.0.0 review.

commit 4c49f6f69ff1b10a70134bd25301447fc5a7c0d9
Author: Dave Birdsall 
Date:   2016-06-02T18:20:33Z

Merge [TRAFODION-2023] PR 515 Clarify license text

commit f7c2c1e08895c3ad960626a1eb46f5cf2061a4d5
Author: Anuradha Hegde 
Date:   2016-06-04T13:32:28Z

Openssl libraries (ssl & crypto) are linked dynamically

commit e18b447c625a7b7ead8a78a93de3a630b45af5cb
Author: Dave Birdsall 
Date:   2016-06-06T16:10:02Z

Merge [TRAFODION-2024] PR 522 Change Openssl library linkage to dynamic

commit e3755ec2cdc233930a3cbe199f6a1d618098d6f8
Author: Steve Varnau 
Date:   2016-06-06T16:41:39Z

Bump Release number to 2.0.1

Patch release for linking SSL libraries.

commit 5943646e38861d4fceb92a5be314a440b4528aff
Author: Steve Varnau 
Date:   2016-06-06T19:12:41Z

Release version change - fix core regression

Backport from master branch. Same fix was needed when release version
changed there.

commit 355bb687b70acd0aba78b4436fd7bdf11dd8c9ed
Author: Dave Birdsall 
Date:   2016-06-07T00:26:28Z

Merge PR 524 Bump release number to 2.0.1

commit 4ce086c1be09ca8d6d111a13541cadc71f4da4d2
Author: Steve Varnau 
Date:   2016-06-08T17:59:52Z

[TRAFODION-1977] Merge 2.0.1 RC1 fixes forward to master

Merging all release2.0 branch fixes forward. This now includes
TRAFODION-2023 Clarify License Text
TRAFODION-2024 Dynamically link SSL libraries




> Merge 2.0 fixes forward to master branch
> 
>
> Key: TRAFODION-1977
> URL: https://issues.apache.org/jira/browse/TRAFODION-1977
> Project: Apache Trafodion
>  Issue Type: Bug
>Affects Versions: 2.1-incubating
>Reporter: Steve Varnau
>Assignee: Steve Varnau
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (TRAFODION-2042) Index shall inherit compression and data_block_encoding HBase options from Table

2016-06-08 Thread Suresh Subbiah (JIRA)
Suresh Subbiah created TRAFODION-2042:
-

 Summary: Index shall inherit compression and data_block_encoding 
HBase options from Table
 Key: TRAFODION-2042
 URL: https://issues.apache.org/jira/browse/TRAFODION-2042
 Project: Apache Trafodion
  Issue Type: Improvement
  Components: sql-general
Affects Versions: 2.0-incubating
Reporter: Suresh Subbiah
Assignee: Suresh Subbiah


When a an index is created on a Trafodion table using the CREATE INDEX 
statement, users accidentally omit specifying the HBASE_OPTIONS clause. This 
results in the index being uncompressed and occupying more than necessary space 
on disk. If the table is large this can also lead to errors during bulkload as 
a single region may have more than 32 HFiles each of size 10 GB (default max 
size for HFile created by bulkload).

Having the index inherit compression and encoding options used by the table by 
default is reasonable and will help.
If the user wishes for the index to have a different type of compression from 
the base table then it can be explicitly specified in the CREATE INDEX 
statement.

A CQD will also be provided to disable this attribute inheritance.

An attribute inherited from the base table will take precedence over 
compression/encoding type obtained from
HBASE_COMPRESSION_OPTION and HBASE_DATA_BLOCK_ENCODING_OPTION cqds.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TRAFODION-2041) UPDATE STATS uses select count(*) on large tables on HDP 2.3.4

2016-06-08 Thread David Wayne Birdsall (JIRA)

[ 
https://issues.apache.org/jira/browse/TRAFODION-2041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15320835#comment-15320835
 ] 

David Wayne Birdsall commented on TRAFODION-2041:
-

UPDATE STATISTICS requires an estimate of row count before it selects an 
algorithm to use for computing histograms. If the row count is small, it will 
choose a very fast in-memory algorithm. If the row count is larger but the 
sample size will fit in memory, it will use a sampling query that reads data 
directly into memory. If the sample size will not fit in memory, it creates a 
sample table, and performs an UPSERT to that, capturing a sample of the data. 
It then calculates the histograms a few columns at a time; as many as will fit 
in memory.

If sampling is not chosen, UPDATE STATISTICS uses a select count(*) to obtain 
the row count. This is typically done only for smaller tables.

If sampling is chosen, UPDATE STATISTICS uses a Java routine, 
HBaseClient.estimateRowCount, which uses server-side HBase code to estimate the 
row count. It looks at size information stored within the file, and samples a 
few rows to estimate how many cells there are per row, and from knowledge of 
the Trafodion metadata, estimates the number of rows.

On HDP 2.3.4, however, this routine is failing with a Java OutOfMemory 
exception. The exception is not reported to the UPDATE STATISTICS code (the 
layers between HBaseClient.estimateRowCount and the UPDATE STATISTICS code 
silently discard it), however the estimate of rows returned to UPDATE 
STATISTICS is zero. In this case, UPDATE STATISTICS does a select count(*) 
instead, as the error in using a small estimate is prohibitively large. 
Unfortunately, this is happening when in fact the tables are large. And select 
count(*) on large tables is a very slow operation that tends to bottleneck all 
the RegionServers in a cluster.

Here are the details:

NATable::estimateHBaseRowCount() calls ExpHbaseInterface_JNI::estimateRowCount, 
which calls HBaseClient_JNI::estimateRowCount. That function is returning an 
error HBC_ERROR_ROWCOUNT_EST_EXCEPTION, which the higher callers ignore. They 
just use a zero row count in that case, which triggers the “select count(*)”. 
Not unreasonable; it does ultimately give the correct result.
 
While I was still in debug in HBaseClient_JNI::estimateRowCount, though, I had 
a look at the exception details that get saved off. (In my stepping, I observed 
that we went through this code path:
 
  if (jenv_->ExceptionCheck())
  {
getExceptionDetails();
logError(CAT_SQL_HBASE, __FILE__, __LINE__);
logError(CAT_SQL_HBASE, "HBaseClient_JNI::estimateRowCount()", 
getLastError());
jenv_->PopLocalFrame(NULL);
return HBC_ERROR_ROWCOUNT_EST_EXCEPTION;
  }
 
Now, by looking at cli_globals->getJniErrorStr(), one can see the exception 
details captured.)
 
Here are the exception details:
 
(gdb) set print elements 1000
(gdb) p cli_globals->getJniErrorStr()
$25 = { = {_vptr.NABasicObject = 0x7f6c2ebaa9d0, 
h_ = 0x7f6c1b06f068}, fbstring_ = {static npos = , 
store_ = {heap_ = 0x0, {
small_ = 
"x\227'\003\000\000\000\000\257\003\000\000\000\000\000\000\360\003\000\000\000\000\000@",
 ml_ = {
  data_ = 0x3279778 "\njava.lang.OutOfMemoryError: Direct buffer 
memory\njava.nio.Bits.reserveMemory(Bits.java:658)\njava.nio.DirectByteBuffer.(DirectByteBuffer.java:123)\njava.nio.ByteBuffer.allocateDirect(ByteBuffer.java:306)\norg.apache.hadoop.hbase.util.ByteBufferArray.(ByteBufferArray.java:65)\norg.apache.hadoop.hbase.io.hfile.bucket.ByteBufferIOEngine.(ByteBufferIOEngine.java:47)\norg.apache.hadoop.hbase.io.hfile.bucket.BucketCache.getIOEngineFromName(BucketCache.java:307)\norg.apache.hadoop.hbase.io.hfile.bucket.BucketCache.(BucketCache.java:217)\norg.apache.hadoop.hbase.io.hfile.CacheConfig.getBucketCache(CacheConfig.java:614)\norg.apache.hadoop.hbase.io.hfile.CacheConfig.getL2(CacheConfig.java:553)\norg.apache.hadoop.hbase.io.hfile.CacheConfig.instantiateBlockCache(CacheConfig.java:637)\norg.apache.hadoop.hbase.io.hfile.CacheConfig.(CacheConfig.java:231)\norg.trafodion.sql.HBaseClient.estimateRowCount(HBaseClient.java:1155)\n",
 size_ = 943, capacity_ = 4611686018427388912}
(gdb)

Analyzing the code in this exception call stack suggests that the cause is we 
are creating an HBase offheap bucket cache, and there is not enough memory to 
do so. The choice of what kind of cache to create is determined by the setting 
of the hbase.bucketcache.ioengine property in hbase-site.xml. Evidently, on HDP 
2.3.4, this setting is "offheap", while on other distributions we support it is 
"heap". So, the speculation is that it is this "offheap" setting that causes 
the problem.

In general, though, this kind of issue raises the question of why we are 
calling server-side APIs in client-side logic. It is probably better 
architecturally to package this logic in a 

[jira] [Commented] (TRAFODION-2039) Add support for ALTER LIBRARY FILE ''

2016-06-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/TRAFODION-2039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15320817#comment-15320817
 ] 

ASF GitHub Bot commented on TRAFODION-2039:
---

GitHub user sureshsubbiah opened a pull request:

https://github.com/apache/incubator-trafodion/pull/530

[TRAFODION-2039] Add support for ALTER LIBRARY FILE ''

Library file (.jar, .dll or .so) can now be altered for an existing library 
object. ALTER command will check for existence of new file. Old library
definition will be removed from compiler caches

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sureshsubbiah/incubator-trafodion alterlib

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-trafodion/pull/530.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #530


commit 35d28d6f6ad7eecb8c04d2bb4e9ebc1931c3c063
Author: Suresh Subbiah 
Date:   2016-06-08T15:50:13Z

[TRAFODION-2039] Add support for ALTER LIBRARY FILE ''

Location of library file can now be altered. Alter command will check for
existence of new file. Old library definition will be removed from compiler
caches.




> Add support for ALTER LIBRARY FILE ''
> 
>
> Key: TRAFODION-2039
> URL: https://issues.apache.org/jira/browse/TRAFODION-2039
> Project: Apache Trafodion
>  Issue Type: Improvement
>  Components: sql-general
>Affects Versions: 2.0-incubating
>Reporter: Suresh Subbiah
>Assignee: Suresh Subbiah
>
> Add suppport for an ALTER statement that will allow the file associated with 
> a library object to be changed.
> ALTER LIBRARY FILE ' ;'



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (TRAFODION-2041) UPDATE STATS uses select count(*) on large tables on HDP 2.3.4

2016-06-08 Thread David Wayne Birdsall (JIRA)
David Wayne Birdsall created TRAFODION-2041:
---

 Summary: UPDATE STATS uses select count(*) on large tables on HDP 
2.3.4
 Key: TRAFODION-2041
 URL: https://issues.apache.org/jira/browse/TRAFODION-2041
 Project: Apache Trafodion
  Issue Type: Bug
  Components: sql-cmp
Affects Versions: 2.1-incubating
Reporter: David Wayne Birdsall
Assignee: David Wayne Birdsall
Priority: Critical
 Fix For: 2.1-incubating






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (TRAFODION-2039) Add support for ALTER LIBRARY FILE ''

2016-06-08 Thread Suresh Subbiah (JIRA)

 [ 
https://issues.apache.org/jira/browse/TRAFODION-2039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Subbiah reassigned TRAFODION-2039:
-

Assignee: Suresh Subbiah

> Add support for ALTER LIBRARY FILE ''
> 
>
> Key: TRAFODION-2039
> URL: https://issues.apache.org/jira/browse/TRAFODION-2039
> Project: Apache Trafodion
>  Issue Type: Improvement
>  Components: sql-general
>Affects Versions: 2.0-incubating
>Reporter: Suresh Subbiah
>Assignee: Suresh Subbiah
>
> Add suppport for an ALTER statement that will allow the file associated with 
> a library object to be changed.
> ALTER LIBRARY FILE ' ;'



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TRAFODION-2011) better logging or exception messaging for getScanner issue due to lease timeout

2016-06-08 Thread liu ming (JIRA)

[ 
https://issues.apache.org/jira/browse/TRAFODION-2011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15320615#comment-15320615
 ] 

liu ming commented on TRAFODION-2011:
-

this is due to transaction lease timeout. And executor doesn't know, so 
continue to do scan, but the cleanup chore already reclaime the scanner and 
report this error.
So to fix this, one must correctly handle the transaction lease timeout.

> better logging or exception messaging for getScanner issue due to lease 
> timeout
> ---
>
> Key: TRAFODION-2011
> URL: https://issues.apache.org/jira/browse/TRAFODION-2011
> Project: Apache Trafodion
>  Issue Type: Improvement
>Reporter: liu ming
>Assignee: liu ming
>
> When an active transaction is too long, its lease will timeout. And the 
> transaction will be retired internally, however, the client doesn't know and 
> still issue new get/put operations, and will get strange exception, which is 
> hard to understand.
> It will be better to enhance the error message or logging to help identify 
> the issue.
> Reproduce:
> simulate a long transaction, 
> >begin;
> wait for 2 hours, or change the hbase.transaction.lease.timeout to shorter 
> timeout.
> >do an update;
> some error like this, it is confusing:
> *** ERROR[8448] Unable to access Hbase interface. Call to 
> ExpHbaseInterface::nextRow returned error HBASE_ACCESS_ERROR(-706). Cause:
> java.util.concurrent.ExecutionException: java.io.IOException: PerformScan 
> error on coprocessor call, scannerID: 1 java.io.IOException: performScan 
> encountered Exception txID: 25769804282 Exception: 
> org.apache.hadoop.hbase.UnknownScannerException: TrxRegionEndpoint getScanner 
> - scanner id 1, already closed?
> java.util.concurrent.FutureTask.report(FutureTask.java:122)
> java.util.concurrent.FutureTask.get(FutureTask.java:188)
> org.trafodion.sql.HTableClient.fetchRows(HTableClient.java:1251)
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TRAFODION-2011) better logging or exception messaging for getScanner issue due to lease timeout

2016-06-08 Thread liu ming (JIRA)

 [ 
https://issues.apache.org/jira/browse/TRAFODION-2011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liu ming updated TRAFODION-2011:

Assignee: Sean Broeder  (was: liu ming)

> better logging or exception messaging for getScanner issue due to lease 
> timeout
> ---
>
> Key: TRAFODION-2011
> URL: https://issues.apache.org/jira/browse/TRAFODION-2011
> Project: Apache Trafodion
>  Issue Type: Improvement
>Reporter: liu ming
>Assignee: Sean Broeder
>
> When an active transaction is too long, its lease will timeout. And the 
> transaction will be retired internally, however, the client doesn't know and 
> still issue new get/put operations, and will get strange exception, which is 
> hard to understand.
> It will be better to enhance the error message or logging to help identify 
> the issue.
> Reproduce:
> simulate a long transaction, 
> >begin;
> wait for 2 hours, or change the hbase.transaction.lease.timeout to shorter 
> timeout.
> >do an update;
> some error like this, it is confusing:
> *** ERROR[8448] Unable to access Hbase interface. Call to 
> ExpHbaseInterface::nextRow returned error HBASE_ACCESS_ERROR(-706). Cause:
> java.util.concurrent.ExecutionException: java.io.IOException: PerformScan 
> error on coprocessor call, scannerID: 1 java.io.IOException: performScan 
> encountered Exception txID: 25769804282 Exception: 
> org.apache.hadoop.hbase.UnknownScannerException: TrxRegionEndpoint getScanner 
> - scanner id 1, already closed?
> java.util.concurrent.FutureTask.report(FutureTask.java:122)
> java.util.concurrent.FutureTask.get(FutureTask.java:188)
> org.trafodion.sql.HTableClient.fetchRows(HTableClient.java:1251)
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TRAFODION-2038) ERROR[2006] Internal error: assertion failure ((Int32)outputDataLen >= 0) during compile

2016-06-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/TRAFODION-2038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15320246#comment-15320246
 ] 

ASF GitHub Bot commented on TRAFODION-2038:
---

GitHub user HowardQin opened a pull request:

https://github.com/apache/incubator-trafodion/pull/529

Fix webroot reported assertion failure

JIRA TRAFODION-2038

Webroot reported:
ERROR[2006] Internal error:assertion failure ((Int32)outputDataLen >= 0) 
during compile,
which is the same issue as mantis-262, mantis-134, cherry pick from Adv2.1

Changes:
In QCache:
//backpatch for HQC queries
NABoolean CacheData::backpatchParams()
//convDoIt
sourceType->getPrecision() ==> sourceType->getPrecisionOrMaxNumChars()

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/HowardQin/incubator-trafodion patch-incubator

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-trafodion/pull/529.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #529


commit 914e42d4b3f1730561949f8df28d3cccd6d68ad1
Author: Howard Qin 
Date:   2016-06-08T03:56:36Z

Fix webroot reported assertion failure

JIRA TRAFODION-2038

Webroot reported:
ERROR[2006] Internal error:assertion failure ((Int32)outputDataLen >= 0) 
during compile,
which is the same issue as mantis-262, mantis-134, cherry pick from Adv2.1

Changes:
In QCache:
//backpatch for HQC queries
NABoolean CacheData::backpatchParams()
//convDoIt
sourceType->getPrecision() ==> sourceType->getPrecisionOrMaxNumChars()




>  ERROR[2006] Internal error: assertion failure ((Int32)outputDataLen >= 0) 
> during compile
> -
>
> Key: TRAFODION-2038
> URL: https://issues.apache.org/jira/browse/TRAFODION-2038
> Project: Apache Trafodion
>  Issue Type: Bug
>  Components: sql-cmp
>Affects Versions: 1.3-incubating
>Reporter: Suresh Subbiah
>Assignee: Howard Qin
>
> When certain query with a long varchar column in a predicate are compiled we 
> get the following error intermittently
> *** ERROR[2006] Internal error: assertion failure ((Int32)outputDataLen >= 0) 
> in file ../exp/exp_conv.cpp at line 4557.
> Problem occurs when a SQL compiler gets a query that is similar to what it 
> has seen before and can can therefore give a cached plan. The problem occurs 
> when the plan is backpatched to account for differences (i.e. does not show 
> on initial compile, or when query gets an identical text cache hit).
> It can be worked around with any one of these cqds, each one being more 
> specific than previous one.
> cqd query_cache '0';
> cqd hybrid_query_cache 'off' ;
> cqd QUERY_CACHE_USE_CONVDOIT_FOR_BACKPATCH 'off' ;
> Analysis by Sandhya and Arvind is below.
> (897)
> ssubbiah  (developer) 
> 2016-06-06 14:14
> This issue needs another round of debugging but from the Qcache point of 
> view. Here is what we found . The assertion failure happens at this point 
> (highlightedin the expressions code exp/exp_conv.cpp.
> if (rc || (targetPrecision && targetPrecision < (Lng32) translatedCharCnt))
> {
>   Lng32 canIgnoreOverflowChars = FALSE;
>   Lng32 remainingScanStringLen = 0;
>  
>   if (rc == 0)
> {
>   // We failed because the target has too many characters. Find
>   // the first offending character.
>   input_to_scan = utf8Tgt;
>   input_length = outputDataLen;
>   outputDataLen = lightValidateUTF8Str(utf8Tgt,
>(Int32)outputDataLen,
>(Int32)targetPrecision,
>
> ignoreTrailingBlanksInSource);
>   assert((Int32)outputDataLen >= 0); //LCOV_EXCL_LINE : rfi
>  
>  
> The following is the stack trace when query cache is on :
>  
> #0 convCharToChar (source=0x7fd270730d50 "reason 25", sourceLen=9, 
> sourceType=0, sourcePrecision=-1, sourceScale=1, 
> target=0x7fd269d7231a "reason 25", targetLen=31999, targetType=64, 
> targetPrecision=-1, targetScale=15, heap=0x0, diagsArea=0x0, 
> dataConversionErrorFlag=0x7fd286fa9fc8, actualTargetLen=0x7fd286fa9cb8, 
> blankPadResult=0, ignoreTrailingBlanksInSource=1, allowInvalidCodePoint=0)
> at ../exp/exp_conv.cpp:4524
> 001 0x7fd29657a34f in convDoIt (source=0x7fd270730d50 "reason 25", 
> sourceLen=9, sourceType=0, sourcePrecision=-1, sourceScale=1, 
> target=0x7fd269d7231a "reason 25", 

[jira] [Commented] (TRAFODION-2036) Write access permission denied for user TRAFODION on "/hbase/archive/data/default"

2016-06-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/TRAFODION-2036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15320116#comment-15320116
 ] 

ASF GitHub Bot commented on TRAFODION-2036:
---

Github user asfgit closed the pull request at:

https://github.com/apache/incubator-trafodion/pull/527


> Write access permission denied for user TRAFODION on 
> "/hbase/archive/data/default"
> --
>
> Key: TRAFODION-2036
> URL: https://issues.apache.org/jira/browse/TRAFODION-2036
> Project: Apache Trafodion
>  Issue Type: Bug
>  Components: sql-general
>Reporter: Roberta Marton
>
> Trafodion using snapshots for loading data and building indexes.  Today, it 
> piggy-backs snapshot locations using the existing location - 
> /hbase/archive/data/default. ACL permissions for this location are not set 
> correctly and are reset at times by HBase.
> From a discussion with Cloudera:
>  deletion, snapshots drops, basically anything that would have caused HBase to 
> move files. Yes, it is periodically cleaned up, and files that don't belong 
> to a table being archived are targeted by that cleanup process*. This should 
> be considered an HBase internal repository and you shouldn't be putting 
> things in there and changing permissions. 
> *https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/backup/example/LongTermArchivingHFileCleaner.java;>
> Condensed e-mail exchange on this issue:
> 
> Subject: hive/test018: RE: Trafodion master Daily Test Result - 224 - Failure
> *** ERROR[8448] Unable to access Hbase interface. Call to 
> ExpHbaseInterface::scanOpen returned error HBASE_OPEN_ERROR(-704). Cause:
> > java.io.IOException: java.util.concurrent.ExecutionException: 
> > org.apache.hadoop.security.AccessControlException: Permission denied: 
> > user=TRAFODION, access=WRITE, 
> > inode="/hbase/archive/data/default":hbase:hbase:drwxr-xr-x:user:TRAFODION:rwx,group::r-x,default:user::rwx,default:user:TRAFODION:rwx,default:group::r-x,default:mask::rwx,default:other::r-x
> >  We tried to solve this issue with the installer but it can't be done. 
>  
> The installer can create /hbase/archive but not /hbase/archive/data/default. 
> Cloudera/HDFS/HDP goes and deletes these directories sometimes before we can 
> even add the correct permissions to them (the next command). I don't know 
> 'why' this happens... I think it has something to do with creating 3 levels 
> of empty directories but I am not sure. 
>  I tested this and wrote long emails about this a few months ago... I even 
> put in the changes "anyways" but it causes installer to fail when the 
> directory is deleted before the next command (sometimes it deletes the 
> folders quickly sometimes more slowly) so I had to take it out. 
>  I will go back to my original comment on this... who (trafodion, hive, 
> hdfs?) is using this directory? Is this a hard coded value in our code? 
> 
> Snapshot is supposed to be used by hbase user alone because it was mostly 
> used for Admin purposes. In trafodion case, the snapshot is used as a 
> Trafodion user. This requires that the folders used by snapshot have read and 
> write permission for Trafodion user.  Hence we use ACL to provide access to 
> Traofodion. Alternatively, I have suggested that Trafodion user belongs to 
> hbase group and allow hbase group to have read/write permissions in the early 
> days of Esgyn. I believe it fell through the cracks. 
> 
> One thing to consider, a customer who is concerned about security – will it 
> be acceptable to make the Trafodion ID belong to the HBase group?   A 
> customer that has an HBase setup separate from Trafodion may not want to give 
> the Trafodion user more elevated privileges.
> 
> In that case, we need to make ACL work somehow otherwise we can get into 
> problems at the time of bulkloading or create index. Couple of times, I got 
> into a situation that I was not able to bring up hbase in lava @hp till I 
> changed hdfs to give write permission to everyone.  This issue needs to be 
> addressed and I hope it doesn’t fall through the cracks again.
> 
> Thanks for all the feedback. If this issue has been around for such a long 
> time  , my question is why does this how up so infrequently ? The tests today 
> have also failed and we do need to address this issue ASAP. But it doesn’t 
> fail all the time. 
> Were the ACLs set up manually on the build machines for that 3 level deep 
> directory and do those just stay around ? Is this a new VM  and that’s why 
> it’s showing up ? 
> Amanda,  is there a Cloudera dev contact who can explain this issue that you 
> or CLR have already contacted ? Or can you post  the question in the usual 
> places you usually look for answers about CDH and HDP ? 
> Roberta’s reply seems to indicate