date:20160805

[jira] [Commented] (HIVE-13901) Hivemetastore add partitions can be slow depending on filesystems

2016-08-05 Thread Lefty Leverenz (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-13901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15410492#comment-15410492
 ] 

Lefty Leverenz commented on HIVE-13901:
---

HIVE-14423 changes the default value of *hive.metastore.fshandler.threads* for 
release 2.2.0.

> Hivemetastore add partitions can be slow depending on filesystems
> -
>
> Key: HIVE-13901
> URL: https://issues.apache.org/jira/browse/HIVE-13901
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
>  Labels: TODOC2.1.1, TODOC2.2
> Fix For: 2.2.0, 2.1.1
>
> Attachments: HIVE-13901.1.patch, HIVE-13901.2.patch, 
> HIVE-13901.6.patch, HIVE-13901.7.patch, HIVE-13901.8.patch, HIVE-13901.9.patch
>
>
> Depending on FS, creating external tables & adding partitions can be 
> expensive (e.g msck which adds all partitions).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14423) S3: Fetching partition sizes from FS can be expensive when stats are not available in metastore

2016-08-05 Thread Lefty Leverenz (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15410491#comment-15410491
 ] 

Lefty Leverenz commented on HIVE-14423:
---

Doc note:  This changes the default value of 
*hive.metastore.fshandler.threads*, which was introduced by HIVE-13901 (2.1.1 
and 2.2.0) but is not documented in the wiki yet.

Will this also be committed to branch-2.1 for release 2.1.1?

> S3: Fetching partition sizes from FS can be expensive when stats are not 
> available in metastore 
> 
>
> Key: HIVE-14423
> URL: https://issues.apache.org/jira/browse/HIVE-14423
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
>  Labels: TODOC2.2
> Fix For: 2.2.0
>
> Attachments: HIVE-14423.1.patch, HIVE-14423.2.patch
>
>
> When partition stats are not available in metastore, it tries to get the file 
> sizes from FS.
> e.g
> {noformat}
> at 
> org.apache.hadoop.fs.FileSystem.getContentSummary(FileSystem.java:1487)
> at 
> org.apache.hadoop.hive.ql.stats.StatsUtils.getFileSizeForPartitions(StatsUtils.java:598)
> at 
> org.apache.hadoop.hive.ql.stats.StatsUtils.collectStatistics(StatsUtils.java:235)
> at 
> org.apache.hadoop.hive.ql.stats.StatsUtils.collectStatistics(StatsUtils.java:144)
> at 
> org.apache.hadoop.hive.ql.stats.StatsUtils.collectStatistics(StatsUtils.java:132)
> at 
> org.apache.hadoop.hive.ql.optimizer.stats.annotation.StatsRulesProcFactory$TableScanStatsRule.process(StatsRulesProcFactory.java:126)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105)
> {noformat}
> This can be quite expensive in some FS like S3. Especially when table is 
> partitioned (e.g TPC-DS store_sales which has 1000s of partitions), query can 
> spend 1000s of seconds just waiting for these information to be pulled in.
> Also, it would be good to remove FS.getContentSummary usage to find out file 
> sizes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14440) Fix default value of USE_DEPRECATED_CLI in cli.cmd

2016-08-05 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15410478#comment-15410478
 ] 

Hive QA commented on HIVE-14440:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12822340/HIVE-14440.01.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/792/testReport
Console output: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/792/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-792/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ [[ -n /usr/java/jdk1.8.0_25 ]]
+ export JAVA_HOME=/usr/java/jdk1.8.0_25
+ JAVA_HOME=/usr/java/jdk1.8.0_25
+ export 
PATH=/usr/java/jdk1.8.0_25/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/java/jdk1.8.0_25/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hive-ptest/working/
+ tee /data/hive-ptest/logs/PreCommit-HIVE-MASTER-Build-792/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at c194577 HIVE-14423 : S3: Fetching partition sizes from FS can be 
expensive when stats are not available in metastore (Rajesh Balamohan via Chris 
Nauroth, Ashutosh Chauhan)
+ git clean -f -d
+ git checkout master
Already on 'master'
Your branch is up-to-date with 'origin/master'.
+ git reset --hard origin/master
HEAD is now at c194577 HIVE-14423 : S3: Fetching partition sizes from FS can be 
expensive when stats are not available in metastore (Rajesh Balamohan via Chris 
Nauroth, Ashutosh Chauhan)
+ git merge --ff-only origin/master
Already up-to-date.
+ git gc
+ patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hive-ptest/working/scratch/build.patch
+ [[ -f /data/hive-ptest/working/scratch/build.patch ]]
+ chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh
+ /data/hive-ptest/working/scratch/smart-apply-patch.sh 
/data/hive-ptest/working/scratch/build.patch
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12822340 - PreCommit-HIVE-MASTER-Build

> Fix default value of USE_DEPRECATED_CLI in cli.cmd
> --
>
> Key: HIVE-14440
> URL: https://issues.apache.org/jira/browse/HIVE-14440
> Project: Hive
>  Issue Type: Sub-task
>  Components: CLI
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Minor
> Attachments: HIVE-14440.01.patch
>
>
> cli.cmd script sets the default value of USE_DEPRECATED_CLI to false when it 
> is not set which is different than cli.sh which sets it to true.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14439) LlapTaskScheduler should try scheduling tasks when a node is disabled

2016-08-05 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15410470#comment-15410470
 ] 

Hive QA commented on HIVE-14439:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12822334/HIVE-14439.02.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 10440 tests 
executed
*Failed tests:*
{noformat}
TestMsgBusConnection - did not produce a TEST-*.xml file
TestQueryLifeTimeHook - did not produce a TEST-*.xml file
org.apache.hadoop.hive.llap.tezplugins.TestLlapTaskSchedulerService.testForcedLocalityMultiplePreemptionsSameHost2
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/791/testReport
Console output: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/791/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-791/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12822334 - PreCommit-HIVE-MASTER-Build

> LlapTaskScheduler should try scheduling tasks when a node is disabled
> -
>
> Key: HIVE-14439
> URL: https://issues.apache.org/jira/browse/HIVE-14439
> Project: Hive
>  Issue Type: Bug
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-14439.01.patch, HIVE-14439.02.patch
>
>
> When a node is disabled - try scheduling pending tasks. Tasks which may have 
> been waiting for the node to become available could become candidates for 
> scheduling on alternate nodes depending on the locality delay and disable 
> duration.
> This is what is causing an occasional timeout on 
> testDelayedLocalityNodeCommErrorImmediateAllocation



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14423) S3: Fetching partition sizes from FS can be expensive when stats are not available in metastore

2016-08-05 Thread Lefty Leverenz (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-14423:
--
Labels: TODOC2.2  (was: )

> S3: Fetching partition sizes from FS can be expensive when stats are not 
> available in metastore 
> 
>
> Key: HIVE-14423
> URL: https://issues.apache.org/jira/browse/HIVE-14423
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
>  Labels: TODOC2.2
> Fix For: 2.2.0
>
> Attachments: HIVE-14423.1.patch, HIVE-14423.2.patch
>
>
> When partition stats are not available in metastore, it tries to get the file 
> sizes from FS.
> e.g
> {noformat}
> at 
> org.apache.hadoop.fs.FileSystem.getContentSummary(FileSystem.java:1487)
> at 
> org.apache.hadoop.hive.ql.stats.StatsUtils.getFileSizeForPartitions(StatsUtils.java:598)
> at 
> org.apache.hadoop.hive.ql.stats.StatsUtils.collectStatistics(StatsUtils.java:235)
> at 
> org.apache.hadoop.hive.ql.stats.StatsUtils.collectStatistics(StatsUtils.java:144)
> at 
> org.apache.hadoop.hive.ql.stats.StatsUtils.collectStatistics(StatsUtils.java:132)
> at 
> org.apache.hadoop.hive.ql.optimizer.stats.annotation.StatsRulesProcFactory$TableScanStatsRule.process(StatsRulesProcFactory.java:126)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105)
> {noformat}
> This can be quite expensive in some FS like S3. Especially when table is 
> partitioned (e.g TPC-DS store_sales which has 1000s of partitions), query can 
> spend 1000s of seconds just waiting for these information to be pulled in.
> Also, it would be good to remove FS.getContentSummary usage to find out file 
> sizes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14421) FS.deleteOnExit holds references to _tmp_space.db files

2016-08-05 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15410401#comment-15410401
 ] 

Hive QA commented on HIVE-14421:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12822327/HIVE-14421.02.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 10440 tests 
executed
*Failed tests:*
{noformat}
TestMsgBusConnection - did not produce a TEST-*.xml file
TestQueryLifeTimeHook - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_orc_llap_counters
org.apache.hadoop.hive.llap.tezplugins.TestLlapTaskSchedulerService.testForcedLocalityMultiplePreemptionsSameHost2
org.apache.hive.jdbc.TestJdbcWithMiniHS2.testAddJarConstructorUnCaching
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/790/testReport
Console output: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/790/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-790/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12822327 - PreCommit-HIVE-MASTER-Build

> FS.deleteOnExit holds references to _tmp_space.db files
> ---
>
> Key: HIVE-14421
> URL: https://issues.apache.org/jira/browse/HIVE-14421
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-14421.01.patch, HIVE-14421.02.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14447) Set HIVE_TRANSACTIONAL_TABLE_SCAN to the correct job conf for FetchOperator

2016-08-05 Thread Matt McCline (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15410394#comment-15410394
 ] 

Matt McCline commented on HIVE-14447:
-

+1 LGTM

> Set HIVE_TRANSACTIONAL_TABLE_SCAN to the correct job conf for FetchOperator
> ---
>
> Key: HIVE-14447
> URL: https://issues.apache.org/jira/browse/HIVE-14447
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Transactions
>Affects Versions: 1.3.0, 2.2.0, 2.1.1
>Reporter: Wei Zheng
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-14447.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14453) refactor physical writing of ORC data and metadata to FS from the logical writers

2016-08-05 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14453:

Status: Patch Available  (was: Open)

> refactor physical writing of ORC data and metadata to FS from the logical 
> writers
> -
>
> Key: HIVE-14453
> URL: https://issues.apache.org/jira/browse/HIVE-14453
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14453.patch
>
>
> ORC data doesn't have to go directly into an HDFS stream via buffers, it can 
> go somewhere else (e.g. a write-thru cache, or an addressable system that 
> doesn't require the stream blocks to be held in memory before writing them 
> all together).
> To that effect, it would be nice to abstract the data block/metadata 
> structure creating from the physical file concerns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14453) refactor physical writing of ORC data and metadata to FS from the logical writers

2016-08-05 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14453:

Description: 
ORC data doesn't have to go directly into an HDFS stream via buffers, it can go 
somewhere else (e.g. a write-thru cache, or an addressable system that doesn't 
require the stream blocks to be held in memory before writing them all 
together).
To that effect, it would be nice to abstract the data block/metadata structure 
creating from the physical file concerns.

> refactor physical writing of ORC data and metadata to FS from the logical 
> writers
> -
>
> Key: HIVE-14453
> URL: https://issues.apache.org/jira/browse/HIVE-14453
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14453.patch
>
>
> ORC data doesn't have to go directly into an HDFS stream via buffers, it can 
> go somewhere else (e.g. a write-thru cache, or an addressable system that 
> doesn't require the stream blocks to be held in memory before writing them 
> all together).
> To that effect, it would be nice to abstract the data block/metadata 
> structure creating from the physical file concerns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14453) refactor physical writing of ORC data and metadata to FS from the logical writers

2016-08-05 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14453:

Attachment: HIVE-14453.patch

The patch; most of it is just moving code.
PhysicalWriter boundary is at data blocks and metadata protobuf objects. It 
modifies protobuf objects only with regard to physical information like sizes, 
etc.
Some tests pass locally, let's see what HiveQA thinks... [~prasanth_j] can you 
take a look

> refactor physical writing of ORC data and metadata to FS from the logical 
> writers
> -
>
> Key: HIVE-14453
> URL: https://issues.apache.org/jira/browse/HIVE-14453
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14453.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14452) Vectorization: BinarySortableDeserializeRead should delegate buffer copies to VectorDeserializeRow

2016-08-05 Thread Gopal V (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-14452:
---
Description: 
Since the VectorDeserializeRow already calls a setVal(), the copy inside the 
lower layer is entirely wasteful.

{code}
BinarySortableSerDe.deserializeText(
  inputByteBuffer, columnSortOrderIsDesc[fieldIndex], tempText);
{code}

With HIVE-14451, the copies can be avoided for some scenarios and retained for 
others.

  was:
Since the VectorDeserializeRow already calls a setVal(), the copy inside the 
lower layer is entirely wasteful.

With HIVE-14451, the copies can be avoided for some scenarios and retained for 
others.


> Vectorization: BinarySortableDeserializeRead should delegate buffer copies to 
> VectorDeserializeRow
> --
>
> Key: HIVE-14452
> URL: https://issues.apache.org/jira/browse/HIVE-14452
> Project: Hive
>  Issue Type: Improvement
>  Components: Vectorization
>Reporter: Gopal V
>
> Since the VectorDeserializeRow already calls a setVal(), the copy inside the 
> lower layer is entirely wasteful.
> {code}
> BinarySortableSerDe.deserializeText(
>   inputByteBuffer, columnSortOrderIsDesc[fieldIndex], tempText);
> {code}
> With HIVE-14451, the copies can be avoided for some scenarios and retained 
> for others.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14450) Vectorization: StringExpr::truncate() can assume 1 byte per-char minimum

2016-08-05 Thread Gopal V (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-14450:
---
Affects Version/s: 2.2.0

> Vectorization: StringExpr::truncate() can assume 1 byte per-char minimum
> 
>
> Key: HIVE-14450
> URL: https://issues.apache.org/jira/browse/HIVE-14450
> Project: Hive
>  Issue Type: Improvement
>  Components: Vectorization
>Affects Versions: 2.2.0
>Reporter: Gopal V
>
> {code}
> public static int truncate(byte[] bytes, int start, int length, int 
> maxLength) {
> int end = start + length;
> // count characters forward
> int j = start;
> int charCount = 0;
> while(j < end) {
>   // UTF-8 continuation bytes have 2 high bits equal to 0x80.
>   if ((bytes[j] & 0xc0) != 0x80) {
> if (charCount == maxLength) {
>   break;
> }
> ++charCount;
>   }
>   j++;
> }
> return (j - start);
>   }
> {code}
> Should not read the bytes if the maxLength is 4096 and the input string has 
> 256 bytes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14450) Vectorization: StringExpr::truncate() can assume 1 byte per-char minimum

2016-08-05 Thread Gopal V (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-14450:
---
Description: 
{code}
public static int truncate(byte[] bytes, int start, int length, int maxLength) {
int end = start + length;

// count characters forward
int j = start;
int charCount = 0;
while(j < end) {
  // UTF-8 continuation bytes have 2 high bits equal to 0x80.
  if ((bytes[j] & 0xc0) != 0x80) {
if (charCount == maxLength) {
  break;
}
++charCount;
  }
  j++;
}
return (j - start);
  }
{code}

Should not read the bytes if the maxLength is 4096 and the input string has 256 
bytes.

  was:
{code}
public static int truncate(byte[] bytes, int start, int length, int maxLength) {
int end = start + length;

// count characters forward
int j = start;
int charCount = 0;
while(j < end) {
  // UTF-8 continuation bytes have 2 high bits equal to 0x80.
  if ((bytes[j] & 0xc0) != 0x80) {
if (charCount == maxLength) {
  break;
}
++charCount;
  }
  j++;
}
return (j - start);
  }
{code}

Should not dirty the L1 cache if the maxLength is 4096 and the input string has 
256 bytes.


> Vectorization: StringExpr::truncate() can assume 1 byte per-char minimum
> 
>
> Key: HIVE-14450
> URL: https://issues.apache.org/jira/browse/HIVE-14450
> Project: Hive
>  Issue Type: Improvement
>  Components: Vectorization
>Affects Versions: 2.2.0
>Reporter: Gopal V
>
> {code}
> public static int truncate(byte[] bytes, int start, int length, int 
> maxLength) {
> int end = start + length;
> // count characters forward
> int j = start;
> int charCount = 0;
> while(j < end) {
>   // UTF-8 continuation bytes have 2 high bits equal to 0x80.
>   if ((bytes[j] & 0xc0) != 0x80) {
> if (charCount == maxLength) {
>   break;
> }
> ++charCount;
>   }
>   j++;
> }
> return (j - start);
>   }
> {code}
> Should not read the bytes if the maxLength is 4096 and the input string has 
> 256 bytes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14450) Vectorization: StringExpr::truncate() can assume 1 byte per-char minimum

2016-08-05 Thread Gopal V (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-14450:
---
Component/s: Vectorization

> Vectorization: StringExpr::truncate() can assume 1 byte per-char minimum
> 
>
> Key: HIVE-14450
> URL: https://issues.apache.org/jira/browse/HIVE-14450
> Project: Hive
>  Issue Type: Improvement
>  Components: Vectorization
>Affects Versions: 2.2.0
>Reporter: Gopal V
>
> {code}
> public static int truncate(byte[] bytes, int start, int length, int 
> maxLength) {
> int end = start + length;
> // count characters forward
> int j = start;
> int charCount = 0;
> while(j < end) {
>   // UTF-8 continuation bytes have 2 high bits equal to 0x80.
>   if ((bytes[j] & 0xc0) != 0x80) {
> if (charCount == maxLength) {
>   break;
> }
> ++charCount;
>   }
>   j++;
> }
> return (j - start);
>   }
> {code}
> Should not read the bytes if the maxLength is 4096 and the input string has 
> 256 bytes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14418) Hive config validation prevents unsetting the settings

2016-08-05 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15410354#comment-15410354
 ] 

Ashutosh Chauhan commented on HIVE-14418:
-

+1 code changes LGTM.. please commit once you test it out : )

> Hive config validation prevents unsetting the settings
> --
>
> Key: HIVE-14418
> URL: https://issues.apache.org/jira/browse/HIVE-14418
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14418.patch
>
>
> {noformat}
> hive> set hive.tez.task.scale.memory.reserve.fraction.max=;
> Query returned non-zero code: 1, cause: 'SET 
> hive.tez.task.scale.memory.reserve.fraction.max=' FAILED because 
> hive.tez.task.scale.memory.reserve.fraction.max expects FLOAT type value.
> hive> set hive.tez.task.scale.memory.reserve.fraction.max=null;
> Query returned non-zero code: 1, cause: 'SET 
> hive.tez.task.scale.memory.reserve.fraction.max=null' FAILED because 
> hive.tez.task.scale.memory.reserve.fraction.max expects FLOAT type value.
> {noformat}
> unset also doesn't work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14423) S3: Fetching partition sizes from FS can be expensive when stats are not available in metastore

2016-08-05 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-14423:

   Resolution: Fixed
Fix Version/s: 2.2.0
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks, Rajesh!

> S3: Fetching partition sizes from FS can be expensive when stats are not 
> available in metastore 
> 
>
> Key: HIVE-14423
> URL: https://issues.apache.org/jira/browse/HIVE-14423
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Fix For: 2.2.0
>
> Attachments: HIVE-14423.1.patch, HIVE-14423.2.patch
>
>
> When partition stats are not available in metastore, it tries to get the file 
> sizes from FS.
> e.g
> {noformat}
> at 
> org.apache.hadoop.fs.FileSystem.getContentSummary(FileSystem.java:1487)
> at 
> org.apache.hadoop.hive.ql.stats.StatsUtils.getFileSizeForPartitions(StatsUtils.java:598)
> at 
> org.apache.hadoop.hive.ql.stats.StatsUtils.collectStatistics(StatsUtils.java:235)
> at 
> org.apache.hadoop.hive.ql.stats.StatsUtils.collectStatistics(StatsUtils.java:144)
> at 
> org.apache.hadoop.hive.ql.stats.StatsUtils.collectStatistics(StatsUtils.java:132)
> at 
> org.apache.hadoop.hive.ql.optimizer.stats.annotation.StatsRulesProcFactory$TableScanStatsRule.process(StatsRulesProcFactory.java:126)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105)
> {noformat}
> This can be quite expensive in some FS like S3. Especially when table is 
> partitioned (e.g TPC-DS store_sales which has 1000s of partitions), query can 
> spend 1000s of seconds just waiting for these information to be pulled in.
> Also, it would be good to remove FS.getContentSummary usage to find out file 
> sizes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14448) Queries with predicate fail when ETL split strategy is chosen for ACID tables

2016-08-05 Thread Saket Saurabh (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saket Saurabh updated HIVE-14448:
-
Description: 
When ETL split strategy is applied to ACID tables with predicate pushdown (SARG 
enabled), split generation fails for ACID. This bug will be usually exposed 
when working with data at scale, because in most otherwise cases only BI split 
strategy is chosen. My guess is that this is happening because the correct 
readerSchema is not being picked up when we try to extract SARG column names.

Quickest way to reproduce is to add the following unit test to 
ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java

{code:title=ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java|borderStyle=solid}
 @Test
  public void testETLSplitStrategyForACID() throws Exception {
hiveConf.setVar(HiveConf.ConfVars.HIVE_ORC_SPLIT_STRATEGY, "ETL");
hiveConf.setBoolVar(HiveConf.ConfVars.HIVEOPTINDEXFILTER, true);
runStatementOnDriver("insert into " + Table.ACIDTBL + " values(1,2)");
runStatementOnDriver("alter table " + Table.ACIDTBL + " compact 'MAJOR'");
runWorker(hiveConf);
List rs = runStatementOnDriver("select * from " +  Table.ACIDTBL  + 
" where a = 1");
int[][] resultData = new int[][] {{1,2}};
Assert.assertEquals(stringifyValues(resultData), rs);
  }
{code}

Back-trace for this failed test is as follows:
{code}
exec.Task: Job Submission failed with exception 'java.lang.RuntimeException(ORC 
split generation failed with exception: java.lang.NegativeArraySizeException)'
java.lang.RuntimeException: ORC split generation failed with exception: 
java.lang.NegativeArraySizeException
at 
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1570)
at 
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getSplits(OrcInputFormat.java:1656)
at 
org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:370)
at 
org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:488)
at 
org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:329)
at 
org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:321)
at 
org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:197)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1297)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1294)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1656)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1294)
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:562)
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:557)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1656)
at 
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:557)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:548)
at 
org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:417)
at 
org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:141)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197)
at 
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1962)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1653)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1389)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1131)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1119)
at 
org.apache.hadoop.hive.ql.TestTxnCommands2.runStatementOnDriver(TestTxnCommands2.java:1292)
at 
org.apache.hadoop.hive.ql.TestTxnCommands2.testETLSplitStrategyForACID(TestTxnCommands2.java:280)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at

[jira] [Updated] (HIVE-12181) Change hive.stats.fetch.column.stats value to true for MiniTezCliDriver

2016-08-05 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-12181:

Attachment: HIVE-12181.9.patch

> Change hive.stats.fetch.column.stats value to true for MiniTezCliDriver
> ---
>
> Key: HIVE-12181
> URL: https://issues.apache.org/jira/browse/HIVE-12181
> Project: Hive
>  Issue Type: Improvement
>  Components: Statistics
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-12181.1.patch, HIVE-12181.2.patch, 
> HIVE-12181.3.patch, HIVE-12181.4.patch, HIVE-12181.7.patch, 
> HIVE-12181.8.patch, HIVE-12181.9.patch, HIVE-12181.patch, HIVE-12181.patch
>
>
> There was a performance concern earlier, but HIVE-7587 has fixed that. We can 
> change the default to true now.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12181) Change hive.stats.fetch.column.stats value to true for MiniTezCliDriver

2016-08-05 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-12181:

Status: Patch Available  (was: Open)

> Change hive.stats.fetch.column.stats value to true for MiniTezCliDriver
> ---
>
> Key: HIVE-12181
> URL: https://issues.apache.org/jira/browse/HIVE-12181
> Project: Hive
>  Issue Type: Improvement
>  Components: Statistics
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-12181.1.patch, HIVE-12181.2.patch, 
> HIVE-12181.3.patch, HIVE-12181.4.patch, HIVE-12181.7.patch, 
> HIVE-12181.8.patch, HIVE-12181.9.patch, HIVE-12181.patch, HIVE-12181.patch
>
>
> There was a performance concern earlier, but HIVE-7587 has fixed that. We can 
> change the default to true now.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12181) Change hive.stats.fetch.column.stats value to true for MiniTezCliDriver

2016-08-05 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-12181:

Status: Open  (was: Patch Available)

> Change hive.stats.fetch.column.stats value to true for MiniTezCliDriver
> ---
>
> Key: HIVE-12181
> URL: https://issues.apache.org/jira/browse/HIVE-12181
> Project: Hive
>  Issue Type: Improvement
>  Components: Statistics
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-12181.1.patch, HIVE-12181.2.patch, 
> HIVE-12181.3.patch, HIVE-12181.4.patch, HIVE-12181.7.patch, 
> HIVE-12181.8.patch, HIVE-12181.patch, HIVE-12181.patch
>
>
> There was a performance concern earlier, but HIVE-7587 has fixed that. We can 
> change the default to true now.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14415) Upgrade qtest execution framework to junit4 - TestPerfCliDriver

2016-08-05 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15410341#comment-15410341
 ] 

Ashutosh Chauhan commented on HIVE-14415:
-

Alright. We can check in this patch as well since it improves atleast one cli 
driver (qfile broken temporarily for perflcidriver is ok), but if you are close 
to make it work for few others, we can wait as well. Upto you, let me know.

> Upgrade qtest execution framework to junit4 - TestPerfCliDriver
> ---
>
> Key: HIVE-14415
> URL: https://issues.apache.org/jira/browse/HIVE-14415
> Project: Hive
>  Issue Type: Sub-task
>  Components: Tests
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Attachments: HIVE-14415.1.patch, HIVE-14415.2.patch
>
>
> I would like to upgrade the current maven+ant+velocimacro+junit4 qtest 
> generation framework to use only junit4 - while (trying) to keep 
> all the existing features it provides.
> What I can't really do with the current one: execute easily a single qtests 
> from an IDE (as a matter of fact I can...but it's way too complicated; after 
> this it won't be a cake-walk either...but it will be a step closer ;)
> I think this change will make it more clear how these tests are configured 
> and executed.
> I will do this in two phases, currently i will only change 
> {{TestPerfCliDriver}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14415) Upgrade qtest execution framework to junit4 - TestPerfCliDriver

2016-08-05 Thread Zoltan Haindrich (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15410335#comment-15410335
 ] 

Zoltan Haindrich commented on HIVE-14415:
-

oh..i'm a bit tired...in this patch it only doesn't work on testperfclidriver ;)
i'm working on the other ones - and i've forgot to pass qfile variable from 
maven to the java process.

I was not sure when I started working on this that the ptest executors will  
welcome this change or not...the patch proved that it will cause no problems.

Anyway...I think I will drop this ticket and add these changes to the other 
one...at least half of these changes have became outdated.

> Upgrade qtest execution framework to junit4 - TestPerfCliDriver
> ---
>
> Key: HIVE-14415
> URL: https://issues.apache.org/jira/browse/HIVE-14415
> Project: Hive
>  Issue Type: Sub-task
>  Components: Tests
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Attachments: HIVE-14415.1.patch, HIVE-14415.2.patch
>
>
> I would like to upgrade the current maven+ant+velocimacro+junit4 qtest 
> generation framework to use only junit4 - while (trying) to keep 
> all the existing features it provides.
> What I can't really do with the current one: execute easily a single qtests 
> from an IDE (as a matter of fact I can...but it's way too complicated; after 
> this it won't be a cake-walk either...but it will be a step closer ;)
> I think this change will make it more clear how these tests are configured 
> and executed.
> I will do this in two phases, currently i will only change 
> {{TestPerfCliDriver}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14378) Data size may be estimated as 0 if no columns are being projected after an operator

2016-08-05 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-14378:

   Resolution: Fixed
Fix Version/s: 2.2.0
   Status: Resolved  (was: Patch Available)

Pushed to master.

> Data size may be estimated as 0 if no columns are being projected after an 
> operator
> ---
>
> Key: HIVE-14378
> URL: https://issues.apache.org/jira/browse/HIVE-14378
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer, Statistics
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Fix For: 2.2.0
>
> Attachments: HIVE-14378.2.patch, HIVE-14378.3.patch, 
> HIVE-14378.3.patch, HIVE-14378.4.patch, HIVE-14378.patch
>
>
> in those cases we still emit rows.. but they may not have any columns within 
> it.  We shouldn't estimate 0 data size in such cases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10264) Document Replication support on wiki

2016-08-05 Thread Sushanth Sowmyan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15410332#comment-15410332
 ] 

Sushanth Sowmyan commented on HIVE-10264:
-

Filing new jira for admin/user/programmer-facing doc.

> Document Replication support on wiki
> 
>
> Key: HIVE-10264
> URL: https://issues.apache.org/jira/browse/HIVE-10264
> Project: Hive
>  Issue Type: Sub-task
>  Components: Import/Export
>Affects Versions: 1.2.0
>Reporter: Sushanth Sowmyan
>Assignee: Shannon Ladymon
>  Labels: TODOC1.2
> Attachments: BirdsAndBees.pdf, EXIMReplicationReplayProtocol.pdf, 
> apache_hivedr.0.pdf
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (HIVE-10264) Document Replication support on wiki

2016-08-05 Thread Sushanth Sowmyan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15410332#comment-15410332
 ] 

Sushanth Sowmyan edited comment on HIVE-10264 at 8/6/16 12:34 AM:
--

Filing new jira for admin/user/programmer-facing doc : 
https://issues.apache.org/jira/browse/HIVE-14449


was (Author: sushanth):
Filing new jira for admin/user/programmer-facing doc.

> Document Replication support on wiki
> 
>
> Key: HIVE-10264
> URL: https://issues.apache.org/jira/browse/HIVE-10264
> Project: Hive
>  Issue Type: Sub-task
>  Components: Import/Export
>Affects Versions: 1.2.0
>Reporter: Sushanth Sowmyan
>Assignee: Shannon Ladymon
>  Labels: TODOC1.2
> Attachments: BirdsAndBees.pdf, EXIMReplicationReplayProtocol.pdf, 
> apache_hivedr.0.pdf
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10264) Document Replication support on wiki

2016-08-05 Thread Sushanth Sowmyan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15410325#comment-15410325
 ] 

Sushanth Sowmyan commented on HIVE-10264:
-

Oh, and [~teabot], apologies for missing you! Thank you for your work on 
https://cwiki.apache.org/confluence/display/Hive/Replication - that's very 
useful. We might want to spin that as a more admin/programmer facing doc(as it 
already is that way) and expand on it further - I'll create a new jira to 
expand this further.

> Document Replication support on wiki
> 
>
> Key: HIVE-10264
> URL: https://issues.apache.org/jira/browse/HIVE-10264
> Project: Hive
>  Issue Type: Sub-task
>  Components: Import/Export
>Affects Versions: 1.2.0
>Reporter: Sushanth Sowmyan
>Assignee: Shannon Ladymon
>  Labels: TODOC1.2
> Attachments: BirdsAndBees.pdf, EXIMReplicationReplayProtocol.pdf, 
> apache_hivedr.0.pdf
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (HIVE-14415) Upgrade qtest execution framework to junit4 - TestPerfCliDriver

2016-08-05 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15410321#comment-15410321
 ] 

Ashutosh Chauhan edited comment on HIVE-14415 at 8/6/16 12:28 AM:
--

Didn't follow you. How is it broken and why ptests didn't fail since they do 
use it?


was (Author: ashutoshc):
Didn't follow you. How is it broken and why ptests failed since they do use it?

> Upgrade qtest execution framework to junit4 - TestPerfCliDriver
> ---
>
> Key: HIVE-14415
> URL: https://issues.apache.org/jira/browse/HIVE-14415
> Project: Hive
>  Issue Type: Sub-task
>  Components: Tests
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Attachments: HIVE-14415.1.patch, HIVE-14415.2.patch
>
>
> I would like to upgrade the current maven+ant+velocimacro+junit4 qtest 
> generation framework to use only junit4 - while (trying) to keep 
> all the existing features it provides.
> What I can't really do with the current one: execute easily a single qtests 
> from an IDE (as a matter of fact I can...but it's way too complicated; after 
> this it won't be a cake-walk either...but it will be a step closer ;)
> I think this change will make it more clear how these tests are configured 
> and executed.
> I will do this in two phases, currently i will only change 
> {{TestPerfCliDriver}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14415) Upgrade qtest execution framework to junit4 - TestPerfCliDriver

2016-08-05 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15410321#comment-15410321
 ] 

Ashutosh Chauhan commented on HIVE-14415:
-

Didn't follow you. How is it broken and why ptests failed since they do use it?

> Upgrade qtest execution framework to junit4 - TestPerfCliDriver
> ---
>
> Key: HIVE-14415
> URL: https://issues.apache.org/jira/browse/HIVE-14415
> Project: Hive
>  Issue Type: Sub-task
>  Components: Tests
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Attachments: HIVE-14415.1.patch, HIVE-14415.2.patch
>
>
> I would like to upgrade the current maven+ant+velocimacro+junit4 qtest 
> generation framework to use only junit4 - while (trying) to keep 
> all the existing features it provides.
> What I can't really do with the current one: execute easily a single qtests 
> from an IDE (as a matter of fact I can...but it's way too complicated; after 
> this it won't be a cake-walk either...but it will be a step closer ;)
> I think this change will make it more clear how these tests are configured 
> and executed.
> I will do this in two phases, currently i will only change 
> {{TestPerfCliDriver}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14448) Queries with predicate fail when ETL split strategy is chosen for ACID tables

2016-08-05 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-14448:
--
Target Version/s: 2.2.0

> Queries with predicate fail when ETL split strategy is chosen for ACID tables
> -
>
> Key: HIVE-14448
> URL: https://issues.apache.org/jira/browse/HIVE-14448
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 2.2.0
>Reporter: Saket Saurabh
>
> When ETL split strategy is applied to ACID tables with predicate pushdown 
> (SARG enabled), split generation fails for ACID. This bug will be usually 
> exposed when working with data at scale, because in most otherwise cases only 
> BI split strategy is chosen. My guess is that this is happening because the 
> correct readerSchema is not being picked up when we try to extract SARG 
> column names.
> Quickest way to reproduce is to add the following unit test to 
> ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java
> {code:title=ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java|borderStyle=solid}
>  @Test
>   public void testETLSplitStrategyForACID() throws Exception {
> hiveConf.setVar(HiveConf.ConfVars.HIVE_ORC_SPLIT_STRATEGY, "ETL");
> hiveConf.setBoolVar(HiveConf.ConfVars.HIVEOPTINDEXFILTER, true);
> runStatementOnDriver("insert into " + Table.ACIDTBL + " values(1,2)");
> runStatementOnDriver("alter table " + Table.ACIDTBL + " compact 'MAJOR'");
> runWorker(hiveConf);
> List rs = runStatementOnDriver("select * from " +  Table.ACIDTBL  
> + " where a = 1");
> int[][] resultData = new int[][] {{1,2}};
> Assert.assertEquals(stringifyValues(resultData), rs);
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14415) Upgrade qtest execution framework to junit4 - TestPerfCliDriver

2016-08-05 Thread Zoltan Haindrich (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15410319#comment-15410319
 ] 

Zoltan Haindrich commented on HIVE-14415:
-

[~ashutoshc] i think this patch has a major issue: i'm afraid {{-Dqfile}} will 
be broken with it...i think it would be best to add all these changes too to 
HIVE-1

the ptest executor had no problem with this change...so: I think it will have 
no problems with a more drastic change either. 


> Upgrade qtest execution framework to junit4 - TestPerfCliDriver
> ---
>
> Key: HIVE-14415
> URL: https://issues.apache.org/jira/browse/HIVE-14415
> Project: Hive
>  Issue Type: Sub-task
>  Components: Tests
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Attachments: HIVE-14415.1.patch, HIVE-14415.2.patch
>
>
> I would like to upgrade the current maven+ant+velocimacro+junit4 qtest 
> generation framework to use only junit4 - while (trying) to keep 
> all the existing features it provides.
> What I can't really do with the current one: execute easily a single qtests 
> from an IDE (as a matter of fact I can...but it's way too complicated; after 
> this it won't be a cake-walk either...but it will be a step closer ;)
> I think this change will make it more clear how these tests are configured 
> and executed.
> I will do this in two phases, currently i will only change 
> {{TestPerfCliDriver}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10264) Document Replication support on wiki

2016-08-05 Thread Sushanth Sowmyan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15410317#comment-15410317
 ] 

Sushanth Sowmyan commented on HIVE-10264:
-

Thanks, [~sladymon] - your doc is definitely much better organized and detailed 
than my original attempts, and it's great to finally have be able to resolve 
this jira. :)

> Document Replication support on wiki
> 
>
> Key: HIVE-10264
> URL: https://issues.apache.org/jira/browse/HIVE-10264
> Project: Hive
>  Issue Type: Sub-task
>  Components: Import/Export
>Affects Versions: 1.2.0
>Reporter: Sushanth Sowmyan
>Assignee: Shannon Ladymon
>  Labels: TODOC1.2
> Attachments: BirdsAndBees.pdf, EXIMReplicationReplayProtocol.pdf, 
> apache_hivedr.0.pdf
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HIVE-10264) Document Replication support on wiki

2016-08-05 Thread Sushanth Sowmyan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan resolved HIVE-10264.
-
Resolution: Fixed

> Document Replication support on wiki
> 
>
> Key: HIVE-10264
> URL: https://issues.apache.org/jira/browse/HIVE-10264
> Project: Hive
>  Issue Type: Sub-task
>  Components: Import/Export
>Affects Versions: 1.2.0
>Reporter: Sushanth Sowmyan
>Assignee: Shannon Ladymon
>  Labels: TODOC1.2
> Attachments: BirdsAndBees.pdf, EXIMReplicationReplayProtocol.pdf, 
> apache_hivedr.0.pdf
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-7973) Hive Replication Support

2016-08-05 Thread Sushanth Sowmyan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-7973:
---
Assignee: Shannon Ladymon  (was: Sushanth Sowmyan)

> Hive Replication Support
> 
>
> Key: HIVE-7973
> URL: https://issues.apache.org/jira/browse/HIVE-7973
> Project: Hive
>  Issue Type: Bug
>  Components: Import/Export
>Reporter: Sushanth Sowmyan
>Assignee: Shannon Ladymon
>
> A need for replication is a common one in many database management systems, 
> and it's important for hive to evolve support for such a tool as part of its 
> ecosystem. Hive already supports an EXPORT and IMPORT command, which can be 
> used to dump out tables, distcp them to another cluster, and and 
> import/create from that. If we had a mechanism by which exports and imports 
> could be automated, it establishes the base with which replication can be 
> developed.
> One place where this kind of automation can be developed is with aid of the 
> HiveMetaStoreEventHandler mechanisms, to generate notifications when certain 
> changes are committed to the metastore, and then translate those 
> notifications to export actions, distcp actions and import actions on another 
> import action.
> Part of that already exists is with the Notification system that is part of 
> hcatalog-server-extensions. Initially, this was developed to be able to 
> trigger a JMS notification, which an Oozie workflow can use to can start off 
> actions keyed on the finishing of a job that used HCatalog to write to a 
> table. While this currently lives under hcatalog, the primary reason for its 
> existence has a scope well past hcatalog alone, and can be used as-is without 
> the use of HCatalog IF/OF. This can be extended, with the help of a library 
> which does that aforementioned translation. I also think that these sections 
> should live in a core hive module, rather than being tucked away inside 
> hcatalog.
> Once we have rudimentary support for table & partition replication, we can 
> then move on to further requirements of replication, such as metadata 
> replications (such as replication of changes to roles/etc), and/or optimize 
> away the requirement to distcp and use webhdfs instead, etc.
> This Story tracks all the bits that go into development of such a system - 
> I'll create multiple smaller tasks inside this as we go on.
> Please also see HIVE-10264 for documentation-related links for this, and 
> https://cwiki.apache.org/confluence/display/Hive/HiveReplicationDevelopment 
> for associated wiki (currently in progress)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10264) Document Replication support on wiki

2016-08-05 Thread Sushanth Sowmyan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-10264:

Assignee: Shannon Ladymon  (was: Sushanth Sowmyan)

> Document Replication support on wiki
> 
>
> Key: HIVE-10264
> URL: https://issues.apache.org/jira/browse/HIVE-10264
> Project: Hive
>  Issue Type: Sub-task
>  Components: Import/Export
>Affects Versions: 1.2.0
>Reporter: Sushanth Sowmyan
>Assignee: Shannon Ladymon
>  Labels: TODOC1.2
> Attachments: BirdsAndBees.pdf, EXIMReplicationReplayProtocol.pdf, 
> apache_hivedr.0.pdf
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HIVE-7973) Hive Replication Support

2016-08-05 Thread Sushanth Sowmyan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan reassigned HIVE-7973:
--

Assignee: Sushanth Sowmyan  (was: Shannon Ladymon)

> Hive Replication Support
> 
>
> Key: HIVE-7973
> URL: https://issues.apache.org/jira/browse/HIVE-7973
> Project: Hive
>  Issue Type: Bug
>  Components: Import/Export
>Reporter: Sushanth Sowmyan
>Assignee: Sushanth Sowmyan
>
> A need for replication is a common one in many database management systems, 
> and it's important for hive to evolve support for such a tool as part of its 
> ecosystem. Hive already supports an EXPORT and IMPORT command, which can be 
> used to dump out tables, distcp them to another cluster, and and 
> import/create from that. If we had a mechanism by which exports and imports 
> could be automated, it establishes the base with which replication can be 
> developed.
> One place where this kind of automation can be developed is with aid of the 
> HiveMetaStoreEventHandler mechanisms, to generate notifications when certain 
> changes are committed to the metastore, and then translate those 
> notifications to export actions, distcp actions and import actions on another 
> import action.
> Part of that already exists is with the Notification system that is part of 
> hcatalog-server-extensions. Initially, this was developed to be able to 
> trigger a JMS notification, which an Oozie workflow can use to can start off 
> actions keyed on the finishing of a job that used HCatalog to write to a 
> table. While this currently lives under hcatalog, the primary reason for its 
> existence has a scope well past hcatalog alone, and can be used as-is without 
> the use of HCatalog IF/OF. This can be extended, with the help of a library 
> which does that aforementioned translation. I also think that these sections 
> should live in a core hive module, rather than being tucked away inside 
> hcatalog.
> Once we have rudimentary support for table & partition replication, we can 
> then move on to further requirements of replication, such as metadata 
> replications (such as replication of changes to roles/etc), and/or optimize 
> away the requirement to distcp and use webhdfs instead, etc.
> This Story tracks all the bits that go into development of such a system - 
> I'll create multiple smaller tasks inside this as we go on.
> Please also see HIVE-10264 for documentation-related links for this, and 
> https://cwiki.apache.org/confluence/display/Hive/HiveReplicationDevelopment 
> for associated wiki (currently in progress)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13403) Make Streaming API not create empty buckets (at least as an option)

2016-08-05 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-13403:
--
Target Version/s: 2.2.0  (was: 1.3.0, 2.2.0)

> Make Streaming API not create empty buckets (at least as an option)
> ---
>
> Key: HIVE-13403
> URL: https://issues.apache.org/jira/browse/HIVE-13403
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog, Transactions
>Affects Versions: 1.3.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>
> as of HIVE-11983, when a TransactionBatch is opened in StreamingAPI, a full 
> compliment of bucket files (AbstractRecordWriter.createRecordUpdaters()) is 
> created on disk even though some may end up receiving no data.
> It would be better to create them on demand and not clog the FS.
> Tez can handle missing (empty) buckets and on MR bucket join algorithms will 
> check if all buckets are there and bail out if not.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-13403) Make Streaming API not create empty buckets (at least as an option)

2016-08-05 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-13403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-13403:
--
Priority: Critical  (was: Major)

> Make Streaming API not create empty buckets (at least as an option)
> ---
>
> Key: HIVE-13403
> URL: https://issues.apache.org/jira/browse/HIVE-13403
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog, Transactions
>Affects Versions: 1.3.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
>
> as of HIVE-11983, when a TransactionBatch is opened in StreamingAPI, a full 
> compliment of bucket files (AbstractRecordWriter.createRecordUpdaters()) is 
> created on disk even though some may end up receiving no data.
> It would be better to create them on demand and not clog the FS.
> Tez can handle missing (empty) buckets and on MR bucket join algorithms will 
> check if all buckets are there and bail out if not.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14423) S3: Fetching partition sizes from FS can be expensive when stats are not available in metastore

2016-08-05 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15410314#comment-15410314
 ] 

Ashutosh Chauhan commented on HIVE-14423:
-

Gobbling InterruptedException is fine since we are collecting stats to better 
plan the query, but if we fail to gather stats we still want to continue 
compiling, not fail the query. 
+1

> S3: Fetching partition sizes from FS can be expensive when stats are not 
> available in metastore 
> 
>
> Key: HIVE-14423
> URL: https://issues.apache.org/jira/browse/HIVE-14423
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-14423.1.patch, HIVE-14423.2.patch
>
>
> When partition stats are not available in metastore, it tries to get the file 
> sizes from FS.
> e.g
> {noformat}
> at 
> org.apache.hadoop.fs.FileSystem.getContentSummary(FileSystem.java:1487)
> at 
> org.apache.hadoop.hive.ql.stats.StatsUtils.getFileSizeForPartitions(StatsUtils.java:598)
> at 
> org.apache.hadoop.hive.ql.stats.StatsUtils.collectStatistics(StatsUtils.java:235)
> at 
> org.apache.hadoop.hive.ql.stats.StatsUtils.collectStatistics(StatsUtils.java:144)
> at 
> org.apache.hadoop.hive.ql.stats.StatsUtils.collectStatistics(StatsUtils.java:132)
> at 
> org.apache.hadoop.hive.ql.optimizer.stats.annotation.StatsRulesProcFactory$TableScanStatsRule.process(StatsRulesProcFactory.java:126)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105)
> {noformat}
> This can be quite expensive in some FS like S3. Especially when table is 
> partitioned (e.g TPC-DS store_sales which has 1000s of partitions), query can 
> spend 1000s of seconds just waiting for these information to be pulled in.
> Also, it would be good to remove FS.getContentSummary usage to find out file 
> sizes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14378) Data size may be estimated as 0 if no columns are being projected after an operator

2016-08-05 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15410311#comment-15410311
 ] 

Hive QA commented on HIVE-14378:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12821934/HIVE-14378.4.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 10440 tests 
executed
*Failed tests:*
{noformat}
TestMsgBusConnection - did not produce a TEST-*.xml file
TestQueryLifeTimeHook - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_orc_llap
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_orc_llap_counters
org.apache.hadoop.hive.llap.tezplugins.TestLlapTaskSchedulerService.testForcedLocalityMultiplePreemptionsSameHost2
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/789/testReport
Console output: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/789/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-789/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12821934 - PreCommit-HIVE-MASTER-Build

> Data size may be estimated as 0 if no columns are being projected after an 
> operator
> ---
>
> Key: HIVE-14378
> URL: https://issues.apache.org/jira/browse/HIVE-14378
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer, Statistics
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-14378.2.patch, HIVE-14378.3.patch, 
> HIVE-14378.3.patch, HIVE-14378.4.patch, HIVE-14378.patch
>
>
> in those cases we still emit rows.. but they may not have any columns within 
> it.  We shouldn't estimate 0 data size in such cases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14439) LlapTaskScheduler should try scheduling tasks when a node is disabled

2016-08-05 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15410277#comment-15410277
 ] 

Prasanth Jayachandran commented on HIVE-14439:
--

+1

> LlapTaskScheduler should try scheduling tasks when a node is disabled
> -
>
> Key: HIVE-14439
> URL: https://issues.apache.org/jira/browse/HIVE-14439
> Project: Hive
>  Issue Type: Bug
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-14439.01.patch, HIVE-14439.02.patch
>
>
> When a node is disabled - try scheduling pending tasks. Tasks which may have 
> been waiting for the node to become available could become candidates for 
> scheduling on alternate nodes depending on the locality delay and disable 
> duration.
> This is what is causing an occasional timeout on 
> testDelayedLocalityNodeCommErrorImmediateAllocation



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14447) Set HIVE_TRANSACTIONAL_TABLE_SCAN to the correct job conf for FetchOperator

2016-08-05 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15410273#comment-15410273
 ] 

Prasanth Jayachandran commented on HIVE-14447:
--

[~mmccline] Can you please review this patch?


> Set HIVE_TRANSACTIONAL_TABLE_SCAN to the correct job conf for FetchOperator
> ---
>
> Key: HIVE-14447
> URL: https://issues.apache.org/jira/browse/HIVE-14447
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Transactions
>Affects Versions: 1.3.0, 2.2.0, 2.1.1
>Reporter: Wei Zheng
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-14447.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14447) Set HIVE_TRANSACTIONAL_TABLE_SCAN to the correct job conf for FetchOperator

2016-08-05 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-14447:
-
Status: Patch Available  (was: Open)

> Set HIVE_TRANSACTIONAL_TABLE_SCAN to the correct job conf for FetchOperator
> ---
>
> Key: HIVE-14447
> URL: https://issues.apache.org/jira/browse/HIVE-14447
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Transactions
>Affects Versions: 1.3.0, 2.2.0, 2.1.1
>Reporter: Wei Zheng
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-14447.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14447) Set HIVE_TRANSACTIONAL_TABLE_SCAN to the correct job conf for FetchOperator

2016-08-05 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-14447:
-
Attachment: HIVE-14447.1.patch

> Set HIVE_TRANSACTIONAL_TABLE_SCAN to the correct job conf for FetchOperator
> ---
>
> Key: HIVE-14447
> URL: https://issues.apache.org/jira/browse/HIVE-14447
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Transactions
>Affects Versions: 1.3.0, 2.2.0, 2.1.1
>Reporter: Wei Zheng
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-14447.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HIVE-14447) Set HIVE_TRANSACTIONAL_TABLE_SCAN to the correct job conf for FetchOperator

2016-08-05 Thread Prasanth Jayachandran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran reassigned HIVE-14447:


Assignee: Prasanth Jayachandran  (was: Wei Zheng)

> Set HIVE_TRANSACTIONAL_TABLE_SCAN to the correct job conf for FetchOperator
> ---
>
> Key: HIVE-14447
> URL: https://issues.apache.org/jira/browse/HIVE-14447
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Transactions
>Affects Versions: 1.3.0, 2.2.0, 2.1.1
>Reporter: Wei Zheng
>Assignee: Prasanth Jayachandran
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14443) Improve ide support

2016-08-05 Thread Zoltan Haindrich (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15410259#comment-15410259
 ] 

Zoltan Haindrich commented on HIVE-14443:
-

[~spena] of course...i've added it; i think those maven vs ide related 
experiences will be very useful

> Improve ide support
> ---
>
> Key: HIVE-14443
> URL: https://issues.apache.org/jira/browse/HIVE-14443
> Project: Hive
>  Issue Type: Improvement
>  Components: Tests
>Reporter: Zoltan Haindrich
>
> this is an umbrella ticket to enable collaboration between us...
> I think that the ability to execute qtests from eclipse or idea would be a 
> reasonable goal ;)
> feel free to add subtasks



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-12033) Move TestCliDriver/TestNegativeCliDriver out of ANT and make it debugable

2016-08-05 Thread Zoltan Haindrich (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-12033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-12033:

Issue Type: Sub-task  (was: Improvement)
Parent: HIVE-14443

> Move TestCliDriver/TestNegativeCliDriver out of ANT and make it debugable
> -
>
> Key: HIVE-12033
> URL: https://issues.apache.org/jira/browse/HIVE-12033
> Project: Hive
>  Issue Type: Sub-task
>  Components: Test
>Affects Versions: 1.2.1
>Reporter: Sergio Peña
>Assignee: Sergio Peña
>Priority: Minor
> Attachments: HIVE-12033.1-spark.patch, HIVE-12033.1.patch
>
>
> The ANT auto-generated test sources make TestCliDriver code a little 
> complicated to debug with IntelliJ and Eclipse. Remote debugging is the best 
> choice to do it.
> There must be a new way to move out the ANT auto-generated source plug-in, 
> and make TestCliDriver easy debuggable by current IDE, such as IntelliJ and 
> Eclipse.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14440) Fix default value of USE_DEPRECATED_CLI in cli.cmd

2016-08-05 Thread JIRA


[ 
https://issues.apache.org/jira/browse/HIVE-14440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15410231#comment-15410231
 ] 

Sergio Peña commented on HIVE-14440:


LGTM +1

> Fix default value of USE_DEPRECATED_CLI in cli.cmd
> --
>
> Key: HIVE-14440
> URL: https://issues.apache.org/jira/browse/HIVE-14440
> Project: Hive
>  Issue Type: Sub-task
>  Components: CLI
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Minor
> Attachments: HIVE-14440.01.patch
>
>
> cli.cmd script sets the default value of USE_DEPRECATED_CLI to false when it 
> is not set which is different than cli.sh which sets it to true.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14435) Vectorization: missed vectorization for const varchar()

2016-08-05 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15410196#comment-15410196
 ] 

Sergey Shelukhin commented on HIVE-14435:
-

+1 pending tests

> Vectorization: missed vectorization for const varchar()
> ---
>
> Key: HIVE-14435
> URL: https://issues.apache.org/jira/browse/HIVE-14435
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 2.2.0
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-14435.patch
>
>
> {code}
> 2016-08-05T09:45:16,488  INFO [main] physical.Vectorizer: Failed to vectorize
> 2016-08-05T09:45:16,488  INFO [main] physical.Vectorizer: Cannot vectorize 
> select expression: Const varchar(1) f
> {code}
> The constant throws an illegal argument because the varchar precision is lost 
> in the pipeline.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14394) Reduce excessive INFO level logging

2016-08-05 Thread Sushanth Sowmyan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-14394:

   Resolution: Fixed
Fix Version/s: 2.1.1
   2.2.0
   Status: Resolved  (was: Patch Available)

Committed to master and branch-2.1.

> Reduce excessive INFO level logging
> ---
>
> Key: HIVE-14394
> URL: https://issues.apache.org/jira/browse/HIVE-14394
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Sushanth Sowmyan
>Assignee: Sushanth Sowmyan
> Fix For: 2.2.0, 2.1.1
>
> Attachments: HIVE-14394.2.patch, HIVE-14394.patch
>
>
> We need to cull down on the number of logs we generate in HMS and HS2 that 
> are not needed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14435) Vectorization: missed vectorization for const varchar()

2016-08-05 Thread Gopal V (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-14435:
---
Status: Patch Available  (was: Open)

> Vectorization: missed vectorization for const varchar()
> ---
>
> Key: HIVE-14435
> URL: https://issues.apache.org/jira/browse/HIVE-14435
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-14435.patch
>
>
> {code}
> 2016-08-05T09:45:16,488  INFO [main] physical.Vectorizer: Failed to vectorize
> 2016-08-05T09:45:16,488  INFO [main] physical.Vectorizer: Cannot vectorize 
> select expression: Const varchar(1) f
> {code}
> The constant throws an illegal argument because the varchar precision is lost 
> in the pipeline.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14435) Vectorization: missed vectorization for const varchar()

2016-08-05 Thread Gopal V (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-14435:
---
Affects Version/s: 2.2.0

> Vectorization: missed vectorization for const varchar()
> ---
>
> Key: HIVE-14435
> URL: https://issues.apache.org/jira/browse/HIVE-14435
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 2.2.0
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-14435.patch
>
>
> {code}
> 2016-08-05T09:45:16,488  INFO [main] physical.Vectorizer: Failed to vectorize
> 2016-08-05T09:45:16,488  INFO [main] physical.Vectorizer: Cannot vectorize 
> select expression: Const varchar(1) f
> {code}
> The constant throws an illegal argument because the varchar precision is lost 
> in the pipeline.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14435) Vectorization: missed vectorization for const varchar()

2016-08-05 Thread Gopal V (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-14435:
---
Component/s: Vectorization

> Vectorization: missed vectorization for const varchar()
> ---
>
> Key: HIVE-14435
> URL: https://issues.apache.org/jira/browse/HIVE-14435
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 2.2.0
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-14435.patch
>
>
> {code}
> 2016-08-05T09:45:16,488  INFO [main] physical.Vectorizer: Failed to vectorize
> 2016-08-05T09:45:16,488  INFO [main] physical.Vectorizer: Cannot vectorize 
> select expression: Const varchar(1) f
> {code}
> The constant throws an illegal argument because the varchar precision is lost 
> in the pipeline.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14435) Vectorization: missed vectorization for const varchar()

2016-08-05 Thread Gopal V (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-14435:
---
Attachment: HIVE-14435.patch

> Vectorization: missed vectorization for const varchar()
> ---
>
> Key: HIVE-14435
> URL: https://issues.apache.org/jira/browse/HIVE-14435
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 2.2.0
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-14435.patch
>
>
> {code}
> 2016-08-05T09:45:16,488  INFO [main] physical.Vectorizer: Failed to vectorize
> 2016-08-05T09:45:16,488  INFO [main] physical.Vectorizer: Cannot vectorize 
> select expression: Const varchar(1) f
> {code}
> The constant throws an illegal argument because the varchar precision is lost 
> in the pipeline.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14342) Beeline output is garbled when executed from a remote shell

2016-08-05 Thread Naveen Gangam (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15410183#comment-15410183
 ] 

Naveen Gangam commented on HIVE-14342:
--

Thanks Mohit. I ran the following test on bash 4.1.2 (tried both conditions, 
the second condition for Bash 4.2+ does not seem to work as expected).
The prompt variable does not seem to be set in any scenario. I ran the test 
both in background and foreground and the result is the same. Running from a 
remote node results in the same.

{code}
# cat test.sh 
#!/usr/bin/env bash

if [ -z $PS1 ] # no prompt?
#if [ -v PS1 ]   # On Bash 4.2+ ...
then
  echo "non-interactive3"
else
  echo "interactive3"
fi
# ./test.sh 
non-interactive3
# ./test.sh &
[1] 5715
# non-interactive3

[1]+  Done./test.sh
{code}

>From a remote node
{code}
ngangam-MBP-2:~ ngangam$ ssh -l root  "/root/test.sh"
root@'s password: 
non-interactive3
{code}

In the link above, further down on the page, there is a sample using exactly 
what the patch is, "-p /dev/stdin". Do you know any other means that might 
work? Thanks

> Beeline output is garbled when executed from a remote shell
> ---
>
> Key: HIVE-14342
> URL: https://issues.apache.org/jira/browse/HIVE-14342
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 2.0.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
> Attachments: HIVE-14342.2.patch, HIVE-14342.patch, HIVE-14342.patch
>
>
> {code}
> use default;
> create table clitest (key int, name String, value String);
> insert into table clitest values 
> (1,"TRUE","1"),(2,"TRUE","1"),(3,"TRUE","1"),(4,"TRUE","1"),(5,"FALSE","0"),(6,"FALSE","0"),(7,"FALSE","0");
> {code}
> then run a select query
> {code} 
> # cat /tmp/select.sql 
> set hive.execution.engine=mr;
> select key,name,value 
> from clitest 
> where value="1" limit 1;
> {code}
> Then run beeline via a remote shell, for example
> {code}
> $ ssh -l root  "sudo -u hive beeline -u 
> jdbc:hive2://localhost:1 -n hive -p hive --silent=true 
> --outputformat=csv2 -f /tmp/select.sql" 
> root@'s password: 
> 16/07/12 14:59:22 WARN mapreduce.TableMapReduceUtil: The hbase-prefix-tree 
> module jar containing PrefixTreeCodec is not present.  Continuing without it.
> nullkey,name,value 
> 1,TRUE,1
> null   
> $
> {code}
> In older releases that the output is as follows
> {code}
> $ ssh -l root  "sudo -u hive beeline -u 
> jdbc:hive2://localhost:1 -n hive -p hive --silent=true 
> --outputformat=csv2 -f /tmp/run.sql" 
> Are you sure you want to continue connecting (yes/no)? yes
> root@'s password: 
> 16/07/12 14:57:55 WARN mapreduce.TableMapReduceUtil: The hbase-prefix-tree 
> module jar containing PrefixTreeCodec is not present.  Continuing without it.
> key,name,value
> 1,TRUE,1
> $
> {code}
> The output contains nulls instead of blank lines. This is due to the use of 
> -Djline.terminal=jline.UnsupportedTerminal introduced in HIVE-6758 to be able 
> to run beeline as a background process. But this is the unfortunate side 
> effect of that fix.
> Running beeline in background also produces garbled output.
> {code}
> # beeline -u "jdbc:hive2://localhost:1" -n hive -p hive --silent=true 
> --outputformat=csv2 --showHeader=false -f /tmp/run.sql 2>&1 > 
> /tmp/beeline.txt &
> # cat /tmp/beeline.txt 
> null1,TRUE,1   
> #
> {code}
> So I think the use of jline.UnsupportedTerminal should be documented but not 
> used automatically by beeline under the covers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14446) Adjust bloom filter for hybrid grace hash join when row count exceeds certain limit

2016-08-05 Thread Wei Zheng (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng updated HIVE-14446:
-
Summary: Adjust bloom filter for hybrid grace hash join when row count 
exceeds certain limit  (was: Disable bloom filter for hybrid grace hash join 
when row count exceeds certain limit)

> Adjust bloom filter for hybrid grace hash join when row count exceeds certain 
> limit
> ---
>
> Key: HIVE-14446
> URL: https://issues.apache.org/jira/browse/HIVE-14446
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.3.0, 2.2.0, 2.1.1
>Reporter: Wei Zheng
>Assignee: Wei Zheng
>
> When row count exceeds certain limit, it doesn't make sense to generate a 
> bloom filter, since its size will be a few hundred MB or even a few GB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14437) Vectorization: Optimize key misses in VectorMapJoinFastBytesHashTable

2016-08-05 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15410154#comment-15410154
 ] 

Hive QA commented on HIVE-14437:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12822280/HIVE-14437.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 10440 tests 
executed
*Failed tests:*
{noformat}
TestMsgBusConnection - did not produce a TEST-*.xml file
TestQueryLifeTimeHook - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_orc_llap_counters
org.apache.hive.jdbc.TestJdbcWithMiniHS2.testAddJarConstructorUnCaching
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/788/testReport
Console output: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/788/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-788/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12822280 - PreCommit-HIVE-MASTER-Build

> Vectorization: Optimize key misses in VectorMapJoinFastBytesHashTable
> -
>
> Key: HIVE-14437
> URL: https://issues.apache.org/jira/browse/HIVE-14437
> Project: Hive
>  Issue Type: Improvement
>  Components: Vectorization
>Affects Versions: 2.2.0
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-14437.1.patch
>
>
> Currently, the lookup in VectorMapJoinFastBytesHashTable proceeds until the 
> max number of metric put conflicts have been reached.
> This can have a fast-exit when encountering the first empty slot during the 
> probe, to speed up looking for non-existent keys.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14422) LLAP IF: when using LLAP IF from multiple threads in secure cluster, tokens can get mixed up

2016-08-05 Thread Siddharth Seth (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15410108#comment-15410108
 ] 

Siddharth Seth commented on HIVE-14422:
---

Patch looks good to me. One minor change would be to make the ugi handling even 
more explicit - pass it in as a parameter to LlapProtocolClientImpl, instead of 
getting it via the caller ugi.

> LLAP IF: when using LLAP IF from multiple threads in secure cluster, tokens 
> can get mixed up 
> -
>
> Key: HIVE-14422
> URL: https://issues.apache.org/jira/browse/HIVE-14422
> Project: Hive
>  Issue Type: Bug
>Reporter: Jason Dere
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14422.01.patch, HIVE-14422.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14443) Improve ide support

2016-08-05 Thread JIRA


[ 
https://issues.apache.org/jira/browse/HIVE-14443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15409993#comment-15409993
 ] 

Sergio Peña commented on HIVE-14443:


[~kgyrtkirk] I think HIVE-12033 jira would be a good thing to add as subtask 
here. I was trying to remove ANT from q-test. Testing on IntelliJ was useful 
with the patch attached to it, but I had some issues with a hadoop script. 
There are more comments on the jira.

> Improve ide support
> ---
>
> Key: HIVE-14443
> URL: https://issues.apache.org/jira/browse/HIVE-14443
> Project: Hive
>  Issue Type: Improvement
>  Components: Tests
>Reporter: Zoltan Haindrich
>
> this is an umbrella ticket to enable collaboration between us...
> I think that the ability to execute qtests from eclipse or idea would be a 
> reasonable goal ;)
> feel free to add subtasks



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (HIVE-14433) refactor LLAP plan cache avoidance and fix issue in merge processor

2016-08-05 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15409954#comment-15409954
 ] 

Sergey Shelukhin edited comment on HIVE-14433 at 8/5/16 7:37 PM:
-

[~prasanth_j] can you take a look? Note this also fixes the usage in Reduce and 
Merge processors where close() uses the "cache" field, but that field is never 
initialized - instead, ctors use a local variable.


was (Author: sershe):
[~prasanth_j] can you take a look?

> refactor LLAP plan cache avoidance and fix issue in merge processor
> ---
>
> Key: HIVE-14433
> URL: https://issues.apache.org/jira/browse/HIVE-14433
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14433.patch
>
>
> Map and reduce processors do this:
> {noformat}
> if (LlapProxy.isDaemon()) {
>   cache = new org.apache.hadoop.hive.ql.exec.mr.ObjectCache(); // do not 
> cache plan
> ...
> {noformat}
> but merge processor just gets the plan. If it runs in LLAP, it can get a 
> cached plan. Need to move this logic into ObjectCache itself, via a isPlan 
> arg or something. That will also fix this issue for merge processor



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HIVE-14433) refactor LLAP plan cache avoidance and fix issue in merge processor

2016-08-05 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HIVE-14433:
---

Assignee: Sergey Shelukhin

> refactor LLAP plan cache avoidance and fix issue in merge processor
> ---
>
> Key: HIVE-14433
> URL: https://issues.apache.org/jira/browse/HIVE-14433
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14433.patch
>
>
> Map and reduce processors do this:
> {noformat}
> if (LlapProxy.isDaemon()) {
>   cache = new org.apache.hadoop.hive.ql.exec.mr.ObjectCache(); // do not 
> cache plan
> ...
> {noformat}
> but merge processor just gets the plan. If it runs in LLAP, it can get a 
> cached plan. Need to move this logic into ObjectCache itself, via a isPlan 
> arg or something. That will also fix this issue for merge processor



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14433) refactor LLAP plan cache avoidance and fix issue in merge processor

2016-08-05 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14433:

Status: Patch Available  (was: Open)

[~prasanth_j] can you take a look?

> refactor LLAP plan cache avoidance and fix issue in merge processor
> ---
>
> Key: HIVE-14433
> URL: https://issues.apache.org/jira/browse/HIVE-14433
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14433.patch
>
>
> Map and reduce processors do this:
> {noformat}
> if (LlapProxy.isDaemon()) {
>   cache = new org.apache.hadoop.hive.ql.exec.mr.ObjectCache(); // do not 
> cache plan
> ...
> {noformat}
> but merge processor just gets the plan. If it runs in LLAP, it can get a 
> cached plan. Need to move this logic into ObjectCache itself, via a isPlan 
> arg or something. That will also fix this issue for merge processor



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14433) refactor LLAP plan cache avoidance and fix issue in merge processor

2016-08-05 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14433:

Attachment: HIVE-14433.patch

> refactor LLAP plan cache avoidance and fix issue in merge processor
> ---
>
> Key: HIVE-14433
> URL: https://issues.apache.org/jira/browse/HIVE-14433
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
> Attachments: HIVE-14433.patch
>
>
> Map and reduce processors do this:
> {noformat}
> if (LlapProxy.isDaemon()) {
>   cache = new org.apache.hadoop.hive.ql.exec.mr.ObjectCache(); // do not 
> cache plan
> ...
> {noformat}
> but merge processor just gets the plan. If it runs in LLAP, it can get a 
> cached plan. Need to move this logic into ObjectCache itself, via a isPlan 
> arg or something. That will also fix this issue for merge processor



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14391) TestAccumuloCliDriver is not executed during precommit tests

2016-08-05 Thread Zoltan Haindrich (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15409952#comment-15409952
 ] 

Zoltan Haindrich commented on HIVE-14391:
-

i've rewritten TestAccumuloCliDriver to use junit4...but because it already 
fails - and it's not being run during tests...i don't know what to do with it ;)

> TestAccumuloCliDriver is not executed during precommit tests
> 
>
> Key: HIVE-14391
> URL: https://issues.apache.org/jira/browse/HIVE-14391
> Project: Hive
>  Issue Type: Sub-task
>  Components: Testing Infrastructure
>Reporter: Zoltan Haindrich
>
> according to for example this build result:
> https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/685/testReport/org.apache.hadoop.hive.cli/
> there is no 'TestAccumuloCliDriver' being run during precommit testing...but 
> i see no reason why and how it was excluded inside the project;
> my maven executes it when i start it with {{-Dtest=TestAccumuloCliDriver}} - 
> so i think the properties/profiles aren't preventing it.
> maybe i miss something obvious ;)
> (note: my TestAccumuloCliDriver executions are failed with errors.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14391) TestAccumuloCliDriver is not executed during precommit tests

2016-08-05 Thread Zoltan Haindrich (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-14391:

Issue Type: Sub-task  (was: Bug)
Parent: HIVE-14443

> TestAccumuloCliDriver is not executed during precommit tests
> 
>
> Key: HIVE-14391
> URL: https://issues.apache.org/jira/browse/HIVE-14391
> Project: Hive
>  Issue Type: Sub-task
>  Components: Testing Infrastructure
>Reporter: Zoltan Haindrich
>
> according to for example this build result:
> https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/685/testReport/org.apache.hadoop.hive.cli/
> there is no 'TestAccumuloCliDriver' being run during precommit testing...but 
> i see no reason why and how it was excluded inside the project;
> my maven executes it when i start it with {{-Dtest=TestAccumuloCliDriver}} - 
> so i think the properties/profiles aren't preventing it.
> maybe i miss something obvious ;)
> (note: my TestAccumuloCliDriver executions are failed with errors.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14423) S3: Fetching partition sizes from FS can be expensive when stats are not available in metastore

2016-08-05 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15409941#comment-15409941
 ] 

Hive QA commented on HIVE-14423:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12822278/HIVE-14423.2.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 10440 tests 
executed
*Failed tests:*
{noformat}
TestMsgBusConnection - did not produce a TEST-*.xml file
TestQueryLifeTimeHook - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_orc_llap_counters
org.apache.hive.jdbc.TestJdbcWithMiniHS2.testAddJarConstructorUnCaching
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/787/testReport
Console output: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/787/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-787/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12822278 - PreCommit-HIVE-MASTER-Build

> S3: Fetching partition sizes from FS can be expensive when stats are not 
> available in metastore 
> 
>
> Key: HIVE-14423
> URL: https://issues.apache.org/jira/browse/HIVE-14423
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-14423.1.patch, HIVE-14423.2.patch
>
>
> When partition stats are not available in metastore, it tries to get the file 
> sizes from FS.
> e.g
> {noformat}
> at 
> org.apache.hadoop.fs.FileSystem.getContentSummary(FileSystem.java:1487)
> at 
> org.apache.hadoop.hive.ql.stats.StatsUtils.getFileSizeForPartitions(StatsUtils.java:598)
> at 
> org.apache.hadoop.hive.ql.stats.StatsUtils.collectStatistics(StatsUtils.java:235)
> at 
> org.apache.hadoop.hive.ql.stats.StatsUtils.collectStatistics(StatsUtils.java:144)
> at 
> org.apache.hadoop.hive.ql.stats.StatsUtils.collectStatistics(StatsUtils.java:132)
> at 
> org.apache.hadoop.hive.ql.optimizer.stats.annotation.StatsRulesProcFactory$TableScanStatsRule.process(StatsRulesProcFactory.java:126)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
> at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105)
> {noformat}
> This can be quite expensive in some FS like S3. Especially when table is 
> partitioned (e.g TPC-DS store_sales which has 1000s of partitions), query can 
> spend 1000s of seconds just waiting for these information to be pulled in.
> Also, it would be good to remove FS.getContentSummary usage to find out file 
> sizes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14415) Upgrade qtest execution framework to junit4 - TestPerfCliDriver

2016-08-05 Thread Zoltan Haindrich (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-14415:

Issue Type: Sub-task  (was: Improvement)
Parent: HIVE-14443

> Upgrade qtest execution framework to junit4 - TestPerfCliDriver
> ---
>
> Key: HIVE-14415
> URL: https://issues.apache.org/jira/browse/HIVE-14415
> Project: Hive
>  Issue Type: Sub-task
>  Components: Tests
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Attachments: HIVE-14415.1.patch, HIVE-14415.2.patch
>
>
> I would like to upgrade the current maven+ant+velocimacro+junit4 qtest 
> generation framework to use only junit4 - while (trying) to keep 
> all the existing features it provides.
> What I can't really do with the current one: execute easily a single qtests 
> from an IDE (as a matter of fact I can...but it's way too complicated; after 
> this it won't be a cake-walk either...but it will be a step closer ;)
> I think this change will make it more clear how these tests are configured 
> and executed.
> I will do this in two phases, currently i will only change 
> {{TestPerfCliDriver}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14415) Upgrade qtest execution framework to junit4 - TestPerfCliDriver

2016-08-05 Thread Zoltan Haindrich (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15409916#comment-15409916
 ] 

Zoltan Haindrich commented on HIVE-14415:
-

[~pvary]  I'm not actively working on those other things...i've experimented 
with cli execution from eclipse about a month ago..and those are some minor 
issues I identified..

wow!...i wasn't aware that beeline tests are also disabled...there are about 20 
cli tests...testaccumuloclidriver is also disabled - but i was unable to 
determine how HIVE-14391 ;)

I think an umbrella jira ticket would be usefuli've created HIVE-14443 

i've started working on migrating the other clidrivers one-by-one...I haven't 
reached beeline yet...so just submit it as a new ticket; or if you send me a 
pastebin link i'll add it to the second patch

cheers! :)

> Upgrade qtest execution framework to junit4 - TestPerfCliDriver
> ---
>
> Key: HIVE-14415
> URL: https://issues.apache.org/jira/browse/HIVE-14415
> Project: Hive
>  Issue Type: Improvement
>  Components: Tests
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Attachments: HIVE-14415.1.patch, HIVE-14415.2.patch
>
>
> I would like to upgrade the current maven+ant+velocimacro+junit4 qtest 
> generation framework to use only junit4 - while (trying) to keep 
> all the existing features it provides.
> What I can't really do with the current one: execute easily a single qtests 
> from an IDE (as a matter of fact I can...but it's way too complicated; after 
> this it won't be a cake-walk either...but it will be a step closer ;)
> I think this change will make it more clear how these tests are configured 
> and executed.
> I will do this in two phases, currently i will only change 
> {{TestPerfCliDriver}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14442) CBO: Calcite Operator To Hive Operator(Calcite Return Path): Wrong result/plan in group by with hive.map.aggr=false

2016-08-05 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-14442:
---
Status: Patch Available  (was: Open)

> CBO: Calcite Operator To Hive Operator(Calcite Return Path): Wrong 
> result/plan in group by with hive.map.aggr=false
> ---
>
> Key: HIVE-14442
> URL: https://issues.apache.org/jira/browse/HIVE-14442
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-14442.1.patch
>
>
> Reproducer
> {code} set hive.cbo.returnpath.hiveop=true
>  set hive.map.aggr=false
> create table abcd (a int, b int, c int, d int);
> LOAD DATA LOCAL INPATH '../../data/files/in4.txt' INTO TABLE abcd;
> {code}
> {code} explain select count(distinct a) from abcd group by b; {code}
> {code}
> STAGE PLANS:
>   Stage: Stage-1
> Map Reduce
>   Map Operator Tree:
>   TableScan
> alias: abcd
> Statistics: Num rows: 19 Data size: 78 Basic stats: COMPLETE 
> Column stats: NONE
> Select Operator
>   expressions: a (type: int)
>   outputColumnNames: a
>   Statistics: Num rows: 19 Data size: 78 Basic stats: COMPLETE 
> Column stats: NONE
>   Reduce Output Operator
> key expressions: a (type: int), a (type: int)
> sort order: ++
> Map-reduce partition columns: a (type: int)
> Statistics: Num rows: 19 Data size: 78 Basic stats: COMPLETE 
> Column stats: NONE
>   Reduce Operator Tree:
> Group By Operator
>   aggregations: count(DISTINCT KEY._col1:0._col0)
>   keys: KEY._col0 (type: int)
>   mode: complete
>   outputColumnNames: b, $f1
>   Statistics: Num rows: 9 Data size: 36 Basic stats: COMPLETE Column 
> stats: NONE
>   Select Operator
> expressions: $f1 (type: bigint)
> outputColumnNames: _o__c0
> Statistics: Num rows: 9 Data size: 36 Basic stats: COMPLETE 
> Column stats: NONE
> File Output Operator
>   compressed: false
>   Statistics: Num rows: 9 Data size: 36 Basic stats: COMPLETE 
> Column stats: NONE
>   table:
>   input format: 
> org.apache.hadoop.mapred.SequenceFileInputFormat
>   output format: 
> org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
>   serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
> {code}
> {code} explain select count(distinct a) from abcd group by c; {code}
> {code}
> STAGE PLANS:
>   Stage: Stage-1
> Map Reduce
>   Map Operator Tree:
>   TableScan
> alias: abcd
> Statistics: Num rows: 19 Data size: 78 Basic stats: COMPLETE 
> Column stats: NONE
> Select Operator
>   expressions: a (type: int)
>   outputColumnNames: a
>   Statistics: Num rows: 19 Data size: 78 Basic stats: COMPLETE 
> Column stats: NONE
>   Reduce Output Operator
> key expressions: a (type: int), a (type: int)
> sort order: ++
> Map-reduce partition columns: a (type: int)
> Statistics: Num rows: 19 Data size: 78 Basic stats: COMPLETE 
> Column stats: NONE
>   Reduce Operator Tree:
> Group By Operator
>   aggregations: count(DISTINCT KEY._col1:0._col0)
>   keys: KEY._col0 (type: int)
>   mode: complete
>   outputColumnNames: c, $f1
>   Statistics: Num rows: 9 Data size: 36 Basic stats: COMPLETE Column 
> stats: NONE
>   Select Operator
> expressions: $f1 (type: bigint)
> outputColumnNames: _o__c0
> Statistics: Num rows: 9 Data size: 36 Basic stats: COMPLETE 
> Column stats: NONE
> File Output Operator
>   compressed: false
>   Statistics: Num rows: 9 Data size: 36 Basic stats: COMPLETE 
> Column stats: NONE
>   table:
>   input format: 
> org.apache.hadoop.mapred.SequenceFileInputFormat
>   output format: 
> org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
>   serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
> {code}
> Above two cases has wrong keys in Map side Reduce Output Operator (both has 
> a, a instead of b,a and c,a respectively



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14442) CBO: Calcite Operator To Hive Operator(Calcite Return Path): Wrong result/plan in group by with hive.map.aggr=false

2016-08-05 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-14442:
---
Attachment: HIVE-14442.1.patch

> CBO: Calcite Operator To Hive Operator(Calcite Return Path): Wrong 
> result/plan in group by with hive.map.aggr=false
> ---
>
> Key: HIVE-14442
> URL: https://issues.apache.org/jira/browse/HIVE-14442
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-14442.1.patch
>
>
> Reproducer
> {code} set hive.cbo.returnpath.hiveop=true
>  set hive.map.aggr=false
> create table abcd (a int, b int, c int, d int);
> LOAD DATA LOCAL INPATH '../../data/files/in4.txt' INTO TABLE abcd;
> {code}
> {code} explain select count(distinct a) from abcd group by b; {code}
> {code}
> STAGE PLANS:
>   Stage: Stage-1
> Map Reduce
>   Map Operator Tree:
>   TableScan
> alias: abcd
> Statistics: Num rows: 19 Data size: 78 Basic stats: COMPLETE 
> Column stats: NONE
> Select Operator
>   expressions: a (type: int)
>   outputColumnNames: a
>   Statistics: Num rows: 19 Data size: 78 Basic stats: COMPLETE 
> Column stats: NONE
>   Reduce Output Operator
> key expressions: a (type: int), a (type: int)
> sort order: ++
> Map-reduce partition columns: a (type: int)
> Statistics: Num rows: 19 Data size: 78 Basic stats: COMPLETE 
> Column stats: NONE
>   Reduce Operator Tree:
> Group By Operator
>   aggregations: count(DISTINCT KEY._col1:0._col0)
>   keys: KEY._col0 (type: int)
>   mode: complete
>   outputColumnNames: b, $f1
>   Statistics: Num rows: 9 Data size: 36 Basic stats: COMPLETE Column 
> stats: NONE
>   Select Operator
> expressions: $f1 (type: bigint)
> outputColumnNames: _o__c0
> Statistics: Num rows: 9 Data size: 36 Basic stats: COMPLETE 
> Column stats: NONE
> File Output Operator
>   compressed: false
>   Statistics: Num rows: 9 Data size: 36 Basic stats: COMPLETE 
> Column stats: NONE
>   table:
>   input format: 
> org.apache.hadoop.mapred.SequenceFileInputFormat
>   output format: 
> org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
>   serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
> {code}
> {code} explain select count(distinct a) from abcd group by c; {code}
> {code}
> STAGE PLANS:
>   Stage: Stage-1
> Map Reduce
>   Map Operator Tree:
>   TableScan
> alias: abcd
> Statistics: Num rows: 19 Data size: 78 Basic stats: COMPLETE 
> Column stats: NONE
> Select Operator
>   expressions: a (type: int)
>   outputColumnNames: a
>   Statistics: Num rows: 19 Data size: 78 Basic stats: COMPLETE 
> Column stats: NONE
>   Reduce Output Operator
> key expressions: a (type: int), a (type: int)
> sort order: ++
> Map-reduce partition columns: a (type: int)
> Statistics: Num rows: 19 Data size: 78 Basic stats: COMPLETE 
> Column stats: NONE
>   Reduce Operator Tree:
> Group By Operator
>   aggregations: count(DISTINCT KEY._col1:0._col0)
>   keys: KEY._col0 (type: int)
>   mode: complete
>   outputColumnNames: c, $f1
>   Statistics: Num rows: 9 Data size: 36 Basic stats: COMPLETE Column 
> stats: NONE
>   Select Operator
> expressions: $f1 (type: bigint)
> outputColumnNames: _o__c0
> Statistics: Num rows: 9 Data size: 36 Basic stats: COMPLETE 
> Column stats: NONE
> File Output Operator
>   compressed: false
>   Statistics: Num rows: 9 Data size: 36 Basic stats: COMPLETE 
> Column stats: NONE
>   table:
>   input format: 
> org.apache.hadoop.mapred.SequenceFileInputFormat
>   output format: 
> org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
>   serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
> {code}
> Above two cases has wrong keys in Map side Reduce Output Operator (both has 
> a, a instead of b,a and c,a respectively



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14424) Address CLIRestoreTest failure

2016-08-05 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-14424:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Address CLIRestoreTest failure
> --
>
> Key: HIVE-14424
> URL: https://issues.apache.org/jira/browse/HIVE-14424
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Rajat Khandelwal
>Assignee: Rajat Khandelwal
> Fix For: 2.2.0, 2.1.1
>
> Attachments: HIVE-14424.1.patch, HIVE-14424.patch
>
>
> {noformat}
> java.lang.RuntimeException: Error applying authorization policy on hive 
> configuration: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.ClassNotFoundException: 
> org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAuthorizerFactoryForTest
>   at org.apache.hive.service.cli.CLIService.init(CLIService.java:113)
>   at 
> org.apache.hive.service.cli.CLIServiceRestoreTest.getService(CLIServiceRestoreTest.java:48)
>   at 
> org.apache.hive.service.cli.CLIServiceRestoreTest.(CLIServiceRestoreTest.java:28)
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.createTest(BlockJUnit4ClassRunner.java:195)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner$1.runReflectiveCall(BlockJUnit4ClassRunner.java:244)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.methodBlock(BlockJUnit4ClassRunner.java:241)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
>   at org.junit.runner.JUnitCore.run(JUnitCore.java:160)
>   at 
> com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:69)
>   at 
> com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:234)
>   at 
> com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:74)
> Caused by: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.ClassNotFoundException: 
> org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAuthorizerFactoryForTest
>   at 
> org.apache.hadoop.hive.ql.session.SessionState.setupAuth(SessionState.java:836)
>   at 
> org.apache.hadoop.hive.ql.session.SessionState.applyAuthorizationPolicy(SessionState.java:1602)
>   at 
> org.apache.hive.service.cli.CLIService.applyAuthorizationConfigPolicy(CLIService.java:126)
>   at org.apache.hive.service.cli.CLIService.init(CLIService.java:110)
>   ... 22 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.ClassNotFoundException: 
> org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAuthorizerFactoryForTest
>   at 
> org.apache.hadoop.hive.ql.metadata.HiveUtils.getAuthorizeProviderManager(HiveUtils.java:385)
>   at 
> org.apache.hadoop.hive.ql.session.SessionState.setupAuth(SessionState.java:812)
>   ... 25 more
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAuthorizerFactoryForTest
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>   at java.lang.Class.forName0(Native Method)
>   at java.lang.Class.forName(Class.java:348)
>   at 
> org.apache.hadoop.hive.ql.metadata.HiveUtils.getAuthorizeProviderManager(HiveUtils.java:375)
>   ... 26 more
> {noformat}
> But is caused by HIVE-14221. Code changes are here: 
> https://github.com/apache/hive/commit/de5ae86ee70d9396d5cefc499507b5f31fecc916
> So the issue is that, in this patch, everywhere the class 
> org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAuthorizerFactory

[jira] [Commented] (HIVE-14424) Address CLIRestoreTest failure

2016-08-05 Thread Pengcheng Xiong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15409887#comment-15409887
 ] 

Pengcheng Xiong commented on HIVE-14424:


pushed to master and 2.1. Thanks [~prongs] for the patch.

> Address CLIRestoreTest failure
> --
>
> Key: HIVE-14424
> URL: https://issues.apache.org/jira/browse/HIVE-14424
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Rajat Khandelwal
>Assignee: Rajat Khandelwal
> Fix For: 2.2.0, 2.1.1
>
> Attachments: HIVE-14424.1.patch, HIVE-14424.patch
>
>
> {noformat}
> java.lang.RuntimeException: Error applying authorization policy on hive 
> configuration: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.ClassNotFoundException: 
> org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAuthorizerFactoryForTest
>   at org.apache.hive.service.cli.CLIService.init(CLIService.java:113)
>   at 
> org.apache.hive.service.cli.CLIServiceRestoreTest.getService(CLIServiceRestoreTest.java:48)
>   at 
> org.apache.hive.service.cli.CLIServiceRestoreTest.(CLIServiceRestoreTest.java:28)
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.createTest(BlockJUnit4ClassRunner.java:195)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner$1.runReflectiveCall(BlockJUnit4ClassRunner.java:244)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.methodBlock(BlockJUnit4ClassRunner.java:241)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
>   at org.junit.runner.JUnitCore.run(JUnitCore.java:160)
>   at 
> com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:69)
>   at 
> com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:234)
>   at 
> com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:74)
> Caused by: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.ClassNotFoundException: 
> org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAuthorizerFactoryForTest
>   at 
> org.apache.hadoop.hive.ql.session.SessionState.setupAuth(SessionState.java:836)
>   at 
> org.apache.hadoop.hive.ql.session.SessionState.applyAuthorizationPolicy(SessionState.java:1602)
>   at 
> org.apache.hive.service.cli.CLIService.applyAuthorizationConfigPolicy(CLIService.java:126)
>   at org.apache.hive.service.cli.CLIService.init(CLIService.java:110)
>   ... 22 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.ClassNotFoundException: 
> org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAuthorizerFactoryForTest
>   at 
> org.apache.hadoop.hive.ql.metadata.HiveUtils.getAuthorizeProviderManager(HiveUtils.java:385)
>   at 
> org.apache.hadoop.hive.ql.session.SessionState.setupAuth(SessionState.java:812)
>   ... 25 more
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAuthorizerFactoryForTest
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>   at java.lang.Class.forName0(Native Method)
>   at java.lang.Class.forName(Class.java:348)
>   at 
> org.apache.hadoop.hive.ql.metadata.HiveUtils.getAuthorizeProviderManager(HiveUtils.java:375)
>   ... 26 more
> {noformat}
> But is caused by HIVE-14221. Code changes are here: 
> https://github.com/apache/hive/commit/de5ae86ee70d9396d5cefc499507b5f31fecc916
> So the issue is that, in this patch, everywhere the class 
>

[jira] [Updated] (HIVE-14422) LLAP IF: when using LLAP IF from multiple threads in secure cluster, tokens can get mixed up

2016-08-05 Thread Sergey Shelukhin (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14422:

Attachment: HIVE-14422.01.patch

> LLAP IF: when using LLAP IF from multiple threads in secure cluster, tokens 
> can get mixed up 
> -
>
> Key: HIVE-14422
> URL: https://issues.apache.org/jira/browse/HIVE-14422
> Project: Hive
>  Issue Type: Bug
>Reporter: Jason Dere
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14422.01.patch, HIVE-14422.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14422) LLAP IF: when using LLAP IF from multiple threads in secure cluster, tokens can get mixed up

2016-08-05 Thread Sergey Shelukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15409883#comment-15409883
 ] 

Sergey Shelukhin commented on HIVE-14422:
-

[~jdere] [~sseth] ping?

> LLAP IF: when using LLAP IF from multiple threads in secure cluster, tokens 
> can get mixed up 
> -
>
> Key: HIVE-14422
> URL: https://issues.apache.org/jira/browse/HIVE-14422
> Project: Hive
>  Issue Type: Bug
>Reporter: Jason Dere
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14422.01.patch, HIVE-14422.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HIVE-14430) More instances of HiveConf and the associated UDFClassLoader than expected

2016-08-05 Thread Vaibhav Gumashta (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta reassigned HIVE-14430:
---

Assignee: Vaibhav Gumashta

> More instances of HiveConf and the associated UDFClassLoader than expected
> --
>
> Key: HIVE-14430
> URL: https://issues.apache.org/jira/browse/HIVE-14430
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Reporter: Siddharth Seth
>Assignee: Vaibhav Gumashta
>Priority: Critical
>
> 841 instances of HiveConf.
> 831 instances of UDFClassLoader
> This is on a HS2 instance configured to run 10 concurrent queries with LLAP.
> 10 SessionState instances. Something is holding on to the additional 
> HiveConf, UDFClassLoaders - potentially HMSHandler.
> This is with an embedded metastore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14440) Fix default value of USE_DEPRECATED_CLI in cli.cmd

2016-08-05 Thread Vihang Karajgaonkar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15409818#comment-15409818
 ] 

Vihang Karajgaonkar commented on HIVE-14440:


Hi [~Ferd] Can you please review?

> Fix default value of USE_DEPRECATED_CLI in cli.cmd
> --
>
> Key: HIVE-14440
> URL: https://issues.apache.org/jira/browse/HIVE-14440
> Project: Hive
>  Issue Type: Sub-task
>  Components: CLI
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Minor
> Attachments: HIVE-14440.01.patch
>
>
> cli.cmd script sets the default value of USE_DEPRECATED_CLI to false when it 
> is not set which is different than cli.sh which sets it to true.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14424) Address CLIRestoreTest failure

2016-08-05 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-14424:
---
Fix Version/s: (was: 1.3.0)

> Address CLIRestoreTest failure
> --
>
> Key: HIVE-14424
> URL: https://issues.apache.org/jira/browse/HIVE-14424
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Rajat Khandelwal
>Assignee: Rajat Khandelwal
> Fix For: 2.2.0, 2.1.1
>
> Attachments: HIVE-14424.1.patch, HIVE-14424.patch
>
>
> {noformat}
> java.lang.RuntimeException: Error applying authorization policy on hive 
> configuration: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.ClassNotFoundException: 
> org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAuthorizerFactoryForTest
>   at org.apache.hive.service.cli.CLIService.init(CLIService.java:113)
>   at 
> org.apache.hive.service.cli.CLIServiceRestoreTest.getService(CLIServiceRestoreTest.java:48)
>   at 
> org.apache.hive.service.cli.CLIServiceRestoreTest.(CLIServiceRestoreTest.java:28)
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.createTest(BlockJUnit4ClassRunner.java:195)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner$1.runReflectiveCall(BlockJUnit4ClassRunner.java:244)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.methodBlock(BlockJUnit4ClassRunner.java:241)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
>   at org.junit.runner.JUnitCore.run(JUnitCore.java:160)
>   at 
> com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:69)
>   at 
> com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:234)
>   at 
> com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:74)
> Caused by: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.ClassNotFoundException: 
> org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAuthorizerFactoryForTest
>   at 
> org.apache.hadoop.hive.ql.session.SessionState.setupAuth(SessionState.java:836)
>   at 
> org.apache.hadoop.hive.ql.session.SessionState.applyAuthorizationPolicy(SessionState.java:1602)
>   at 
> org.apache.hive.service.cli.CLIService.applyAuthorizationConfigPolicy(CLIService.java:126)
>   at org.apache.hive.service.cli.CLIService.init(CLIService.java:110)
>   ... 22 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.ClassNotFoundException: 
> org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAuthorizerFactoryForTest
>   at 
> org.apache.hadoop.hive.ql.metadata.HiveUtils.getAuthorizeProviderManager(HiveUtils.java:385)
>   at 
> org.apache.hadoop.hive.ql.session.SessionState.setupAuth(SessionState.java:812)
>   ... 25 more
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAuthorizerFactoryForTest
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>   at java.lang.Class.forName0(Native Method)
>   at java.lang.Class.forName(Class.java:348)
>   at 
> org.apache.hadoop.hive.ql.metadata.HiveUtils.getAuthorizeProviderManager(HiveUtils.java:375)
>   ... 26 more
> {noformat}
> But is caused by HIVE-14221. Code changes are here: 
> https://github.com/apache/hive/commit/de5ae86ee70d9396d5cefc499507b5f31fecc916
> So the issue is that, in this patch, everywhere the class 
> org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAuthorizerFactory
>  has been mentioned, except at

[jira] [Updated] (HIVE-14424) Address CLIRestoreTest failure

2016-08-05 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-14424:
---
Affects Version/s: 2.1.0

> Address CLIRestoreTest failure
> --
>
> Key: HIVE-14424
> URL: https://issues.apache.org/jira/browse/HIVE-14424
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Rajat Khandelwal
>Assignee: Rajat Khandelwal
> Fix For: 2.2.0, 2.1.1
>
> Attachments: HIVE-14424.1.patch, HIVE-14424.patch
>
>
> {noformat}
> java.lang.RuntimeException: Error applying authorization policy on hive 
> configuration: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.ClassNotFoundException: 
> org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAuthorizerFactoryForTest
>   at org.apache.hive.service.cli.CLIService.init(CLIService.java:113)
>   at 
> org.apache.hive.service.cli.CLIServiceRestoreTest.getService(CLIServiceRestoreTest.java:48)
>   at 
> org.apache.hive.service.cli.CLIServiceRestoreTest.(CLIServiceRestoreTest.java:28)
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.createTest(BlockJUnit4ClassRunner.java:195)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner$1.runReflectiveCall(BlockJUnit4ClassRunner.java:244)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.methodBlock(BlockJUnit4ClassRunner.java:241)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
>   at org.junit.runner.JUnitCore.run(JUnitCore.java:160)
>   at 
> com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:69)
>   at 
> com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:234)
>   at 
> com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:74)
> Caused by: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.ClassNotFoundException: 
> org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAuthorizerFactoryForTest
>   at 
> org.apache.hadoop.hive.ql.session.SessionState.setupAuth(SessionState.java:836)
>   at 
> org.apache.hadoop.hive.ql.session.SessionState.applyAuthorizationPolicy(SessionState.java:1602)
>   at 
> org.apache.hive.service.cli.CLIService.applyAuthorizationConfigPolicy(CLIService.java:126)
>   at org.apache.hive.service.cli.CLIService.init(CLIService.java:110)
>   ... 22 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.ClassNotFoundException: 
> org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAuthorizerFactoryForTest
>   at 
> org.apache.hadoop.hive.ql.metadata.HiveUtils.getAuthorizeProviderManager(HiveUtils.java:385)
>   at 
> org.apache.hadoop.hive.ql.session.SessionState.setupAuth(SessionState.java:812)
>   ... 25 more
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAuthorizerFactoryForTest
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>   at java.lang.Class.forName0(Native Method)
>   at java.lang.Class.forName(Class.java:348)
>   at 
> org.apache.hadoop.hive.ql.metadata.HiveUtils.getAuthorizeProviderManager(HiveUtils.java:375)
>   ... 26 more
> {noformat}
> But is caused by HIVE-14221. Code changes are here: 
> https://github.com/apache/hive/commit/de5ae86ee70d9396d5cefc499507b5f31fecc916
> So the issue is that, in this patch, everywhere the class 
> org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAuthorizerFactory
>  has been mentioned, except at one

[jira] [Commented] (HIVE-14441) TestFetchResultsOfLog failures

2016-08-05 Thread Tao Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15409803#comment-15409803
 ] 

Tao Li commented on HIVE-14441:
---

 These test failures happen when logging level is set to INFO for HS2. See the 
attached patch as a temporary workaround to fix this issue.

> TestFetchResultsOfLog failures
> --
>
> Key: HIVE-14441
> URL: https://issues.apache.org/jira/browse/HIVE-14441
> Project: Hive
>  Issue Type: Test
>Reporter: Tao Li
> Attachments: BUG-63720.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14441) TestFetchResultsOfLog failures

2016-08-05 Thread Tao Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Li updated HIVE-14441:
--
Attachment: BUG-63720.1.patch

> TestFetchResultsOfLog failures
> --
>
> Key: HIVE-14441
> URL: https://issues.apache.org/jira/browse/HIVE-14441
> Project: Hive
>  Issue Type: Test
>Reporter: Tao Li
> Attachments: BUG-63720.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14424) CLIRestoreTest failing

2016-08-05 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-14424:
---
Summary: CLIRestoreTest failing  (was: CLIRestoreTest failing in branch 2.1)

> CLIRestoreTest failing
> --
>
> Key: HIVE-14424
> URL: https://issues.apache.org/jira/browse/HIVE-14424
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajat Khandelwal
>Assignee: Rajat Khandelwal
> Fix For: 1.3.0, 2.2.0, 2.1.1
>
> Attachments: HIVE-14424.1.patch, HIVE-14424.patch
>
>
> {noformat}
> java.lang.RuntimeException: Error applying authorization policy on hive 
> configuration: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.ClassNotFoundException: 
> org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAuthorizerFactoryForTest
>   at org.apache.hive.service.cli.CLIService.init(CLIService.java:113)
>   at 
> org.apache.hive.service.cli.CLIServiceRestoreTest.getService(CLIServiceRestoreTest.java:48)
>   at 
> org.apache.hive.service.cli.CLIServiceRestoreTest.(CLIServiceRestoreTest.java:28)
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.createTest(BlockJUnit4ClassRunner.java:195)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner$1.runReflectiveCall(BlockJUnit4ClassRunner.java:244)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.methodBlock(BlockJUnit4ClassRunner.java:241)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
>   at org.junit.runner.JUnitCore.run(JUnitCore.java:160)
>   at 
> com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:69)
>   at 
> com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:234)
>   at 
> com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:74)
> Caused by: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.ClassNotFoundException: 
> org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAuthorizerFactoryForTest
>   at 
> org.apache.hadoop.hive.ql.session.SessionState.setupAuth(SessionState.java:836)
>   at 
> org.apache.hadoop.hive.ql.session.SessionState.applyAuthorizationPolicy(SessionState.java:1602)
>   at 
> org.apache.hive.service.cli.CLIService.applyAuthorizationConfigPolicy(CLIService.java:126)
>   at org.apache.hive.service.cli.CLIService.init(CLIService.java:110)
>   ... 22 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.ClassNotFoundException: 
> org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAuthorizerFactoryForTest
>   at 
> org.apache.hadoop.hive.ql.metadata.HiveUtils.getAuthorizeProviderManager(HiveUtils.java:385)
>   at 
> org.apache.hadoop.hive.ql.session.SessionState.setupAuth(SessionState.java:812)
>   ... 25 more
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAuthorizerFactoryForTest
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>   at java.lang.Class.forName0(Native Method)
>   at java.lang.Class.forName(Class.java:348)
>   at 
> org.apache.hadoop.hive.ql.metadata.HiveUtils.getAuthorizeProviderManager(HiveUtils.java:375)
>   ... 26 more
> {noformat}
> But is caused by HIVE-14221. Code changes are here: 
> https://github.com/apache/hive/commit/de5ae86ee70d9396d5cefc499507b5f31fecc916
> So the issue is that, in this patch, everywhere the class 
> org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAuthorizerFactory
>  has been mentioned,

[jira] [Updated] (HIVE-14393) Tuple in list feature fails if there's only 1 tuple in the list

2016-08-05 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-14393:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Tuple in list feature fails if there's only 1 tuple in the list
> ---
>
> Key: HIVE-14393
> URL: https://issues.apache.org/jira/browse/HIVE-14393
> Project: Hive
>  Issue Type: Bug
>  Components: Parser
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Carter Shanklin
>Assignee: Pengcheng Xiong
> Fix For: 2.2.0, 2.1.1
>
> Attachments: HIVE-14393.01.patch
>
>
> So this works:
> {code}
> hive> select * from test where (x,y) in ((1,1),(2,2));
> OK
> 1 1
> 2 2
> Time taken: 0.063 seconds, Fetched: 2 row(s)
> {code}
> And this doesn't:
> {code}
> hive> select * from test where (x,y) in ((1,1));
> org.antlr.runtime.EarlyExitException
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.precedenceEqualExpressionMutiple(HiveParser_IdentifiersParser.java:9510)
> {code}
> If I'm generating SQL I'd like to not have to special case 1 tuple.
> As a point of comparison this works in Postgres:
> {code}
> vagrant=# select * from test where (x, y) in ((1, 1));
>  x | y
> ---+---
>  1 | 1
> (1 row)
> {code}
> Any thoughts on this [~pxiong] ?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14342) Beeline output is garbled when executed from a remote shell

2016-08-05 Thread Naveen Gangam (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naveen Gangam updated HIVE-14342:
-
Status: Patch Available  (was: Open)

> Beeline output is garbled when executed from a remote shell
> ---
>
> Key: HIVE-14342
> URL: https://issues.apache.org/jira/browse/HIVE-14342
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 2.0.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
> Attachments: HIVE-14342.2.patch, HIVE-14342.patch, HIVE-14342.patch
>
>
> {code}
> use default;
> create table clitest (key int, name String, value String);
> insert into table clitest values 
> (1,"TRUE","1"),(2,"TRUE","1"),(3,"TRUE","1"),(4,"TRUE","1"),(5,"FALSE","0"),(6,"FALSE","0"),(7,"FALSE","0");
> {code}
> then run a select query
> {code} 
> # cat /tmp/select.sql 
> set hive.execution.engine=mr;
> select key,name,value 
> from clitest 
> where value="1" limit 1;
> {code}
> Then run beeline via a remote shell, for example
> {code}
> $ ssh -l root  "sudo -u hive beeline -u 
> jdbc:hive2://localhost:1 -n hive -p hive --silent=true 
> --outputformat=csv2 -f /tmp/select.sql" 
> root@'s password: 
> 16/07/12 14:59:22 WARN mapreduce.TableMapReduceUtil: The hbase-prefix-tree 
> module jar containing PrefixTreeCodec is not present.  Continuing without it.
> nullkey,name,value 
> 1,TRUE,1
> null   
> $
> {code}
> In older releases that the output is as follows
> {code}
> $ ssh -l root  "sudo -u hive beeline -u 
> jdbc:hive2://localhost:1 -n hive -p hive --silent=true 
> --outputformat=csv2 -f /tmp/run.sql" 
> Are you sure you want to continue connecting (yes/no)? yes
> root@'s password: 
> 16/07/12 14:57:55 WARN mapreduce.TableMapReduceUtil: The hbase-prefix-tree 
> module jar containing PrefixTreeCodec is not present.  Continuing without it.
> key,name,value
> 1,TRUE,1
> $
> {code}
> The output contains nulls instead of blank lines. This is due to the use of 
> -Djline.terminal=jline.UnsupportedTerminal introduced in HIVE-6758 to be able 
> to run beeline as a background process. But this is the unfortunate side 
> effect of that fix.
> Running beeline in background also produces garbled output.
> {code}
> # beeline -u "jdbc:hive2://localhost:1" -n hive -p hive --silent=true 
> --outputformat=csv2 --showHeader=false -f /tmp/run.sql 2>&1 > 
> /tmp/beeline.txt &
> # cat /tmp/beeline.txt 
> null1,TRUE,1   
> #
> {code}
> So I think the use of jline.UnsupportedTerminal should be documented but not 
> used automatically by beeline under the covers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14342) Beeline output is garbled when executed from a remote shell

2016-08-05 Thread Naveen Gangam (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naveen Gangam updated HIVE-14342:
-
Status: Open  (was: Patch Available)

> Beeline output is garbled when executed from a remote shell
> ---
>
> Key: HIVE-14342
> URL: https://issues.apache.org/jira/browse/HIVE-14342
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 2.0.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
> Attachments: HIVE-14342.2.patch, HIVE-14342.patch, HIVE-14342.patch
>
>
> {code}
> use default;
> create table clitest (key int, name String, value String);
> insert into table clitest values 
> (1,"TRUE","1"),(2,"TRUE","1"),(3,"TRUE","1"),(4,"TRUE","1"),(5,"FALSE","0"),(6,"FALSE","0"),(7,"FALSE","0");
> {code}
> then run a select query
> {code} 
> # cat /tmp/select.sql 
> set hive.execution.engine=mr;
> select key,name,value 
> from clitest 
> where value="1" limit 1;
> {code}
> Then run beeline via a remote shell, for example
> {code}
> $ ssh -l root  "sudo -u hive beeline -u 
> jdbc:hive2://localhost:1 -n hive -p hive --silent=true 
> --outputformat=csv2 -f /tmp/select.sql" 
> root@'s password: 
> 16/07/12 14:59:22 WARN mapreduce.TableMapReduceUtil: The hbase-prefix-tree 
> module jar containing PrefixTreeCodec is not present.  Continuing without it.
> nullkey,name,value 
> 1,TRUE,1
> null   
> $
> {code}
> In older releases that the output is as follows
> {code}
> $ ssh -l root  "sudo -u hive beeline -u 
> jdbc:hive2://localhost:1 -n hive -p hive --silent=true 
> --outputformat=csv2 -f /tmp/run.sql" 
> Are you sure you want to continue connecting (yes/no)? yes
> root@'s password: 
> 16/07/12 14:57:55 WARN mapreduce.TableMapReduceUtil: The hbase-prefix-tree 
> module jar containing PrefixTreeCodec is not present.  Continuing without it.
> key,name,value
> 1,TRUE,1
> $
> {code}
> The output contains nulls instead of blank lines. This is due to the use of 
> -Djline.terminal=jline.UnsupportedTerminal introduced in HIVE-6758 to be able 
> to run beeline as a background process. But this is the unfortunate side 
> effect of that fix.
> Running beeline in background also produces garbled output.
> {code}
> # beeline -u "jdbc:hive2://localhost:1" -n hive -p hive --silent=true 
> --outputformat=csv2 --showHeader=false -f /tmp/run.sql 2>&1 > 
> /tmp/beeline.txt &
> # cat /tmp/beeline.txt 
> null1,TRUE,1   
> #
> {code}
> So I think the use of jline.UnsupportedTerminal should be documented but not 
> used automatically by beeline under the covers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14342) Beeline output is garbled when executed from a remote shell

2016-08-05 Thread Naveen Gangam (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naveen Gangam updated HIVE-14342:
-
Attachment: HIVE-14342.2.patch

Appears that from HIVE-11717 we need to make an additional change to the cli.sh 
Attaching a new patch that includes that fix.

> Beeline output is garbled when executed from a remote shell
> ---
>
> Key: HIVE-14342
> URL: https://issues.apache.org/jira/browse/HIVE-14342
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 2.0.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
> Attachments: HIVE-14342.2.patch, HIVE-14342.patch, HIVE-14342.patch
>
>
> {code}
> use default;
> create table clitest (key int, name String, value String);
> insert into table clitest values 
> (1,"TRUE","1"),(2,"TRUE","1"),(3,"TRUE","1"),(4,"TRUE","1"),(5,"FALSE","0"),(6,"FALSE","0"),(7,"FALSE","0");
> {code}
> then run a select query
> {code} 
> # cat /tmp/select.sql 
> set hive.execution.engine=mr;
> select key,name,value 
> from clitest 
> where value="1" limit 1;
> {code}
> Then run beeline via a remote shell, for example
> {code}
> $ ssh -l root  "sudo -u hive beeline -u 
> jdbc:hive2://localhost:1 -n hive -p hive --silent=true 
> --outputformat=csv2 -f /tmp/select.sql" 
> root@'s password: 
> 16/07/12 14:59:22 WARN mapreduce.TableMapReduceUtil: The hbase-prefix-tree 
> module jar containing PrefixTreeCodec is not present.  Continuing without it.
> nullkey,name,value 
> 1,TRUE,1
> null   
> $
> {code}
> In older releases that the output is as follows
> {code}
> $ ssh -l root  "sudo -u hive beeline -u 
> jdbc:hive2://localhost:1 -n hive -p hive --silent=true 
> --outputformat=csv2 -f /tmp/run.sql" 
> Are you sure you want to continue connecting (yes/no)? yes
> root@'s password: 
> 16/07/12 14:57:55 WARN mapreduce.TableMapReduceUtil: The hbase-prefix-tree 
> module jar containing PrefixTreeCodec is not present.  Continuing without it.
> key,name,value
> 1,TRUE,1
> $
> {code}
> The output contains nulls instead of blank lines. This is due to the use of 
> -Djline.terminal=jline.UnsupportedTerminal introduced in HIVE-6758 to be able 
> to run beeline as a background process. But this is the unfortunate side 
> effect of that fix.
> Running beeline in background also produces garbled output.
> {code}
> # beeline -u "jdbc:hive2://localhost:1" -n hive -p hive --silent=true 
> --outputformat=csv2 --showHeader=false -f /tmp/run.sql 2>&1 > 
> /tmp/beeline.txt &
> # cat /tmp/beeline.txt 
> null1,TRUE,1   
> #
> {code}
> So I think the use of jline.UnsupportedTerminal should be documented but not 
> used automatically by beeline under the covers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14421) FS.deleteOnExit holds references to _tmp_space.db files

2016-08-05 Thread Thejas M Nair (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15409795#comment-15409795
 ] 

Thejas M Nair commented on HIVE-14421:
--

Regarding - 
bq. LOG.error("Failed to delete path at {} on fs with scheme {}", path, 
 (fs == null ? "Unknown-null" : fs.getScheme()), e);
Wouldn't the exception e be considered to be a part of the string message 
arguments ? It won't get printed as there are only two "{}", and even if we add 
another "{}", it won't print the stack trace. (Please correct me if I am wrong).
Should we just go with LOG.error(String, Throwable) method ?



> FS.deleteOnExit holds references to _tmp_space.db files
> ---
>
> Key: HIVE-14421
> URL: https://issues.apache.org/jira/browse/HIVE-14421
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-14421.01.patch, HIVE-14421.02.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14442) CBO: Calcite Operator To Hive Operator(Calcite Return Path): Wrong result/plan in group by with hive.map.aggr=false

2016-08-05 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-14442:
---
Description: 
Reproducer

{code} set hive.cbo.returnpath.hiveop=true
 set hive.map.aggr=false

create table abcd (a int, b int, c int, d int);
LOAD DATA LOCAL INPATH '../../data/files/in4.txt' INTO TABLE abcd;
{code}

{code} explain select count(distinct a) from abcd group by b; {code}

{code}
STAGE PLANS:
  Stage: Stage-1
Map Reduce
  Map Operator Tree:
  TableScan
alias: abcd
Statistics: Num rows: 19 Data size: 78 Basic stats: COMPLETE Column 
stats: NONE
Select Operator
  expressions: a (type: int)
  outputColumnNames: a
  Statistics: Num rows: 19 Data size: 78 Basic stats: COMPLETE 
Column stats: NONE
  Reduce Output Operator
key expressions: a (type: int), a (type: int)
sort order: ++
Map-reduce partition columns: a (type: int)
Statistics: Num rows: 19 Data size: 78 Basic stats: COMPLETE 
Column stats: NONE
  Reduce Operator Tree:
Group By Operator
  aggregations: count(DISTINCT KEY._col1:0._col0)
  keys: KEY._col0 (type: int)
  mode: complete
  outputColumnNames: b, $f1
  Statistics: Num rows: 9 Data size: 36 Basic stats: COMPLETE Column 
stats: NONE
  Select Operator
expressions: $f1 (type: bigint)
outputColumnNames: _o__c0
Statistics: Num rows: 9 Data size: 36 Basic stats: COMPLETE Column 
stats: NONE
File Output Operator
  compressed: false
  Statistics: Num rows: 9 Data size: 36 Basic stats: COMPLETE 
Column stats: NONE
  table:
  input format: org.apache.hadoop.mapred.SequenceFileInputFormat
  output format: 
org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
  serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
{code}

{code} explain select count(distinct a) from abcd group by c; {code}
{code}
STAGE PLANS:
  Stage: Stage-1
Map Reduce
  Map Operator Tree:
  TableScan
alias: abcd
Statistics: Num rows: 19 Data size: 78 Basic stats: COMPLETE Column 
stats: NONE
Select Operator
  expressions: a (type: int)
  outputColumnNames: a
  Statistics: Num rows: 19 Data size: 78 Basic stats: COMPLETE 
Column stats: NONE
  Reduce Output Operator
key expressions: a (type: int), a (type: int)
sort order: ++
Map-reduce partition columns: a (type: int)
Statistics: Num rows: 19 Data size: 78 Basic stats: COMPLETE 
Column stats: NONE
  Reduce Operator Tree:
Group By Operator
  aggregations: count(DISTINCT KEY._col1:0._col0)
  keys: KEY._col0 (type: int)
  mode: complete
  outputColumnNames: c, $f1
  Statistics: Num rows: 9 Data size: 36 Basic stats: COMPLETE Column 
stats: NONE
  Select Operator
expressions: $f1 (type: bigint)
outputColumnNames: _o__c0
Statistics: Num rows: 9 Data size: 36 Basic stats: COMPLETE Column 
stats: NONE
File Output Operator
  compressed: false
  Statistics: Num rows: 9 Data size: 36 Basic stats: COMPLETE 
Column stats: NONE
  table:
  input format: org.apache.hadoop.mapred.SequenceFileInputFormat
  output format: 
org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
  serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe

{code}

Above two cases has wrong keys in Map side Reduce Output Operator (both has a, 
a instead of b,a and c,a respectively

  was:
Reproducer

{code} set hive.cbo.returnpath.hiveop=true
 set hive.map.aggr=false

create table abcd (a int, b int, c int, d int);
LOAD DATA LOCAL INPATH '../../data/files/in4.txt' INTO TABLE abcd;
{code}

{code} explain select count(distinct a) from abcd group by b; {code}

{code}
STAGE PLANS:
  Stage: Stage-1
Map Reduce
  Map Operator Tree:
  TableScan
alias: abcd
Statistics: Num rows: 19 Data size: 78 Basic stats: COMPLETE Column 
stats: NONE
Select Operator
  expressions: a (type: int)
  outputColumnNames: a
  Statistics: Num rows: 19 Data size: 78 Basic stats: COMPLETE 
Column stats: NONE
  Reduce Output Operator
key expressions: a (type: int), a (type: int)
sort order: ++
Map-reduce partition columns: a (type: int)
Statistics: Num rows: 19 Data size: 78 Basic stats: COMPLETE 
Column stats: NONE
  Reduce Operator Tree:
Group By Operator

[jira] [Updated] (HIVE-14441) TestFetchResultsOfLog failures

2016-08-05 Thread Tao Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Li updated HIVE-14441:
--
Summary: TestFetchResultsOfLog failures  (was: TestOperationLogging fails)

> TestFetchResultsOfLog failures
> --
>
> Key: HIVE-14441
> URL: https://issues.apache.org/jira/browse/HIVE-14441
> Project: Hive
>  Issue Type: Test
>Reporter: Tao Li
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14442) CBO: Calcite Operator To Hive Operator(Calcite Return Path): Wrong result/plan in group by with hive.map.aggr=false

2016-08-05 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-14442:
---
Description: 
Reproducer

{code} set hive.cbo.returnpath.hiveop=true {code}
 set hive.map.aggr=false {code}

{code}
create table abcd (a int, b int, c int, d int);
LOAD DATA LOCAL INPATH '../../data/files/in4.txt' INTO TABLE abcd;
{code}

{code} explain select count(distinct a) from abcd group by b; {code}

{code}
STAGE PLANS:
  Stage: Stage-1
Map Reduce
  Map Operator Tree:
  TableScan
alias: abcd
Statistics: Num rows: 19 Data size: 78 Basic stats: COMPLETE Column 
stats: NONE
Select Operator
  expressions: a (type: int)
  outputColumnNames: a
  Statistics: Num rows: 19 Data size: 78 Basic stats: COMPLETE 
Column stats: NONE
  Reduce Output Operator
key expressions: a (type: int), a (type: int)
sort order: ++
Map-reduce partition columns: a (type: int)
Statistics: Num rows: 19 Data size: 78 Basic stats: COMPLETE 
Column stats: NONE
  Reduce Operator Tree:
Group By Operator
  aggregations: count(DISTINCT KEY._col1:0._col0)
  keys: KEY._col0 (type: int)
  mode: complete
  outputColumnNames: b, $f1
  Statistics: Num rows: 9 Data size: 36 Basic stats: COMPLETE Column 
stats: NONE
  Select Operator
expressions: $f1 (type: bigint)
outputColumnNames: _o__c0
Statistics: Num rows: 9 Data size: 36 Basic stats: COMPLETE Column 
stats: NONE
File Output Operator
  compressed: false
  Statistics: Num rows: 9 Data size: 36 Basic stats: COMPLETE 
Column stats: NONE
  table:
  input format: org.apache.hadoop.mapred.SequenceFileInputFormat
  output format: 
org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
  serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe

{code} explain select count(distinct a) from abcd group by c; {code}
{code}
STAGE PLANS:
  Stage: Stage-1
Map Reduce
  Map Operator Tree:
  TableScan
alias: abcd
Statistics: Num rows: 19 Data size: 78 Basic stats: COMPLETE Column 
stats: NONE
Select Operator
  expressions: a (type: int)
  outputColumnNames: a
  Statistics: Num rows: 19 Data size: 78 Basic stats: COMPLETE 
Column stats: NONE
  Reduce Output Operator
key expressions: a (type: int), a (type: int)
sort order: ++
Map-reduce partition columns: a (type: int)
Statistics: Num rows: 19 Data size: 78 Basic stats: COMPLETE 
Column stats: NONE
  Reduce Operator Tree:
Group By Operator
  aggregations: count(DISTINCT KEY._col1:0._col0)
  keys: KEY._col0 (type: int)
  mode: complete
  outputColumnNames: c, $f1
  Statistics: Num rows: 9 Data size: 36 Basic stats: COMPLETE Column 
stats: NONE
  Select Operator
expressions: $f1 (type: bigint)
outputColumnNames: _o__c0
Statistics: Num rows: 9 Data size: 36 Basic stats: COMPLETE Column 
stats: NONE
File Output Operator
  compressed: false
  Statistics: Num rows: 9 Data size: 36 Basic stats: COMPLETE 
Column stats: NONE
  table:
  input format: org.apache.hadoop.mapred.SequenceFileInputFormat
  output format: 
org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
  serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe

{code}

Above two cases has wrong keys in Map side Reduce Output Operator (both has a, 
a instead of b,a and c,a respectively

  was:
Reproducer

{code} set hive.cbo.returnpath.hiveop=true {code}
{code} set hive.map.aggr=false {code}

{code}
create table abcd (a int, b int, c int, d int);
LOAD DATA LOCAL INPATH '../../data/files/in4.txt' INTO TABLE abcd;
{code}

{code} explain select count(distinct a) from abcd group by b; {code}

{code}
STAGE PLANS:
  Stage: Stage-1
Map Reduce
  Map Operator Tree:
  TableScan
alias: abcd
Statistics: Num rows: 19 Data size: 78 Basic stats: COMPLETE Column 
stats: NONE
Select Operator
  expressions: a (type: int)
  outputColumnNames: a
  Statistics: Num rows: 19 Data size: 78 Basic stats: COMPLETE 
Column stats: NONE
  Reduce Output Operator
key expressions: a (type: int), a (type: int)
sort order: ++
Map-reduce partition columns: a (type: int)
Statistics: Num rows: 19 Data size: 78 Basic stats: COMPLETE 
Column stats: NONE
  Reduce Operator Tree:

[jira] [Updated] (HIVE-14442) CBO: Calcite Operator To Hive Operator(Calcite Return Path): Wrong result/plan in group by with hive.map.aggr=false

2016-08-05 Thread Vineet Garg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-14442:
---
Description: 
Reproducer

{code} set hive.cbo.returnpath.hiveop=true
 set hive.map.aggr=false

create table abcd (a int, b int, c int, d int);
LOAD DATA LOCAL INPATH '../../data/files/in4.txt' INTO TABLE abcd;
{code}

{code} explain select count(distinct a) from abcd group by b; {code}

{code}
STAGE PLANS:
  Stage: Stage-1
Map Reduce
  Map Operator Tree:
  TableScan
alias: abcd
Statistics: Num rows: 19 Data size: 78 Basic stats: COMPLETE Column 
stats: NONE
Select Operator
  expressions: a (type: int)
  outputColumnNames: a
  Statistics: Num rows: 19 Data size: 78 Basic stats: COMPLETE 
Column stats: NONE
  Reduce Output Operator
key expressions: a (type: int), a (type: int)
sort order: ++
Map-reduce partition columns: a (type: int)
Statistics: Num rows: 19 Data size: 78 Basic stats: COMPLETE 
Column stats: NONE
  Reduce Operator Tree:
Group By Operator
  aggregations: count(DISTINCT KEY._col1:0._col0)
  keys: KEY._col0 (type: int)
  mode: complete
  outputColumnNames: b, $f1
  Statistics: Num rows: 9 Data size: 36 Basic stats: COMPLETE Column 
stats: NONE
  Select Operator
expressions: $f1 (type: bigint)
outputColumnNames: _o__c0
Statistics: Num rows: 9 Data size: 36 Basic stats: COMPLETE Column 
stats: NONE
File Output Operator
  compressed: false
  Statistics: Num rows: 9 Data size: 36 Basic stats: COMPLETE 
Column stats: NONE
  table:
  input format: org.apache.hadoop.mapred.SequenceFileInputFormat
  output format: 
org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
  serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe

{code} explain select count(distinct a) from abcd group by c; {code}
{code}
STAGE PLANS:
  Stage: Stage-1
Map Reduce
  Map Operator Tree:
  TableScan
alias: abcd
Statistics: Num rows: 19 Data size: 78 Basic stats: COMPLETE Column 
stats: NONE
Select Operator
  expressions: a (type: int)
  outputColumnNames: a
  Statistics: Num rows: 19 Data size: 78 Basic stats: COMPLETE 
Column stats: NONE
  Reduce Output Operator
key expressions: a (type: int), a (type: int)
sort order: ++
Map-reduce partition columns: a (type: int)
Statistics: Num rows: 19 Data size: 78 Basic stats: COMPLETE 
Column stats: NONE
  Reduce Operator Tree:
Group By Operator
  aggregations: count(DISTINCT KEY._col1:0._col0)
  keys: KEY._col0 (type: int)
  mode: complete
  outputColumnNames: c, $f1
  Statistics: Num rows: 9 Data size: 36 Basic stats: COMPLETE Column 
stats: NONE
  Select Operator
expressions: $f1 (type: bigint)
outputColumnNames: _o__c0
Statistics: Num rows: 9 Data size: 36 Basic stats: COMPLETE Column 
stats: NONE
File Output Operator
  compressed: false
  Statistics: Num rows: 9 Data size: 36 Basic stats: COMPLETE 
Column stats: NONE
  table:
  input format: org.apache.hadoop.mapred.SequenceFileInputFormat
  output format: 
org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
  serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe

{code}

Above two cases has wrong keys in Map side Reduce Output Operator (both has a, 
a instead of b,a and c,a respectively

  was:
Reproducer

{code} set hive.cbo.returnpath.hiveop=true {code}
 set hive.map.aggr=false {code}

{code}
create table abcd (a int, b int, c int, d int);
LOAD DATA LOCAL INPATH '../../data/files/in4.txt' INTO TABLE abcd;
{code}

{code} explain select count(distinct a) from abcd group by b; {code}

{code}
STAGE PLANS:
  Stage: Stage-1
Map Reduce
  Map Operator Tree:
  TableScan
alias: abcd
Statistics: Num rows: 19 Data size: 78 Basic stats: COMPLETE Column 
stats: NONE
Select Operator
  expressions: a (type: int)
  outputColumnNames: a
  Statistics: Num rows: 19 Data size: 78 Basic stats: COMPLETE 
Column stats: NONE
  Reduce Output Operator
key expressions: a (type: int), a (type: int)
sort order: ++
Map-reduce partition columns: a (type: int)
Statistics: Num rows: 19 Data size: 78 Basic stats: COMPLETE 
Column stats: NONE
  Reduce Operator Tree:
Group By Operator

[jira] [Commented] (HIVE-14430) More instances of HiveConf and the associated UDFClassLoader than expected

2016-08-05 Thread Siddharth Seth (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15409781#comment-15409781
 ] 

Siddharth Seth commented on HIVE-14430:
---

Not sure. I think the thread local instances are cleaned up at session shutdown?

cc [~vgumashta]

> More instances of HiveConf and the associated UDFClassLoader than expected
> --
>
> Key: HIVE-14430
> URL: https://issues.apache.org/jira/browse/HIVE-14430
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Reporter: Siddharth Seth
>Priority: Critical
>
> 841 instances of HiveConf.
> 831 instances of UDFClassLoader
> This is on a HS2 instance configured to run 10 concurrent queries with LLAP.
> 10 SessionState instances. Something is holding on to the additional 
> HiveConf, UDFClassLoaders - potentially HMSHandler.
> This is with an embedded metastore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14440) Fix default value of USE_DEPRECATED_CLI in cli.cmd

2016-08-05 Thread Vihang Karajgaonkar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar updated HIVE-14440:
---
Status: Patch Available  (was: Open)

> Fix default value of USE_DEPRECATED_CLI in cli.cmd
> --
>
> Key: HIVE-14440
> URL: https://issues.apache.org/jira/browse/HIVE-14440
> Project: Hive
>  Issue Type: Sub-task
>  Components: CLI
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Minor
> Attachments: HIVE-14440.01.patch
>
>
> cli.cmd script sets the default value of USE_DEPRECATED_CLI to false when it 
> is not set which is different than cli.sh which sets it to true.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14440) Fix default value of USE_DEPRECATED_CLI in cli.cmd

2016-08-05 Thread Vihang Karajgaonkar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar updated HIVE-14440:
---
Description: cli.cmd script sets the default value of USE_DEPRECATED_CLI to 
false when it is not set which is different than cli.sh which sets it to true.

> Fix default value of USE_DEPRECATED_CLI in cli.cmd
> --
>
> Key: HIVE-14440
> URL: https://issues.apache.org/jira/browse/HIVE-14440
> Project: Hive
>  Issue Type: Sub-task
>  Components: CLI
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Minor
> Attachments: HIVE-14440.01.patch
>
>
> cli.cmd script sets the default value of USE_DEPRECATED_CLI to false when it 
> is not set which is different than cli.sh which sets it to true.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14440) Fix default value of USE_DEPRECATED_CLI in cli.cmd

2016-08-05 Thread Vihang Karajgaonkar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar updated HIVE-14440:
---
Attachment: HIVE-14440.01.patch

> Fix default value of USE_DEPRECATED_CLI in cli.cmd
> --
>
> Key: HIVE-14440
> URL: https://issues.apache.org/jira/browse/HIVE-14440
> Project: Hive
>  Issue Type: Sub-task
>  Components: CLI
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Minor
> Attachments: HIVE-14440.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14439) LlapTaskScheduler should try scheduling tasks when a node is disabled

2016-08-05 Thread Siddharth Seth (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-14439:
--
Attachment: HIVE-14439.01.patch

cc [~prasanth_j], [~hagleitn] for review.

> LlapTaskScheduler should try scheduling tasks when a node is disabled
> -
>
> Key: HIVE-14439
> URL: https://issues.apache.org/jira/browse/HIVE-14439
> Project: Hive
>  Issue Type: Bug
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-14439.01.patch
>
>
> When a node is disabled - try scheduling pending tasks. Tasks which may have 
> been waiting for the node to become available could become candidates for 
> scheduling on alternate nodes depending on the locality delay and disable 
> duration.
> This is what is causing an occasional timeout on 
> testDelayedLocalityNodeCommErrorImmediateAllocation



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14439) LlapTaskScheduler should try scheduling tasks when a node is disabled

2016-08-05 Thread Siddharth Seth (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-14439:
--
Status: Patch Available  (was: Open)

> LlapTaskScheduler should try scheduling tasks when a node is disabled
> -
>
> Key: HIVE-14439
> URL: https://issues.apache.org/jira/browse/HIVE-14439
> Project: Hive
>  Issue Type: Bug
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-14439.01.patch
>
>
> When a node is disabled - try scheduling pending tasks. Tasks which may have 
> been waiting for the node to become available could become candidates for 
> scheduling on alternate nodes depending on the locality delay and disable 
> duration.
> This is what is causing an occasional timeout on 
> testDelayedLocalityNodeCommErrorImmediateAllocation



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-14439) LlapTaskScheduler should try scheduling tasks when a node is disabled

2016-08-05 Thread Siddharth Seth (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-14439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-14439:
--
Attachment: HIVE-14439.02.patch

Updated to make the test failure more consistent.

> LlapTaskScheduler should try scheduling tasks when a node is disabled
> -
>
> Key: HIVE-14439
> URL: https://issues.apache.org/jira/browse/HIVE-14439
> Project: Hive
>  Issue Type: Bug
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-14439.01.patch, HIVE-14439.02.patch
>
>
> When a node is disabled - try scheduling pending tasks. Tasks which may have 
> been waiting for the node to become available could become candidates for 
> scheduling on alternate nodes depending on the locality delay and disable 
> duration.
> This is what is causing an occasional timeout on 
> testDelayedLocalityNodeCommErrorImmediateAllocation



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-11943) Set old CLI as the default Client when using hive script

2016-08-05 Thread Vihang Karajgaonkar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-11943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15409751#comment-15409751
 ] 

Vihang Karajgaonkar commented on HIVE-11943:


Thanks for the confirmation. Created HIVE-14440 to address this

> Set old CLI as the default Client when using hive script
> 
>
> Key: HIVE-11943
> URL: https://issues.apache.org/jira/browse/HIVE-11943
> Project: Hive
>  Issue Type: Sub-task
>  Components: CLI
>Affects Versions: beeline-cli-branch
>Reporter: Ferdinand Xu
>Assignee: Ferdinand Xu
> Fix For: beeline-cli-branch
>
> Attachments: HIVE-11943.1-beeline-cli.patch
>
>
> Since we have some concerns about deprecating the current CLI, we will set 
> the old CLI as default. Once we resolve the problems, we will set the new CLI 
> as default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10459) Add materialized views to Hive

2016-08-05 Thread Gunther Hagleitner (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-10459:
--
Assignee: Jesus Camacho Rodriguez  (was: Julian Hyde)

> Add materialized views to Hive
> --
>
> Key: HIVE-10459
> URL: https://issues.apache.org/jira/browse/HIVE-10459
> Project: Hive
>  Issue Type: New Feature
>  Components: Views
>Reporter: Alan Gates
>Assignee: Jesus Camacho Rodriguez
>
> Materialized views are useful as ways to store either alternate versions of 
> data (e.g. same data, different sort order) or derivatives of data sets (e.g. 
> commonly used aggregates).  It is useful to store these as materialized views 
> rather than as tables because it can give the optimizer the ability to 
> understand how data sets are related.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14424) CLIRestoreTest failing in branch 2.1

2016-08-05 Thread Hive QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15409710#comment-15409710
 ] 

Hive QA commented on HIVE-14424:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12822040/HIVE-14424.1.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 10440 tests 
executed
*Failed tests:*
{noformat}
TestMsgBusConnection - did not produce a TEST-*.xml file
TestQueryLifeTimeHook - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_mult_tables_compact
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_orc_llap_counters
org.apache.hive.jdbc.TestJdbcWithMiniHS2.testAddJarConstructorUnCaching
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/786/testReport
Console output: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/786/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-786/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12822040 - PreCommit-HIVE-MASTER-Build

> CLIRestoreTest failing in branch 2.1
> 
>
> Key: HIVE-14424
> URL: https://issues.apache.org/jira/browse/HIVE-14424
> Project: Hive
>  Issue Type: Bug
>Reporter: Rajat Khandelwal
>Assignee: Rajat Khandelwal
> Fix For: 1.3.0, 2.2.0, 2.1.1
>
> Attachments: HIVE-14424.1.patch, HIVE-14424.patch
>
>
> {noformat}
> java.lang.RuntimeException: Error applying authorization policy on hive 
> configuration: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.ClassNotFoundException: 
> org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAuthorizerFactoryForTest
>   at org.apache.hive.service.cli.CLIService.init(CLIService.java:113)
>   at 
> org.apache.hive.service.cli.CLIServiceRestoreTest.getService(CLIServiceRestoreTest.java:48)
>   at 
> org.apache.hive.service.cli.CLIServiceRestoreTest.(CLIServiceRestoreTest.java:28)
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.createTest(BlockJUnit4ClassRunner.java:195)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner$1.runReflectiveCall(BlockJUnit4ClassRunner.java:244)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.methodBlock(BlockJUnit4ClassRunner.java:241)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
>   at org.junit.runner.JUnitCore.run(JUnitCore.java:160)
>   at 
> com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:69)
>   at 
> com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:234)
>   at 
> com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:74)
> Caused by: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.ClassNotFoundException: 
> org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAuthorizerFactoryForTest
>   at 
> org.apache.hadoop.hive.ql.session.SessionState.setupAuth(SessionState.java:836)
>   at 
> org.apache.hadoop.hive.ql.session.SessionState.applyAuthorizationPolicy(SessionState.java:1602)
>   at 
> org.apache.hive.service.cli.CLIService.applyAuthorizationConfigPolicy(CLIService.java:126)
>   at org.apache.hive.service.cli.CLIService.init(CLIService.java:110)

[jira] [Commented] (HIVE-14270) Write temporary data to HDFS when doing inserts on tables located on S3

2016-08-05 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-14270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15409680#comment-15409680
 ] 

Ashutosh Chauhan commented on HIVE-14270:
-

Seems like failures are related.

> Write temporary data to HDFS when doing inserts on tables located on S3
> ---
>
> Key: HIVE-14270
> URL: https://issues.apache.org/jira/browse/HIVE-14270
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergio Peña
>Assignee: Sergio Peña
> Attachments: HIVE-14270.1.patch, HIVE-14270.2.patch, 
> HIVE-14270.3.patch, HIVE-14270.4.patch
>
>
> Currently, when doing INSERT statements on tables located at S3, Hive writes 
> and reads temporary (or intermediate) files to S3 as well. 
> If HDFS is still the default filesystem on Hive, then we can keep such 
> temporary files on HDFS to keep things run faster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

1 2 >

1 - 100 of 131 matches

Mail list logo