[jira] [Commented] (HIVE-12643) For self describing InputFormat don't replicate schema information in partitions

2016-05-23 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15297281#comment-15297281
 ] 

Matt McCline commented on HIVE-12643:
-

LGTM +1

> For self describing InputFormat don't replicate schema information in 
> partitions
> 
>
> Key: HIVE-12643
> URL: https://issues.apache.org/jira/browse/HIVE-12643
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 2.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-12643.1.patch, HIVE-12643.2.patch, 
> HIVE-12643.3.patch, HIVE-12643.3.patch, HIVE-12643.patch
>
>
> Since self describing Input Formats don't use individual partition schemas 
> for schema resolution, there is no need to send that info to tasks.
> Doing this should cut down plan size.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12643) For self describing InputFormat don't replicate schema information in partitions

2016-05-17 Thread Nita Dembla (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15288149#comment-15288149
 ] 

Nita Dembla commented on HIVE-12643:


I've tested a slightly modified version of the patch. Original changes to 
following files were rejected
- ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java
- ql/src/java/org/apache/hadoop/hive/ql/exec/MapOperator.java

And following file needed modifications
- ql/src/java/org/apache/hadoop/hive/ql/plan/PartitionDesc.java



> For self describing InputFormat don't replicate schema information in 
> partitions
> 
>
> Key: HIVE-12643
> URL: https://issues.apache.org/jira/browse/HIVE-12643
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 2.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-12643.1.patch, HIVE-12643.2.patch, 
> HIVE-12643.3.patch, HIVE-12643.3.patch, HIVE-12643.patch
>
>
> Since self describing Input Formats don't use individual partition schemas 
> for schema resolution, there is no need to send that info to tasks.
> Doing this should cut down plan size.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12643) For self describing InputFormat don't replicate schema information in partitions

2016-05-17 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15288034#comment-15288034
 ] 

Ashutosh Chauhan commented on HIVE-12643:
-

[~mmccline] Can you also please take a look at this one? [~ndembla] tried it 
out and reported this gives us a speed-up in query compile time.

> For self describing InputFormat don't replicate schema information in 
> partitions
> 
>
> Key: HIVE-12643
> URL: https://issues.apache.org/jira/browse/HIVE-12643
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 2.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-12643.1.patch, HIVE-12643.2.patch, 
> HIVE-12643.3.patch, HIVE-12643.3.patch, HIVE-12643.patch
>
>
> Since self describing Input Formats don't use individual partition schemas 
> for schema resolution, there is no need to send that info to tasks.
> Doing this should cut down plan size.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12643) For self describing InputFormat don't replicate schema information in partitions

2016-05-16 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15284340#comment-15284340
 ] 

Hive QA commented on HIVE-12643:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12804105/HIVE-12643.3.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 76 failed/errored test(s), 9814 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
TestMiniLlapCliDriver - did not produce a TEST-*.xml file
TestMiniTezCliDriver-auto_join1.q-schema_evol_text_vec_mapwork_part_all_complex.q-vector_complex_join.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-auto_join30.q-vector_decimal_10_0.q-acid_globallimit.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-auto_sortmerge_join_16.q-skewjoin.q-vectorization_div0.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-auto_sortmerge_join_7.q-orc_merge9.q-tez_union_dynamic_partition.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-cte_4.q-vector_non_string_partition.q-delete_where_non_partitioned.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-join1.q-mapjoin_decimal.q-union5.q-and-12-more - did not 
produce a TEST-*.xml file
TestMiniTezCliDriver-load_dyn_part2.q-selectDistinctStar.q-vector_decimal_5.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-schema_evol_text_nonvec_mapwork_table.q-vector_decimal_trailing.q-subquery_in.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-smb_cache.q-transform_ppr2.q-vector_outer_join0.q-and-5-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-update_orig_table.q-union2.q-bucket4.q-and-12-more - did 
not produce a TEST-*.xml file
TestMiniTezCliDriver-vector_distinct_2.q-tez_joins_explain.q-cte_mat_1.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-vector_interval_2.q-schema_evol_text_nonvec_mapwork_part_all_primitive.q-tez_fsstat.q-and-12-more
 - did not produce a TEST-*.xml file
TestMiniTezCliDriver-vectorized_parquet.q-insert_values_non_partitioned.q-schema_evol_orc_nonvec_mapwork_part.q-and-12-more
 - did not produce a TEST-*.xml file
TestPTFRowContainer - did not produce a TEST-*.xml file
TestSparkCliDriver-groupby10.q-groupby4_noskew.q-union5.q-and-12-more - did not 
produce a TEST-*.xml file
TestSparkCliDriver-join9.q-join_casesensitive.q-filter_join_breaktask.q-and-12-more
 - did not produce a TEST-*.xml file
TestSparkCliDriver-join_cond_pushdown_3.q-groupby7.q-auto_join17.q-and-12-more 
- did not produce a TEST-*.xml file
TestSparkCliDriver-join_cond_pushdown_unqual4.q-bucketmapjoin12.q-avro_decimal_native.q-and-12-more
 - did not produce a TEST-*.xml file
TestSparkCliDriver-multi_insert.q-join5.q-groupby6.q-and-12-more - did not 
produce a TEST-*.xml file
TestSparkCliDriver-stats13.q-stats2.q-ppd_gby_join.q-and-12-more - did not 
produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_mult_tables_compact
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ivyDownload
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_index_bitmap3
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join11
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join3
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join_stats2
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_bucket_map_join_tez2
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_bucketmapjoin1
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_custom_input_output_format
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby2
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby2_map
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby3_noskew_multi_distinct
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby6_map_skew
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby7_map_multi_single_reducer
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_groupby7_map_skew
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_input14
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join26
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join39
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_limit_pushdown
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_load_dyn_part10
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_load_dyn_part11
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_load_dyn_part12
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_merge2
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_nullgroup2

[jira] [Commented] (HIVE-12643) For self describing InputFormat don't replicate schema information in partitions

2015-12-12 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15054367#comment-15054367
 ] 

Hive QA commented on HIVE-12643:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12777237/HIVE-12643.2.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 17 failed/errored test(s), 9896 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_order2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union9
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_dynamic
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.org.apache.hadoop.hive.cli.TestMiniTezCliDriver
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorized_dynamic_partition_pruning
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.metastore.TestHiveMetaStorePartitionSpecs.testGetPartitionSpecs_WithAndWithoutPartitionGrouping
org.apache.hive.jdbc.TestSSL.testSSLVersion
org.apache.hive.spark.client.TestSparkClient.testAddJarsAndFiles
org.apache.hive.spark.client.TestSparkClient.testCounters
org.apache.hive.spark.client.TestSparkClient.testErrorJob
org.apache.hive.spark.client.TestSparkClient.testJobSubmission
org.apache.hive.spark.client.TestSparkClient.testMetricsCollection
org.apache.hive.spark.client.TestSparkClient.testRemoteClient
org.apache.hive.spark.client.TestSparkClient.testSimpleSparkJob
org.apache.hive.spark.client.TestSparkClient.testSyncRpc
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6331/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6331/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6331/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 17 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12777237 - PreCommit-HIVE-TRUNK-Build

> For self describing InputFormat don't replicate schema information in 
> partitions
> 
>
> Key: HIVE-12643
> URL: https://issues.apache.org/jira/browse/HIVE-12643
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 2.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-12643.1.patch, HIVE-12643.2.patch, HIVE-12643.patch
>
>
> Since self describing Input Formats don't use individual partition schemas 
> for schema resolution, there is no need to send that info to tasks.
> Doing this should cut down plan size.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12643) For self describing InputFormat don't replicate schema information in partitions

2015-12-11 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15053813#comment-15053813
 ] 

Hive QA commented on HIVE-12643:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12777180/HIVE-12643.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 15 failed/errored test(s), 9894 tests 
executed
*Failed tests:*
{noformat}
TestHWISessionManager - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_udf_max
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_order2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_quotedid_tblproperty
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union9
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_partition_diff_num_cols
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_dynamic
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_dynamic_partition_pruning
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vectorized_dynamic_partition_pruning
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_mergejoin
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_partition_diff_num_cols
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import
org.apache.hadoop.hive.metastore.TestHiveMetaStorePartitionSpecs.testGetPartitionSpecs_WithAndWithoutPartitionGrouping
org.apache.hive.jdbc.TestSSL.testSSLVersion
org.apache.hive.jdbc.miniHS2.TestHs2Metrics.testMetrics
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6322/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6322/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6322/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 15 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12777180 - PreCommit-HIVE-TRUNK-Build

> For self describing InputFormat don't replicate schema information in 
> partitions
> 
>
> Key: HIVE-12643
> URL: https://issues.apache.org/jira/browse/HIVE-12643
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 2.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-12643.1.patch, HIVE-12643.patch
>
>
> Since self describing Input Formats don't use individual partition schemas 
> for schema resolution, there is no need to send that info to tasks.
> Doing this should cut down plan size.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12643) For self describing InputFormat don't replicate schema information in partitions

2015-12-10 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15052200#comment-15052200
 ] 

Hive QA commented on HIVE-12643:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12776691/HIVE-12643.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6313/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6313/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6313/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Tests exited with: ExecutionException: java.util.concurrent.ExecutionException: 
org.apache.hive.ptest.execution.ssh.SSHExecutionException: RSyncResult 
[localFile=/data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-6313/succeeded/TestCliDriver-cp_mj_rc.q-udf_stddev_pop.q-mapreduce2.q-and-12-more,
 remoteFile=/home/hiveptest/50.16.94.163-hiveptest-2/logs/, getExitCode()=12, 
getException()=null, getUser()=hiveptest, getHost()=50.16.94.163, 
getInstance()=2]: 'ssh_exchange_identification: Connection closed by remote host
rsync: connection unexpectedly closed (0 bytes received so far) [receiver]
rsync error: error in rsync protocol data stream (code 12) at io.c(600) 
[receiver=3.0.6]
ssh_exchange_identification: Connection closed by remote host
rsync: connection unexpectedly closed (0 bytes received so far) [receiver]
rsync error: error in rsync protocol data stream (code 12) at io.c(600) 
[receiver=3.0.6]
ssh_exchange_identification: Connection closed by remote host
rsync: connection unexpectedly closed (0 bytes received so far) [receiver]
rsync error: error in rsync protocol data stream (code 12) at io.c(600) 
[receiver=3.0.6]
ssh_exchange_identification: Connection closed by remote host
rsync: connection unexpectedly closed (0 bytes received so far) [receiver]
rsync error: error in rsync protocol data stream (code 12) at io.c(600) 
[receiver=3.0.6]
ssh_exchange_identification: Connection closed by remote host
rsync: connection unexpectedly closed (0 bytes received so far) [receiver]
rsync error: error in rsync protocol data stream (code 12) at io.c(600) 
[receiver=3.0.6]
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12776691 - PreCommit-HIVE-TRUNK-Build

> For self describing InputFormat don't replicate schema information in 
> partitions
> 
>
> Key: HIVE-12643
> URL: https://issues.apache.org/jira/browse/HIVE-12643
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 2.0.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-12643.patch
>
>
> Since self describing Input Formats don't use individual partition schemas 
> for schema resolution, there is no need to send that info to tasks.
> Doing this should cut down plan size.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)