[jira] [Commented] (HIVE-10228) Changes to Hive Export/Import/DropTable/DropPartition to support replication semantics

2015-04-18 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14501516#comment-14501516
 ] 

Alan Gates commented on HIVE-10228:
---

+1

 Changes to Hive Export/Import/DropTable/DropPartition to support replication 
 semantics
 --

 Key: HIVE-10228
 URL: https://issues.apache.org/jira/browse/HIVE-10228
 Project: Hive
  Issue Type: Sub-task
  Components: Import/Export
Affects Versions: 1.2.0
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan
 Attachments: HIVE-10228.2.patch, HIVE-10228.3.patch, 
 HIVE-10228.4.patch, HIVE-10228.5.patch, HIVE-10228.patch


 We need to update a couple of hive commands to support replication semantics. 
 To wit, we need the following:
 EXPORT ... [FOR [METADATA] REPLICATION(“comment”)]
 Export will now support an extra optional clause to tell it that this export 
 is being prepared for the purpose of replication. There is also an additional 
 optional clause here, that allows for the export to be a metadata-only 
 export, to handle cases of capturing the diff for alter statements, for 
 example.
 Also, if done for replication, the non-presence of a table, or a table being 
 a view/offline table/non-native table is not considered an error, and 
 instead, will result in a successful no-op.
 IMPORT ... (as normal) – but handles new semantics 
 No syntax changes for import, but import will have to change to be able to 
 handle all the permutations of export dumps possible. Also, import will have 
 to ensure that it should update the object only if the update being imported 
 is not older than the state of the object. Also, import currently does not 
 work with dbname.tablename kind of specification, this should be fixed to 
 work.
 DROP TABLE ... FOR REPLICATION('eventid')
 Drop Table now has an additional clause, to specify that this drop table is 
 being done for replication purposes, and that the dop should not actually 
 drop the table if the table is newer than that event id specified.
 ALTER TABLE ... DROP PARTITION (...) FOR REPLICATION('eventid')
 Similarly, Drop Partition also has an equivalent change to Drop Table.
 =
 In addition, we introduce a new property repl.last.id, which when tagged on 
 to table properties or partition properties on a replication-destination, 
 holds the effective state identifier of the object.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10228) Changes to Hive Export/Import/DropTable/DropPartition to support replication semantics

2015-04-18 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14501585#comment-14501585
 ] 

Lefty Leverenz commented on HIVE-10228:
---

Doc note:  HIVE-10264 will document everything related to replication, 
including the configuration parameter added here 
(*hive.exim.strict.repl.tables*).

 Changes to Hive Export/Import/DropTable/DropPartition to support replication 
 semantics
 --

 Key: HIVE-10228
 URL: https://issues.apache.org/jira/browse/HIVE-10228
 Project: Hive
  Issue Type: Sub-task
  Components: Import/Export
Affects Versions: 1.2.0
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan
 Fix For: 1.2.0

 Attachments: HIVE-10228.2.patch, HIVE-10228.3.patch, 
 HIVE-10228.4.patch, HIVE-10228.5.patch, HIVE-10228.patch


 We need to update a couple of hive commands to support replication semantics. 
 To wit, we need the following:
 EXPORT ... [FOR [METADATA] REPLICATION(“comment”)]
 Export will now support an extra optional clause to tell it that this export 
 is being prepared for the purpose of replication. There is also an additional 
 optional clause here, that allows for the export to be a metadata-only 
 export, to handle cases of capturing the diff for alter statements, for 
 example.
 Also, if done for replication, the non-presence of a table, or a table being 
 a view/offline table/non-native table is not considered an error, and 
 instead, will result in a successful no-op.
 IMPORT ... (as normal) – but handles new semantics 
 No syntax changes for import, but import will have to change to be able to 
 handle all the permutations of export dumps possible. Also, import will have 
 to ensure that it should update the object only if the update being imported 
 is not older than the state of the object. Also, import currently does not 
 work with dbname.tablename kind of specification, this should be fixed to 
 work.
 DROP TABLE ... FOR REPLICATION('eventid')
 Drop Table now has an additional clause, to specify that this drop table is 
 being done for replication purposes, and that the dop should not actually 
 drop the table if the table is newer than that event id specified.
 ALTER TABLE ... DROP PARTITION (...) FOR REPLICATION('eventid')
 Similarly, Drop Partition also has an equivalent change to Drop Table.
 =
 In addition, we introduce a new property repl.last.id, which when tagged on 
 to table properties or partition properties on a replication-destination, 
 holds the effective state identifier of the object.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10228) Changes to Hive Export/Import/DropTable/DropPartition to support replication semantics

2015-04-17 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14499370#comment-14499370
 ] 

Alan Gates commented on HIVE-10228:
---

Ok, so once the comments are added on why DROP TABLE FOR REPLICATION is 
different than IF EXISTS and the new JIRAs referenced in review board I filed 
I'm +1 on this patch.

 Changes to Hive Export/Import/DropTable/DropPartition to support replication 
 semantics
 --

 Key: HIVE-10228
 URL: https://issues.apache.org/jira/browse/HIVE-10228
 Project: Hive
  Issue Type: Sub-task
  Components: Import/Export
Affects Versions: 1.2.0
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan
 Attachments: HIVE-10228.2.patch, HIVE-10228.3.patch, 
 HIVE-10228.4.patch, HIVE-10228.patch


 We need to update a couple of hive commands to support replication semantics. 
 To wit, we need the following:
 EXPORT ... [FOR [METADATA] REPLICATION(“comment”)]
 Export will now support an extra optional clause to tell it that this export 
 is being prepared for the purpose of replication. There is also an additional 
 optional clause here, that allows for the export to be a metadata-only 
 export, to handle cases of capturing the diff for alter statements, for 
 example.
 Also, if done for replication, the non-presence of a table, or a table being 
 a view/offline table/non-native table is not considered an error, and 
 instead, will result in a successful no-op.
 IMPORT ... (as normal) – but handles new semantics 
 No syntax changes for import, but import will have to change to be able to 
 handle all the permutations of export dumps possible. Also, import will have 
 to ensure that it should update the object only if the update being imported 
 is not older than the state of the object. Also, import currently does not 
 work with dbname.tablename kind of specification, this should be fixed to 
 work.
 DROP TABLE ... FOR REPLICATION('eventid')
 Drop Table now has an additional clause, to specify that this drop table is 
 being done for replication purposes, and that the dop should not actually 
 drop the table if the table is newer than that event id specified.
 ALTER TABLE ... DROP PARTITION (...) FOR REPLICATION('eventid')
 Similarly, Drop Partition also has an equivalent change to Drop Table.
 =
 In addition, we introduce a new property repl.last.id, which when tagged on 
 to table properties or partition properties on a replication-destination, 
 holds the effective state identifier of the object.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10228) Changes to Hive Export/Import/DropTable/DropPartition to support replication semantics

2015-04-17 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14500436#comment-14500436
 ] 

Sushanth Sowmyan commented on HIVE-10228:
-

Thanks Alan, I have created HIVE-10381 for that other issue, and added comments 
in code to expand on what DROP TABLE FOR REPLICATION is doing.

 Changes to Hive Export/Import/DropTable/DropPartition to support replication 
 semantics
 --

 Key: HIVE-10228
 URL: https://issues.apache.org/jira/browse/HIVE-10228
 Project: Hive
  Issue Type: Sub-task
  Components: Import/Export
Affects Versions: 1.2.0
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan
 Attachments: HIVE-10228.2.patch, HIVE-10228.3.patch, 
 HIVE-10228.4.patch, HIVE-10228.5.patch, HIVE-10228.patch


 We need to update a couple of hive commands to support replication semantics. 
 To wit, we need the following:
 EXPORT ... [FOR [METADATA] REPLICATION(“comment”)]
 Export will now support an extra optional clause to tell it that this export 
 is being prepared for the purpose of replication. There is also an additional 
 optional clause here, that allows for the export to be a metadata-only 
 export, to handle cases of capturing the diff for alter statements, for 
 example.
 Also, if done for replication, the non-presence of a table, or a table being 
 a view/offline table/non-native table is not considered an error, and 
 instead, will result in a successful no-op.
 IMPORT ... (as normal) – but handles new semantics 
 No syntax changes for import, but import will have to change to be able to 
 handle all the permutations of export dumps possible. Also, import will have 
 to ensure that it should update the object only if the update being imported 
 is not older than the state of the object. Also, import currently does not 
 work with dbname.tablename kind of specification, this should be fixed to 
 work.
 DROP TABLE ... FOR REPLICATION('eventid')
 Drop Table now has an additional clause, to specify that this drop table is 
 being done for replication purposes, and that the dop should not actually 
 drop the table if the table is newer than that event id specified.
 ALTER TABLE ... DROP PARTITION (...) FOR REPLICATION('eventid')
 Similarly, Drop Partition also has an equivalent change to Drop Table.
 =
 In addition, we introduce a new property repl.last.id, which when tagged on 
 to table properties or partition properties on a replication-destination, 
 holds the effective state identifier of the object.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10228) Changes to Hive Export/Import/DropTable/DropPartition to support replication semantics

2015-04-17 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14500822#comment-14500822
 ] 

Sushanth Sowmyan commented on HIVE-10228:
-

The reported failed tests are not related to this patch.

 Changes to Hive Export/Import/DropTable/DropPartition to support replication 
 semantics
 --

 Key: HIVE-10228
 URL: https://issues.apache.org/jira/browse/HIVE-10228
 Project: Hive
  Issue Type: Sub-task
  Components: Import/Export
Affects Versions: 1.2.0
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan
 Attachments: HIVE-10228.2.patch, HIVE-10228.3.patch, 
 HIVE-10228.4.patch, HIVE-10228.5.patch, HIVE-10228.patch


 We need to update a couple of hive commands to support replication semantics. 
 To wit, we need the following:
 EXPORT ... [FOR [METADATA] REPLICATION(“comment”)]
 Export will now support an extra optional clause to tell it that this export 
 is being prepared for the purpose of replication. There is also an additional 
 optional clause here, that allows for the export to be a metadata-only 
 export, to handle cases of capturing the diff for alter statements, for 
 example.
 Also, if done for replication, the non-presence of a table, or a table being 
 a view/offline table/non-native table is not considered an error, and 
 instead, will result in a successful no-op.
 IMPORT ... (as normal) – but handles new semantics 
 No syntax changes for import, but import will have to change to be able to 
 handle all the permutations of export dumps possible. Also, import will have 
 to ensure that it should update the object only if the update being imported 
 is not older than the state of the object. Also, import currently does not 
 work with dbname.tablename kind of specification, this should be fixed to 
 work.
 DROP TABLE ... FOR REPLICATION('eventid')
 Drop Table now has an additional clause, to specify that this drop table is 
 being done for replication purposes, and that the dop should not actually 
 drop the table if the table is newer than that event id specified.
 ALTER TABLE ... DROP PARTITION (...) FOR REPLICATION('eventid')
 Similarly, Drop Partition also has an equivalent change to Drop Table.
 =
 In addition, we introduce a new property repl.last.id, which when tagged on 
 to table properties or partition properties on a replication-destination, 
 holds the effective state identifier of the object.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10228) Changes to Hive Export/Import/DropTable/DropPartition to support replication semantics

2015-04-17 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14500790#comment-14500790
 ] 

Hive QA commented on HIVE-10228:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12726232/HIVE-10228.5.patch

{color:red}ERROR:{color} -1 due to 15 failed/errored test(s), 8727 tests 
executed
*Failed tests:*
{noformat}
TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a 
TEST-*.xml file
TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did 
not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_decimal_precision2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_view
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3480/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3480/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3480/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 15 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12726232 - PreCommit-HIVE-TRUNK-Build

 Changes to Hive Export/Import/DropTable/DropPartition to support replication 
 semantics
 --

 Key: HIVE-10228
 URL: https://issues.apache.org/jira/browse/HIVE-10228
 Project: Hive
  Issue Type: Sub-task
  Components: Import/Export
Affects Versions: 1.2.0
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan
 Attachments: HIVE-10228.2.patch, HIVE-10228.3.patch, 
 HIVE-10228.4.patch, HIVE-10228.5.patch, HIVE-10228.patch


 We need to update a couple of hive commands to support replication semantics. 
 To wit, we need the following:
 EXPORT ... [FOR [METADATA] REPLICATION(“comment”)]
 Export will now support an extra optional clause to tell it that this export 
 is being prepared for the purpose of replication. There is also an additional 
 optional clause here, that allows for the export to be a metadata-only 
 export, to handle cases of capturing the diff for alter statements, for 
 example.
 Also, if done for replication, the non-presence of a table, or a table being 
 a view/offline table/non-native table is not considered an error, and 
 instead, will result in a successful no-op.
 IMPORT ... (as normal) – but handles new semantics 
 No syntax changes for import, but import will have to change to be able to 
 handle all the permutations of export dumps possible. Also, import will have 
 to ensure that it should update the object only if the update being imported 
 is not older than the state of the object. Also, import currently does not 
 work with dbname.tablename kind of specification, this should be fixed to 
 work.
 DROP TABLE ... FOR REPLICATION('eventid')
 Drop Table now has an additional clause, to specify that this drop table is 
 being done for replication purposes, 

[jira] [Commented] (HIVE-10228) Changes to Hive Export/Import/DropTable/DropPartition to support replication semantics

2015-04-16 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14497895#comment-14497895
 ] 

Alan Gates commented on HIVE-10228:
---

Wow, when I saw it was a 150K patch I was hoping it was mostly generated code.  
No such luck.

Code level comments on review board, higher level below:

This stuff needs some major doc work as you're introducing a new concept of a 
table being replicated or generated from replication.  Is there a doc JIRA for 
the replication work yet?  If so we should link it to this JIRA.

Parser changes:
I don't understand why DROP TABLE needs the replication clause.  As far as I 
can tell from the changes in DDLSemanticAnalyzer this is semantically 
equivalent to IF EXISTS.  Why not use that?

Adding METADATA and REPLICATION as keywords is not backwards compatible.  We 
either need to explicitly note that in this JIRA or add them to the list of 
reserved keywords allowed as identifiers in IdentifiersParser.g.  I suspect the 
latter is a better choice.







 Changes to Hive Export/Import/DropTable/DropPartition to support replication 
 semantics
 --

 Key: HIVE-10228
 URL: https://issues.apache.org/jira/browse/HIVE-10228
 Project: Hive
  Issue Type: Sub-task
  Components: Import/Export
Affects Versions: 1.2.0
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan
 Attachments: HIVE-10228.2.patch, HIVE-10228.3.patch, HIVE-10228.patch


 We need to update a couple of hive commands to support replication semantics. 
 To wit, we need the following:
 EXPORT ... [FOR [METADATA] REPLICATION(“comment”)]
 Export will now support an extra optional clause to tell it that this export 
 is being prepared for the purpose of replication. There is also an additional 
 optional clause here, that allows for the export to be a metadata-only 
 export, to handle cases of capturing the diff for alter statements, for 
 example.
 Also, if done for replication, the non-presence of a table, or a table being 
 a view/offline table/non-native table is not considered an error, and 
 instead, will result in a successful no-op.
 IMPORT ... (as normal) – but handles new semantics 
 No syntax changes for import, but import will have to change to be able to 
 handle all the permutations of export dumps possible. Also, import will have 
 to ensure that it should update the object only if the update being imported 
 is not older than the state of the object. Also, import currently does not 
 work with dbname.tablename kind of specification, this should be fixed to 
 work.
 DROP TABLE ... FOR REPLICATION('eventid')
 Drop Table now has an additional clause, to specify that this drop table is 
 being done for replication purposes, and that the dop should not actually 
 drop the table if the table is newer than that event id specified.
 ALTER TABLE ... DROP PARTITION (...) FOR REPLICATION('eventid')
 Similarly, Drop Partition also has an equivalent change to Drop Table.
 =
 In addition, we introduce a new property repl.last.id, which when tagged on 
 to table properties or partition properties on a replication-destination, 
 holds the effective state identifier of the object.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10228) Changes to Hive Export/Import/DropTable/DropPartition to support replication semantics

2015-04-16 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14498877#comment-14498877
 ] 

Sushanth Sowmyan commented on HIVE-10228:
-

Sorry, yeah, this is a big patch. :)

It's really a cumulative patch of a bunch of work, but a lot of that was 
overwriting itself so much that splitting them out into a bunch of patches 
would have been difficult. Forking hive to do dev of this on a separate branch 
and merging in one go might have been easier.

I'd created https://issues.apache.org/jira/browse/HIVE-10264 as a doc jira, and 
I've attached a presentation-like document there outlining various points of 
why we're doing a bunch of what we're doing, but that still needs some 
wiki-fication that I am working on. I've also attached the replay-protocol 
document on that jira after updating it slightly with your question on DROP 
TABLE here.

I'll reply to code-level comments on review board, and reply to your 
higher-level comments here.

DROP TABLE : This is not quite a DROP TABLE IF EXISTS, it's a DROP TABLE IF 
OLDER THAN(x). There are a couple of cases this can happen in:

a) To make it more resilient in cases of parallelization of events (in the 
cases of a worker that times out and does not respond back, for eg., but might 
still be running, albeit slowly in the background), one of the goals of all 
Commands generated by Replication is that they should be idempotent, and 
reprocessing of events older than the state of an object should not cause any 
error. So, if one drone that's processing events (41,42,43) might perform 41 
and then not respond back for a significant amount of time, causing Falcon to 
queue another HiveDR job that starts performing (41,42,43), and 43 might return 
successfully before the other job performs 42, and then failing. So, one of the 
early design goals was that all commands should be resilient to repeats. This 
is a way of achieving that goal.

b) In the case of a 
CREATE1-DROP1-CREATE2-REPL(CREATE1)-REPL(DROP1)-REPL(CREATE2), since the 
REPL(CREATE1) occurs after CREATE2, it picks up a newer state of the table, and 
the destination is at a newer state than the table which was dropped. Thus, by 
making the DROP ignore the destination table if it's already newer than the 
event that spawned the DROP, we can optimize away a bit of re-importing that 
REPL(CREATE2) would have needed to do. In the future, we'll add in 
event-nullification, and can do it at a higher level if we batch events, but 
this helps out even when processing at an individual level.

c) In addition to a DROP-IF-OLDER, it also acts like a recursive 
DROP-TABLE-IF-OLDER for cases where it doesn't result in the dropping of the 
table, it will still result in dropping older partitions in a newer table. For 
eg., if a T(state=50) has partitions P1(state=45) and P2(state=53), then 
DROP_TABLE_IF_OLDER_THAN(47) will drop P1 but not P2. This is because a 
Drop-table event does not result in a series of DropPtn events that are 
associated with the appropriate table. So, given that our replication works on 
an per-object basis, if DropTable should not drop the destination table because 
the destination table is newer than the origin table at the time of the drop, 
it might still contain older partitions which should be nuked. (This mode is 
tested in one of the tests in TestCommands in HIVE-10227 if you want to have a 
look at an example of what's expected)

--

Regarding the kewword addition, thanks for the feedback, it was not my intent 
to make them reserved keywords. I talked to [~pxiong] and [~ashutoshc] about 
it, and the latter is the way that makes sense. As long as I add them to the 
nonReserved entry in IdentifiersParser.g, it should be good. So, I'll add that 
in and have another update here.


 Changes to Hive Export/Import/DropTable/DropPartition to support replication 
 semantics
 --

 Key: HIVE-10228
 URL: https://issues.apache.org/jira/browse/HIVE-10228
 Project: Hive
  Issue Type: Sub-task
  Components: Import/Export
Affects Versions: 1.2.0
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan
 Attachments: HIVE-10228.2.patch, HIVE-10228.3.patch, HIVE-10228.patch


 We need to update a couple of hive commands to support replication semantics. 
 To wit, we need the following:
 EXPORT ... [FOR [METADATA] REPLICATION(“comment”)]
 Export will now support an extra optional clause to tell it that this export 
 is being prepared for the purpose of replication. There is also an additional 
 optional clause here, that allows for the export to be a metadata-only 
 export, to handle cases of capturing the diff for alter statements, for 
 example.
 Also, if done for replication, the non-presence of a table, or a table 

[jira] [Commented] (HIVE-10228) Changes to Hive Export/Import/DropTable/DropPartition to support replication semantics

2015-04-15 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14496376#comment-14496376
 ] 

Sushanth Sowmyan commented on HIVE-10228:
-

RB link : https://reviews.apache.org/r/33224/

 Changes to Hive Export/Import/DropTable/DropPartition to support replication 
 semantics
 --

 Key: HIVE-10228
 URL: https://issues.apache.org/jira/browse/HIVE-10228
 Project: Hive
  Issue Type: Sub-task
  Components: Import/Export
Affects Versions: 1.2.0
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan
 Attachments: HIVE-10228.2.patch, HIVE-10228.3.patch, HIVE-10228.patch


 We need to update a couple of hive commands to support replication semantics. 
 To wit, we need the following:
 EXPORT ... [FOR [METADATA] REPLICATION(“comment”)]
 Export will now support an extra optional clause to tell it that this export 
 is being prepared for the purpose of replication. There is also an additional 
 optional clause here, that allows for the export to be a metadata-only 
 export, to handle cases of capturing the diff for alter statements, for 
 example.
 Also, if done for replication, the non-presence of a table, or a table being 
 a view/offline table/non-native table is not considered an error, and 
 instead, will result in a successful no-op.
 IMPORT ... (as normal) – but handles new semantics 
 No syntax changes for import, but import will have to change to be able to 
 handle all the permutations of export dumps possible. Also, import will have 
 to ensure that it should update the object only if the update being imported 
 is not older than the state of the object. Also, import currently does not 
 work with dbname.tablename kind of specification, this should be fixed to 
 work.
 DROP TABLE ... FOR REPLICATION('eventid')
 Drop Table now has an additional clause, to specify that this drop table is 
 being done for replication purposes, and that the dop should not actually 
 drop the table if the table is newer than that event id specified.
 ALTER TABLE ... DROP PARTITION (...) FOR REPLICATION('eventid')
 Similarly, Drop Partition also has an equivalent change to Drop Table.
 =
 In addition, we introduce a new property repl.last.id, which when tagged on 
 to table properties or partition properties on a replication-destination, 
 holds the effective state identifier of the object.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10228) Changes to Hive Export/Import/DropTable/DropPartition to support replication semantics

2015-04-14 Thread Vaibhav Gumashta (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14495671#comment-14495671
 ] 

Vaibhav Gumashta commented on HIVE-10228:
-

[~sushanth] If possible an RB entry will help a lot. Thanks.

 Changes to Hive Export/Import/DropTable/DropPartition to support replication 
 semantics
 --

 Key: HIVE-10228
 URL: https://issues.apache.org/jira/browse/HIVE-10228
 Project: Hive
  Issue Type: Sub-task
  Components: Import/Export
Affects Versions: 1.2.0
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan
 Attachments: HIVE-10228.2.patch, HIVE-10228.3.patch, HIVE-10228.patch


 We need to update a couple of hive commands to support replication semantics. 
 To wit, we need the following:
 EXPORT ... [FOR [METADATA] REPLICATION(“comment”)]
 Export will now support an extra optional clause to tell it that this export 
 is being prepared for the purpose of replication. There is also an additional 
 optional clause here, that allows for the export to be a metadata-only 
 export, to handle cases of capturing the diff for alter statements, for 
 example.
 Also, if done for replication, the non-presence of a table, or a table being 
 a view/offline table/non-native table is not considered an error, and 
 instead, will result in a successful no-op.
 IMPORT ... (as normal) – but handles new semantics 
 No syntax changes for import, but import will have to change to be able to 
 handle all the permutations of export dumps possible. Also, import will have 
 to ensure that it should update the object only if the update being imported 
 is not older than the state of the object.
 DROP TABLE ... FOR REPLICATION('eventid')
 Drop Table now has an additional clause, to specify that this drop table is 
 being done for replication purposes, and that the dop should not actually 
 drop the table if the table is newer than that event id specified.
 ALTER TABLE ... DROP PARTITION (...) FOR REPLICATION('eventid')
 Similarly, Drop Partition also has an equivalent change to Drop Table.
 =
 In addition, we introduce a new property repl.last.id, which when tagged on 
 to table properties or partition properties on a replication-destination, 
 holds the effective state identifier of the object.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10228) Changes to Hive Export/Import/DropTable/DropPartition to support replication semantics

2015-04-13 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14492982#comment-14492982
 ] 

Hive QA commented on HIVE-10228:




{color:red}Overall{color}: -1 no tests executed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12724991/HIVE-10228.2.patch

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3408/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3408/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3408/

Messages:
{noformat}
 This message was trimmed, see log for full details 
Reverted 'service/src/gen/thrift/gen-py/hive_service/constants.py'
Reverted 'service/src/gen/thrift/gen-py/hive_service/ThriftHive-remote'
Reverted 'service/src/gen/thrift/gen-py/TCLIService/ttypes.py'
Reverted 'service/src/gen/thrift/gen-py/TCLIService/TCLIService.py'
Reverted 'service/src/gen/thrift/gen-py/TCLIService/constants.py'
Reverted 'service/src/gen/thrift/gen-py/TCLIService/TCLIService-remote'
Reverted 'service/src/gen/thrift/gen-cpp/hive_service_types.cpp'
Reverted 'service/src/gen/thrift/gen-cpp/TCLIService_types.cpp'
Reverted 'service/src/gen/thrift/gen-cpp/TCLIService.h'
Reverted 'service/src/gen/thrift/gen-cpp/ThriftHive.h'
Reverted 'service/src/gen/thrift/gen-cpp/hive_service_types.h'
Reverted 'service/src/gen/thrift/gen-cpp/TCLIService_types.h'
Reverted 'service/src/gen/thrift/gen-cpp/hive_service_constants.cpp'
Reverted 'service/src/gen/thrift/gen-cpp/TCLIService_constants.cpp'
Reverted 'service/src/gen/thrift/gen-cpp/TCLIService.cpp'
Reverted 'service/src/gen/thrift/gen-cpp/ThriftHive.cpp'
Reverted 'service/src/gen/thrift/gen-cpp/hive_service_constants.h'
Reverted 'service/src/gen/thrift/gen-cpp/TCLIService_constants.h'
Reverted 'service/src/gen/thrift/gen-rb/hive_service_types.rb'
Reverted 'service/src/gen/thrift/gen-rb/t_c_l_i_service_constants.rb'
Reverted 'service/src/gen/thrift/gen-rb/hive_service_constants.rb'
Reverted 'service/src/gen/thrift/gen-rb/t_c_l_i_service.rb'
Reverted 'service/src/gen/thrift/gen-rb/thrift_hive.rb'
Reverted 'service/src/gen/thrift/gen-rb/t_c_l_i_service_types.rb'
Reverted 
'service/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/service/HiveClusterStatus.java'
Reverted 
'service/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/service/HiveServerException.java'
Reverted 
'service/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/service/JobTrackerState.java'
Reverted 
'service/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/service/ThriftHive.java'
Reverted 
'service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TCancelOperationReq.java'
Reverted 
'service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TStatusCode.java'
Reverted 
'service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TTypeQualifierValue.java'
Reverted 
'service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TGetFunctionsReq.java'
Reverted 
'service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TTypeDesc.java'
Reverted 
'service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TCloseSessionReq.java'
Reverted 
'service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TFetchResultsReq.java'
Reverted 
'service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TRowSet.java'
Reverted 
'service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TStringColumn.java'
Reverted 
'service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TGetTableTypesReq.java'
Reverted 
'service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TCLIServiceConstants.java'
Reverted 
'service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TGetCatalogsResp.java'
Reverted 
'service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TGetColumnsReq.java'
Reverted 
'service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TI16Value.java'
Reverted 
'service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TByteValue.java'
Reverted 
'service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TMapTypeEntry.java'
Reverted 
'service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TGetFunctionsResp.java'
Reverted 
'service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TBinaryColumn.java'
Reverted 
'service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TTypeEntry.java'
Reverted 
'service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TFetchOrientation.java'
Reverted 
'service/src/gen/thrift/gen-javabean/org/apache/hive/service/cli/thrift/TGetTableTypesResp.java'
Reverted 

[jira] [Commented] (HIVE-10228) Changes to Hive Export/Import/DropTable/DropPartition to support replication semantics

2015-04-13 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493208#comment-14493208
 ] 

Hive QA commented on HIVE-10228:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12725034/HIVE-10228.3.patch

{color:red}ERROR:{color} -1 due to 13 failed/errored test(s), 8677 tests 
executed
*Failed tests:*
{noformat}
TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a 
TEST-*.xml file
TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did 
not produce a TEST-*.xml file
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3411/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3411/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3411/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 13 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12725034 - PreCommit-HIVE-TRUNK-Build

 Changes to Hive Export/Import/DropTable/DropPartition to support replication 
 semantics
 --

 Key: HIVE-10228
 URL: https://issues.apache.org/jira/browse/HIVE-10228
 Project: Hive
  Issue Type: Sub-task
  Components: Import/Export
Affects Versions: 1.2.0
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan
 Attachments: HIVE-10228.2.patch, HIVE-10228.3.patch, HIVE-10228.patch


 We need to update a couple of hive commands to support replication semantics. 
 To wit, we need the following:
 EXPORT ... [FOR [METADATA] REPLICATION(“comment”)]
 Export will now support an extra optional clause to tell it that this export 
 is being prepared for the purpose of replication. There is also an additional 
 optional clause here, that allows for the export to be a metadata-only 
 export, to handle cases of capturing the diff for alter statements, for 
 example.
 Also, if done for replication, the non-presence of a table, or a table being 
 a view/offline table/non-native table is not considered an error, and 
 instead, will result in a successful no-op.
 IMPORT ... (as normal) – but handles new semantics 
 No syntax changes for import, but import will have to change to be able to 
 handle all the permutations of export dumps possible. Also, import will have 
 to ensure that it should update the object only if the update being imported 
 is not older than the state of the object.
 DROP TABLE ... FOR REPLICATION('eventid')
 Drop Table now has an additional clause, to specify that this drop table is 
 being done for replication purposes, and that the dop should not actually 
 drop the table if the table is newer than that event id specified.
 ALTER TABLE ... DROP PARTITION (...) FOR REPLICATION('eventid')
 Similarly, Drop Partition also has an equivalent change to Drop Table.
 =
 In addition, we introduce a new property 

[jira] [Commented] (HIVE-10228) Changes to Hive Export/Import/DropTable/DropPartition to support replication semantics

2015-04-13 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493210#comment-14493210
 ] 

Sushanth Sowmyan commented on HIVE-10228:
-

Note : Visiting the test report page 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3411/testReport/
 shows 0 failures. The issues reported above are from TestMinimrCliDriver not 
producing TEST-*.xml files, which are unrelated to this patch.

 Changes to Hive Export/Import/DropTable/DropPartition to support replication 
 semantics
 --

 Key: HIVE-10228
 URL: https://issues.apache.org/jira/browse/HIVE-10228
 Project: Hive
  Issue Type: Sub-task
  Components: Import/Export
Affects Versions: 1.2.0
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan
 Attachments: HIVE-10228.2.patch, HIVE-10228.3.patch, HIVE-10228.patch


 We need to update a couple of hive commands to support replication semantics. 
 To wit, we need the following:
 EXPORT ... [FOR [METADATA] REPLICATION(“comment”)]
 Export will now support an extra optional clause to tell it that this export 
 is being prepared for the purpose of replication. There is also an additional 
 optional clause here, that allows for the export to be a metadata-only 
 export, to handle cases of capturing the diff for alter statements, for 
 example.
 Also, if done for replication, the non-presence of a table, or a table being 
 a view/offline table/non-native table is not considered an error, and 
 instead, will result in a successful no-op.
 IMPORT ... (as normal) – but handles new semantics 
 No syntax changes for import, but import will have to change to be able to 
 handle all the permutations of export dumps possible. Also, import will have 
 to ensure that it should update the object only if the update being imported 
 is not older than the state of the object.
 DROP TABLE ... FOR REPLICATION('eventid')
 Drop Table now has an additional clause, to specify that this drop table is 
 being done for replication purposes, and that the dop should not actually 
 drop the table if the table is newer than that event id specified.
 ALTER TABLE ... DROP PARTITION (...) FOR REPLICATION('eventid')
 Similarly, Drop Partition also has an equivalent change to Drop Table.
 =
 In addition, we introduce a new property repl.last.id, which when tagged on 
 to table properties or partition properties on a replication-destination, 
 holds the effective state identifier of the object.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10228) Changes to Hive Export/Import/DropTable/DropPartition to support replication semantics

2015-04-10 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14490439#comment-14490439
 ] 

Sushanth Sowmyan commented on HIVE-10228:
-

Jumbo patch. [~alangates], could you please take a look?

 Changes to Hive Export/Import/DropTable/DropPartition to support replication 
 semantics
 --

 Key: HIVE-10228
 URL: https://issues.apache.org/jira/browse/HIVE-10228
 Project: Hive
  Issue Type: Sub-task
  Components: Import/Export
Affects Versions: 1.2.0
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan
 Attachments: HIVE-10228.patch


 We need to update a couple of hive commands to support replication semantics. 
 To wit, we need the following:
 EXPORT ... [FOR [METADATA] REPLICATION(“comment”)]
 Export will now support an extra optional clause to tell it that this export 
 is being prepared for the purpose of replication. There is also an additional 
 optional clause here, that allows for the export to be a metadata-only 
 export, to handle cases of capturing the diff for alter statements, for 
 example.
 Also, if done for replication, the non-presence of a table, or a table being 
 a view/offline table/non-native table is not considered an error, and 
 instead, will result in a successful no-op.
 IMPORT ... (as normal) – but handles new semantics 
 No syntax changes for import, but import will have to change to be able to 
 handle all the permutations of export dumps possible. Also, import will have 
 to ensure that it should update the object only if the update being imported 
 is not older than the state of the object.
 DROP TABLE ... FOR REPLICATION('eventid')
 Drop Table now has an additional clause, to specify that this drop table is 
 being done for replication purposes, and that the dop should not actually 
 drop the table if the table is newer than that event id specified.
 ALTER TABLE ... DROP PARTITION (...) FOR REPLICATION('eventid')
 Similarly, Drop Partition also has an equivalent change to Drop Table.
 =
 In addition, we introduce a new property repl.last.id, which when tagged on 
 to table properties or partition properties on a replication-destination, 
 holds the effective state identifier of the object.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)