[jira] [Updated] (HIVE-2773) HiveStorageHandler.configureTableJobProperites() should let the handler know wether it is configuration for input or output
[ https://issues.apache.org/jira/browse/HIVE-2773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-2773: --- Resolution: Fixed Fix Version/s: 0.9.0 Status: Resolved (was: Patch Available) Committed to trunk. Thanks, Francis! HiveStorageHandler.configureTableJobProperites() should let the handler know wether it is configuration for input or output --- Key: HIVE-2773 URL: https://issues.apache.org/jira/browse/HIVE-2773 Project: Hive Issue Type: Improvement Reporter: Francis Liu Assignee: Francis Liu Labels: hcatalog, storage_handler Fix For: 0.9.0 Attachments: HIVE-2773.D1815.1.patch, HIVE-2773.D2007.1.patch, HIVE-2773.D2415.1.patch, HIVE-2773.patch HiveStorageHandler.configureTableJobProperties() is called to allow the storage handler to setup any properties that the underlying inputformat/outputformat/serde may need. But the handler implementation does not know whether it is being called for configuring input or output. This makes it a problem for handlers which sets an external state. In the case of HCatalog's HBase storageHandler, whenever a write needs to be configured we create a write transaction which needs to be committed or aborted later on. In this case configuring for both input and output each time configureTableJobProperties() is called would not be desirable. This has become an issue since HCatalog is dropping storageDrivers for SerDe and StorageHandler (see HCATALOG-237). My proposal is to replace configureTableJobProperties() with two methods: configureInputJobProperties() configureOutputJobProperties() Each method will have the same signature. I cursory look at the code and I believe changes should be straighforward also given that we are not really changing anything just splitting responsibility. If the community is fine with this approach I will go ahead and create a aptch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2773) HiveStorageHandler.configureTableJobProperites() should let the handler know wether it is configuration for input or output
[ https://issues.apache.org/jira/browse/HIVE-2773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HIVE-2773: -- Attachment: HIVE-2773.D2415.1.patch toffer requested code review of HIVE-2773 [jira] HiveStorageHandler.configureTableJobProperites() should let the handler know wether it is configuration for input or output. Reviewers: JIRA TEST PLAN EMPTY REVISION DETAIL https://reviews.facebook.net/D2415 AFFECTED FILES hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseStorageHandler.java ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java ql/src/java/org/apache/hadoop/hive/ql/metadata/DefaultStorageHandler.java ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveStorageHandler.java ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java ql/src/java/org/apache/hadoop/hive/ql/plan/MapredLocalWork.java ql/src/java/org/apache/hadoop/hive/ql/plan/PartitionDesc.java ql/src/java/org/apache/hadoop/hive/ql/plan/PlanUtils.java MANAGE HERALD DIFFERENTIAL RULES https://reviews.facebook.net/herald/view/differential/ WHY DID I GET THIS EMAIL? https://reviews.facebook.net/herald/transcript/5409/ Tip: use the X-Herald-Rules header to filter Herald messages in your client. HiveStorageHandler.configureTableJobProperites() should let the handler know wether it is configuration for input or output --- Key: HIVE-2773 URL: https://issues.apache.org/jira/browse/HIVE-2773 Project: Hive Issue Type: Improvement Reporter: Francis Liu Assignee: Francis Liu Labels: hcatalog, storage_handler Attachments: HIVE-2773.D1815.1.patch, HIVE-2773.D2007.1.patch, HIVE-2773.D2415.1.patch, HIVE-2773.patch HiveStorageHandler.configureTableJobProperties() is called to allow the storage handler to setup any properties that the underlying inputformat/outputformat/serde may need. But the handler implementation does not know whether it is being called for configuring input or output. This makes it a problem for handlers which sets an external state. In the case of HCatalog's HBase storageHandler, whenever a write needs to be configured we create a write transaction which needs to be committed or aborted later on. In this case configuring for both input and output each time configureTableJobProperties() is called would not be desirable. This has become an issue since HCatalog is dropping storageDrivers for SerDe and StorageHandler (see HCATALOG-237). My proposal is to replace configureTableJobProperties() with two methods: configureInputJobProperties() configureOutputJobProperties() Each method will have the same signature. I cursory look at the code and I believe changes should be straighforward also given that we are not really changing anything just splitting responsibility. If the community is fine with this approach I will go ahead and create a aptch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2773) HiveStorageHandler.configureTableJobProperites() should let the handler know wether it is configuration for input or output
[ https://issues.apache.org/jira/browse/HIVE-2773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HIVE-2773: -- Attachment: HIVE-2773.D2007.1.patch toffer requested code review of HIVE-2773 [jira] HiveStorageHandler.configureTableJobProperites() should let the handler know wether it is configuration for input or output. Reviewers: JIRA, cwsteinbach, ashutoshc This is a patch which supports backward compatibility for the old storageHandlers which implements the deprecated method. Patch is built against 0.8-r2 TEST PLAN EMPTY REVISION DETAIL https://reviews.facebook.net/D2007 AFFECTED FILES hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseStorageHandler.java ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java ql/src/java/org/apache/hadoop/hive/ql/metadata/DefaultStorageHandler.java ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveStorageHandler.java ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java ql/src/java/org/apache/hadoop/hive/ql/plan/MapredLocalWork.java ql/src/java/org/apache/hadoop/hive/ql/plan/PartitionDesc.java ql/src/java/org/apache/hadoop/hive/ql/plan/PlanUtils.java MANAGE HERALD DIFFERENTIAL RULES https://reviews.facebook.net/herald/view/differential/ WHY DID I GET THIS EMAIL? https://reviews.facebook.net/herald/transcript/4281/ Tip: use the X-Herald-Rules header to filter Herald messages in your client. HiveStorageHandler.configureTableJobProperites() should let the handler know wether it is configuration for input or output --- Key: HIVE-2773 URL: https://issues.apache.org/jira/browse/HIVE-2773 Project: Hive Issue Type: Improvement Reporter: Francis Liu Assignee: Francis Liu Labels: hcatalog, storage_handler Attachments: HIVE-2773.D1815.1.patch, HIVE-2773.D2007.1.patch, HIVE-2773.patch HiveStorageHandler.configureTableJobProperties() is called to allow the storage handler to setup any properties that the underlying inputformat/outputformat/serde may need. But the handler implementation does not know whether it is being called for configuring input or output. This makes it a problem for handlers which sets an external state. In the case of HCatalog's HBase storageHandler, whenever a write needs to be configured we create a write transaction which needs to be committed or aborted later on. In this case configuring for both input and output each time configureTableJobProperties() is called would not be desirable. This has become an issue since HCatalog is dropping storageDrivers for SerDe and StorageHandler (see HCATALOG-237). My proposal is to replace configureTableJobProperties() with two methods: configureInputJobProperties() configureOutputJobProperties() Each method will have the same signature. I cursory look at the code and I believe changes should be straighforward also given that we are not really changing anything just splitting responsibility. If the community is fine with this approach I will go ahead and create a aptch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2773) HiveStorageHandler.configureTableJobProperites() should let the handler know wether it is configuration for input or output
[ https://issues.apache.org/jira/browse/HIVE-2773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francis Liu updated HIVE-2773: -- Attachment: HIVE-2773.patch HiveStorageHandler.configureTableJobProperites() should let the handler know wether it is configuration for input or output --- Key: HIVE-2773 URL: https://issues.apache.org/jira/browse/HIVE-2773 Project: Hive Issue Type: Improvement Reporter: Francis Liu Labels: hcatalog, storage_handler Attachments: HIVE-2773.patch HiveStorageHandler.configureTableJobProperties() is called to allow the storage handler to setup any properties that the underlying inputformat/outputformat/serde may need. But the handler implementation does not know whether it is being called for configuring input or output. This makes it a problem for handlers which sets an external state. In the case of HCatalog's HBase storageHandler, whenever a write needs to be configured we create a write transaction which needs to be committed or aborted later on. In this case configuring for both input and output each time configureTableJobProperties() is called would not be desirable. This has become an issue since HCatalog is dropping storageDrivers for SerDe and StorageHandler (see HCATALOG-237). My proposal is to replace configureTableJobProperties() with two methods: configureInputJobProperties() configureOutputJobProperties() Each method will have the same signature. I cursory look at the code and I believe changes should be straighforward also given that we are not really changing anything just splitting responsibility. If the community is fine with this approach I will go ahead and create a aptch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira