[jira] [Assigned] (TEZ-3420) Parallel queries to HS2/Tez not thread safe (local mode)

2019-05-20 Thread Todd Lipcon (JIRA)


 [ 
https://issues.apache.org/jira/browse/TEZ-3420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon reassigned TEZ-3420:


Assignee: (was: Todd Lipcon)

I gave up on this. See HIVE-21682 for some details on why.

> Parallel queries to HS2/Tez not thread safe (local mode)
> 
>
> Key: TEZ-3420
> URL: https://issues.apache.org/jira/browse/TEZ-3420
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.8.4
> Environment: HiveServer2 1.2.1 Local mode + Tez
>Reporter: Uday Chitragar
>Priority: Major
> Attachments: hive.log.submit.gz
>
>
> When running parallel queries (simultaneous connections by two beeline 
> clients to HS2), I get the following exception (full debug attached), 
> interestingly running the queries one after the other completes without any 
> problem. The partition location and actual files seem to get mixed up across 
> the DAGS
>  
> The setup is Hive (1.2.1) and Tez (0.8.4) running in local mode.
> {noformat} 
> 2016-08-25 15:45:41,333 DEBUG 
> [TezTaskEventRouter{attempt_1472136335089_0001_1_01_00_0}]: 
> impl.ShuffleInputEventHandlerImpl 
> (ShuffleInputEventHandlerImpl.java:processDataMovementEvent(127)) - DME 
> srcIdx: 0, targetIndex: 9, attemptNum
> : 0, payload: [hasEmptyPartitions: true, host: , port: 0, pathComponent: , 
> runDuration: 0]
> 2016-08-25 15:45:41,557 ERROR [TezChild]: tez.MapRecordSource 
> (MapRecordSource.java:processRow(90)) - java.lang.IllegalStateException: 
> Invalid input path 
> file:/acorn/QC/OraExtract/20160131/Devices/Devices_extract_20160229T080613_3
> at 
> org.apache.hadoop.hive.ql.exec.MapOperator.getNominalPath(MapOperator.java:415)
> at 
> org.apache.hadoop.hive.ql.exec.MapOperator.cleanUpInputFileChangedOp(MapOperator.java:457)
> at 
> org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1069)
> at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:501)
>  
>  
>  
> 2016-08-25 15:45:41,817 INFO  [TezChild]: io.HiveContextAwareRecordReader 
> (HiveContextAwareRecordReader.java:doNext(326)) –
> Cannot get partition description from 
> file:/acorn/QC/reportlib/VM_ValEdit.24656because cannot find dir = file:/ac
> orn/QC/reportlib/VM_ValEdit.24656 in pathToPartitionInfo: 
> [file:/acorn/QC/OraExtract/20160131/Devices]
> {noformat}
> Perhaps clashing directories for intermediate data might be causing an issue?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (TEZ-3310) Handle splits grouping better when locality information is not available (or only when localhost is available)

2019-05-03 Thread Todd Lipcon (JIRA)


[ 
https://issues.apache.org/jira/browse/TEZ-3310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16832940#comment-16832940
 ] 

Todd Lipcon commented on TEZ-3310:
--

Sure enough, this is causing problems for pseudo-distributed-cluster testing. 
The "min split length" config gets ignored because all of the splits are on 
localhost, and thus queries have different behavior on this cluster than on a 
remote one.

> Handle splits grouping better when locality information is not available (or 
> only when localhost is available)
> --
>
> Key: TEZ-3310
> URL: https://issues.apache.org/jira/browse/TEZ-3310
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Priority: Minor
>
> This is a follow up JIRA to TEZ-3291. TEZ-3291 tries to handle the case when 
> only localhost is specified in the locations. It would be good to improve 
> handling of splits grouping when Tez does not have enough information about 
> the locality.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (TEZ-3420) Parallel queries to HS2/Tez not thread safe (local mode)

2019-05-02 Thread Todd Lipcon (JIRA)


[ 
https://issues.apache.org/jira/browse/TEZ-3420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16831937#comment-16831937
 ] 

Todd Lipcon commented on TEZ-3420:
--

Tracked this down to a couple things:
(1) Tez side: LocalClient uses a timestamp to generate appIds, so if multiple 
clients start at the same time, they''ll conflict and cause problems
(2) Hive side: The current implementation of IOContext uses global statics in 
the case of Tez, so tasks overwrite each other's IOContexts. Swithcing that to 
be keyed on an attempt ID (similar to what's done with LLAP) ought to fix it, 
but that's a Hive-side change.

> Parallel queries to HS2/Tez not thread safe (local mode)
> 
>
> Key: TEZ-3420
> URL: https://issues.apache.org/jira/browse/TEZ-3420
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.8.4
> Environment: HiveServer2 1.2.1 Local mode + Tez
>Reporter: Uday Chitragar
>Assignee: Todd Lipcon
>Priority: Major
> Attachments: hive.log.submit.gz
>
>
> When running parallel queries (simultaneous connections by two beeline 
> clients to HS2), I get the following exception (full debug attached), 
> interestingly running the queries one after the other completes without any 
> problem. The partition location and actual files seem to get mixed up across 
> the DAGS
>  
> The setup is Hive (1.2.1) and Tez (0.8.4) running in local mode.
> {noformat} 
> 2016-08-25 15:45:41,333 DEBUG 
> [TezTaskEventRouter{attempt_1472136335089_0001_1_01_00_0}]: 
> impl.ShuffleInputEventHandlerImpl 
> (ShuffleInputEventHandlerImpl.java:processDataMovementEvent(127)) - DME 
> srcIdx: 0, targetIndex: 9, attemptNum
> : 0, payload: [hasEmptyPartitions: true, host: , port: 0, pathComponent: , 
> runDuration: 0]
> 2016-08-25 15:45:41,557 ERROR [TezChild]: tez.MapRecordSource 
> (MapRecordSource.java:processRow(90)) - java.lang.IllegalStateException: 
> Invalid input path 
> file:/acorn/QC/OraExtract/20160131/Devices/Devices_extract_20160229T080613_3
> at 
> org.apache.hadoop.hive.ql.exec.MapOperator.getNominalPath(MapOperator.java:415)
> at 
> org.apache.hadoop.hive.ql.exec.MapOperator.cleanUpInputFileChangedOp(MapOperator.java:457)
> at 
> org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1069)
> at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:501)
>  
>  
>  
> 2016-08-25 15:45:41,817 INFO  [TezChild]: io.HiveContextAwareRecordReader 
> (HiveContextAwareRecordReader.java:doNext(326)) –
> Cannot get partition description from 
> file:/acorn/QC/reportlib/VM_ValEdit.24656because cannot find dir = file:/ac
> orn/QC/reportlib/VM_ValEdit.24656 in pathToPartitionInfo: 
> [file:/acorn/QC/OraExtract/20160131/Devices]
> {noformat}
> Perhaps clashing directories for intermediate data might be causing an issue?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (TEZ-3420) Parallel queries to HS2/Tez not thread safe (local mode)

2019-05-02 Thread Todd Lipcon (JIRA)


 [ 
https://issues.apache.org/jira/browse/TEZ-3420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon reassigned TEZ-3420:


Assignee: Todd Lipcon

> Parallel queries to HS2/Tez not thread safe (local mode)
> 
>
> Key: TEZ-3420
> URL: https://issues.apache.org/jira/browse/TEZ-3420
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.8.4
> Environment: HiveServer2 1.2.1 Local mode + Tez
>Reporter: Uday Chitragar
>Assignee: Todd Lipcon
>Priority: Major
> Attachments: hive.log.submit.gz
>
>
> When running parallel queries (simultaneous connections by two beeline 
> clients to HS2), I get the following exception (full debug attached), 
> interestingly running the queries one after the other completes without any 
> problem. The partition location and actual files seem to get mixed up across 
> the DAGS
>  
> The setup is Hive (1.2.1) and Tez (0.8.4) running in local mode.
> {noformat} 
> 2016-08-25 15:45:41,333 DEBUG 
> [TezTaskEventRouter{attempt_1472136335089_0001_1_01_00_0}]: 
> impl.ShuffleInputEventHandlerImpl 
> (ShuffleInputEventHandlerImpl.java:processDataMovementEvent(127)) - DME 
> srcIdx: 0, targetIndex: 9, attemptNum
> : 0, payload: [hasEmptyPartitions: true, host: , port: 0, pathComponent: , 
> runDuration: 0]
> 2016-08-25 15:45:41,557 ERROR [TezChild]: tez.MapRecordSource 
> (MapRecordSource.java:processRow(90)) - java.lang.IllegalStateException: 
> Invalid input path 
> file:/acorn/QC/OraExtract/20160131/Devices/Devices_extract_20160229T080613_3
> at 
> org.apache.hadoop.hive.ql.exec.MapOperator.getNominalPath(MapOperator.java:415)
> at 
> org.apache.hadoop.hive.ql.exec.MapOperator.cleanUpInputFileChangedOp(MapOperator.java:457)
> at 
> org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1069)
> at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:501)
>  
>  
>  
> 2016-08-25 15:45:41,817 INFO  [TezChild]: io.HiveContextAwareRecordReader 
> (HiveContextAwareRecordReader.java:doNext(326)) –
> Cannot get partition description from 
> file:/acorn/QC/reportlib/VM_ValEdit.24656because cannot find dir = file:/ac
> orn/QC/reportlib/VM_ValEdit.24656 in pathToPartitionInfo: 
> [file:/acorn/QC/OraExtract/20160131/Devices]
> {noformat}
> Perhaps clashing directories for intermediate data might be causing an issue?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (TEZ-1348) Allow Tez local mode to run against filesystems other than local FS

2019-04-29 Thread Todd Lipcon (JIRA)


[ 
https://issues.apache.org/jira/browse/TEZ-1348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16829615#comment-16829615
 ] 

Todd Lipcon commented on TEZ-1348:
--

It looks like the wrong revision of the patch got committed to master (the one 
with the checkstyle warnings). 
https://issues.apache.org/jira/secure/attachment/12966987/tez-1348.txt should 
be the right one. [~sseth] mind reverting and recommitting?

> Allow Tez local mode to run against filesystems other than local FS
> ---
>
> Key: TEZ-1348
> URL: https://issues.apache.org/jira/browse/TEZ-1348
> Project: Apache Tez
>  Issue Type: Sub-task
>Reporter: Siddharth Seth
>Assignee: Todd Lipcon
>Priority: Critical
> Fix For: 0.10.1
>
> Attachments: tez-1348.patch, tez-1348.patch, tez-1348.txt
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> In TEZ-717, I incorrect thought setting fs.defaultFS programmatically in 
> tez-site would work for local mode.
> Currently the requirement is that tez-site.xml must have fs.defaultFS set to 
> file:///.
> While that works, it doesn't allow for seamless execution in either 
> local-mode or on a cluster.
> The main issue here is that when Inputs / Outputs are configured - they use a 
> version of configuration which reads tez-site, and do not use the 
> configuration from the client itself (which is correct behaviour).
> Not sure what a good way to fix this is 
> 1) It may be possible to override this value each time an instance of 
> Configuration/TezConfiguration is created. One possible way would be to 
> statically add a default resource to Configuration the moment a local client 
> is created.
> 2) Provide information in the contexts on whether this is local or not. This 
> is fairly ugly, and would get in the way of running mixed mode tasks.
> Anyone have other suggestions ?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (TEZ-1348) Setup configs required for local mode automatically, instead of relying on changes to tez-site

2019-04-25 Thread Todd Lipcon (JIRA)


[ 
https://issues.apache.org/jira/browse/TEZ-1348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16825776#comment-16825776
 ] 

Todd Lipcon commented on TEZ-1348:
--

Think checkstyle should be fixed now. Let's see

> Setup configs required for local mode automatically, instead of relying on 
> changes to tez-site
> --
>
> Key: TEZ-1348
> URL: https://issues.apache.org/jira/browse/TEZ-1348
> Project: Apache Tez
>  Issue Type: Sub-task
>Reporter: Siddharth Seth
>Assignee: Todd Lipcon
>Priority: Critical
> Attachments: tez-1348.patch, tez-1348.patch, tez-1348.txt
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> In TEZ-717, I incorrect thought setting fs.defaultFS programmatically in 
> tez-site would work for local mode.
> Currently the requirement is that tez-site.xml must have fs.defaultFS set to 
> file:///.
> While that works, it doesn't allow for seamless execution in either 
> local-mode or on a cluster.
> The main issue here is that when Inputs / Outputs are configured - they use a 
> version of configuration which reads tez-site, and do not use the 
> configuration from the client itself (which is correct behaviour).
> Not sure what a good way to fix this is 
> 1) It may be possible to override this value each time an instance of 
> Configuration/TezConfiguration is created. One possible way would be to 
> statically add a default resource to Configuration the moment a local client 
> is created.
> 2) Provide information in the contexts on whether this is local or not. This 
> is fairly ugly, and would get in the way of running mixed mode tasks.
> Anyone have other suggestions ?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (TEZ-1348) Setup configs required for local mode automatically, instead of relying on changes to tez-site

2019-04-25 Thread Todd Lipcon (JIRA)


 [ 
https://issues.apache.org/jira/browse/TEZ-1348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated TEZ-1348:
-
Attachment: tez-1348.txt

> Setup configs required for local mode automatically, instead of relying on 
> changes to tez-site
> --
>
> Key: TEZ-1348
> URL: https://issues.apache.org/jira/browse/TEZ-1348
> Project: Apache Tez
>  Issue Type: Sub-task
>Reporter: Siddharth Seth
>Assignee: Todd Lipcon
>Priority: Critical
> Attachments: tez-1348.patch, tez-1348.patch, tez-1348.txt
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> In TEZ-717, I incorrect thought setting fs.defaultFS programmatically in 
> tez-site would work for local mode.
> Currently the requirement is that tez-site.xml must have fs.defaultFS set to 
> file:///.
> While that works, it doesn't allow for seamless execution in either 
> local-mode or on a cluster.
> The main issue here is that when Inputs / Outputs are configured - they use a 
> version of configuration which reads tez-site, and do not use the 
> configuration from the client itself (which is correct behaviour).
> Not sure what a good way to fix this is 
> 1) It may be possible to override this value each time an instance of 
> Configuration/TezConfiguration is created. One possible way would be to 
> statically add a default resource to Configuration the moment a local client 
> is created.
> 2) Provide information in the contexts on whether this is local or not. This 
> is fairly ugly, and would get in the way of running mixed mode tasks.
> Anyone have other suggestions ?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (TEZ-1348) Setup configs required for local mode automatically, instead of relying on changes to tez-site

2019-04-22 Thread Todd Lipcon (JIRA)


 [ 
https://issues.apache.org/jira/browse/TEZ-1348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated TEZ-1348:
-
Attachment: tez-1348.patch

> Setup configs required for local mode automatically, instead of relying on 
> changes to tez-site
> --
>
> Key: TEZ-1348
> URL: https://issues.apache.org/jira/browse/TEZ-1348
> Project: Apache Tez
>  Issue Type: Sub-task
>Reporter: Siddharth Seth
>Assignee: Todd Lipcon
>Priority: Critical
> Attachments: tez-1348.patch, tez-1348.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> In TEZ-717, I incorrect thought setting fs.defaultFS programmatically in 
> tez-site would work for local mode.
> Currently the requirement is that tez-site.xml must have fs.defaultFS set to 
> file:///.
> While that works, it doesn't allow for seamless execution in either 
> local-mode or on a cluster.
> The main issue here is that when Inputs / Outputs are configured - they use a 
> version of configuration which reads tez-site, and do not use the 
> configuration from the client itself (which is correct behaviour).
> Not sure what a good way to fix this is 
> 1) It may be possible to override this value each time an instance of 
> Configuration/TezConfiguration is created. One possible way would be to 
> statically add a default resource to Configuration the moment a local client 
> is created.
> 2) Provide information in the contexts on whether this is local or not. This 
> is fairly ugly, and would get in the way of running mixed mode tasks.
> Anyone have other suggestions ?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (TEZ-1348) Setup configs required for local mode automatically, instead of relying on changes to tez-site

2019-04-09 Thread Todd Lipcon (JIRA)


 [ 
https://issues.apache.org/jira/browse/TEZ-1348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated TEZ-1348:
-
Target Version/s: 0.10.0  (was: 0.8.6)

> Setup configs required for local mode automatically, instead of relying on 
> changes to tez-site
> --
>
> Key: TEZ-1348
> URL: https://issues.apache.org/jira/browse/TEZ-1348
> Project: Apache Tez
>  Issue Type: Sub-task
>Reporter: Siddharth Seth
>Assignee: Todd Lipcon
>Priority: Critical
> Attachments: tez-1348.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> In TEZ-717, I incorrect thought setting fs.defaultFS programmatically in 
> tez-site would work for local mode.
> Currently the requirement is that tez-site.xml must have fs.defaultFS set to 
> file:///.
> While that works, it doesn't allow for seamless execution in either 
> local-mode or on a cluster.
> The main issue here is that when Inputs / Outputs are configured - they use a 
> version of configuration which reads tez-site, and do not use the 
> configuration from the client itself (which is correct behaviour).
> Not sure what a good way to fix this is 
> 1) It may be possible to override this value each time an instance of 
> Configuration/TezConfiguration is created. One possible way would be to 
> statically add a default resource to Configuration the moment a local client 
> is created.
> 2) Provide information in the contexts on whether this is local or not. This 
> is fairly ugly, and would get in the way of running mixed mode tasks.
> Anyone have other suggestions ?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (TEZ-1348) Setup configs required for local mode automatically, instead of relying on changes to tez-site

2019-04-09 Thread Todd Lipcon (JIRA)


 [ 
https://issues.apache.org/jira/browse/TEZ-1348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated TEZ-1348:
-
Attachment: tez-1348.patch

> Setup configs required for local mode automatically, instead of relying on 
> changes to tez-site
> --
>
> Key: TEZ-1348
> URL: https://issues.apache.org/jira/browse/TEZ-1348
> Project: Apache Tez
>  Issue Type: Sub-task
>Reporter: Siddharth Seth
>Assignee: Todd Lipcon
>Priority: Critical
> Attachments: tez-1348.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> In TEZ-717, I incorrect thought setting fs.defaultFS programmatically in 
> tez-site would work for local mode.
> Currently the requirement is that tez-site.xml must have fs.defaultFS set to 
> file:///.
> While that works, it doesn't allow for seamless execution in either 
> local-mode or on a cluster.
> The main issue here is that when Inputs / Outputs are configured - they use a 
> version of configuration which reads tez-site, and do not use the 
> configuration from the client itself (which is correct behaviour).
> Not sure what a good way to fix this is 
> 1) It may be possible to override this value each time an instance of 
> Configuration/TezConfiguration is created. One possible way would be to 
> statically add a default resource to Configuration the moment a local client 
> is created.
> 2) Provide information in the contexts on whether this is local or not. This 
> is fairly ugly, and would get in the way of running mixed mode tasks.
> Anyone have other suggestions ?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (TEZ-1348) Setup configs required for local mode automatically, instead of relying on changes to tez-site

2019-04-05 Thread Todd Lipcon (JIRA)


[ 
https://issues.apache.org/jira/browse/TEZ-1348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16811446#comment-16811446
 ] 

Todd Lipcon commented on TEZ-1348:
--

Local fetch is enabled by default since Tez 0.7 so I think that's not necessary 
anymore. I have another patch in the works which removes a bunch of unnecessary 
calls and documentation to set it to true

> Setup configs required for local mode automatically, instead of relying on 
> changes to tez-site
> --
>
> Key: TEZ-1348
> URL: https://issues.apache.org/jira/browse/TEZ-1348
> Project: Apache Tez
>  Issue Type: Sub-task
>Reporter: Siddharth Seth
>Assignee: Todd Lipcon
>Priority: Critical
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> In TEZ-717, I incorrect thought setting fs.defaultFS programmatically in 
> tez-site would work for local mode.
> Currently the requirement is that tez-site.xml must have fs.defaultFS set to 
> file:///.
> While that works, it doesn't allow for seamless execution in either 
> local-mode or on a cluster.
> The main issue here is that when Inputs / Outputs are configured - they use a 
> version of configuration which reads tez-site, and do not use the 
> configuration from the client itself (which is correct behaviour).
> Not sure what a good way to fix this is 
> 1) It may be possible to override this value each time an instance of 
> Configuration/TezConfiguration is created. One possible way would be to 
> statically add a default resource to Configuration the moment a local client 
> is created.
> 2) Provide information in the contexts on whether this is local or not. This 
> is fairly ugly, and would get in the way of running mixed mode tasks.
> Anyone have other suggestions ?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (TEZ-1348) Setup configs required for local mode automatically, instead of relying on changes to tez-site

2019-04-03 Thread Todd Lipcon (JIRA)


[ 
https://issues.apache.org/jira/browse/TEZ-1348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16809292#comment-16809292
 ] 

Todd Lipcon commented on TEZ-1348:
--

It seems that by changing the code to ensure that the tez working directory is 
created on Local FS, then tez local-mode can work even if defaultFS is a remote 
cluster. This is actually useful, for example when testing Hive against a 
pseudo-distributed HDFS, if you don't want to also start a pseudo-distributed 
YARN. I'll work on a patch for this.

> Setup configs required for local mode automatically, instead of relying on 
> changes to tez-site
> --
>
> Key: TEZ-1348
> URL: https://issues.apache.org/jira/browse/TEZ-1348
> Project: Apache Tez
>  Issue Type: Sub-task
>Reporter: Siddharth Seth
>Priority: Critical
>
> In TEZ-717, I incorrect thought setting fs.defaultFS programmatically in 
> tez-site would work for local mode.
> Currently the requirement is that tez-site.xml must have fs.defaultFS set to 
> file:///.
> While that works, it doesn't allow for seamless execution in either 
> local-mode or on a cluster.
> The main issue here is that when Inputs / Outputs are configured - they use a 
> version of configuration which reads tez-site, and do not use the 
> configuration from the client itself (which is correct behaviour).
> Not sure what a good way to fix this is 
> 1) It may be possible to override this value each time an instance of 
> Configuration/TezConfiguration is created. One possible way would be to 
> statically add a default resource to Configuration the moment a local client 
> is created.
> 2) Provide information in the contexts on whether this is local or not. This 
> is fairly ugly, and would get in the way of running mixed mode tasks.
> Anyone have other suggestions ?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)