Raghav Kumar Gautam created STORM-1910:
------------------------------------------
Summary: One topology can't use hdfs spout to read from two
locations
Key: STORM-1910
URL: https://issues.apache.org/jira/browse/STORM-1910
Project: Apache Storm
Issue Type: Bug
Components: storm-hdfs
Affects Versions: 1.0.1
Reporter: Raghav Kumar Gautam
Fix For: 1.1.0
The hdfs uri is passed using config:
{code}
conf.put(Configs.HDFS_URI, hdfsUri);
{code}
I see two problems with this approach:
1. If someone wants to used two hdfsUri in same or different spouts - then that
does not seem feasible.
https://github.com/apache/storm/blob/d17b3b9c3cbc89d854bfb436d213d11cfd4545ec/examples/storm-starter/src/jvm/storm/starter/HdfsSpoutTopology.java#L117-L117
https://github.com/apache/storm/blob/d17b3b9c3cbc89d854bfb436d213d11cfd4545ec/external/storm-hdfs/src/main/java/org/apache/storm/hdfs/spout/HdfsSpout.java#L331-L331
{code}
if ( !conf.containsKey(Configs.SOURCE_DIR) ) {
LOG.error(Configs.SOURCE_DIR + " setting is required");
throw new RuntimeException(Configs.SOURCE_DIR + " setting is required");
}
this.sourceDirPath = new Path( conf.get(Configs.SOURCE_DIR).toString() );
{code}
2. It does not fail fast i.e. at the time of topology submissing. We can fail
fast if the hdfs path is invalid or credentials/permissions are not ok. Such
errors at this time can only be detected at runtime by looking at the worker
logs.
https://github.com/hortonworks/storm/blob/d17b3b9c3cbc89d854bfb436d213d11cfd4545ec/external/storm-hdfs/src/main/java/org/apache/storm/hdfs/spout/HdfsSpout.java#L297-L297
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)