I have created JIRA for this
https://issues.apache.org/jira/browse/STORM-2358.
For point 1:

Its specific use case just to support why it needs to be public

For point 2:
We are limiting code to be very specific to these 2 implementations we
should have generic implementation. I see there is another check-in
happening for ZippedTextFileReader.I have attached my code changes in
JIRA, please take a look, where you need to provide class.

For point 3:

Lets assume I have multiple topologies with different readers. So I
defined the a base topology class with HDFSSpout in it. Now I always needs
to pass the outputFields as separate array. This actually can be part of
every reader class as its very specific to it.

Thanks,
Sachin

On 14/02/17, 4:52 AM, "Roshan Naik" <ros...@hortonworks.com> wrote:

>
>
>On 2/13/17, 12:14 PM, "Sachin Pasalkar" <sachin_pasal...@symantec.com>
>wrote:
>
>>I have attached updated source code of HDFSSpout for more reference. I
>>have updated respective classes (not attached)
>
>
>Don¹t see any attachment. Answers are below. Better to do this discussion
>on a JIRA.
>
>
>On 2/13/17, 8:32 AM, "Sachin Pasalkar" <sachin_pasal...@symantec.com>
>wrote:
>
>>Hi,
>>
>>I was looking at storm hdfs spout code in 1.x branch, I found below
>>improvements can be made in below code.
>>
>>  1.  Make org.apache.storm.hdfs.spout.AbstractFileReader as public so
>>that it can be used in generics.
>
>Java generics and making a class public are unrelated to my knowledge. But
>making it public sounds ok to me if its useful for "user defined² readers
>Š although it doesn¹t really have that much going on in it. For future
>built-in reader types it is immaterial as they can derive from it anyway
>just like the existing ones. HdfsSpout class itself doesn¹t care about the
>ŒAbstractFileReader¹ type. For that there is the ŒFileReader¹ interface.
>
>
>
>>  2.  org.apache.storm.hdfs.spout.HdfsSpout requires readerType as
>>String. It will be great to have class<? extends AbstractFileReader>
>>readerType; So we will not use Class.forName at multiple places also it
>>will help in below point.
>
>The reason it is a string, is that, for built-in readers,  we wanted to
>support Œshort aliases¹ like Œtext¹ and Œseq¹ instead of FQCN..
>
>
>>  3.  HdfsSpout also needs to provide outFields which are declared as
>>constants in each reader(e.g.SequenceFileReader). We can have abstract
>>API AbstractFileReader in which return them to user to make it generic.
>
>
>These consts can¹t go into the AbstractFileReader as they are reader
>specific.
>
>They are there just for convenience.  Users can call withOutputFields() on
>the spout and set it to these predefined names or anything else.
>
>
>-Roshan
>

Reply via email to