I have created JIRA for this https://issues.apache.org/jira/browse/STORM-2358. For point 1:
Its specific use case just to support why it needs to be public For point 2: We are limiting code to be very specific to these 2 implementations we should have generic implementation. I see there is another check-in happening for ZippedTextFileReader.I have attached my code changes in JIRA, please take a look, where you need to provide class. For point 3: Lets assume I have multiple topologies with different readers. So I defined the a base topology class with HDFSSpout in it. Now I always needs to pass the outputFields as separate array. This actually can be part of every reader class as its very specific to it. Thanks, Sachin On 14/02/17, 4:52 AM, "Roshan Naik" <ros...@hortonworks.com> wrote: > > >On 2/13/17, 12:14 PM, "Sachin Pasalkar" <sachin_pasal...@symantec.com> >wrote: > >>I have attached updated source code of HDFSSpout for more reference. I >>have updated respective classes (not attached) > > >Don¹t see any attachment. Answers are below. Better to do this discussion >on a JIRA. > > >On 2/13/17, 8:32 AM, "Sachin Pasalkar" <sachin_pasal...@symantec.com> >wrote: > >>Hi, >> >>I was looking at storm hdfs spout code in 1.x branch, I found below >>improvements can be made in below code. >> >> 1. Make org.apache.storm.hdfs.spout.AbstractFileReader as public so >>that it can be used in generics. > >Java generics and making a class public are unrelated to my knowledge. But >making it public sounds ok to me if its useful for "user defined² readers >Š although it doesn¹t really have that much going on in it. For future >built-in reader types it is immaterial as they can derive from it anyway >just like the existing ones. HdfsSpout class itself doesn¹t care about the >ŒAbstractFileReader¹ type. For that there is the ŒFileReader¹ interface. > > > >> 2. org.apache.storm.hdfs.spout.HdfsSpout requires readerType as >>String. It will be great to have class<? extends AbstractFileReader> >>readerType; So we will not use Class.forName at multiple places also it >>will help in below point. > >The reason it is a string, is that, for built-in readers, we wanted to >support Œshort aliases¹ like Œtext¹ and Œseq¹ instead of FQCN.. > > >> 3. HdfsSpout also needs to provide outFields which are declared as >>constants in each reader(e.g.SequenceFileReader). We can have abstract >>API AbstractFileReader in which return them to user to make it generic. > > >These consts can¹t go into the AbstractFileReader as they are reader >specific. > >They are there just for convenience. Users can call withOutputFields() on >the spout and set it to these predefined names or anything else. > > >-Roshan >