Assuming you have git and maven installed: git clone g...@github.com:apache/storm.git cd storm git checkout -b 1.x origin/1.x-branch mvn install -DskipTests
That third step checks out the 1.x-branch branch which is the base for the upcoming 1.0 release. You can then include the storm-hdfs dependency in your project: <dependency> <groupId>org.apache.storm</groupId> <artifactId>storm-hdfs</artifactId> <version>1.0.0-SNAPSHOT</version> </dependency> You can find more information on using the spout and other HDFS components here: https://github.com/apache/storm/tree/1.x-branch/external/storm-hdfs#hdfs-spout <https://github.com/apache/storm/tree/1.x-branch/external/storm-hdfs#hdfs-spout> -Taylor > On Feb 3, 2016, at 2:54 PM, K Zharas <kgzha...@gmail.com> wrote: > > Oh ok. Can you plz give me an idea how can I do it manually? I'm quite > beginner :) > > On Thu, Feb 4, 2016 at 3:43 AM, Parth Brahmbhatt <pbrahmbh...@hortonworks.com > <mailto:pbrahmbh...@hortonworks.com>> wrote: > Storm-hdfs spout is not yet published in maven. You will have to checkout > storm locally and build it to make it available for development. > > From: K Zharas <kgzha...@gmail.com <mailto:kgzha...@gmail.com>> > Reply-To: "user@storm.apache.org <mailto:user@storm.apache.org>" > <user@storm.apache.org <mailto:user@storm.apache.org>> > Date: Wednesday, February 3, 2016 at 11:41 AM > To: "user@storm.apache.org <mailto:user@storm.apache.org>" > <user@storm.apache.org <mailto:user@storm.apache.org>> > Subject: Re: Storm + HDFS > > Yes, looks like it is. But, I have added dependencies required by storm-hdfs > as stated in a guide. > > On Thu, Feb 4, 2016 at 3:33 AM, Nick R. Katsipoulakis <nick.kat...@gmail.com > <mailto:nick.kat...@gmail.com>> wrote: > Well, > > those errors look like a problem with the way you build your jar file. > Please, make sure that you build your jar with the proper storm maven > dependency). > > Cheers, > Nick > > On Wed, Feb 3, 2016 at 2:31 PM, K Zharas <kgzha...@gmail.com > <mailto:kgzha...@gmail.com>> wrote: > It throws and error that packages does not exist. I have also tried changing > org.apache to backtype, still got an error but only for storm.hdfs.spout. > Btw, I use Storm-0.10.0 and Hadoop-2.7.1 > > package org.apache.storm does not exist > package org.apache.storm does not exist > package org.apache.storm.generated does not exist > package org.apache.storm.metric does not exist > package org.apache.storm.topology does not exist > package org.apache.storm.utils does not exist > package org.apache.storm.utils does not exist > package org.apache.storm.hdfs.spout does not exist > package org.apache.storm.hdfs.spout does not exist > package org.apache.storm.topology.base does not exist > package org.apache.storm.topology does not exist > package org.apache.storm.tuple does not exist > package org.apache.storm.task does not exist > > On Wed, Feb 3, 2016 at 8:57 PM, Matthias J. Sax <mj...@apache.org > <mailto:mj...@apache.org>> wrote: > Storm does provide HdfsSpout and HdfsBolt already. Just use those, > instead of writing your own spout/bolt: > > https://github.com/apache/storm/tree/master/external/storm-hdfs > <https://github.com/apache/storm/tree/master/external/storm-hdfs> > > -Matthias > > > On 02/03/2016 12:34 PM, K Zharas wrote: > > Can anyone help to create a Spout which reads a file from HDFS? > > I have tried with the code below, but it is not working. > > > > public void nextTuple() { > > Path pt=new Path("hdfs://localhost:50070/user/BCpredict.txt"); > > FileSystem fs = FileSystem.get(new Configuration()); > > BufferedReader br = new BufferedReader(new > > InputStreamReader(fs.open(pt))); > > String line = br.readLine(); > > while (line != null){ > > System.out.println(line); > > line=br.readLine(); > > _collector.emit(new Values(line)); > > } > > } > > > > On Tue, Feb 2, 2016 at 1:19 PM, K Zharas <kgzha...@gmail.com > > <mailto:kgzha...@gmail.com> > > <mailto:kgzha...@gmail.com <mailto:kgzha...@gmail.com>>> wrote: > > > > Hi. > > > > I have a project I'm currently working on. The idea is to implement > > "scikit-learn" into Storm and integrate it with HDFS. > > > > I've already implemented "scikit-learn". But, currently I'm using a > > text file to read and write. However, I need to use HDFS, but > > finding it hard to integrate with HDFS. > > > > Here is the link to github > > <https://github.com/kgzharas/StormTopologyTest > > <https://github.com/kgzharas/StormTopologyTest>>. (I only included > > files that I used, not whole project) > > > > Basically, I have a few questions if you don't mint to answer them > > 1) How to use HDFS to read and write? > > 2) Is my "scikit-learn" implementation correct? > > 3) How to create a Storm project? (Currently working in "storm-starter") > > > > These questions may sound a bit silly, but I really can't find a > > proper solution. > > > > Thank you for your attention to this matter. > > Sincerely, Zharas. > > > > > > > > > > -- > > Best regards, > > Zharas > > > > > -- > Best regards, > Zharas > > > > -- > Nick R. Katsipoulakis, > Department of Computer Science > University of Pittsburgh > > > > -- > Best regards, > Zharas > > > > -- > Best regards, > Zharas
signature.asc
Description: Message signed with OpenPGP using GPGMail