Hi Gianmarco,

Yes, it helped me!
I put the STORM, HADOOP and SAMOA in the same cluster, it worked! However, I
am thinking the execution too slow.
Considering the same task and covtypeNorm.arff dataset , Samoa (local mode)
takes 18 seconds. Already in cluster mode, several minutes. Is this normal?

Best regards,
Eduardo.

2016-03-13 4:48 GMT-03:00 Gianmarco De Francisci Morales <[email protected]>:

> Hi Eduardo,
>
> As long as you can access the HDFS cluster from the machines composing the
> Storm cluster, there should be no problem.
> However, you need to figure out how to set the environment variables to
> point to the right installation of Hadoop (set the HADOOP_HOME variable).
> You just need to set your configuration files (e.g., hdfs-site.xml) to
> point to the correct Hadoop cluster.
>
> Hope it helps,
>
> -- Gianmarco
>
> On Sat, Mar 12, 2016 at 4:21 PM, Eduardo Costa <[email protected]>
> wrote:
>
> > Hi, Gianmarco!
> > Thank you so much by response!
> > Now, I have another doubt: I run the SAMOA (in cluster mode) in a
> different
> > machine (cluster) from Hadoop cluster because I run the  SAMOA on top of
> > Storm cluster. Is there some way to read arff files from this Hadoop
> > cluster remote to run the SAMOA on top of Storm cluster?
> > Sorry for bothering so much, but I need it to give continidade my
> master's
> > thesis in Brazil at the Federal University of the State of Rio de Janeiro
> > (UNIRIO). As previously mentioned, I'm trying to build a rudimentary
> > anomaly detection system using SAMOA, but I am a layman in relation to
> > Samoa.
> >
> > Best regards,
> > Eduardo.
> >
> > 2016-03-06 8:59 GMT-03:00 Gianmarco De Francisci Morales <
> [email protected]
> > >:
> >
> > > Hi Eduardo,
> > > Yes, it is possible to read ARFF files from HDFS.
> > > However, right now it is way more complicated than it should be, and
> it's
> > > not documented at all.
> > > Thanks for asking the question.
> > >
> > > I managed to do it with this command line:
> > >
> > > ./bin/samoa local target/SAMOA-Local-0.4.0-incubating-SNAPSHOT.jar
> > > "PrequentialEvaluation -s (org.apache.samoa.streams.ArffFileStream -s
> > > HDFSFileStreamSource -f /user/$USER/covtypeNorm.arff)"
> > >
> > > But I had to do a small modification to HDFSFileStreamSource to make it
> > > work, by adding this line after line 61
> > >
> > >     config.set("fs.hdfs.impl",
> > >
> > >         org.apache.hadoop.hdfs.DistributedFileSystem.class.getName());
> > >
> > > Things to notice:
> > > - We rely on HADOOP_HOME being set to your hadoop installation. This
> > should
> > > be made more robust.
> > > - I used explicitly org.apache.samoa.streams.ArffFileStream as the
> normal
> > > ArffFileStream does not support HDFS (this is related to SAMOA-14
> > > <https://issues.apache.org/jira/browse/SAMOA-14>, and I plan to fix it
> > > asap).
> > > - I will add the snippet of code above in the same patch for SAMOA-14
> > >
> > >
> > > Hope it helps,
> > >
> > >
> > >
> > >
> > > -- Gianmarco
> > >
> > > On Fri, Feb 12, 2016 at 6:45 PM, Eduardo Costa <[email protected]>
> > > wrote:
> > >
> > > > Hi,
> > > >
> > > > Could I pass arff files, by "-s " argumment, from hadoop HDFS to
> SAMOA.
> > > If
> > > > I could, how to make?
> > > >
> > > > Best regards,
> > > > Eduardo.
> > > >
> > >
> >
>

Reply via email to