Thanks Jay.

Regarding the question of disk placement, that would probably depend on
how hard you're pushing the machines.  In our case, we have a single
collector handling 60 routers.  That's been working just fine, but the
collector is doing plenty of time-sensitive I/O, so it might make sense
to leave the disk there.  Also, our nodes will be connected with Gb
Ethernet.

The bigger question is how to manage the cluster's workload.  We
considered the static approach, as you suggested below, where each node
is assigned to a dedicated list of routers.  The concern is scalability
since router loads change over time.  The process-migration or
load-balancing solutions are nicer because they distribute load
dynamically, but I guess they're unreasonable with I/O.  We might go
with the static approach after all.

As far as you know, are all clustering solutions inherently I/O
inefficient, or are some of them ok?  Did you check out openSSI?

By the way, we're running reports on daily aggregations.  So on some of
the routers we can have 2GB of compressed data for a single report.
Sometimes these processes run out of memory and sometimes they take 4
hours.  I'm probably pushing this system more than I should, but the
results are usually pretty good, I just need more processing power.

On that note, does anybody know about the inner workings of flow-report?
My general understanding is that it loads up a huge hashtable (or other
data structure) in memory and then basically dumps out quick stats.  Not
very cpu intensive.  Is that accurate?

Ari

 
-----Original Message-----
From: Jay A. Kreibich [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, October 27, 2004 4:12 PM
To: Ari Leichtberg
Cc: [EMAIL PROTECTED]
Subject: Re: [Flow-tools] Flow-tools on linux cluster (Mosix)

On Tue, Oct 26, 2004 at 05:09:04PM -0400, Ari Leichtberg scratched on
the wall:

> I'm wondering if anybody has any experience running flow-tools on a
> Linux cluster.  I have a dedicated Sun box running flow-capture,
> collecting from around 60 Cisco's campus wide, totaling over 16 GB of
> level-6 compressed data per day.  The flows are written to the
> collector's local storage, and I have enough space to hold around 12
> day's worth of data.

  We're only collecting off our exit routers and do ~14GB per day,
  although that's uncompressed.

> My plan is to have a separate Linux cluster, nfs mounted to the
> collector's storage, which runs daily and hourly flow-reports,
> flow-dscans, and other analyses.  It's not uncommon for a router to
> collect over 2GB per day, so the flow-report processes get pretty IO
and
> memory heavy.

  Consider this: what requires more disk I/O, the collector, which has
  an hour to do one pass on one hour's worth of data; or the analyzers,
  that have one hour to do all of your reports.  Often reports require
  multiple passes and ideally don't take the whole hour.

  With that in mind, if you are going to write everything to disk and
  then do post-analysis, put the disk on the analyzers, not the
  collectors. They do even more I/O and will benefit a lot more from
  the direct disk attachment.  You definitely don't want the collector
  wasting lots of resources doing NFS server traffic!

  In the bigger picture, one of the problems with clusters for flow
  analysis is volume of data involved.  Most people run reports that
  are fairly simple, so they tend to be I/O bound (or compression
  processing bound) on any modern machine.  This is worst case for
  cluster stuff since clusters inherently add I/O inefficiencies, so
  for I/O bound stuff you can actually make everything run slower
  on a cluster, although the compression helps a little there.

> Has anybody ever tried this with Mosix, or any other ideas for a
> clustering solution? 

  Because of these problems, one of the things we're looking at is
putting
  in a SAN infrastructure with a clustered file system so that multiple
  machines can access the same fibre-channel attached filesystem at
  the same time.  The collector writes and everything else does what it
  needs.  More or less what you're talking about, but using
  fibre-channel rather than NFS.  Once you remove most of the file
  transport problems, how you want to split up or distribute your
  computation is up to you.  We're looking at static assignments, not
  load balanced clustering, mostly because we aren't looking at process
  migration type stuff.

  The other option is to pre-distribute the data.  Have one (or more)
  collectors with big disk that is your main collectors are archivers.
  Configure them to filter and split their data streams up to multiple
  machines in the cluster.  Have each cluster node keep only one to two
  hours worth of data, or better yet, do the reports in real time so
  they need almost no storage at all.  The constant data rates are not
  exciting-- even with spikes you're only looking at like 45Mb/s.  If
  you back-channel multicast the data across the cluster, that's no
  problem.  If you pre-filter it, so each cluster node only services a
  few routers, it is even easier.

  There are lots of games to play here, but the big thing is to
  remember that collection data rates are almost always smaller than
  required analysis data rates.

  I should also say that we use a custom collector and tool set, so I
  have no idea how easy/hard it would be to do some of these things
  with the public tools.

   -j

-- 
                     Jay A. Kreibich | Comm. Technologies, R&D
                        [EMAIL PROTECTED] | Campus IT & Edu. Svcs.
          <http://www.uiuc.edu/~jak> | University of Illinois at U/C

_______________________________________________
Flow-tools mailing list
[EMAIL PROTECTED]
http://mailman.splintered.net/mailman/listinfo/flow-tools

Reply via email to