Re: [Bro-Dev] CBAN design proposal

2016-05-24 Thread Robin Sommer


On Tue, May 24, 2016 at 16:21 +, you wrote:

> Yeah, I have ideas, but seems like there might be another day of some
> discussion before I try to formally reframe a design doc. Here’s the
> direction I'm thinking:

I like the process you sketch, that sounds like the right way to go to
me.

A few notes, also trying to address some of Adam's comments:

> - the initial submission process involves doing a pull request on the
> bro/cban repo where the only change made is the addition of a
> submodule.  These merges probably can be automated, but if a human
> were to do it, I’d expect it wouldn’t be a time-consuming task

Yeah, maybe that initial merge is one task we leave to a human, who
could then actually take a quick 30sec look at the module to see if
it's not totally off the mark. That would address Adam's point about
what if somebody submits something that's not even a Bro thing (but I
wouldn't go further; e.g., don't try to compile, etc.. Everything
looking roughly right gets in at this time.)

> - submodules that are found to have never been in a working state are
> auto-removed (or could initially be a task that’s not a big deal for a
> human to do every so often if metrics of brokenness are consistently
> available)

Auto-removal sounds dangerous to me; there may be different reasons
why something's not in a good state. I'd leave cleanup to humans too:
if there's a module that's consistently flagged as broken, that's when
we can send a mail to the author and remove it manually if no
improvement is in sight. I'd rather err on the side of having a broken
module than remove something that could actually still be useful.

> - metadata logically can be categorized in two types, one type is
> related to discoverability (tags, author, license, etc) and one type
> is related to interoperability (version number, dependencies)

I wouldn't object to making some meta-data mandatory, per Adam's
comments. For example enforcing having an author and a license would
be useful I think. Author gets us contact information and license is
always worth clarifying.

> - discoverability metadata is aggregated during the nightly quality
> check processes and automatically commits that information to the
> “bro/cban” repo.

Would it be better to maintain this information outside of git in a
state file that clients download? Otherwise the repository will
clutter up quite a bit over time with tons of automatic commits.


Overall, I agree that we can always add more restrictions later if it
turns out necessary. It's not that we'll have 1000s of Bro modules in
there within the first two weeks (as long as we prevent somebody
spamming us).

Robin

-- 
Robin Sommer * ICSI/LBNL * ro...@icir.org * www.icir.org/robin
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] CBAN design proposal

2016-05-24 Thread Slagell, Adam J

> On May 24, 2016, at 4:46 PM, Matthias Vallentin  wrote:
> 
> If I understold it correctly, I don't think that the central CBAN
> repository will accumulate clutter. The automatic checks will help
> simply age out repos that do not comply with the minimal standards. It's
> up to the devs to comply if the want to be integrated.

I think that’s fine, but that isn’t what I thought Robin was saying. I thought 
he did not want minimal standards.

___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] CBAN design proposal

2016-05-24 Thread Matthias Vallentin
(I will respond to the actual proposal in more depth later.)

> That is a good point. I am more concerned about accumulating clutter.

If I understold it correctly, I don't think that the central CBAN
repository will accumulate clutter. The automatic checks will help
simply age out repos that do not comply with the minimal standards. It's
up to the devs to comply if the want to be integrated.

More generally, there will presumably some functionality to add
"remotes" to one's configuration, allowing plugin writers to point to
experimental code if they wish. Then they can still hack out code and
mix it with existing CBAN plugins, at their own risk.

With a small linter in place, we would also lower the bar for devs to
comply. Homebrew has a nice checker, for example.

On a cosmetic note, will thing be called CBAN? I find it a very cryptic
name, often confused it with BPAN (even though it doesn't make sense),
and was wondering whether we should converge on some more pronounceable
candidates.

Matthias
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] CBAN design proposal

2016-05-24 Thread Jan Grashöfer
> - it’s not a big deal for a submodule to temporarily enter in to a broken 
> state — cban users can always roll back to a previous version or just 
> uninstall it.  It’s up to the community to communicate/collaborate directly 
> w/ the author here to get things fixed.

I really like the community-centric approach. Regarding the discussion
about checks and consistency I think that basically all conventions,
that could be enforced automatically, will make the archive easier to
work with (for authors and for end-users). But there is another thing
that came to my mind: How are situations handled in which the author
becomes the bottleneck?

Imagine there was someone who published an awesome script but a new
version of Bro breaks it. Another one patched the script and sends the
patch to the original author. What will happen, in case he does not
respond? Personally I don't like repositories which end up with entries
like: "awesome-script", "awesome-script v2", "awesome-script by Jan" ...
To avoid this one might consider to support forking plugins or organize
the plugins user-centered ("jan/awesome-script", "anna/awesome-script").

Best regards,
Jan
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] CBAN design proposal

2016-05-24 Thread Siwek, Jon

> On May 24, 2016, at 11:49 AM, Slagell, Adam J  wrote:
> 
> I propose that we keep mandatory checks minimal, but not non-existent, and 
> then we reevaluate when we have real data about how well this works. But I 
> would really like more feedback from the community. Maybe I am an outlier 
> here?

I think starting w/ either approach could end up evolving/devolving in to the 
other? 

If you had no checks in place, but then later instituted mandatory checks, you 
might be able to have the cban client not remove things a user has already 
checked out.  So you can delist plugins if they fail the new checks, but users 
would still have the local version they can use (if somehow they’ve got it in a 
configuration that’s usable to them, but that doesn’t pass the new mandatory 
quality checks).

I lean toward starting w/ the most streamlined and least complicated approach 
and seeing what quality control checks you need to layer on top of it because 
we might just expend a lot of effort planning for problems that don’t actual 
ever pop up in practice.  But as a person that has to do development work on 
cban I might be biased toward doing what seems easier for me, so I’m fine not 
having a vote.

- Jon

___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] CBAN design proposal

2016-05-24 Thread Slagell, Adam J

> On May 24, 2016, at 12:07 PM, Siwek, Jon  wrote:
> 
>> 
>> On May 23, 2016, at 6:30 PM, Slagell, Adam J  wrote:
>> 
>> I guess there is a balance here. If we do no mandatory checks and you could 
>> submit something that isn’t even a Bro plugin, the repository could become 
>> cluttered with junk. Do we really want things that don’t even “compile”?
> 
> The clutter could still be removed by an out-of-band process.  e.g. there’s 
> no initial check for whether a submission actually works, but after X days of 
> a nightly process finding it is broken, it gets auto-removed.

That is a good point. I am more concerned about accumulating clutter.

--

Adam J. Slagell
Chief Information Security Officer
Director, Cybersecurity Division
National Center for Supercomputing Applications
University of Illinois at Urbana-Champaign
www.slagell.info

"Under the Illinois Freedom of Information Act (FOIA), any written 
communication to or from University employees regarding University business is 
a public record and may be subject to public disclosure." 









___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] CBAN design proposal

2016-05-24 Thread Siwek, Jon

> On May 23, 2016, at 6:30 PM, Slagell, Adam J  wrote:
> 
> I guess there is a balance here. If we do no mandatory checks and you could 
> submit something that isn’t even a Bro plugin, the repository could become 
> cluttered with junk. Do we really want things that don’t even “compile”?

The clutter could still be removed by an out-of-band process.  e.g. there’s no 
initial check for whether a submission actually works, but after X days of a 
nightly process finding it is broken, it gets auto-removed.

> However, where we can’t do that is with the metadata we collect. If we don’t 
> require what we think is important metadata in the beginning, then we will 
> have a gap if we decide it was important all along. So there I would err 
> towards overcorrecting in the beginning, and make things optional in the 
> future if it turns out not to be important.

I think the most important metadata has to do w/ plugin interoperability 
(versioning and dependency info) and metadata that improves discoverability and 
search features is less important.  For one reason, the former has a more 
objective correctness to it and the later is more subjective.  Having wrong or 
missing discoverability metadata is also not going to cause things to break 
this missing interoperability data would.

But even though I think interoperability metadata is important, I’m also not 
sure it needs to be collected/aggregated before plugin submissions are accepted 
— it might be something the client can collect “just in time” directly from a 
clone of the plugin itself.  Even if a plugin doesn’t initially include this, 
the expected behavior could be for the cban client to use the plugin’s master 
branch and assume it will work w/ everything.  If the user finds that not to be 
true, then they just uninstall it and ask the author to add proper 
versioning/dependency info or they might even try to add it themselves and 
submit the fixes back to the author.

Metadata that helps improve discoverability and search features 
(topics/tags/keywords, author, license, etc) I don’t see becoming so important 
but underused to the point that you’d wish it were a requirement for 
submissions to be accepted.

I don't expect adding metadata to be so much a burden that people avoid it 
entirely.  Were there other reasons to think people won’t eventually add 
metadata info even if none is initially required?

- Jon

___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] CBAN design proposal

2016-05-24 Thread Slagell, Adam J

> On May 24, 2016, at 11:21 AM, Siwek, Jon  wrote:
> 
> I think all those points make things easy on contributors, minimize direct 
> involvement of the Bro Team in sorting out problems related to particular 
> plugins, and provide a useful way for users to discover and maintain Bro 
> plugins.  There’s more potential for users to encounter broken/bad plugins, 
> but maybe that also encourages stronger community involvement w/ users more 
> likely to try and help get problems resolved.

I don’t feel like we have converged on agreement regarding the balance of 
mandatory vs. optional checks.

I think we need to define some basic metadata as a requirement for 
interoperability and discovery. Otherwise, what do we really end up providing 
above and beyond GitHub. 

Other quality checks can be optional, as long as we can change that in the 
future. I still think we should do do some basic checks to avoid completely 
broken stuff. It might mean more work for us in making sure we have good 
feedback and documentation.

In general we all want to avoid human interaction becoming a bottleneck to 
submissions.

I propose that we keep mandatory checks minimal, but not non-existent, and then 
we reevaluate when we have real data about how well this works. But I would 
really like more feedback from the community. Maybe I am an outlier here?

--

Adam J. Slagell
Chief Information Security Officer
Director, Cybersecurity Division
National Center for Supercomputing Applications
University of Illinois at Urbana-Champaign
www.slagell.info

"Under the Illinois Freedom of Information Act (FOIA), any written 
communication to or from University employees regarding University business is 
a public record and may be subject to public disclosure." 









___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] CBAN design proposal

2016-05-24 Thread Siwek, Jon

> On May 23, 2016, at 4:59 PM, Robin Sommer  wrote:
> 
>> That would make life easier for authors, and that’s maybe even a
>> higher priority than maximizing the quality/consistency of user
>> experience because, without authors, there won’t be much for users to
>> experience in the first place.
> 
> Yeah, that's exactly what I'm advocating: making it easy should really
> be priority number one, with everything else coming second. If you see
> ways to adapt the design to target that specifically, I'm all for it.

Yeah, I have ideas, but seems like there might be another day of some 
discussion before I try to formally reframe a design doc. Here’s the direction 
I'm thinking:

- still have a "bro/cban" github repository, but it now contains git submodules 
that point directly to other github accounts/repos

- the initial submission process involves doing a pull request on the bro/cban 
repo where the only change made is the addition of a submodule.  These merges 
probably can be automated, but if a human were to do it, I’d expect it wouldn’t 
be a time-consuming task — just check if the change is adding a submodule and 
then click a button to merge (don’t even have to look at the contents of the 
submodule).

- a person that has submitted something to cban needs no further interaction 
with it and they resume their typical development workflow — cban client's 
“update” command will fetch/pull directly from their git repo.

- all submodules get scanned by an out-of-band nightly process which checks for 
brokenness and other quality metrics

- submodules that are found to have never been in a working state are 
auto-removed (or could initially be a task that’s not a big deal for a human to 
do every so often if metrics of brokenness are consistently available)

- it’s not a big deal for a submodule to temporarily enter in to a broken state 
— cban users can always roll back to a previous version or just uninstall it.  
It’s up to the community to communicate/collaborate directly w/ the author here 
to get things fixed.

- metadata associated w/ plugins is all optional, but its existence contributes 
to some arbitrary “quality” rating/metrics

- metadata logically can be categorized in two types, one type is related to 
discoverability (tags, author, license, etc) and one type is related to 
interoperability (version number, dependencies)

- discoverability metadata is aggregated during the nightly quality check 
processes and automatically commits that information to the “bro/cban” repo.  
Without doing this, I think cban clients would have an incredibly slow “search” 
command that goes out to each submodule individually and gathers metadata.  
(features related to discoverability might be lower priority in general)

- interoperability metadata can also be aggregated nightly along the 
discoverability metadata, but when the cban client is actually going to perform 
specific operations on a particular submodule, it gets this data directly from 
the cloned submodule(s) to make sure the info is up-to-date.  Version numbering 
can probably be done via git tags, but dependency info stored in a canonically 
named text file.

I think all those points make things easy on contributors, minimize direct 
involvement of the Bro Team in sorting out problems related to particular 
plugins, and provide a useful way for users to discover and maintain Bro 
plugins.  There’s more potential for users to encounter broken/bad plugins, but 
maybe that also encourages stronger community involvement w/ users more likely 
to try and help get problems resolved.

- Jon

___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev