Re: New tags for biology and medicine.

2007-09-07 Thread Rudi Cilibrasi
On 9/6/07, Benjamin Mesing [EMAIL PROTECTED] wrote:
 Hello,
 Are there packages out there, which work on general seqences (i.e. are
 independent of the type of the sequence). The utility sort comes to my
 mind, wich can work on many different types (numbers, strings, dates)
 What you describe is obviously a nice idea, but I think beyond the scope
 of debtags. A package for DNA analysis will probably not work when
 feeded with written language (without modification). And debtags is
 about describing what a package can do as it is.

The complearn-tools package is one example; it works well with genetic
sequences, protein sequences, written human languages, compiled
executable code, and many other domains.  It is in the class of
algorithms called universal learners which have recently gained
popularity. [1,2]  This terminology is not without support given
recent results in universal type theory. [3]

Best regards,

Rudi

[1]: http://citeseer.ist.psu.edu/689388.html
[2]:  R. Cilibrasi, P.M.B. Vitanyi, Clustering by compression, IEEE
Trans. Information Theory, 51:4(2005)
[3]: Seroussi, Gadiel, On universal types, HPL-2004-153 20040917, HP Tech report



   +Tag: works-with::sequence:nuceleic
   +Description: Nucleic acids
   + Sequence of nucleic acids: DNA, RNA but also non-natural nucleic acids
   such as PNA or LNA.
   +
   +Tag: works-with::sequence:peptidic
   +Description: Proteins
   + Sequence of aminoacids: peptides and proteins.
  
   Quite detailed, though otherwise, people proably won't pick
   works-with::sequence if searching for algorithms working on a DNA.
 
  I made this proposition with the goal of having a lot of debian-med
  packages which manipulate sequence. In that context, the biologist would
  naturally want to distinguish between proteins and nucleic acids: this
  is a very common distinction. But shall we wait before we have, say 50
  packages wihich have field::biology and works-with::sequence?

 I have suggested to move those into the biology:: facet, so you get full
 expressivity without bloating the works-with:: facet.


   +Tag: works-with-format::plaintext:aln
   +Tag: works-with-format::plaintext:fasta
   +Tag: works-with-format::plaintext:nexus
 
  This is definitely an area where there is an overlap between mime types
  and tags. But I would definitely be excited if debtags could propose
  toolchains which are connected by the formats they accept. Once again,
  we do not have the critical mass yet...

 Same here, I proposed to put them into biology::.


  A few words of the proposals you made in another mail:
 
   * ::bioinformatics, ::molecular-biology, ::structural-biology
  I would rather see field::biology:molecular than
  field::biology:molecular-biology,

 Sure.
 However, my proposal was to have biology::molecular-biology. Though, you
 seem to prefer to keep this in the main field facet, which is also ok.


  biology::molecular-biology:structural instead of
  biology::structural(-biology) may horrify some of our colleagues, though.

 I think you have misread my proposal here. Or I am misunderstand you.
 What would horrify your colleagues?


 * ::emboss
  I strongly advocate suite::emboss we will get the critical mass for it.

 Again I would move that into a biology facet.


  In conclusion, about the possiblity to manage ourselves our sets of
  tags. In the everyday work, one has a very narrow point of view of his
  tools. I use a PCR machine to make a PCR, I use a Pipetmanⓡ to
  pipette,... this could be expressed by biology::PCR, and
  biology::pipetting. But if we think harder, we can have a higher point
  of view. Instead of biology::PCR it would be use::amplification, or
  use::diagnostic, for instance, because the PCR machine produces DNA, but
  sometimes we want to keep it as a reagent, and some other times we just
  want to see its size and then we throw it away.
 
  So the questions I am wondering about are :
 
   - What is the most powerful approach ?
   - What is the expectations of our users ?
   - How can we interest our users in an unexpeced and powerful usage of
 the DebTags ?

 We had the dicussion of the degree of detail for the vocabulary (which
 is the set of facets and tags) before, and most agree that a high degree
 is desirable. The complexity of a larger number of tags can be made
 manageable by a good user interface. However, I think this applies only
 for the general purpose domain (i.e. search criteria required by the
 majority of users). The other special purpose domains like (devel,
 security, medicine,..) IMO should be provided in seperate modules of
 encapsulation (if you forgive me using this term from the software
 enginnering terminology) - which in this case can be represented as
 seperate facets. Within those facets a high degree of detail can be
 achieved again.


  I think that an advanced usage of Debtags is the only way to bring
  attention of users and ourselves to programs which we do not expect to
  be relevant to 

Re: New tags for biology and medicine.

2007-09-06 Thread Andreas Tille
On Wed, 5 Sep 2007, Benjamin Mesing wrote:

 We had a short discussion on IRC about your proposal, and as far as we
 are concerned, Option 2. would be Ok for us (obviously Option 1. would
 also be ok, since we wouldn't have anything to do with that ;-). We
 would like to put the following tags in the main hierarchy either way:
  * field::medicine
  * use::comparison (though Enrico warned about the name - we would
imagine a diff tool from that, but I think it is just fine to
use it with different interpretation)
  * use::analysis
  * field::medicine:imaging (I wouldn't want to place that into
biology:: and don't see the need for a med:: facet yet)

I'm perfectly fine with this except the last item.  The currently
available packages for medical imaging do definitely not belong into
a biology section.  It is clearly about medicine and handles medical
image formats like DICOM.  Moreover we have a medical practice management
system (GNUmed) which does not really fit in any yet existing category.

 If there are no objections I will add those in roughly a week.

This would be great.

 And the following tags in the biology facet (note that I have adapted
 some of the tag names):
  * ::bioinformatics, ::molecular-biology, ::structural-biology
(though those could go into field::biology if you rather see
that)
  * ::format:aln, ::format:fasta, ::format:nexus (or would you
rather have aln-format, fasta-format,..?)
  * ::emboss
  * ::nucleic-acids, ::peptides
  * ::alignment-analysis, ::phylogeny-analysis (if you really think
this is neccessary)

 Once this is agreed upon and the remaining questions are answered, I
 will add the biology facet.

I would regard this as a very reasonable compromise.

 We are not sure about the ::algorithm:* thing. They are not biology
 specific so it would be odd to put them there. Besides, Enrico pointed
 out, that nearly everything (at least the software) is made-of
 algorithms. Additionally, to me the whole made-of facet does not seem
 very concise anyways...

I trust in Enricos sane experience. ;-))

Kind regards

 Andreas.

-- 
http://fam-tille.de

___
Debtags-devel mailing list
Debtags-devel@lists.alioth.debian.org
http://lists.alioth.debian.org/mailman/listinfo/debtags-devel


Re: New tags for biology and medicine.

2007-09-06 Thread Benjamin Mesing
On Thu, 2007-09-06 at 09:11 +0200, Andreas Tille wrote:
 On Wed, 5 Sep 2007, Benjamin Mesing wrote:
 
  We had a short discussion on IRC about your proposal, and as far as we
  are concerned, Option 2. would be Ok for us (obviously Option 1. would
  also be ok, since we wouldn't have anything to do with that ;-). We
  would like to put the following tags in the main hierarchy either way:
   * field::medicine
   * use::comparison (though Enrico warned about the name - we would
 imagine a diff tool from that, but I think it is just fine to
 use it with different interpretation)
   * use::analysis
   * field::medicine:imaging (I wouldn't want to place that into
 biology:: and don't see the need for a med:: facet yet)
 
 I'm perfectly fine with this except the last item.  The currently
 available packages for medical imaging do definitely not belong into
 a biology section.  It is clearly about medicine and handles medical
 image formats like DICOM.  Moreover we have a medical practice management
 system (GNUmed) which does not really fit in any yet existing category.

I think you have misunderstood me. It's the same that I thought.
Therefore I left it in the  field:: facet as field::medicine:imaging.

Regards Ben




___
Debtags-devel mailing list
Debtags-devel@lists.alioth.debian.org
http://lists.alioth.debian.org/mailman/listinfo/debtags-devel


Re: New tags for biology and medicine.

2007-09-06 Thread Charles Plessy
Dear all,

I was a bit lazily waiting for the conversation to settle before trying
to aswer :)


 +Tag: field::biology:bioinformatics
 +Tag: field::biology:molecular
 +Tag: field::biology:structural
 
 This is probably a reasonable distinction, though we have to decide if
 we want such a fine-grained separation of the field facet. We would
 also end up with needing the same level of detail for electronics,
 chemistry, physics,...

I think that I would have a pragmatic approach : fine-graining as long
as there is a consensual demand. By this I mean that fine-graing a facet
should not become a hassle for the package maintainers who are not
interested in them. In the case of the Debian-Med project, I think that
each time we will propose such kind of tags it will mean that we have
people dedicated to screen all the parent tags and assign the
fine-grained if necessary.

(by the way, could there be a subscription mechanism to monitor addition
and removal of tags ?)


 +Tag: field::medicine:imaging

I support creating field::medicine:imaging, and using field::biology +
use::analysis and works-with::image instead of field::biology:imaging. I
think that as long as we do not package software such as microscope
control tools it would make sense. field::medicine:imaging, on the other
hand, already have candidate package whose usage is broader than just
taking and viewing pictures.


 +Tag: made-of::algorithm:dynamic-programming
 +Tag: made-of::algorithm:hashing
 +Tag: made-of::algorithm:hidden-markov-model
 +Tag: made-of::algorithm:neural-network

I like the idea, but I see that it is not consensual, and I think that
we did not reach a critical mass yet. I propose to postpone: let us keep
the proposition in debian-med's SVN, and see in a few monthes when we
have improved our coverage.


 +Tag: works-with::sequence
 +Description: Sequence
 + The program manipulates data made of a sequence of elements from a
 finite set.
 
 Somehow this is different to the current tags in works-with, but I
 believe it could fit in. E.g. sorting applications could also fit in
 here?

I think that this is exactly the goal. Sometimes there is innovative
research which is done by taking tools for analysing genome sequence and
utilizing them on written language, or vice-versa. I would see this tag
with a high level of abstraction.


 +Tag: works-with::sequence:nuceleic
 +Description: Nucleic acids
 + Sequence of nucleic acids: DNA, RNA but also non-natural nucleic acids
 such as PNA or LNA.
 +
 +Tag: works-with::sequence:peptidic
 +Description: Proteins
 + Sequence of aminoacids: peptides and proteins.
 
 Quite detailed, though otherwise, people proably won't pick
 works-with::sequence if searching for algorithms working on a DNA.

I made this proposition with the goal of having a lot of debian-med
packages which manipulate sequence. In that context, the biologist would
naturally want to distinguish between proteins and nucleic acids: this
is a very common distinction. But shall we wait before we have, say 50
packages wihich have field::biology and works-with::sequence?


 +Tag: works-with-format::plaintext:aln
 +Tag: works-with-format::plaintext:fasta
 +Tag: works-with-format::plaintext:nexus

This is definitely an area where there is an overlap between mime types
and tags. But I would definitely be excited if debtags could propose
toolchains which are connected by the formats they accept. Once again,
we do not have the critical mass yet...


 I am not sure it is a good idea to put those beneath plaintext. There
 are the two cases: 
  1. Someone searching for a tool for editing plaintext would end up
 with the special purpose plaintext:aln editors, which IMO is
 undesirable.
  2. Someone searching for a special purpose plaintext:aln editor
 could deduce from the tag name, that he could also use
 plaintext, and if he knows that ALN is a plaintext format he
 could navigate there smoothly (which assumes that the tags are
 displayed in a hierarchical manner).
 
 So the formats could as well be top level. Though this would mean
 cluttering the works-with-format facet. Could there be a
 works-with-format::special-purpose:* group?
 Do we need a way to express releationships beween tags like: show
 works-with-format::plaintext:aln only if field::biology or
 field::medicine is selected? Or do we want to cover this by requiring
 sophisticated UIs, which detect this in an automatic fashion.

I will bravely let you choose, as you know much better Debtags than I
do. I think that it could be useful to know that fasta, nexus, aln, ...
are plaintext format.


 +Tag: use::comparison:alignment
 +Description: Alignment
 + To identify similarities in two objects by maximising the overlap of
 identical parts.
 +
 +Tag: use::comparison:phylogeny
 +Description: Phylogenetic analysis
 + To infer lineage relationships.
 
 Those seems to be covered by use analysis to me.

Alignment and phylogeny are very 

Re: New tags for biology and medicine.

2007-09-06 Thread Benjamin Mesing
Hello,

On Thu, 2007-09-06 at 20:30 +0900, Charles Plessy wrote:
 Dear all,
 
 I was a bit lazily waiting for the conversation to settle before trying
 to aswer :)
 
 
  +Tag: field::biology:bioinformatics
  +Tag: field::biology:molecular
  +Tag: field::biology:structural
  
  This is probably a reasonable distinction, though we have to decide if
  we want such a fine-grained separation of the field facet. We would
  also end up with needing the same level of detail for electronics,
  chemistry, physics,...
 
 I think that I would have a pragmatic approach : fine-graining as long
 as there is a consensual demand. By this I mean that fine-graing a facet
 should not become a hassle for the package maintainers who are not
 interested in them. In the case of the Debian-Med project, I think that
 each time we will propose such kind of tags it will mean that we have
 people dedicated to screen all the parent tags and assign the
 fine-grained if necessary.

There are two more things to consider:
 1. the users who do searching based on tags and
 2. the people doing the tagging.
With each tag, the complexity of the vocabulary will be increased and
only a small percentage of the people mentioned above is interested in
the level of detail provided by the med-specific tags. However, they
have to deal with those tags either way. To reduce the burden of those
people it is, that I proposed to keep the tags in a separate facet. It
might even make things easier for med-interested people, because they
would probably recognise the biology:: facet as an important one and go
straight there to look for interesting tags.

 (by the way, could there be a subscription mechanism to monitor addition
 and removal of tags ?)

I believe the best thing right now is an SVN diff, which could
theoretically be hooked into sending an email upon changes. However, no
such thing is currently implemented (I believe).


  +Tag: works-with::sequence
  +Description: Sequence
  + The program manipulates data made of a sequence of elements from a
  finite set.
  
  Somehow this is different to the current tags in works-with, but I
  believe it could fit in. E.g. sorting applications could also fit in
  here?
 
 I think that this is exactly the goal. Sometimes there is innovative
 research which is done by taking tools for analysing genome sequence and
 utilizing them on written language, or vice-versa. I would see this tag
 with a high level of abstraction.

Are there packages out there, which work on general seqences (i.e. are
independent of the type of the sequence). The utility sort comes to my
mind, wich can work on many different types (numbers, strings, dates)
What you describe is obviously a nice idea, but I think beyond the scope
of debtags. A package for DNA analysis will probably not work when
feeded with written language (without modification). And debtags is
about describing what a package can do as it is.


  +Tag: works-with::sequence:nuceleic
  +Description: Nucleic acids
  + Sequence of nucleic acids: DNA, RNA but also non-natural nucleic acids
  such as PNA or LNA.
  +
  +Tag: works-with::sequence:peptidic
  +Description: Proteins
  + Sequence of aminoacids: peptides and proteins.
  
  Quite detailed, though otherwise, people proably won't pick
  works-with::sequence if searching for algorithms working on a DNA.
 
 I made this proposition with the goal of having a lot of debian-med
 packages which manipulate sequence. In that context, the biologist would
 naturally want to distinguish between proteins and nucleic acids: this
 is a very common distinction. But shall we wait before we have, say 50
 packages wihich have field::biology and works-with::sequence?

I have suggested to move those into the biology:: facet, so you get full
expressivity without bloating the works-with:: facet.


  +Tag: works-with-format::plaintext:aln
  +Tag: works-with-format::plaintext:fasta
  +Tag: works-with-format::plaintext:nexus
 
 This is definitely an area where there is an overlap between mime types
 and tags. But I would definitely be excited if debtags could propose
 toolchains which are connected by the formats they accept. Once again,
 we do not have the critical mass yet...

Same here, I proposed to put them into biology::.


 A few words of the proposals you made in another mail:
 
  * ::bioinformatics, ::molecular-biology, ::structural-biology
 I would rather see field::biology:molecular than
 field::biology:molecular-biology, 

Sure.
However, my proposal was to have biology::molecular-biology. Though, you
seem to prefer to keep this in the main field facet, which is also ok.


 biology::molecular-biology:structural instead of
 biology::structural(-biology) may horrify some of our colleagues, though.

I think you have misread my proposal here. Or I am misunderstand you.
What would horrify your colleagues?


* ::emboss
 I strongly advocate suite::emboss we will get the critical mass for it.

Again I would move that into a 

Re: New tags for biology and medicine.

2007-09-06 Thread Charles Plessy
Le Thu, Sep 06, 2007 at 10:21:57PM +0200, Benjamin Mesing a écrit :
 
 There are two more things to consider:
  1. the users who do searching based on tags and
  2. the people doing the tagging.
 With each tag, the complexity of the vocabulary will be increased and
 only a small percentage of the people mentioned above is interested in
 the level of detail provided by the med-specific tags.

Hello,

nobody wants to be lost in a space with too many dimentions which are
almost empty. This is why I would prefer to express the properties of
the package with already existing tags rather than with private
biology:: tags. But I will of course not oppose anybody using this
approach, and will do my best so that the packages in our radar are
using them appropriately if they exist.

So unless there is a new idea popping out, my recommendation is to
commit the tags for which we all aggreed on, and re-open the discussion
in a few monthes where:

 - we in Debian-Med have extended our software coverage,
 - you have got diverse feedback from other Debian teams.

Have a nice day,


PS:

  biology::molecular-biology:structural instead of
  biology::structural(-biology) may horrify some of our colleagues, though.
 
 I think you have misread my proposal here. Or I am misunderstand you.
 What would horrify your colleagues?

That structural biology is a whole discipline of its own, and not a mere
offspring of molecular biology ;)

-- 
Charles Plessy
http://charles.plessy.org
Wako, Saitama, Japan

___
Debtags-devel mailing list
Debtags-devel@lists.alioth.debian.org
http://lists.alioth.debian.org/mailman/listinfo/debtags-devel


Re: New tags for biology and medicine.

2007-09-05 Thread Benjamin Mesing
Hello

  Thus we need to decide, if those details should become
  part of the main vocabulary database.
 
 Well, I don't think that we should make a harsh difference compared
 to the main vocabulary database.  Considering the effect of a less
 fine grained tagging: People will be presented a list of (guess)
 20 items instead of 3-5 items for the more fine grained list, but
 I think 20 packages in a list are manageable.  The danger of bloating
 the system with about 15 more packages you might not need is not
 really a thing many people are scary about.

Sorry, I can't really follow your thoughts here, do you vote against a
fine-grained tagging? With the fine-grained tags, you will have more
tags, but usually a smaller result set (i.e. package list). So what you
are bloating is the vocabulary (the set of all available tags and
facets).

  Another way would be to provide
  them in a different vocabulary/tag database - debtags supports multiple
  of those.
 
 Just for the sake of academical interest: What are the consequences of
 a differnet vocabulary/tag database?  I guess the drawback is higher
 than a fine grained tagging.

Advantages: 
  * clean separation 
  * you keep the full expressivity of the main vocabulary (i.e. you
can add tags into the other facets like works-with, made-of...) 
Disadvantages: 
  * additional administrative overhead for hosting the tag database 
  * additional overhead for users of this tag database, which must
be enabled one way or another 
  * tagging infrastructure must be provided (or happen centrally by
the Debian-med team) 


  +Tag: field::biology:bioinformatics
  +Description: Bioinformatics
  + Sequence analysis software.
  +
  +Tag: field::biology:molecular
  +Description: Molecular biology
  + Software useful to molecular cloning and related wet biology.
  +
  +Tag: field::biology:structural
  +Description: Structural biology
  + Software useful to model tridimentional structures.
  +
 
  This is probably a reasonable distinction, though we have to decide if
  we want such a fine-grained separation of the field facet.
 
 I also wonder whether we gain much at users and.  It might happen that
 users have a slightly different perception of these terms and we could

This would hint to have them only inside a special debian-med:: area.

  We would
  also end up with needing the same level of detail for electronics,
  chemistry, physics,...
 
 Well, this is always the same - you need someone who does the job.
 Debian-Med just joins forces for people interested in medicine and
 biology so we are a little bit ahead. :)

Sure, I am not saying that we actually *need* the level of detail there,
but that eventually the same level of detail will arise in the other
areas, which will bloat the vocabulary.

Regards Ben


___
Debtags-devel mailing list
Debtags-devel@lists.alioth.debian.org
http://lists.alioth.debian.org/mailman/listinfo/debtags-devel


Re: New tags for biology and medicine.

2007-09-04 Thread Benjamin Mesing
Hello,

I believe it is past time to react to this proposal, we don't want to be
seen as some kind of black hole, everything that goes in never comes
out. And since I have some spare time at hand, I will make a start.
Generally speaking the proposed tags are relativly detailed. I believe
this level of detail is required only by biologists or people in the
medical field. Thus we need to decide, if those details should become
part of the main vocabulary database. Another way would be to provide
them in a different vocabulary/tag database - debtags supports multiple
of those.

Below you can find my thoughts towards the proposal under the assumption
that the tags should become part of the main database.

Index: debian-packages
===
--- debian-packages (révision 2253)
+++ debian-packages (copie de travail)
@@ -559,6 +559,18 @@
 
+Tag: field::biology:bioinformatics
+Description: Bioinformatics
+ Sequence analysis software.
+
+Tag: field::biology:molecular
+Description: Molecular biology
+ Software useful to molecular cloning and related wet biology.
+
+Tag: field::biology:structural
+Description: Structural biology
+ Software useful to model tridimentional structures.
+

This is probably a reasonable distinction, though we have to decide if
we want such a fine-grained separation of the field facet. We would
also end up with needing the same level of detail for electronics,
chemistry, physics,...

+Tag: field::medicine
+Description: Medicine

I believe that one is agreed upon.


+Tag: field::medicine:imaging
+Description: Medical Imaging
+

Same as for the ::biology:* tags

 
+Tag: made-of::algorithm:dynamic-programming
+Description: Dynamic programming
+
+Tag: made-of::algorithm:hashing
+Description: Hashing
+
+Tag: made-of::algorithm:hidden-markov-model
+Description: Hiden Markov Model (HMM)
+
+Tag: made-of::algorithm:neural-network
+Description: Neural Network
+

Can you please give an example of such a package? I have no idea how a
package made of an algorithm looks like.

 
+Tag: works-with::sequence
+Description: Sequence
+ The program manipulates data made of a sequence of elements from a
finite set.

Somehow this is different to the current tags in works-with, but I
believe it could fit in. E.g. sorting applications could also fit in
here?


+Tag: works-with::sequence:nuceleic
+Description: Nucleic acids
+ Sequence of nucleic acids: DNA, RNA but also non-natural nucleic acids
such as PNA or LNA.
+
+Tag: works-with::sequence:peptidic
+Description: Proteins
+ Sequence of aminoacids: peptides and proteins.

Quite detailed, though otherwise, people proably won't pick
works-with::sequence if searching for algorithms working on a DNA.

 
+Tag: works-with-format::plaintext:aln
+Description: Clustal/ALN
+ Used in multiple alignment of biological sequences.
+
+Tag: works-with-format::plaintext:fasta
+Description: Fasta/Pearson
+ Very popular format for biological sequencs.
+
+Tag: works-with-format::plaintext:nexus
+Description: Nexus
+ Popular format for phylogenetic trees.

I am not sure it is a good idea to put those beneath plaintext. There
are the two cases: 
 1. Someone searching for a tool for editing plaintext would end up
with the special purpose plaintext:aln editors, which IMO is
undesirable.
 2. Someone searching for a special purpose plaintext:aln editor
could deduce from the tag name, that he could also use
plaintext, and if he knows that ALN is a plaintext format he
could navigate there smoothly (which assumes that the tags are
displayed in a hierarchical manner).

So the formats could as well be top level. Though this would mean
cluttering the works-with-format facet. Could there be a
works-with-format::special-purpose:* group?
Do we need a way to express releationships beween tags like: show
works-with-format::plaintext:aln only if field::biology or
field::medicine is selected? Or do we want to cover this by requiring
sophisticated UIs, which detect this in an automatic fashion.

+Tag: suite::emboss
+Description: EMBOSS
+ Software and data related to the European Molecular Biology Open
Software Suite.
+

Sounds good to me.

 
+Tag: use::analysis
+Description: Analysis
+ Software for turning data into knowledge.
+

Agreed.
 
+Tag: use::comparison
+Description: Comparison
+ To find what relates or differs in two or more objects.

Agreed.

+
+Tag: use::comparison:alignment
+Description: Alignment
+ To identify similarities in two objects by maximising the overlap of
identical parts.
+
+Tag: use::comparison:phylogeny
+Description: Phylogenetic analysis
+ To infer lineage relationships.

Those seems to be covered by use analysis to me.


Thanks Charles for brining the topic up again.

Regards Ben


___
Debtags-devel mailing list
Debtags-devel@lists.alioth.debian.org
http://lists.alioth.debian.org/mailman/listinfo/debtags-devel

Re: New tags for biology and medicine.

2007-09-04 Thread Andreas Tille

On Tue, 4 Sep 2007, Benjamin Mesing wrote:


I believe it is past time to react to this proposal, we don't want to be
seen as some kind of black hole, everything that goes in never comes
out. And since I have some spare time at hand, I will make a start.
Generally speaking the proposed tags are relativly detailed. I believe
this level of detail is required only by biologists or people in the
medical field.


Probably.


Thus we need to decide, if those details should become
part of the main vocabulary database.


Well, I don't think that we should make a harsh difference compared
to the main vocabulary database.  Considering the effect of a less
fine grained tagging: People will be presented a list of (guess)
20 items instead of 3-5 items for the more fine grained list, but
I think 20 packages in a list are manageable.  The danger of bloating
the system with about 15 more packages you might not need is not
really a thing many people are scary about.


Another way would be to provide
them in a different vocabulary/tag database - debtags supports multiple
of those.


Just for the sake of academical interest: What are the consequences of
a differnet vocabulary/tag database?  I guess the drawback is higher
than a fine grained tagging.


Index: debian-packages
===
--- debian-packages (révision 2253)
+++ debian-packages (copie de travail)
@@ -559,6 +559,18 @@

+Tag: field::biology:bioinformatics
+Description: Bioinformatics
+ Sequence analysis software.
+
+Tag: field::biology:molecular
+Description: Molecular biology
+ Software useful to molecular cloning and related wet biology.
+
+Tag: field::biology:structural
+Description: Structural biology
+ Software useful to model tridimentional structures.
+

This is probably a reasonable distinction, though we have to decide if
we want such a fine-grained separation of the field facet.


I also wonder whether we gain much at users and.  It might happen that
users have a slightly different perception of these terms and we could


We would
also end up with needing the same level of detail for electronics,
chemistry, physics,...


Well, this is always the same - you need someone who does the job.
Debian-Med just joins forces for people interested in medicine and
biology so we are a little bit ahead. :)


+Tag: field::medicine
+Description: Medicine

I believe that one is agreed upon.


+Tag: field::medicine:imaging
+Description: Medical Imaging
+

Same as for the ::biology:* tags


Well, I do not agree here completely.  We have a fair amount of
packages (becoming more soon) that deal with medical imaging.
If people are interested just in imaging they probably do not
like things like a practice management system (depending on
PostgreSQL server and other stuff).  So IMHO this differentation
might be worth the effort, but if you think it would spoil
the principle of keeping things simple - just leave it out.


Quite detailed, though otherwise, people proably won't pick
works-with::sequence if searching for algorithms working on a DNA.


I'm afraid you are right here.


So the formats could as well be top level. Though this would mean
cluttering the works-with-format facet. Could there be a
works-with-format::special-purpose:* group?


Sounds very reasonable.

Thanks for your input.  It partly shows that outsiders are able
to bring some abstraction into the focussed view of specialists.

Kind regards

   Andreas.

--
http://fam-tille.de
___
Debtags-devel mailing list
Debtags-devel@lists.alioth.debian.org
http://lists.alioth.debian.org/mailman/listinfo/debtags-devel

Re: New tags for biology and medicine.

2007-09-04 Thread Steffen Moeller
Hello,

On Tuesday 04 September 2007 12:10:30 Benjamin Mesing wrote:
 Generally speaking the proposed tags are relativly detailed. I believe
 this level of detail is required only by biologists or people in the
 medical field. Thus we need to decide, if those details should become
 part of the main vocabulary database. Another way would be to provide
 them in a different vocabulary/tag database - debtags supports multiple
 of those.

[...]
 +Tag: field::biology:structural
 +Description: Structural biology
 + Software useful to model tridimentional structures.
 +

 This is probably a reasonable distinction, though we have to decide if
 we want such a fine-grained separation of the field facet. We would
 also end up with needing the same level of detail for electronics,
 chemistry, physics,...

Yes, I think we do. The following two reasons jump to mind:
 * When thinking about automated installations of software (i.e. in grid 
computing) we need a language that allows us to talk about what is eligible 
for installations and what is not. Debtags are not perfect and other efforts 
describing various kinds of properties that software can have, there is 
nothing as sweet as Debtags to talk about what the software is actually 
doing.
 * Debian integrates communities. This is my way to read Custom Debian 
Distributions that are basically saying they people flock together to extend 
Debian towards a particular direction. Specialisation of Debian comes with a 
specialisation of terms. It is natural.

I like the above sketched suggestion to allow for disjunct sets of facets that 
are maintained by different communities. It would seem natural to me to 
eventually allow for sub-facets of some kind with a higher number of : in 
their IDs to thus allow for an easier reduction of complexity. Though ... 
well ... it may not be needed tomorrow.

Many greetings

Steffen




signature.asc
Description: This is a digitally signed message part.
___
Debtags-devel mailing list
Debtags-devel@lists.alioth.debian.org
http://lists.alioth.debian.org/mailman/listinfo/debtags-devel

Re: New tags for biology and medicine.

2007-09-04 Thread Andreas Tille
On Tue, 4 Sep 2007, Steffen Moeller wrote:

 * When thinking about automated installations of software (i.e. in grid
 computing) we need a language that allows us to talk about what is eligible
 for installations and what is not. Debtags are not perfect and other efforts
 describing various kinds of properties that software can have, there is
 nothing as sweet as Debtags to talk about what the software is actually
 doing.

Well, I'm convinced that DebTags might be a very great tools for different
things, but I doubt that it is the best idea to base installations of
clusters on DebTags technology.  You certainly want to know _exactly_
what is installed on your cluster and do not really want it to be changed
by any change in the DebTags database.  I think for this purpose the
meta package approach is the better way to go.

 * Debian integrates communities. This is my way to read Custom Debian
 Distributions that are basically saying they people flock together to extend
 Debian towards a particular direction. Specialisation of Debian comes with a
 specialisation of terms. It is natural.

Sure.  But I think subsetting makes sense in case your main set is
to large to be managed with the means you have at hand.  IMHO this
is actually not (yet) the case.  We want to extend Debian but I don't
think that we should try to make a science out of classifying and
subsetting what finally might end up on a real live installation
all together again.

 I like the above sketched suggestion to allow for disjunct sets of facets that
 are maintained by different communities. It would seem natural to me to
 eventually allow for sub-facets of some kind with a higher number of : in
 their IDs to thus allow for an easier reduction of complexity. Though ...
 well ... it may not be needed tomorrow.

I think we could wait with our fine grained subsets until this is
implemented.  Once this is done also the number of packages that
rectifies a more fine grained subsetting will have increased. :)

Kind regards

   Andreas.

-- 
http://fam-tille.de

___
Debtags-devel mailing list
Debtags-devel@lists.alioth.debian.org
http://lists.alioth.debian.org/mailman/listinfo/debtags-devel


New tags for biology and medicine.

2007-08-23 Thread Charles Plessy
Hi Debtags team !

This is the monthly reminder that the Debian-Med team proposed new tags
in May,

http://lists.alioth.debian.org/pipermail/debtags-devel/2007-May/001630.html
http://lists.alioth.debian.org/pipermail/debtags-devel/2007-July/001658.html

Have a nice day,

-- 
Charles Plessy
http://charles.plessy.org
Wako, Saitama, Japan

___
Debtags-devel mailing list
Debtags-devel@lists.alioth.debian.org
http://lists.alioth.debian.org/mailman/listinfo/debtags-devel