Re: LyX CVS Depository

2001-01-04 Thread Jean-Marc Lasgouttes

> "Amir" == Amir Michail <[EMAIL PROTECTED]> writes:

Amir> I would like to download the entire CVS depository for LyX.
Amir> However, webcvs gives a depository that has been cut off at some
Amir> point in the past. I can no longer go to the very first versions
Amir> of the files.

The oldest repositories are lyx-1_0_x and lyx. The current one,
lyx-devel, dates back from September 99. I'm afraid we do not have
anything newer.

Amir> P.S. This is to test a new kind of search tool for code. It
Amir> would only work well if I have access to early commits to the
Amir> code.

This looks very interesting.

JMarc



LyX CVS Depository

2001-01-03 Thread Amir Michail

i,

I would like to download the entire
CVS depository for LyX.  However,
webcvs gives a depository that has been
cut off at some point in the past.  I can no longer
go to the very first versions of the files.

Do you know where I can get such a fuller CVS
depository for LyX?  While I need 
versions from the very beginning (or as early 
as possible), I do not need the very latest versions.

Amir

P.S.  This is to test a new kind of search tool
for code.  It would only work well if I have access
to early commits to the code.

--- FYI, tool description follows

CVSSearch:  A New Way to Search through Source Code
 
CVSSearch searches for code fragments
using CVS comments. Specifically, it
takes advantage of the fact that a CVS comment
typically describes the lines of code involved
in the commit and that this description
will typically hold for many future versions.
 
In other words, CVSSearch allows you to
better search the most recent version of
the code by looking at previous versions
to better understand the current version.
 
It works as follows:
 
* typically, each comment in a CVS commit not only
  describes the change made but also
  indirectly describes the purpose of the lines of code
  involved in that change (e.g., "added footnote feature"
  indirectly tells you that the lines involved
  in the commit have something to do with footnotes)
 
* each line in the code accumulates
  a "profile" that contains all words in commits
  that involved that line, and each word
  has an associated frequency, which is
  the number of commits that involved that
  line with a comment containing that word.
 
The idea is to let you search the code base
based on the profiles extracted from the CVS
comments. 

This has several advantages:
 
* if a line is affected by many commits, then
  you get multiple summaries/aspects of
  the purpose of that line, as described by
  multiple authors in multiple commits
  (in contrast, a comment in the code itself
  can be viewed as just one summary)
 
* you can search for something like
  "editing window" and get a match even
  if the code does not contain these words
  but at least one author decided to use
  those terms to describe his modifications
  to the code. (That is, this allows us to
  address the vocabulary mismatch problem.)
 
* you can search for "bug" to find lines
  in the code that are especially bug prone
  (since you have many commits with
   "bug fixed" or something similar)
 
* you get very precise information about
  the exact lines in the code that relate
  to your query (which need not appear
  in a contiguous region of code)
 
Intuitively speaking, a comment on a particular
version of an application will probably continue
to hold for a lot of versions that follow, so
it makes sense to combine commit comments
in this way.

The method I described can be viewed as computing
a *vertical* profile for each line from previous changes to the code.
 
It is also possible to compute a *horizontal* profile for lines
by looking at CVS comments in other projects with similar code.
Thus, to get a meaningful profile for a line/group of lines, it is
only necessary that a CVS comment has applied to those lines
in the past of the current application or in some other application
with similar code.  (You can use local similarity, as is done
with DNA, to identify similar code fragments in different contexts.)
 
Of course, you can combine vertical and horizontal profiles.
In this way, we can get around the great variation in CVS
comment quality.