Re: [gentoo-user] Re: attic

2023-09-04 Thread Alan McKinnon
On Mon, Sep 4, 2023 at 2:36 PM Rich Freeman  wrote:

> On Mon, Sep 4, 2023 at 4:38 AM William Kenworthy 
> wrote:
> >
> > On 4/9/23 16:04, Nuno Silva wrote:
> > >
> > > (But note that Rich was suggesting using the *search* feature of the
> > > gitweb interface, which, in this case, also finds the same topmost
> > > commit if I search for "reedsolomon".)
> > >
> > tkx, missed that!
>
> Note that in terms of indexing git and CVS have their pros and cons,
> because they use different data structures.  I've heard the saying
> that Git is a data structure masquerading as an SCM, and certainly the
> inconsistencies in the command line operations bear that out.
>

I'd always heard that Git is a file system and all useful side effects are
pure luck

-- 
Alan McKinnon
alan dot mckinnon at gmail dot com


Re: [gentoo-user] Re: attic

2023-09-04 Thread Rich Freeman
On Mon, Sep 4, 2023 at 4:38 AM William Kenworthy  wrote:
>
> On 4/9/23 16:04, Nuno Silva wrote:
> >
> > (But note that Rich was suggesting using the *search* feature of the
> > gitweb interface, which, in this case, also finds the same topmost
> > commit if I search for "reedsolomon".)
> >
> tkx, missed that!

Note that in terms of indexing git and CVS have their pros and cons,
because they use different data structures.  I've heard the saying
that Git is a data structure masquerading as an SCM, and certainly the
inconsistencies in the command line operations bear that out.  Git
tends to be much more useful in general, but for things like finding
deleted files CVS was definitely more time-efficient.

The reason for this is that everything in git is reachable via
commits, and these are reachable from a head via a linked list.  The
most recent commit gives access to the current version of the
repository, and a pointer to the immediately previous commit(s).  To
find a deleted file, git must go to the most recent commit in whatever
branch you are searching, then descend its tree to look for the file.
If it is not found, it then goes to the previous commit and descends
that tree.  There are 745k commits in the active Gentoo repository.  I
think there are something like 2M of them in the historical one.  Each
commit is a random seek, and then each step down the directory tree to
find a file is another random seek.

In CVS everything is organized first by file, and then each file has
its own commit history.  So finding a file, deleted or otherwise, just
requires a seek for each level in the directory tree.  Then you can
directly read its history.

So finding an old deleted file in the gentoo git repo can require
millions of reads, while doing so in CVS only required about 3.  It is
no surprise that the web interfaces were designed to make that
operation much easier - if you do sufficiently complex searches in the
git web interface it will time you out to avoid bogging down the
server, which is why some searches may require you to clone the repo
and do it locally.

Now, if you want to find out what changed in a particular commit the
situation is reversed.  If you identify a commit in git and want to
see what changed, it can directly read the commit from disk using its
hash.  It then looks at the parent commit, then descends both trees
doing a diff at each level.  Since everything is content-hashed only
directory trees that contain differences need to be read.  If a commit
had changes to 50 files, it might only take 10 reads to figure out
which files changed, and then another 100 to compare the contents of
each file and generate diffs.  If you wanted to do that in CVS you'd
have to read every single file in the repository and read the
sequential history of each file to find any commits that have the same
time/author.  CVS commits also aren't atomic so ordering across files
might not be the same.

Git is a thing of beauty when you think about what it was designed to
do and how well-suited to this design its architecture is.  The same
can be said of several data-driven FOSS applications.  The right
algorithm can make a huge difference...

-- 
Rich



Re: [gentoo-user] Re: attic

2023-09-04 Thread William Kenworthy



On 4/9/23 16:04, Nuno Silva wrote:

On 2023-09-04, William Kenworthy wrote:


On 3/9/23 18:29, Rich Freeman wrote:

On Sun, Sep 3, 2023 at 4:44 AM Michael  wrote:

On Sunday, 3 September 2023 07:49:36 BST William Kenworthy wrote:

Hi , I used to be able to get old ebuilds from "the attic" but I cant
find it on google - is it still around?

Perhaps have a look here at the archives?

https://gitweb.gentoo.org/

The archives will only contain data migrated from CVS - so only things
from more than a few years ago.

You want to look into the main repo for anything recently deleted.

[...]

This can be done via the website, though the search capability is a
little limited.  I ended up having to search from a local clone
because your package name contains an error and the web search found
nothing.

To find your file, go to:
https://gitweb.gentoo.org/repo/gentoo.git/
Go to the search box in the top right and search for:
dev-python/reedsolomon (note that the package category is
different from what was in your email)
Find the commit one commit before the one that removed your package.
(ie one that contains your package in its most recent version)  If you
find the one that deleted your file, then just look at the parent in
the commit header and click on that to go back one version where it is
still present.
Click the tree hash to browse the historical version of the repository
that existed before your file was deleted.
For example, you can find v1.6.1 of that package at:
https://gitweb.gentoo.org/repo/gentoo.git/tree/dev-python/reedsolomon/reedsolomon-1.6.1.ebuild?id=149a131188ebce76a87fd8363fb212f5f1620a02

[...]

The web git interface is capable of displaying past commits.  It just
can't search for wildcards/etc.


Thanks Rich,

unfortunately the web interface isn't helpful - I cant just navigate
the tree to find commits -
"https://gitweb.gentoo.org/repo/gentoo.git/tree/dev-python/reedsolomon/;
gives path not found - it looks like you have to know the commit first
by downloading the git tree to search it - not friendly at all!

With /log/ instead of /tree/ in the URL it at least shows the list of
commits. From a quick check, this seems to include the commit removing
the directory when it's removed instead of renamed, so hopefully it
helps too with retrieving older ebuilds?

(But note that Rich was suggesting using the *search* feature of the
gitweb interface, which, in this case, also finds the same topmost
commit if I search for "reedsolomon".)


tkx, missed that!

BillK





Re: [gentoo-user] Re: Attic files (app-admin/rackview) removed?

2016-06-07 Thread R0b0t1
On Jun 7, 2016 5:00 PM, "James"  wrote:
>
> R0b0t1  gmail.com> writes:
>
>
> > > vs -d :pserver:anonymous  anoncvs.gentoo.org:/var/cvsroot co
> > > gentoo-x86/app-admin/rackview/files
> > >
> > > was support to download the files and the ebuild, manifest etc etc.?
> > >
> > > Is there a single (anoncvs) command syntax to use, in general to pull
> > > complete (theoretically compilable) sources from the archive? It's
been
> > > a while so my cvs could easily be incorrect
> > >
> > > wget is a champ.
> > >
> > > curiously,
> > > James
> > >
> > Use the recursive option. Save the files to your portage tree.
>
>
> man cvs::
> -R
> Process  directories  recursively.   This is the default for all cvs
> commands, with the exception of ls & rls.
>
> So, I have a '/usr/local/portage/app-admin/rackview/' dir.
>
> What is the correct syntax string (I borked this a few times
> to no avail)?
>
>
> curiously,
> James
> ???
> James
>

`cd /usr/portage && wget -r $url`
or
`wget -P /usr/portage -r $url`
or
`wget --help`


Re: [gentoo-user] Re: Attic files (app-admin/rackview) removed?

2016-06-07 Thread R0b0t1
On Jun 7, 2016 12:11 PM, "James"  wrote:
>
> James  tampabay.rr.com> writes:
>
>
> > https://sources.gentoo.org/cgi-bin/viewvc.cgi/gentoo-x86/app-admin/
> > rackview/?hideattic=0
>
> >  rackview-0.09-r3.ebuild  seems to have been removed from the attic?
>
>
> I have to revert to using 'wget' to snag the files and a copy
> of the latest ebuild. I thought the command string given the the page::
>
> vs -d :pserver:anonym...@anoncvs.gentoo.org:/var/cvsroot co
> gentoo-x86/app-admin/rackview/files
>
> was support to download the files and the ebuild, manifest etc etc.?
>
> Is there a single (anoncvs) command syntax to use, in general to pull
> complete (theoretically compilable) sources from the archive? It's been
> a while so my cvs could easily be incorrect
>
> wget is a champ.
>
> curiously,
> James
>

Use the recursive option. Save the files to your portage tree.


Re: [gentoo-user] Re: Attic (cvs) -> ???(git)

2016-02-25 Thread Rich Freeman
On Mon, Feb 22, 2016 at 4:49 PM, James  wrote:
> Rich Freeman  gentoo.org> writes:
>
>> If I were doing anything too
>> crazy with all this I'd probably use the python git module.
>
> dev-python/git-python ???   Any others or related docs/howtos/examples?
>

I used pygit2, but there are a few different implenentations and
plenty of docs online in general.

Here is an example program that runs through a history and dumps a
list of commits and their metadata in csv format:
https://github.com/rich0/gitvalidate/blob/master/gitdump/parsetrees.py

There are some other scripts that retrieve blobs and manipulate them
in the same directory.  This was part of the validation of the git
migration, which uses a map-reduce algorithm to diff every single
commit in a git history and identify all file revisions (which creates
a cvs-like per-file history which can then be compared with results
obtained from parsing a cvs repository for the same information).  The
only single-threaded step in the process is walking the list of
commits - all the diffs can be highly paralleled.

I doubt you need anything quite so fancy.  As you can see from the
script pulling metadata out of commits and walking through parents is
pretty easy.

My example doesn't account for merge commits.  There weren't any in
the cvs->git migration.  Obviously walking commits with merges will
get a lot messier.

-- 
Rich



Re: [gentoo-user] Re: Attic (cvs) -> ???(git)

2016-02-24 Thread Mike Gilbert
On Wed, Feb 24, 2016 at 9:21 PM, walt  wrote:
> On Mon, 22 Feb 2016 19:49:22 + (UTC)
> James  wrote:
>
>> So using wget to fetch {package/files} from the gentoo attic was/is a
>> reliable exercise to build things removed from the tree, into one's
>> /usr/local/portage tree. It still works
>
> Hi James.  I need a version of net-libs/gnutls from before the switch
> to git.  Could I trouble you for an example of how you use wget?  So
> far my googling hasn't even revealed the URL of the attic :-/


https://sources.gentoo.org/cgi-bin/viewvc.cgi/gentoo-x86/?hideattic=0