Re: [gentoo-user] Re: attic
On Mon, Sep 4, 2023 at 2:36 PM Rich Freeman wrote: > On Mon, Sep 4, 2023 at 4:38 AM William Kenworthy > wrote: > > > > On 4/9/23 16:04, Nuno Silva wrote: > > > > > > (But note that Rich was suggesting using the *search* feature of the > > > gitweb interface, which, in this case, also finds the same topmost > > > commit if I search for "reedsolomon".) > > > > > tkx, missed that! > > Note that in terms of indexing git and CVS have their pros and cons, > because they use different data structures. I've heard the saying > that Git is a data structure masquerading as an SCM, and certainly the > inconsistencies in the command line operations bear that out. > I'd always heard that Git is a file system and all useful side effects are pure luck -- Alan McKinnon alan dot mckinnon at gmail dot com
Re: [gentoo-user] Re: attic
On Mon, Sep 4, 2023 at 4:38 AM William Kenworthy wrote: > > On 4/9/23 16:04, Nuno Silva wrote: > > > > (But note that Rich was suggesting using the *search* feature of the > > gitweb interface, which, in this case, also finds the same topmost > > commit if I search for "reedsolomon".) > > > tkx, missed that! Note that in terms of indexing git and CVS have their pros and cons, because they use different data structures. I've heard the saying that Git is a data structure masquerading as an SCM, and certainly the inconsistencies in the command line operations bear that out. Git tends to be much more useful in general, but for things like finding deleted files CVS was definitely more time-efficient. The reason for this is that everything in git is reachable via commits, and these are reachable from a head via a linked list. The most recent commit gives access to the current version of the repository, and a pointer to the immediately previous commit(s). To find a deleted file, git must go to the most recent commit in whatever branch you are searching, then descend its tree to look for the file. If it is not found, it then goes to the previous commit and descends that tree. There are 745k commits in the active Gentoo repository. I think there are something like 2M of them in the historical one. Each commit is a random seek, and then each step down the directory tree to find a file is another random seek. In CVS everything is organized first by file, and then each file has its own commit history. So finding a file, deleted or otherwise, just requires a seek for each level in the directory tree. Then you can directly read its history. So finding an old deleted file in the gentoo git repo can require millions of reads, while doing so in CVS only required about 3. It is no surprise that the web interfaces were designed to make that operation much easier - if you do sufficiently complex searches in the git web interface it will time you out to avoid bogging down the server, which is why some searches may require you to clone the repo and do it locally. Now, if you want to find out what changed in a particular commit the situation is reversed. If you identify a commit in git and want to see what changed, it can directly read the commit from disk using its hash. It then looks at the parent commit, then descends both trees doing a diff at each level. Since everything is content-hashed only directory trees that contain differences need to be read. If a commit had changes to 50 files, it might only take 10 reads to figure out which files changed, and then another 100 to compare the contents of each file and generate diffs. If you wanted to do that in CVS you'd have to read every single file in the repository and read the sequential history of each file to find any commits that have the same time/author. CVS commits also aren't atomic so ordering across files might not be the same. Git is a thing of beauty when you think about what it was designed to do and how well-suited to this design its architecture is. The same can be said of several data-driven FOSS applications. The right algorithm can make a huge difference... -- Rich
Re: [gentoo-user] Re: attic
On 4/9/23 16:04, Nuno Silva wrote: On 2023-09-04, William Kenworthy wrote: On 3/9/23 18:29, Rich Freeman wrote: On Sun, Sep 3, 2023 at 4:44 AM Michael wrote: On Sunday, 3 September 2023 07:49:36 BST William Kenworthy wrote: Hi , I used to be able to get old ebuilds from "the attic" but I cant find it on google - is it still around? Perhaps have a look here at the archives? https://gitweb.gentoo.org/ The archives will only contain data migrated from CVS - so only things from more than a few years ago. You want to look into the main repo for anything recently deleted. [...] This can be done via the website, though the search capability is a little limited. I ended up having to search from a local clone because your package name contains an error and the web search found nothing. To find your file, go to: https://gitweb.gentoo.org/repo/gentoo.git/ Go to the search box in the top right and search for: dev-python/reedsolomon (note that the package category is different from what was in your email) Find the commit one commit before the one that removed your package. (ie one that contains your package in its most recent version) If you find the one that deleted your file, then just look at the parent in the commit header and click on that to go back one version where it is still present. Click the tree hash to browse the historical version of the repository that existed before your file was deleted. For example, you can find v1.6.1 of that package at: https://gitweb.gentoo.org/repo/gentoo.git/tree/dev-python/reedsolomon/reedsolomon-1.6.1.ebuild?id=149a131188ebce76a87fd8363fb212f5f1620a02 [...] The web git interface is capable of displaying past commits. It just can't search for wildcards/etc. Thanks Rich, unfortunately the web interface isn't helpful - I cant just navigate the tree to find commits - "https://gitweb.gentoo.org/repo/gentoo.git/tree/dev-python/reedsolomon/; gives path not found - it looks like you have to know the commit first by downloading the git tree to search it - not friendly at all! With /log/ instead of /tree/ in the URL it at least shows the list of commits. From a quick check, this seems to include the commit removing the directory when it's removed instead of renamed, so hopefully it helps too with retrieving older ebuilds? (But note that Rich was suggesting using the *search* feature of the gitweb interface, which, in this case, also finds the same topmost commit if I search for "reedsolomon".) tkx, missed that! BillK
Re: [gentoo-user] Re: Attic files (app-admin/rackview) removed?
On Jun 7, 2016 5:00 PM, "James"wrote: > > R0b0t1 gmail.com> writes: > > > > > vs -d :pserver:anonymous anoncvs.gentoo.org:/var/cvsroot co > > > gentoo-x86/app-admin/rackview/files > > > > > > was support to download the files and the ebuild, manifest etc etc.? > > > > > > Is there a single (anoncvs) command syntax to use, in general to pull > > > complete (theoretically compilable) sources from the archive? It's been > > > a while so my cvs could easily be incorrect > > > > > > wget is a champ. > > > > > > curiously, > > > James > > > > > Use the recursive option. Save the files to your portage tree. > > > man cvs:: > -R > Process directories recursively. This is the default for all cvs > commands, with the exception of ls & rls. > > So, I have a '/usr/local/portage/app-admin/rackview/' dir. > > What is the correct syntax string (I borked this a few times > to no avail)? > > > curiously, > James > ??? > James > `cd /usr/portage && wget -r $url` or `wget -P /usr/portage -r $url` or `wget --help`
Re: [gentoo-user] Re: Attic files (app-admin/rackview) removed?
On Jun 7, 2016 12:11 PM, "James"wrote: > > James tampabay.rr.com> writes: > > > > https://sources.gentoo.org/cgi-bin/viewvc.cgi/gentoo-x86/app-admin/ > > rackview/?hideattic=0 > > > rackview-0.09-r3.ebuild seems to have been removed from the attic? > > > I have to revert to using 'wget' to snag the files and a copy > of the latest ebuild. I thought the command string given the the page:: > > vs -d :pserver:anonym...@anoncvs.gentoo.org:/var/cvsroot co > gentoo-x86/app-admin/rackview/files > > was support to download the files and the ebuild, manifest etc etc.? > > Is there a single (anoncvs) command syntax to use, in general to pull > complete (theoretically compilable) sources from the archive? It's been > a while so my cvs could easily be incorrect > > wget is a champ. > > curiously, > James > Use the recursive option. Save the files to your portage tree.
Re: [gentoo-user] Re: Attic (cvs) -> ???(git)
On Mon, Feb 22, 2016 at 4:49 PM, Jameswrote: > Rich Freeman gentoo.org> writes: > >> If I were doing anything too >> crazy with all this I'd probably use the python git module. > > dev-python/git-python ??? Any others or related docs/howtos/examples? > I used pygit2, but there are a few different implenentations and plenty of docs online in general. Here is an example program that runs through a history and dumps a list of commits and their metadata in csv format: https://github.com/rich0/gitvalidate/blob/master/gitdump/parsetrees.py There are some other scripts that retrieve blobs and manipulate them in the same directory. This was part of the validation of the git migration, which uses a map-reduce algorithm to diff every single commit in a git history and identify all file revisions (which creates a cvs-like per-file history which can then be compared with results obtained from parsing a cvs repository for the same information). The only single-threaded step in the process is walking the list of commits - all the diffs can be highly paralleled. I doubt you need anything quite so fancy. As you can see from the script pulling metadata out of commits and walking through parents is pretty easy. My example doesn't account for merge commits. There weren't any in the cvs->git migration. Obviously walking commits with merges will get a lot messier. -- Rich
Re: [gentoo-user] Re: Attic (cvs) -> ???(git)
On Wed, Feb 24, 2016 at 9:21 PM, waltwrote: > On Mon, 22 Feb 2016 19:49:22 + (UTC) > James wrote: > >> So using wget to fetch {package/files} from the gentoo attic was/is a >> reliable exercise to build things removed from the tree, into one's >> /usr/local/portage tree. It still works > > Hi James. I need a version of net-libs/gnutls from before the switch > to git. Could I trouble you for an example of how you use wget? So > far my googling hasn't even revealed the URL of the attic :-/ https://sources.gentoo.org/cgi-bin/viewvc.cgi/gentoo-x86/?hideattic=0