Re: [aur-general] AUR Maintenance

2013-03-03 Thread Phillip Smith
On 1 March 2013 15:07, Connor Behan connor.be...@gmail.com wrote:


 Except that line there is 161 characters and contains two comments (one
 comment deleted by its poster about Ruby and one non-deleted comment
 about GNOME). The line in the real file is a million characters and
 contains ~20k comments. And there are 28 such lines. Reading this would
 be like reading War And Peace 10 times but it would teach you a lot
 about the history of the AUR.


While it is a lot of data, I agree that it shouldn't be that difficult to
recover. What am I missing? Any chance those of us who aren't TU's can get
access to the file?


Re: [aur-general] AUR Maintenance

2013-03-03 Thread Martti Kühne
On 3/3/13, Phillip Smith li...@fukawi2.nl wrote:
 While it is a lot of data, I agree that it shouldn't be that difficult to
 recover. What am I missing? Any chance those of us who aren't TU's can get
 access to the file?



I also came close to that question, which indeed is kind of obvious.

cheers!
mar77i


Re: [aur-general] AUR Maintenance

2013-03-01 Thread Martti Kühne
On Fri, Mar 1, 2013 at 5:07 AM, Connor Behan connor.be...@gmail.com wrote:
[...]

 INSERT INTO `PackageComments` VALUES (17,46,68,'ruby bindings for
 fastcgi',1113164127,68),(28,69,65,'A countdown timer applet for the
 GNOME panel.',1113178883,0);

 Except that line there is 161 characters and contains two comments (one
 comment deleted by its poster about Ruby and one non-deleted comment
 about GNOME). The line in the real file is a million characters and
 contains ~20k comments. And there are 28 such lines. Reading this would
 be like reading War And Peace 10 times but it would teach you a lot
 about the history of the AUR.



That's why we use machines to do this kind of work for us. Also, a
lovely idea to restore comments that are older than two years, that'll
be extremely beneficial to the quality of the aur.
Other thoughts on this, we don't need comments on packages that don't
exist any more, that were deleted already or are made by users which
aren't in the db any more.
If I understood correctly, none of that data is currently in the aur's
comments? or all?

cheers!
mar77i


Re: [aur-general] AUR Maintenance

2013-03-01 Thread Connor Behan
On 01/03/13 06:02 AM, Martti Kühne wrote:
 On Fri, Mar 1, 2013 at 5:07 AM, Connor Behan connor.be...@gmail.com wrote:
 [...]
 INSERT INTO `PackageComments` VALUES (17,46,68,'ruby bindings for
 fastcgi',1113164127,68),(28,69,65,'A countdown timer applet for the
 GNOME panel.',1113178883,0);

 Except that line there is 161 characters and contains two comments (one
 comment deleted by its poster about Ruby and one non-deleted comment
 about GNOME). The line in the real file is a million characters and
 contains ~20k comments. And there are 28 such lines. Reading this would
 be like reading War And Peace 10 times but it would teach you a lot
 about the history of the AUR.


 That's why we use machines to do this kind of work for us. Also, a
 lovely idea to restore comments that are older than two years, that'll
 be extremely beneficial to the quality of the aur.
Right, inserting this data into a db can be automated. It would just
require minor syntax changes to account for the newer MySQL version.
This hasn't been done, I gather, because the devs hold themselves to a
high standard and don't want corrupted text littering the AUR comments.
Fixing the encoding of the text is what might require reading. Loui
Chang seemed to think there was a way to automate this as well but it
would be nontrivial so the project got put on the back burner. I should
ask him.
 Other thoughts on this, we don't need comments on packages that don't
 exist any more, that were deleted already or are made by users which
 aren't in the db any more.
 If I understood correctly, none of that data is currently in the aur's
 comments? or all?
Whether a comment is a deleted comment is stored in the AUR database.
Whether it belongs to a deleted package or a deleted user, I believe, is
not. If you delete an AUR package, the PHP file will only delete the
record for that package. Comments that were part of it stay in the db as
orphan data. In fact, package tarballs don't even get deleted by the
PHP file. This is done by a helper script that periodically runs a cleanup.

However, if this 2010 backup does get imported into the AUR, I agree
that we can take the liberty of removing such orphan data so there is
less to import.
 cheers!
 mar77i




signature.asc
Description: OpenPGP digital signature


Re: [aur-general] AUR Maintenance

2013-02-28 Thread Florian Pritz
On 28.02.2013 07:14, Connor Behan wrote:
 I was stupid enough not to make a backup so can someone with
 access please put this on nymeria? Thank-you.

I've put it in your home on nymeria. You're lucky SevenL didn't yet shut
down sigurd.




signature.asc
Description: OpenPGP digital signature


Re: [aur-general] AUR Maintenance

2013-02-28 Thread Connor Behan
On 28/02/13 01:54 PM, Phillip Smith wrote:
 I'd be willing to try and assist with this too. What is the format of
 that backup file?


It is a 46MB text file of SQL commands; the kind you would get by
running mysqldump. It only has 462 lines, but some of them are very
long. The important lines are 97-117 that specify the PackageComments
table:

CREATE TABLE `PackageComments` (
  `ID` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `PackageID` int(10) unsigned NOT NULL DEFAULT '0',
  `UsersID` int(10) unsigned NOT NULL DEFAULT '0',
  `Comments` text NOT NULL,
  `CommentTS` bigint(20) unsigned NOT NULL DEFAULT '0',
  `DelUsersID` int(10) unsigned NOT NULL DEFAULT '0',
  PRIMARY KEY (`ID`),
  KEY `UsersID` (`UsersID`),
  KEY `PackageID` (`PackageID`),
  KEY `DelUsersID` (`DelUsersID`)
) ENGINE=MyISAM AUTO_INCREMENT=154508 DEFAULT CHARSET=latin1;
/*!40101 SET character_set_client = @saved_cs_client */;

--
-- Dumping data for table `PackageComments`
--

LOCK TABLES `PackageComments` WRITE;
/*!4 ALTER TABLE `PackageComments` DISABLE KEYS */;

ID is a number identifying the comment, PackageID is the package to
which it belongs, UsersID is the one who posted it, Comments is the
actual text of it, CommentTS is the timestamp of when it was posted,
DelUsersID is equal to the ID of the user who deleted the comment and 0
if it has not been deleted. The next important lines are 118-146 which
state the actual comment data. An example of it is:

INSERT INTO `PackageComments` VALUES (17,46,68,'ruby bindings for
fastcgi',1113164127,68),(28,69,65,'A countdown timer applet for the
GNOME panel.',1113178883,0);

Except that line there is 161 characters and contains two comments (one
comment deleted by its poster about Ruby and one non-deleted comment
about GNOME). The line in the real file is a million characters and
contains ~20k comments. And there are 28 such lines. Reading this would
be like reading War And Peace 10 times but it would teach you a lot
about the history of the AUR.



signature.asc
Description: OpenPGP digital signature


[aur-general] AUR Maintenance

2013-02-27 Thread Connor Behan
Some of you may remember the lost comments fiasco of 2010. When this
happened, a file on sigurd was created aur-20100205-1859.sql.fixed2.xz
with AUR comments that had been lost. I volunteered (on the mailin
list) to help manually restore them to the AUR [1]. Since I did not
know SQL or how to speak the languages that were causing problems, my
offer was laughed off as enthusiasm that wasn't helpful.

This hasn't really changed. My help at this point would still probably
be a bit useless. But that doesn't mean I don't still think about one
day having a go at it. The zipped sql file was on sigurd as recently as
a few months ago. But since the move to nymeria, I can't find it
anymore. I was stupid enough not to make a backup so can someone with
access please put this on nymeria? Thank-you.

[1] https://mailman.archlinux.org/pipermail/aur-general/2010-May/008847.html



signature.asc
Description: OpenPGP digital signature


[aur-general] AUR maintenance works

2011-11-27 Thread Lukas Fleischer
Some of you might have noticed that the AUR has been in maintenance mode
for the last three hours.

We did a full backup of the server and prepared everything for a drive
replacement that is scheduled for tomorrow, 28. Nov 2011. The server
might be down again tomorrow for a couple of minutes.

Sorry for the inconvenience!


[aur-general] AUR Maintenance

2010-09-19 Thread Loui Chang
I'm going to be updating the AUR in the next few minutes.
Don't be alarmed. Please stand by.



Re: [aur-general] AUR Maintenance

2010-09-19 Thread Loui Chang
On Sun 19 Sep 2010 20:24 -0400, Loui Chang wrote:
 I'm going to be updating the AUR in the next few minutes.
 Don't be alarmed. Please stand by.

Should be good now. Let me know if there are any issues.
Cheers.



Re: [aur-general] AUR Maintenance

2010-09-19 Thread Thomas Dziedzic
On Sun, Sep 19, 2010 at 8:09 PM, Loui Chang louipc@gmail.com wrote:
 On Sun 19 Sep 2010 20:24 -0400, Loui Chang wrote:
 I'm going to be updating the AUR in the next few minutes.
 Don't be alarmed. Please stand by.

 Should be good now. Let me know if there are any issues.
 Cheers.



There are no issues.. thanks for updating the aur!

Although I did notice that the header is inconsistent with the other
headers for archlinux. (the words up top don't seem to be as bold)


Re: [aur-general] AUR Maintenance

2010-09-19 Thread Loui Chang
On Sun 19 Sep 2010 20:40 -0500, Thomas Dziedzic wrote:
 On Sun, Sep 19, 2010 at 8:09 PM, Loui Chang louipc@gmail.com wrote:
  On Sun 19 Sep 2010 20:24 -0400, Loui Chang wrote:
  I'm going to be updating the AUR in the next few minutes.
  Don't be alarmed. Please stand by.
 
  Should be good now. Let me know if there are any issues.
  Cheers.
 
 There are no issues.. thanks for updating the aur!
 
 Although I did notice that the header is inconsistent with the other
 headers for archlinux. (the words up top don't seem to be as bold)

Dammmit!!



Re: [aur-general] AUR Maintenance

2010-09-19 Thread Hilton Medeiros
On Sun, 19 Sep 2010 20:40:23 -0500
Thomas Dziedzic gos...@gmail.com wrote:

 On Sun, Sep 19, 2010 at 8:09 PM, Loui Chang louipc@gmail.com
 wrote:
  On Sun 19 Sep 2010 20:24 -0400, Loui Chang wrote:
  I'm going to be updating the AUR in the next few minutes.
  Don't be alarmed. Please stand by.
 
  Should be good now. Let me know if there are any issues.
  Cheers.
 
 
 
 There are no issues.. thanks for updating the aur!
 
 Although I did notice that the header is inconsistent with the other
 headers for archlinux. (the words up top don't seem to be as bold)

You fell for the duck, man... :)
http://stackoverflow.com/questions/2349378/new-programming-jargon-you-coined/2444361#2444361


Re: [aur-general] AUR Maintenance

2010-05-12 Thread Slash
On Mon, May 3, 2010 at 6:53 PM, Loui Chang louipc@gmail.com wrote:
 Hah. Thanks for your enthusiasm, but it wouldn't be very effective to go
 through it manually. Most of the scrambled strings are in languages
 other than English, and it would take waay too much work.

 I do plan on restoring them, but I haven't had the chance to look into
 it. I can't say when that will be though.

 Cheers.


This is really becoming a great hindrance. The comments on the AUR had
a lot of very valuable information on them. It's not like it's a
twitter feed of inane babble- it's documentation, QA, a changelog,
brainstorming and external references. This is hurting me much more
than if the wiki was completely nuked. I have PKGBUILDs and other
build data in version control, but I don't have any of that other
stuff in the AUR comments anywhere.

Anything you could do would be a great help here. I would rather 5% of
the comments be corrupted or completely deleted because of encoding
issues than have all of them missing for another 2 months.
Unfortunately, I have no expertise with encoding issues, otherwise I
would have immediately offered help at the time. Let us know if there
is anything we can help with, at all (besides stop bugging me! :)

Thanks,

Slash


Re: [aur-general] AUR Maintenance

2010-05-03 Thread Loui Chang
On Fri 30 Apr 2010 22:04 -0400, Connor Behan wrote:
 I read in the archives that comments were 95% repaired by Firmicus
 and the copy with a few illegal characters is:
 /home/francois/aur-20100205-1859.sql.fixed2.xz on a server to which I
 do not yet have access. Since the comments are not back, I gather
 there is more to be done.
 
 Could I please go through the comments and repair sentences that make
 sense? i.e. s/Th#s wor#s for kd#4 but #ot kdemod/This works for kde4
 but not kdemod/ I will do this until everything is done or the only
 sentences left are unintelligible. The other problem is merging this
 with new comments that have been posted since the update. No one
 knows a reliable way to do this automatically, correct? And it would
 require adding countless database entries by hand? I am also prepared
 to get started on this brute force work. I will have several hours
 per week to devote to it. Please give me what I need to contribute!

Hah. Thanks for your enthusiasm, but it wouldn't be very effective to go
through it manually. Most of the scrambled strings are in languages
other than English, and it would take waay too much work.

I do plan on restoring them, but I haven't had the chance to look into
it. I can't say when that will be though.

Cheers.


[aur-general] AUR Maintenance

2010-04-30 Thread Connor Behan
I read in the archives that comments were 95% repaired by Firmicus and 
the copy with a few illegal characters is:
/home/francois/aur-20100205-1859.sql.fixed2.xz on a server to which I do 
not yet have access. Since the comments are not back, I gather there is 
more to be done.


Could I please go through the comments and repair sentences that make 
sense? i.e. s/Th#s wor#s for kd#4 but #ot kdemod/This works for kde4 
but not kdemod/ I will do this until everything is done or the only 
sentences left are unintelligible. The other problem is merging this 
with new comments that have been posted since the update. No one knows a 
reliable way to do this automatically, correct? And it would require 
adding countless database entries by hand? I am also prepared to get 
started on this brute force work. I will have several hours per week to 
devote to it. Please give me what I need to contribute! Thanks.


Re: [aur-general] AUR Maintenance

2010-04-30 Thread Farhan Yousaf
I can help out as well, though I am not a trusted user so I wonder how much I 
can really help. But my offer is there. :)

On 2010-04-30, at 10:04 PM, Connor Behan wrote:

 I read in the archives that comments were 95% repaired by Firmicus and the 
 copy with a few illegal characters is:
 /home/francois/aur-20100205-1859.sql.fixed2.xz on a server to which I do not 
 yet have access. Since the comments are not back, I gather there is more to 
 be done.
 
 Could I please go through the comments and repair sentences that make sense? 
 i.e. s/Th#s wor#s for kd#4 but #ot kdemod/This works for kde4 but not 
 kdemod/ I will do this until everything is done or the only sentences left 
 are unintelligible. The other problem is merging this with new comments that 
 have been posted since the update. No one knows a reliable way to do this 
 automatically, correct? And it would require adding countless database 
 entries by hand? I am also prepared to get started on this brute force work. 
 I will have several hours per week to devote to it. Please give me what I 
 need to contribute! Thanks.



Re: [aur-general] AUR Maintenance

2010-03-30 Thread Allan McRae
So...  is there an official word about what is the final decision on 
restoring these?  Is it still being investigated how to fix this or it 
just being left?


Allan


Re: [aur-general] AUR Maintenance

2010-03-29 Thread Pierre Schmitz
Am Montag, 29. März 2010 01:21:09 schrieb Allan McRae:
 Is there any progress on fixing this?  There are a lot of packaging 
 notes on those pages that would be a shame to lose.

It's very likely the same issue I had updating the wiki. This is caused by a 
mysql packaging change which switched the default encoding from latin1 to 
utf8. Here are some tips: 
http://en.gentoo-wiki.com/wiki/Convert_latin1_to_UTF-8_in_MySQL

But I guess we lost the chance to fix this more or less easily because the AUR 
content has changed since the last backup. This requires some kind of script 
that imports and merges the old and new comments.

-- 

Pierre Schmitz, https://users.archlinux.de/~pierre


Re: [aur-general] AUR Maintenance

2010-03-29 Thread Firmicus

On 29/03/2010 09:00, Pierre Schmitz wrote:

Am Montag, 29. März 2010 01:21:09 schrieb Allan McRae:
   

Is there any progress on fixing this?  There are a lot of packaging
notes on those pages that would be a shame to lose.
 


I did it last Thursday. I've done my best to repair the mysql backup 
Loui pointed me at. I'd say it's 95% fixed now, but the procedure left a 
few isolated illegal characters in its trail (like this: �), especially 
within Cyrillic and CJK. The text should be legible however. You can 
compare the original on sigurd

/srv/http/aur.archlinux.org/backup/aur-20100205-1859.sql.gz
with my repaired version:
/home/francois/aur-20100205-1859.sql.fixed2.xz
and judge whether any further effort is needed or justified.

It's very likely the same issue I had updating the wiki. This is caused by a
mysql packaging change which switched the default encoding from latin1 to
utf8. Here are some tips:
http://en.gentoo-wiki.com/wiki/Convert_latin1_to_UTF-8_in_MySQL

But I guess we lost the chance to fix this more or less easily because the AUR
content has changed since the last backup.
Indeed. Believe me, the encoding of the strings was in a terrible mess 
(mostly the comments, but also the names of users), so it was no longer 
simply a matter of doing a conversion from one charset to another. 
Basically what I did was to convert from windows-1252 (!) to UTF-8, and 
then repair all doubly-encoded UTF-8 characters using the perl module 
Encode::DoubleEncodedUTF8 (on CPAN). But as I said above, there is no 
way to automatically recover everything from that one backup alone.



This requires some kind of script
that imports and merges the old and new comments.

   
The problem with that import and merge operation – unless it is done 
with a reliable and well-tested tool – is that it risks damaging the 
data more than it currently is ;) I'll leave it to Loui to decide 
whether it's worth the trouble.


F


Re: [aur-general] AUR Maintenance

2010-03-24 Thread Firmicus

On 23/03/2010 22:24, Loui Chang wrote:

On Tue 23 Mar 2010 16:51 -0400, Daenyth Blank wrote:
   

On Tue, Mar 23, 2010 at 16:43, Loui Changlouipc@gmail.com  wrote:
 

It may be possible to restore most of the old comments, but that's
something that we'd have to look into later.
   

What's needed for this, and what ways could someone contribute?
 

We need someone with a keen knowledge of mysql and encodings to be able
to restore the backed up comments properly in utf8.

   
I've done encoding conversions and repairs countless times (mostly using 
Perl). So perhaps I could help on this... (Not today though, but 
probably tomorrow). Contact me off-list and give me more detailed 
instructions of what the issue is. I do have access to sigurd but I 
can't look at the data right now as I am not in the mysql group.


F


[aur-general] AUR Maintenance

2010-03-23 Thread Loui Chang
Hello everyone!
I'm going to look into fixing some issues with the AUR right now.
Please don't be alarmed if the site isn't working for a little while.



Re: [aur-general] AUR Maintenance

2010-03-23 Thread Loui Chang
On Tue 23 Mar 2010 15:09 -0400, Loui Chang wrote:
 Hello everyone!
 I'm going to look into fixing some issues with the AUR right now.
 Please don't be alarmed if the site isn't working for a little while.

I've deleted existing comments from the AUR.

I ran into a problem juggling the encodings, which was the problem I was
trying to fix. The aur should properly display utf8 in comments now
though.

It may be possible to restore most of the old comments, but that's
something that we'd have to look into later.

Cheers!



Re: [aur-general] AUR Maintenance

2010-03-23 Thread Xavier Chantry
On Tue, Mar 23, 2010 at 9:43 PM, Loui Chang louipc@gmail.com wrote:
 On Tue 23 Mar 2010 15:09 -0400, Loui Chang wrote:
 Hello everyone!
 I'm going to look into fixing some issues with the AUR right now.
 Please don't be alarmed if the site isn't working for a little while.

 I've deleted existing comments from the AUR.

 I ran into a problem juggling the encodings, which was the problem I was
 trying to fix. The aur should properly display utf8 in comments now
 though.

 It may be possible to restore most of the old comments, but that's
 something that we'd have to look into later.


Probably a stupid question but just to be sure : being able to look
into it later supposes that there is an easy way to restore old
comments by keeping the new ones ? (i.e. merging both)
Or will the new ones be lost when restoring the old ones ?


Re: [aur-general] AUR Maintenance

2010-03-23 Thread Daenyth Blank
On Tue, Mar 23, 2010 at 16:43, Loui Chang louipc@gmail.com wrote:
 It may be possible to restore most of the old comments, but that's
 something that we'd have to look into later.

What's needed for this, and what ways could someone contribute?


Re: [aur-general] AUR Maintenance

2010-03-23 Thread Loui Chang
On Tue 23 Mar 2010 16:51 -0400, Daenyth Blank wrote:
 On Tue, Mar 23, 2010 at 16:43, Loui Chang louipc@gmail.com wrote:
  It may be possible to restore most of the old comments, but that's
  something that we'd have to look into later.
 
 What's needed for this, and what ways could someone contribute?

We need someone with a keen knowledge of mysql and encodings to be able
to restore the backed up comments properly in utf8.



Re: [aur-general] AUR Maintenance

2010-03-23 Thread Loui Chang
On Tue 23 Mar 2010 21:51 +0100, Xavier Chantry wrote:
 On Tue, Mar 23, 2010 at 9:43 PM, Loui Chang louipc@gmail.com wrote:
  On Tue 23 Mar 2010 15:09 -0400, Loui Chang wrote:
  Hello everyone!
  I'm going to look into fixing some issues with the AUR right now.
  Please don't be alarmed if the site isn't working for a little while.
 
  I've deleted existing comments from the AUR.
 
  I ran into a problem juggling the encodings, which was the problem I was
  trying to fix. The aur should properly display utf8 in comments now
  though.
 
  It may be possible to restore most of the old comments, but that's
  something that we'd have to look into later.

 Probably a stupid question but just to be sure : being able to look
 into it later supposes that there is an easy way to restore old
 comments by keeping the new ones ? (i.e. merging both)
 Or will the new ones be lost when restoring the old ones ?

Old comments are mostly backed up but suffer from some
encoding issues - that's the first hurdle.

There should be a way to merge old and new comments. I'm not exactly
sure how easy that would be however. Probably pretty easy for a real
sysadmin. That I am not unfortunately.

I'm not sure how much value is in the old comments, but it's not worth
keeping the AUR locked down while I try to figure it out.



Re: [aur-general] AUR Maintenance

2010-03-23 Thread Xavier Chantry
On Tue, Mar 23, 2010 at 10:39 PM, Loui Chang louipc@gmail.com wrote:

 I'm not sure how much value is in the old comments, but it's not worth
 keeping the AUR locked down while I try to figure it out.



I would say there are 90% of crap and 10% that would be a shame to lose :)

I hope someone more knowledgeable about mysql/encoding/sysadmin can help.


Re: [aur-general] AUR Maintenance

2010-03-23 Thread Daenyth Blank
On Tue, Mar 23, 2010 at 17:41, Xavier Chantry chantry.xav...@gmail.com wrote:
 I would say there are 90% of crap and 10% that would be a shame to lose :)

 I hope someone more knowledgeable about mysql/encoding/sysadmin can help.


I might throw up an announcement on the forums that they've been
removed and a call for help on fixing it.