Re: [Toolserver-l] Compressing stats files better (was; Re: /mnt/user-store is full)

2011-01-03 Thread Daniel Kinzler
On 03.01.2011 01:35, River Tarnell wrote:
> When budgeting for this upgrade, we assumed 5TB would be used for 
> user-store.  In reality, there should be a lot more than this (we 
> originally planned to use 1TB disks); but it won't be a full 24TB.

indeed. listen to river. sorry for throwing around numbers :P

-- daniel

___
Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/toolserver-l
Posting guidelines for this list: 
https://wiki.toolserver.org/view/Mailing_list_etiquette


Re: [Toolserver-l] Compressing stats files better (was; Re: /mnt/user-store is full)

2011-01-03 Thread Daniel Kinzler
On 03.01.2011 01:29, River Tarnell wrote:
> If it turns out to be too slow, we could consider 7z or something else 
> (rzip, bzip2, ..., or maybe even just gzip -9).

i found pbzip2 to be nice. bzip2, just faster :)

-- daniel

___
Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/toolserver-l
Posting guidelines for this list: 
https://wiki.toolserver.org/view/Mailing_list_etiquette


Re: [Toolserver-l] Compressing stats files better (was; Re: /mnt/user-store is full)

2011-01-03 Thread Platonides
Frederic Schutz wrote:
> emijrp wrote:
> 
>> Hi Frederic, thanks for your work. Have you tested 7z?
> 
> It makes no difference to me. River suggested (and installed) xz, so I 
> used it, but 7z would have worked too.
> 
> A quick test using my biased data for one day (but it should be 
> representative enough):
> 
> $ du -s *
> 1027260   7z 1004 M, 25.27% saved
> 1374804   gz  1.4 G, 0% saved
> 1020692   xz  997 M, 25.75% saved
> 
> The difference between xz and 7z is negligible (<1%).

xz has a much saner syntax.

___
Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/toolserver-l
Posting guidelines for this list: 
https://wiki.toolserver.org/view/Mailing_list_etiquette


Re: [Toolserver-l] Compressing stats files better (was; Re: /mnt/user-store is full)

2011-01-03 Thread Frederic Schutz
emijrp wrote:

> Hi Frederic, thanks for your work. Have you tested 7z?

It makes no difference to me. River suggested (and installed) xz, so I 
used it, but 7z would have worked too.

A quick test using my biased data for one day (but it should be 
representative enough):

$ du -s *
1027260 7z 1004 M, 25.27% saved
1374804 gz  1.4 G, 0% saved
1020692 xz  997 M, 25.75% saved

The difference between xz and 7z is negligible (<1%). I haven't 
benchmarked anything formally, but 7z was much faster on my system. It 
looks like this is mainly because the software can use several cores 
simultaneously.

> We can compress to xz while the new disks arrive. I read that it is 
> about 24 TB, so, we can revert to gzip in the future.

Is there any particular reason to use gzip ? When I use these files, I 
mostly uncompress them on the fly from Perl, and there is a module to do 
this with zx too (haven't tested it, though). I am sure Python and other 
languages can do the same.

Even if we have plenty of space, it makes sense to use xz (or another 
format that offers good compression) and to benefit from the size 
reduction, for example if/when these files are backuped or moved around. 
Also, I'd like to be able to provide the files for download for those 
people who want local copies [several academic groups have already 
requested them], and the 25% size reduction is a big bonus here too.

But as I wrote earlier, these files are mostly archived on the 
toolserver, and I assume that most users don't dig often through the 
older ones, so that the best compression should not be a problem.

A better file format (e.g. one file per day, with separate data for 24 
hours, and another file with data aggregated per day) is probably what 
is most needed for "real uses" -- as far as I know, this is how Erik 
Zachte handles this data. A databae would be best, of course, but 
requires much more work...

As always, comments are very welcome.

Frédéric

___
Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/toolserver-l
Posting guidelines for this list: 
https://wiki.toolserver.org/view/Mailing_list_etiquette


Re: [Toolserver-l] [Toolserver-announce] Maintenance

2011-01-03 Thread River Tarnell
emijrp:
> 2011/1/3 River Tarnell 
> >  $ crontab $HOME/crontab.before_nightshade_reinstall
> I get errors http://toolserver.org/~emijrp/cronerrors.txt
 
> > You may want to review
> >  before installing
> > the new crontab.

Please read the page I linked to.

- river.

___
Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/toolserver-l
Posting guidelines for this list: 
https://wiki.toolserver.org/view/Mailing_list_etiquette


Re: [Toolserver-l] [Toolserver-announce] Maintenance

2011-01-03 Thread Andre Koopal
As explained before syntax like */5 won't work under solaris, you will have
to write that out like 0,5,10,15,20,25,30,35,40,45,50,55

Regards,

Andre

On Mon, Jan 03, 2011 at 12:42:16PM +0100, emijrp wrote:
> 2011/1/3 River Tarnell 
> 
> > Note: crontabs from nightshade have been saved as
> > $HOME/crontab.before_nightshade_reinstall.  You should review this for
> > any necessary changes, then install it:
> >
> >  $ crontab $HOME/crontab.before_nightshade_reinstall
> >
> >
> I get errors http://toolserver.org/~emijrp/cronerrors.txt
> 
> 
> > Until you do this, your crontab will not run.
> >
> > You may want to review
> >  before installing
> > the new crontab.
> >
> >- river/
> >
> > ___
> > Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org)
> > https://lists.wikimedia.org/mailman/listinfo/toolserver-l
> > Posting guidelines for this list:
> > https://wiki.toolserver.org/view/Mailing_list_etiquette
> >

> ___
> Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org)
> https://lists.wikimedia.org/mailman/listinfo/toolserver-l
> Posting guidelines for this list: 
> https://wiki.toolserver.org/view/Mailing_list_etiquette

___
Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/toolserver-l
Posting guidelines for this list: 
https://wiki.toolserver.org/view/Mailing_list_etiquette


Re: [Toolserver-l] [Toolserver-announce] Maintenance

2011-01-03 Thread emijrp
Also, I'm receiving e-mails: can't access your crontab file. Resubmit it.
The error on nightshade was "No such file or directory"

2011/1/3 emijrp 

> 2011/1/3 River Tarnell 
>
> Note: crontabs from nightshade have been saved as
>> $HOME/crontab.before_nightshade_reinstall.  You should review this for
>> any necessary changes, then install it:
>>
>>  $ crontab $HOME/crontab.before_nightshade_reinstall
>>
>>
> I get errors 
> http://toolserver.org/~emijrp/cronerrors.txt
>
>
>> Until you do this, your crontab will not run.
>>
>> You may want to review
>>  before installing
>> the new crontab.
>>
>>- river/
>>
>> ___
>> Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org)
>> https://lists.wikimedia.org/mailman/listinfo/toolserver-l
>> Posting guidelines for this list:
>> https://wiki.toolserver.org/view/Mailing_list_etiquette
>>
>
>
___
Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/toolserver-l
Posting guidelines for this list: 
https://wiki.toolserver.org/view/Mailing_list_etiquette

Re: [Toolserver-l] [Toolserver-announce] Maintenance

2011-01-03 Thread emijrp
2011/1/3 River Tarnell 

> Note: crontabs from nightshade have been saved as
> $HOME/crontab.before_nightshade_reinstall.  You should review this for
> any necessary changes, then install it:
>
>  $ crontab $HOME/crontab.before_nightshade_reinstall
>
>
I get errors http://toolserver.org/~emijrp/cronerrors.txt


> Until you do this, your crontab will not run.
>
> You may want to review
>  before installing
> the new crontab.
>
>- river/
>
> ___
> Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org)
> https://lists.wikimedia.org/mailman/listinfo/toolserver-l
> Posting guidelines for this list:
> https://wiki.toolserver.org/view/Mailing_list_etiquette
>
___
Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/toolserver-l
Posting guidelines for this list: 
https://wiki.toolserver.org/view/Mailing_list_etiquette