Re: [Toolserver-l] Wikidata tables

2013-04-18 Thread Byrial Jensen

Den 18-04-2013 11:21, Lydia Pintscher skrev:

On Thu, Apr 18, 2013 at 9:52 AM, Magnus Manske
 wrote:

Just wondering what the status of exposing all wikidata tables on the
toolserver is.

Currently, there are a few wb_* tables with item labels, descriptions,
aliases, and language links.

But the tables (whatever they are called) containing item-to-item
connections appear to be missing. Maybe because they were added later?


As far as I know they're only saved in JSON where usually the article
text is stored and not in separate tables.


You can see in the pagelinks table which properties and which items an 
item is connected to by statements, but not how the properties and items 
are paired together.


___
Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/toolserver-l
Posting guidelines for this list: 
https://wiki.toolserver.org/view/Mailing_list_etiquette

Re: [Toolserver-l] When is the best time of day to run programs?

2013-04-10 Thread Byrial Jensen

Den 10-04-2013 21:35, Merlissimo skrev:

If you are using sge you have not really care about. If you can use the
hole cluster (linux and solaris) we mostly have enough capacity. It is
only important that you can specify which resources (memory, runtime)
you need.

If you need user database access on s3 you simple add -l sql-s3-user=1.
If you rise the number of db-resources replag must be lower to get your
job scheduled (e.g. -l sql-s3-user=3 currently gets only scheduled if
replag is below 1 hour).

deadline option is not available on toolserver. -p mainly changes to
priority compared to other jobs of yourself. For the global scheduling
order job waiting time and used server resources by your user account in
the last hours is more important.

Webserver requests which are also causing much database queries are high
at 14-23 UTC workdays. Most sge jobs are submittet between 0-3 UTC.


Thank you for the explanations to all. They could be used to improve the 
documentaion for SGE.


BTW I can use the whole cluster, as I made this little script to start 
my compiled C programs:


#!/bin/sh
ARCH=`uname`
PROG=$1
shift
/home/byrial/bin/$ARCH/$PROG $@



___
Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/toolserver-l
Posting guidelines for this list: 
https://wiki.toolserver.org/view/Mailing_list_etiquette

Re: [Toolserver-l] When is the best time of day to run programs?

2013-04-10 Thread Byrial Jensen

Den 10-04-2013 20:06, Tim Landscheidt skrev:

Byrial Jensen  wrote:


I am planning to make some maintenace reports for the Danish
Wikipedia at regular intervals, like once a week or once
every few days, but the exact time to run the programs
doesn't really matter.



So what time of the day is it best to run such programs?



And is there a way to tell SGE that it may choose the most
convenient time to start to job?


It will do that by itself, in fact, it is its whole pur-
pose :-).


Well, no. SGE cannot know if I want the job run as early as possible or 
if I can happily wait for several hours for a less busy part of the day, 
unless there is some option I can use to tell that.



___
Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/toolserver-l
Posting guidelines for this list: 
https://wiki.toolserver.org/view/Mailing_list_etiquette

[Toolserver-l] When is the best time of day to run programs?

2013-04-10 Thread Byrial Jensen

Hi,

I am planning to make some maintenace reports for the Danish Wikipedia 
at regular intervals, like once a week or once every few days, but the 
exact time to run the programs doesn't really matter.


So what time of the day is it best to run such programs?

And is there a way to tell SGE that it may choose the most convenient 
time to start to job?


___
Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/toolserver-l
Posting guidelines for this list: 
https://wiki.toolserver.org/view/Mailing_list_etiquette

Re: [Toolserver-l] a new way to run tools: batch job scheduling

2009-10-08 Thread Byrial Jensen
River Tarnell skrev:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
> 
> it's now possible to schedule jobs that run SQL queries, as well as
> those that use CPU resources.  this is described on the wiki at
> 
>   
> 

Is there any way to schedule jobs when you don't know in advance which 
SQL cluster they use. I normally just give the database name to my 
programs on the commandline and leave it to the program to look up which 
cluster to use.

How can I such schedule jobs?

Another matter is that I use binaries which are compiled to a specific 
architecture. Would it be possible in a script to test which 
architecture it is run on, and then select the binary to run 
accordingly? Any code exeamples would be great.

/byrial


___
Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/toolserver-l
Posting guidelines for this list: 
https://wiki.toolserver.org/view/Mailing_list_etiquette


[Toolserver-l] Unmaintained web tools (was: [stable] server migration)

2009-06-20 Thread Byrial Jensen
River Tarnell wrote:
> we are currently considering disabling the unmaintained web tools of expired
> user accounts, meaning these tools will stop working.  stable tools will only
> be disabled if *all* maintainers leave, and no new maintainer can be found.

Is it possible to make a list of the unmaintained web tools? I would 
consider adopting one if it is possible and it seems usefull.

- Byrial


___
Toolserver-l mailing list
Toolserver-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/toolserver-l


Re: [Toolserver-l] SQL-S2 down and new status-files

2007-09-05 Thread Byrial Jensen
DaB. wrote:
> To inform our tool-user better in future, I created some files:
> 
> http://tools.wikimedia.de/status_s1 (same for s2 and s3).

That accounts for connections to sql-s1, sql-s2 and sql-s3.

What about connections to sql to access the toolserver database?
Shouldn't there be a status file for that?

Best regards,
Byrial

___
Toolserver-l mailing list
Toolserver-l@lists.wikimedia.org
http://lists.wikimedia.org/mailman/listinfo/toolserver-l


Re: [Toolserver-l] commons now available on yarrow

2007-08-13 Thread Byrial Jensen
River Tarnell skrev:
> hello,
> 
> a replicated commonswiki_p is now available on yarrow (sql-s1/sql-s3) as
>  well as zedler (sql-s2).

Any plan to also make commonswiki_p available on the new sql-s1 (vandale)?

___
Toolserver-l mailing list
Toolserver-l@lists.wikimedia.org
http://lists.wikimedia.org/mailman/listinfo/toolserver-l


Re: [Toolserver-l] slow queries

2007-06-19 Thread Byrial Jensen
Mashiah Davidson skrev:
> Hello, All!
> 
> Can anyone suggest how to improve the performance of a query like this:
> 
> CREATE TABLE u_mashiah.pagelinks (
>   `pl_from` int(8) unsigned NOT NULL default '0',
>   `pl_namespace` int(11) NOT NULL default '0',
>   `pl_title` varchar(255) binary NOT NULL default '',
>   KEY `pl_from` (`pl_from`,`pl_namespace`)
> ) TYPE=MyISAM AS /* SLOW_OK */
> SELECT pl_from,
>pl_namespace,
>pl_title
>FROM ruwiki_p.pagelinks;
>
> This is the complete copying of a table from readonly database to
> personal one with some altering on keys, not sufficient I suppose. It
> looks like if copying like this is much slower than access to the same
> amount of data in one database.
>
> Please, advise.

Create the table without keys and add them later as needed when the
table is populated. It is much faster insert when there is no keys to
update for each inserted row.

Bst regards
Byrial


___
Toolserver-l mailing list
Toolserver-l@lists.wikimedia.org
http://lists.wikimedia.org/mailman/listinfo/toolserver-l