Re: Making a prefixed install / hard-wired global configuration files?

2007-03-25 Thread Nick Hill

Hello Jeremy

Thank you for the pointer. After make clean, re-configured, re-compiled then 
tried, but /path/to/bin/mysql --help still shows /etc/my.cnf as the default file.


Jeremy Cole wrote:

Hi Nick,

This should do it:

  --sysconfdir=DIR   read-only single-machine data [PREFIX/etc]

Regards,

Jeremy

Nick Hill wrote:
Is there an easy way to build MySQL client library, server and 
start-up script so that /etc/my.cnf is ignored?


I need to build client and server so that it in no way interacts with 
the global system install, and build programs against that client 
library which also don't interact with the global system install.


My attempts so far show that I need to make hacks to get it to work, 
where I expected something like ./configure --prefix=/my/prefix 
--global_config=blah would do the trick but doesn't seem to be the case.


I notice Linux distros often have their configs in places other than 
/etc/my.cnf.  Have I overlooked something?






--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Making a prefixed install / hard-wired global configuration files?

2007-03-25 Thread Nick Hill
Is there an easy way to build MySQL client library, server and start-up script 
so that /etc/my.cnf is ignored?


I need to build client and server so that it in no way interacts with the global 
system install, and build programs against that client library which also don't 
interact with the global system install.


My attempts so far show that I need to make hacks to get it to work, where I 
expected something like ./configure --prefix=/my/prefix --global_config=blah 
would do the trick but doesn't seem to be the case.


I notice Linux distros often have their configs in places other than 
/etc/my.cnf.  Have I overlooked something?


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Struggling with the logic

2006-07-23 Thread Nick Hill

Hello John

There are two approaches to this.

1) Best, save date into table as a number. Select date like
WHERE date>=$date1 AND date<=$date2

2)Otherwise, if storing dates a strings:
WHERE date=$date1 OR date=$date2 OR date=$date3 OR date=$date4 OR 
date=$date5.


In general, if you want to select date ranges and perform other date 
calculations, save the date as a unix date number then convert to human 
readable form as necessary.


John Berman wrote:

Hi

 


I'm struggling with some logic

 


I have a table called: submissions and each record has an approvedate field
which stores the date mm/dd/

 

 


I want to display all records for 7 days only from their  approved date so I
guess something like

 

 


Select * from submissions were approvedate  - this is were im getting stuck

 

 


Pointers appreciated, im sure its simple ?

 


Regards

 


John B

 





--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Optimising for many rows and returned records (de-coupling query time to record set size for range queries)

2006-04-24 Thread Nick Hill

Hello Adam

Adam Wolff wrote:

Actually runs through the table four times instead of twice, and maybe
can't even use the index for the whole query.


Assuming my results are not typical of MySQL query times, this would 
explain the sqrt() relationship of returned rows to query time.


I have tried your suggestions of using a sub-query and have had trouble 
getting the syntax valid. But on using explain, it seems that 4 bytes of 
the index (either lat or lon) are being used and a brute force search on 
the index for the other constraint.


If the query is returning 25600 points from a 100m dataset, it is brute 
seaching through 1.6m records in the second part of the index.


If it were an option of creating 2 1.6M lists then looking for 
commonalities, it may be faster to instead use 1 1.6m item list then 
brute force constraint search.


I have received suggestions to use spatial indexes, which I am looking 
into. Alternatively, I could optimise queries by creating multiple 
slices of the data set accross one axis then use a key on the other 
axis. MySQL 5.1 partitioning scheme may help.



--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Optimising for many rows and returned records (de-coupling query time to record set size for range queries)

2006-04-23 Thread Nick Hill

Alexey Polyakov wrote:

On 4/23/06, Nick Hill <[EMAIL PROTECTED]> wrote:
I've noticed a couple things.
1) Right now you're emulating spatial index.
2) In future, you're going to emulate partitioning.

Why do you think that doing this stuff manually is better than using
builtin capabilities?


1) I am ignorant about any performance advantages of spatial indexes.
2) I am ignorant about built-in partitioning capabilities.



Selects against a table use b-trees too. Splitting data into lot of
tables won't help with selects at all (well, it may help on scans with
concurrent large data sets if data will be spread across different
physical drives, but not with regular range lookups that you're
doing). It will only help with inserts.


Assuming even distribution, selecting a table amongst 1000 will only 
take a few ms while 2^log10(1000) gives an 8 fold improvement in select 
performance. But then, I may be running inefficient queries as 
postulated by Adam Wolff.



Given that there is such a strong relationship between the number of
records returned, and query time, I conclude that the whole index tree
is matched for every given number of root x records returned. If all
records we are matching are under a single node or under a small number
of nodes in the index tree, perhaps there is some way of telling the
database engine to ignore the rest of the index tree.



What is a 'root record'? Are you speaking about internal
representation of b-tree?


Yes. I am suggesting that a lower node in the B-tree may have below it 
all records the select query is looking for, thereby providing a short-cut.



Could this work, or am I misunderstanding how the index tree works? Are
there existing optimisations which can de-couple the relationship
between number of records and query time where the records I am
selecting are within a small range?



For studying select query performance issues it's better think about
index as simply about a sorted array with random-access, where each
random access costs O(lgN) and accesses to adjanced data cost O(1).
If your points are spread uniformly in space, cost of select query
you've shown is O(N*lgN)


I am unfamiliar with this representation. I am not sure I understand.

-Nick Hill.

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Optimising for many rows and returned records (de-coupling query time to record set size for range queries)

2006-04-23 Thread Nick Hill

Hello

I have been looking at planning the database strategy for openstreetmap 
(http://www.openstreetmap.org).


There are several data types stored in tables with longitude and 
latitude columns. Select statements work by selecting


where lat>$lat1 and lat<$lat2 and lon>$lon1 and lon<$lon2

I have made many empirical tests and have concluded:

1) I can improve performance by a factor of 2-2.5 by changing the double 
lat/lon to an integer then selecting on an integer.


2) I have concluded that for each 10 fold increase in the number of 
records, select queries take twice as long. For each doubling of the 
number of returned records, there is a sqrt(2) increase in select query 
time.


All this is assuming all relevant database information is in memory.


As the database grows, it would likely improve database performance by 
splitting an individual table into several thousand tables using the 
file system directory btree algorithm to effectively pre-select the data 
before the query is handled to the MySQL engine. This is not a neat 
solution. A much better way would be to improve the mysql index 
performance on very large numbers of records.


Given that there is such a strong relationship between the number of 
records returned, and query time, I conclude that the whole index tree 
is matched for every given number of root x records returned. If all 
records we are matching are under a single node or under a small number 
of nodes in the index tree, perhaps there is some way of telling the 
database engine to ignore the rest of the index tree.


Could this work, or am I misunderstanding how the index tree works? Are 
there existing optimisations which can de-couple the relationship 
between number of records and query time where the records I am 
selecting are within a small range?




Background information:

We can boil all this down to a mathematical relationship where
query1 selects s number of records from r records dataset
and
query2 selects b number of records from c records dataset

Tquery1 is time to execue query 1 and Tquery2 is time to execute query2.

Tquery2=Tquery1 * sqrt(b/s) * (2^log(r/c)) + (b-s*CONST/15000)+CONST
Where for my processor, CONST is 0.03


This can be simplified (loosing some accuracy) to:

Tquery2=Tquery1 * sqrt(b/s) * (2^log(r/c)




Raw data for selects:
Creating a plan with 10 points and averaging over 25 queries
Points_per_tile Query_Time
25600   0.118
25600   0.119
25600   0.119
25600   0.119
12800   0.069
64000.042
32000.026
16000.017
800 0.011
400 0.008
200 0.005
100 0.004
50  0.003
Creating a plan with 100 points and averaging over 25 queries
Points_per_tile Query_Time
25600   0.224
25600   0.223
25600   0.222
25600   0.223
12800   0.145
64000.093
32000.062
16000.043
800 0.029
400 0.020
200 0.015
100 0.011
50  0.008
Creating a plan with 1000 points and averaging over 25 queries
Points_per_tile Query_Time
25600   0.558
25600   0.548
25600   0.551
25600   0.551
12800   0.376
64000.257
32000.181
16000.125
800 0.087
400 0.062
200 0.044
100 0.031
Creating a plan with 1 points and averaging over 25 queries
Points_per_tile Query_Time
25600   2.422
25600   2.332
25600   2.493
25600   2.446
12800   1.769
64001.295
32000.866
16000.657
800 0.456
400 0.328
200 0.233
100 0.159
50  0.118

Source code for the above test:
#!/usr/bin/perl -w

#Program creates random point fields eqyuivalent to bitfieldtest.pl 
except the data is stored
#as regular signed integers. To represent the globe as closely as 
possible, extents between
#-180 and +179.99 will be used. Therefore, adding 180 normalises for 
international date line 0.

#Prime Meridian 180. 11**0.01


use DBI;
use Time::HiRes qw( usleep ualarm gettimeofday tv_interval );

$DBHOST = "localhost";
$DBNAME = "nickh";
$DBUSER = "nickh";
$DBPASS = "xx";

#initialise database
$driver = "mysql";
$dsn = "DBI:$driver:database=$DBNAME;host=$DBHOST";
$dbh = DBI->connect($dsn, $DBUSER, $DBPASS);

[EMAIL PROTECTED](1);
@plane_densities=(10,100,1000,1);
@tile_points=(25600,25600,25600,25600,12800,6400,3200,1600,800,400,200,100,50);
$query_iterations=25;
$debug=0;

sub create_bitfield;
sub run_tests;

foreach $density(@plane_densities){
print "Creating a plan with $density points and averaging over 
$query_iterations queries\nPoints_per_tile Query_Time\n";

create_bitfield($density);
foreach $tilepoints(@tile_points){
my $testtime=run_tests($densit