Re: data file too big

2004-01-22 Thread Gregory Newby
I believe that this will flush those logs:

mysql reset master;

  -- Greg

On Thu, Jan 22, 2004 at 05:23:07PM -0500, Asif Iqbal wrote:
 Hi All
 
 My data file has all these files
 
 (root)@webrt:/usr/local/mysql/data# du -sh *
 25K ib_arch_log_00
 3.0Kib_arch_log_02
 3.0Kib_arch_log_04
 101Mib_logfile0
 101Mib_logfile1
 1.9Gibdata1
 1.5Gibdata2
 2.0Kmy.cnf
 70K mysql
 2.0Knewdb
 39M rt3
 12K test
 67K webrt.err
 1.0Kwebrt.pid
 
 Is there anyway I can remove some of them so I can get some space back ?
 
 I am using mysql -4.0.13 on Solaris 8
 
 -- 
 Asif Iqbal
 PGP Key: 0xE62693C5 KeyServer: pgp.mit.edu
 There's no place like 127.0.0.1
 
 -- 
 MySQL General Mailing List
 For list archives: http://lists.mysql.com/mysql
 To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
 

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: MySQL slowness on querying indexed columns

2004-01-20 Thread Gregory Newby
Don't forget to run optimize table after you build
the combined index or make significant changes.
  -- Greg

PS: Sorry to hear of MySQL's gender issues ;-)

On Mon, Jan 19, 2004 at 10:20:12AM -0500, Balazs Rauznitz wrote:
 
 I saw surprisingly slow query results on columns that were indexed.
 
 Here's the table definition:
 
 create table sex (
 id integer,
 sex char(1));
 create index id_index on sex (id);
 create index sex_index on sex (sex);
 
 Then I loaded a million rows, id was from 1 to 1_000_000, sex was 
 randomly 'F' or 'M'.
 
 When searching on 'id' everything is snappy:
 
 mysql select count(*) from sex where id459000 and id =46;
 +--+
 | count(*) |
 +--+
 | 1000 |
 +--+
 1 row in set (0.00 sec)
 
 However when the 'sex' column is involved:
 
 mysql select count(*) from sex where id459000 and id =46 and sex = 'M';
 +--+
 | count(*) |
 +--+
 |  504 |
 +--+
 1 row in set (5.09 sec)
 
 Any way to make this faster ?
 
 Insert jokes about sex making MySQL slow here
 
 I'm running MySQL 4.0.16 on a 1GHz AMD CPU under linux.
 
 Thanks,
 
 Balazs

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: 100,000,000 row limit?

2004-01-19 Thread Gregory Newby
There is an issue in ISAM/MyISAM tables of needing to have correctly
sized row pointers.  I submitted this as a bug report (it's really a
documentation bug), but don't know if it will be fixed.  I was happy
to get an answer, though, so will share it with you :-) 

If you get an error like this:
ERROR 1030: Got error 136 from table handler.

Try:
ALTER TABLE tablename MAX_ROWS=[very large value];

INNODB tables do not have the same limitation.  I do not
know about BDB tables.

My large table:
mysql select count(*) from inv0web02;
+---+
| count(*)  |
+---+
| 498093481 |
+---+

  -- Greg

On Mon, Jan 19, 2004 at 11:24:19AM -0800, Daevid Vincent wrote:
  mysql select count(*) from atoms_in_universe; 
  +-+
  | count(*)|
  +-+
  | 30204541410292874012341 |
  +-+
  1 row in set (0.07 sec)
 
 Daevid Vincent
 http://daevid.com
   
 
  -Original Message-
  From: Will Lowe [mailto:[EMAIL PROTECTED] 
  Sent: Tuesday, December 23, 2003 8:38 PM
  To: Andrew Braithwaite
  Cc: 'Paul DuBois'; 'Andres Montiel'; '[EMAIL PROTECTED]'
  Subject: Re: 100,000,000 row limit?
  
   I don't believe this.  I'm going to write a script to 
  disprove this theory
   right now..
  
  We have a lot more than 100,000,000 more than that in a single MyISAM
  table at work:
  
  mysql select count(*) from probe_result; 
  +---+
  | count(*)  |
  +---+
  | 302045414 |
  +---+
  1 row in set (0.00 sec)
  
  -- 
  thanks,
  
  Will

Dr. Gregory B. Newby, Research Faculty, Arctic Region Supercomputing Center
University of Alaska Fairbanks.  PO Box 756020, Fairbanks, AK 99775
e: newby AT arsc.edu v: 907-474-7160 f: 907-474-5494 w: www.arsc.edu/~newby

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: mysql load

2004-01-16 Thread Gregory Newby
On Fri, Jan 16, 2004 at 10:14:08AM -0800, Andrew Kwiczola wrote:
 I was wondering how many records A mysql table can comfortably handle.. and
 which table type supports the greatest amount of record capacity.  Can I
 have a few million records in 1 table? Over 10 million? Thanks !
 

Hi, Andrew.  I posted yesterday explaining that my table seems fairly
slow for queries.  The list archive should have the message at
http://lists.mysql.com

My table has over 498,000,000 records of fixed-length:

CREATE TABLE `inv0web02` (
  `docid` int(10) unsigned NOT NULL default '0',
  `offset` smallint(5) unsigned NOT NULL default '0',
  `termid` int(10) unsigned NOT NULL default '0',
  `taglistid` smallint(5) unsigned NOT NULL default '0',
  `whichpara` tinyint(3) unsigned NOT NULL default '0',
  `weight_in_subdoc` float unsigned NOT NULL default '0',
  PRIMARY KEY  (`docid`,`offset`),
  KEY `termid_index` (`termid`),
  KEY `whichpara_index` (`whichpara`),
  KEY `taglistid_index` (`taglistid`),
  KEY `weight_index` (`weight_in_subdoc`),
  KEY `docid_index` (`docid`),
  KEY `offset_index` (`offset`),
  KEY `termid_docid_whichpara_offset` (`termid`,`docid`,`whichpara`,`offset`)
) TYPE=MyISAM;

mysql show table status;
| Name  | Type   | Row_format | Rows  | Avg_row_length | Data_length | 
Max_data_length | Index_length | Data_free | Auto_increment | Create_time | 
Update_time | Check_time  | Create_options | Comment |
+---+++---++-+-+--+---++-+-+-++-+
| inv0web02 | MyISAM | Fixed  | 498093481 | 18 |  8965682658 | 
77309411327 |  33526264832 | 0 |   NULL | 2004-01-15 13:54:28 | 
2004-01-15 14:42:01 | 2004-01-15 23:16:29 || |


This takes about 40GB on disk, from 20GB of raw input.  (I used INNODB
previously, it took about 120GB on disk with comparable performance
otherwise).  To allow so many records with MyISAM, you need to ALTER
TABLE tablename MAX_ROWS=[very large value]; to allow for pointers to
be big enough for all the rows.

So, I'd say that yes, you can create quite large tables in MySQL.
Certainly for most purposes a few million records should not be any
problem.
  -- Greg

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



large table performance for WHERE queries

2004-01-15 Thread Gregory Newby
I'm using MySQL for an information retrieval application where word
occurrences are indexed.  It seems that performance is not as good as
I would expect (it seems nearly linear with the number of rows).
Any advice would be welcome.  I'll lay out a lot of detail.

Some details follow.  The basic scenario is:
- there are over 498M rows.  This seems to qualify it as a
  rather large table
- the table has 6 fixed-length fields
- I have created indexes on each field
- I set MAX_ROW_SIZE to 550,000,000
- I've tried this with both MyISAM and INNODB tables, with very
  similar performance
- Slower inserts are not really a problem, if I could get faster searching
- I've run ANALYZE, OPTIMIZE, myisamchk etc. (see below)
- Using a socket for local communication; MySQL++ for apps, but I've
  also confirmed benchmarking results using the mysql command line 
- I'm not presenting a lot of variety in queries and response times below,
  but can send more if needed.  
- Due to the patterns in the data, there are many many duplicate
  values in the indexes.  I suspect this might be important.
- This is not a results transport issue.  I'm able to retrieve
  about 1000 rows/second, which is OK.  It's generating the result set,
  prior to results transport, which is slow.

Queries are definitely disk-bound, with 5-40% of CPU utilization
during a long-running query.  (I did a little experimentation with a
ramfs to confirm this...too bad I don't have 40GB of RAM).

The application: I want to run queries that will identify particular
DOCUMENTS (via docid) that have particular TERMS (via termid).
This is your garden variety task for a search engine or other
information retrieval system.  I often submit additional criteria,
such as adjacent terms (offset A - offset B=1) or terms in a
particular tag (term A in title and term B in title) or proximity
(paragraph A == paragraph B).  Being able to do such powerful queries
is the whole point of using MySQL, rather than trying to manage things
myself (using BerkeleyDB or home-grown B-trees, as I've done in the
past).

I have made a quickie 2000-document, then a longer 100,000 document,
then the full 1.2M document database (input is about 20GB of HTML).
The challenge is that on the 1.2M document dataset with nearly 1/2 a
billion rows (term occurrences), I just can't seem to get a quick
response for combinations with terms occurring in a lot of rows.

Maybe the answer is that this is the best I can expect due to
duplicate key values and other factors (that is, there are many terms
with each docidmany many terms with each paragraph idmany many
many terms with each offset, etc.).  But I hope there might be other
tuning parameters or tips that will speed things up.

If I look for terms in only a few hundred rows, no problem.  But for
terms with thousands of rows, it takes a very long time (several
minutes for the first 1000 rows to an hour for all rows) to get the
query output, even when I order by and limit to get a smaller output
set.

Concrete example:

select * from inv0web02 as t0,inv0web02 as t1 where ((t0.termid=35)
and (t1.termid=141) and (t0.docid=t1.docid) and
(t0.whichpara=t1.whichpara) and ( (t1.offset - t0.offset) = 1)) order
by t0.weight_in_subdoc limit 1000

This takes about 3 minutes where term35=web and term141=page (very
common terms).  (Timing depends slightly on whether I employ a Result
or ResUse in my MySQL++ code - but I don't think the MySQL++ is the
controlling factor here since I've also experimented with mysql on the
command line).  This is on a RedHat 9 box: Dell 4600 with 12GB RAM,
2x2.8Ghz Xeon (hyperthreaded to look like 4 CPUs) and 10KRPM drives.
It's a pretty speedy system.  I'm using mysql-max-4.0.17-pc-linux-i686
(binary build).

With less common terms, I'm able to get a response in just a few
seconds.  The subtraction in the offset is not a deciding factor;
performance is comparable without it.  Clearly, the issue is the merge
within the same table using WHERE criteria.

The order by doesn't matter much; the limit speeds things up quite
a bit since the response set is smaller.  (I use these so that the
program that uses the results only needs to look at the best, and to
limit the response set size).  If I remove the limit 1000, the query
can take well over 1/2 hour (even without actually retrieving the
rows).

I have my.cnf tuned to use a lot of memory, so we're not doing a lot
of swapping or thrashing.  The total memory footprint for the mysql
process maxes at 1.5GB or so.  Most of the query wall-clock time seems
to be disk access.

myisam_sort_buffer_size = 640M

Explain looks fine, in terms of using indices.  For the query above:

| table | type | possible_keys| key  | 
key_len | ref  | rows   | Extra   |

Re: large table performance for WHERE queries

2004-01-15 Thread Gregory Newby
On Thu, Jan 15, 2004 at 02:52:30PM -0500, Brent Baisley wrote:
 It sounds like you are trying to do full text searching, but you 
 implemented it manually. Was MySQL's full text indexing not 
 sufficient for your needs or am I totally missing what you are trying 
 to do?

You're absolutely right: it's full text searching.

While the MySQL functionality for this is quite nice, I'm
trying to do a lot more than is available with MySQL currently.
So, yes: MySQL full text indexing is not sufficient (at least,
as I understand it).

More background:

Fundamentally, MySQL offers only one major method for
doing text retrieval (with some configurability, of course).

My application is an experimental platform for different
fundamental approaches to information retrieval.  So,
I need to build very different functionality than MySQL has
for full text searching, on top of the core ability to
identify documents with particular terms.  (Some of the
fundamental approaches are: variations on Boolean retrieval,
the Vector Space Model, Latent Semantic Indexing, and
Probabilistic Information Retrieval.)

In the lingo of IR, what I'm trying to build with MySQL
is the inverted index.  This is where the terms are the keys,
and the data items are the documents in which those terms occur.
I've expanded my data items to also include which paragraph,
which HTML tag, which term sequence, and the weight of the term
in the document.  Among other things, this enables sub-document
retrieval (i.e., paragraph-level), adjacency searching,
phrase searching, and more.  It lets me experiment with
very different term and document weighting methods, too. 

If you're really a glutton for punishment, look at the
current source tree via CVS at http://sourceforge.net/projects/irtools
  -- Greg

 On Jan 15, 2004, at 1:53 PM, Gregory Newby wrote:
 
 I'm using MySQL for an information retrieval application where word
 occurrences are indexed.  It seems that performance is not as good as
 I would expect (it seems nearly linear with the number of rows).
 Any advice would be welcome.  I'll lay out a lot of detail.
 
 -- 
 Brent Baisley
 Systems Architect
 Landover Associates, Inc.
 Search  Advisory Services for Advanced Technology Environments
 p: 212.759.6400/800.759.0577
 

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: newbie - connect error

2004-01-15 Thread Gregory Newby
Hi, Tait.  It sounds like your server is never actually
starting properly.  There could be any number of reasons for
this - it's best to work through the installation instructions
that came with the package.

To see what failed, chances are good there is an entry
in the error log (it will probably be named hostname.err where
hostname is your system's name) under your data directory
(perhaps /usr/local/mysql/data, but you might have installed
it in another location).  View with something like this:
tail /usr/local/mysql/data/hostname.err

If you can't find the data directory or the err file,
chances are that something fundamental is misconfigured
(such as file permissions or directory locations).  For this
go to the instructions in INSTALL-BINARY or whatever's appropriate
for your build, and make sure you've followed them all.

I hope this helps!
  -- Greg

On Fri, Jan 16, 2004 at 10:16:16AM +1100, tait sanders wrote:
 Hi,
 
 I'm a newbie to mysql.
 I'm running an os10.3 server with mysql v. 4.0.16
 
 When I try to start mysql with 'mysqld_safe ' I get a message that:
 Starting mysqld daemon with databases from /var/mysql
 040116 10:07:57 mysqld ended
 then
 nothing... the terminal is blank.
 
 when I then do 'ps auwx | grep mysql' I get that only the grep process 
 is running.
 
 then when I do 'mysql -u root' I get this error:
 ERROR 2002: Can't connect to local MySQL server through socket 
 '/tmp/mysql.sock' (61)
 
 I checked in /tmp and mysql.sock is there.
 
 I'm think that there's something wrong with my /var/mysql database?
 Also thinking that I'd be best off re-installing mysql - but how?
 I've tried 'rpm -qa \ grep MySQL' but aparently this 'rpm' command is 
 not available.
 
 any help appreciated.
 
 ta
 tait
 
 
 
 
 
 tait sanders
 computer technician
 sir robert webster bldg
 trc
 unsw

Dr. Gregory B. Newby, Research Faculty, Arctic Region Supercomputing Center
University of Alaska Fairbanks.  PO Box 756020, Fairbanks, AK 99775
e: newby AT arsc.edu v: 907-474-7160 f: 907-474-5494 w: www.arsc.edu/~newby

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: newbie - connect error

2004-01-15 Thread Gregory Newby
On Fri, Jan 16, 2004 at 12:04:38PM +1100, tait sanders wrote:
 well I found my hostname.err file.
 It states this:
 
 mysqld started
 InnoDB: Operating system error number 13 in a file operation.
 The error means that mysqld does not have the access rights to the 
 directory.
 mysqld ended.
 
 so okay - how do I give mysqld the rights to the directory??

Here are the steps from INSTALL-BINARY, a file you should
already have with your MySQL software:

 shell groupadd mysql
 shell useradd -g mysql mysql
 shell cd /usr/local
 shell gunzip  /path/to/mysql-VERSION-OS.tar.gz | tar xvf -
 shell ln -s full-path-to-mysql-VERSION-OS mysql
 shell cd mysql
*shell scripts/mysql_install_db
*shell chown -R root  .
*shell chown -R mysql data
*shell chgrp -R mysql .
 shell bin/mysqld_safe --user=mysql 

The starred items are the ones you need to pay attention to.  You'll
need to run them as root, of course.  Make sure you have a user
called mysql, and a group called mysql.

From your description, it's not clear whether you already did this.
If you installed other than with a binary distribution from
mysql.com (or a mirror), it is possible that things are in somewhat
different locations than in the INSTALL-BINARY file.  But get
the file (it comes with the distributions you download) so you
can see the steps. 

It sounds like you are almost there!
  -- Greg

 I've got mysql installed in /usr/bin/mysql
 and
 the db's and logs in /var/mysql - this is where the mysql db itself is.
 
 ta
 tait
 
 
 On 16/01/2004, at 10:36 AM, Gregory Newby wrote:
 
 Hi, Tait.  It sounds like your server is never actually
 starting properly.  There could be any number of reasons for
 this - it's best to work through the installation instructions
 that came with the package.
 
 To see what failed, chances are good there is an entry
 in the error log (it will probably be named hostname.err where
 hostname is your system's name) under your data directory
 (perhaps /usr/local/mysql/data, but you might have installed
 it in another location).  View with something like this:
  tail /usr/local/mysql/data/hostname.err
 
 If you can't find the data directory or the err file,
 chances are that something fundamental is misconfigured
 (such as file permissions or directory locations).  For this
 go to the instructions in INSTALL-BINARY or whatever's appropriate
 for your build, and make sure you've followed them all.
 
 I hope this helps!
   -- Greg
 
 On Fri, Jan 16, 2004 at 10:16:16AM +1100, tait sanders wrote:
 Hi,
 
 I'm a newbie to mysql.
 I'm running an os10.3 server with mysql v. 4.0.16
 
 When I try to start mysql with 'mysqld_safe ' I get a message that:
 Starting mysqld daemon with databases from /var/mysql
 040116 10:07:57 mysqld ended
 then
 nothing... the terminal is blank.
 
 when I then do 'ps auwx | grep mysql' I get that only the grep process
 is running.
 
 then when I do 'mysql -u root' I get this error:
 ERROR 2002: Can't connect to local MySQL server through socket
 '/tmp/mysql.sock' (61)
 
 I checked in /tmp and mysql.sock is there.
 
 I'm think that there's something wrong with my /var/mysql database?
 Also thinking that I'd be best off re-installing mysql - but how?
 I've tried 'rpm -qa \ grep MySQL' but aparently this 'rpm' command is
 not available.
 
 any help appreciated.
 
 ta
 tait
 
 
 
 
 
 tait sanders
 computer technician
 sir robert webster bldg
 trc
 unsw
 
 Dr. Gregory B. Newby, Research Faculty, Arctic Region Supercomputing 
 Center
 University of Alaska Fairbanks.  PO Box 756020, Fairbanks, AK 99775
 e: newby AT arsc.edu v: 907-474-7160 f: 907-474-5494 w: 
 www.arsc.edu/~newby
 
 
 tait sanders
 computer technician
 sir robert webster bldg
 trc
 unsw
 

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]