Re: Blooming Filters (was: Re: [Job] MySQL consultation)

2007-03-22 Thread Peter
On Thu, 22 Mar 2007, Omer Zak wrote: On Thu, 2007-03-22 at 00:20 +0200, Peter wrote: You also probably want to read up on more advanced indexing methods (like Bloom and Blooming filters and such) than what's available with ordinary off the shelf databases. I searched for information about

Re: Blooming Filters (was: Re: [Job] MySQL consultation)

2007-03-22 Thread Muli Ben-Yehuda
On Thu, Mar 22, 2007 at 10:55:11AM +0200, Peter wrote: I am not an expert on this, but any algorithm that runs in O(1) or close to that for the data size you use is a candidate. The data size should be obviously less than 2^32 for x86 at least in any indexable dimension if you want a

Re: Blooming Filters (was: Re: [Job] MySQL consultation)

2007-03-22 Thread Geoffrey S. Mendelson
On Thu, Mar 22, 2007 at 10:55:11AM +0200, Peter wrote: 64-bit x86 and up. F.ex. 30 million records will require less than 7 bits per entry just to keep a complete linear index in 3GB of RAM (the maximum usable you can put in a x86 32 bit machine). 7 bits is not enough to even make 30

Re: Blooming Filters (was: Re: [Job] MySQL consultation)

2007-03-22 Thread Geoffrey S. Mendelson
On Thu, Mar 22, 2007 at 11:14:47AM +0200, Nadav Har'El wrote: Reminds me of a group in my workplace whose abbreviated name is STD. Unfortunately, when I found out about it, and told them what STD means to the typical American, it was too late to change the group's name. Or an Israeli company

Re: Blooming Filters (was: Re: [Job] MySQL consultation)

2007-03-22 Thread Tzahi Fadida
On Thursday 22 March 2007 11:25, Muli Ben-Yehuda wrote: On Thu, Mar 22, 2007 at 10:55:11AM +0200, Peter wrote: I am not an expert on this, but any algorithm that runs in O(1) or close to that for the data size you use is a candidate. The data size should be obviously less than 2^32 for x86

Re: Blooming Filters (was: Re: [Job] MySQL consultation)

2007-03-22 Thread Muli Ben-Yehuda
On Thu, Mar 22, 2007 at 11:46:36AM +0200, Tzahi Fadida wrote: With PAE you can stick in a lot more RAM on x86-32, but then your bottleneck becomes the 3GB of virtual address space for a single process. Perhaps some kind of a hardware solution can be used. I.e. attaching a pci with

Re: Blooming Filters (was: Re: [Job] MySQL consultation)

2007-03-22 Thread Tzahi Fadida
On Thursday 22 March 2007 11:49, Muli Ben-Yehuda wrote: On Thu, Mar 22, 2007 at 11:46:36AM +0200, Tzahi Fadida wrote: With PAE you can stick in a lot more RAM on x86-32, but then your bottleneck becomes the 3GB of virtual address space for a single process. Perhaps some kind of a

Re: Blooming Filters (was: Re: [Job] MySQL consultation)

2007-03-22 Thread Peter
On Thu, 22 Mar 2007, Tzahi Fadida wrote: Perhaps some kind of a hardware solution can be used. I.e. attaching a pci with memory and addressing it's 64gb memory either directly (if you have 64 bit bus) or in two phases. It can also be any RAM space size you choose, but it will cost you (each

Re: Blooming Filters (was: Re: [Job] MySQL consultation)

2007-03-22 Thread Tzahi Fadida
On Thursday 22 March 2007 14:58, Peter wrote: On Thu, 22 Mar 2007, Tzahi Fadida wrote: Perhaps some kind of a hardware solution can be used. I.e. attaching a pci with memory and addressing it's 64gb memory either directly (if you have 64 bit bus) or in two phases. It can also be any RAM

Re: Blooming Filters (was: Re: [Job] MySQL consultation)

2007-03-22 Thread Tzahi Fadida
On Thursday 22 March 2007 16:18, Peter wrote: On Thu, 22 Mar 2007, Tzahi Fadida wrote: Advocating is a strong word, i was suggesting. How exactly would you address 128gb,256gb? Unless of course your system board and CPU supports such sizes... The board does not care about sizes. Disk

Re: Blooming Filters (was: Re: [Job] MySQL consultation)

2007-03-22 Thread Peter
On Thu, 22 Mar 2007, Tzahi Fadida wrote: Advocating is a strong word, i was suggesting. How exactly would you address 128gb,256gb? Unless of course your system board and CPU supports such sizes... The board does not care about sizes. Disk requests are serialized and they can be any

Re: Blooming Filters (was: Re: [Job] MySQL consultation)

2007-03-22 Thread Peter
On Thu, 22 Mar 2007, Tzahi Fadida wrote: On Thursday 22 March 2007 16:18, Peter wrote: On Thu, 22 Mar 2007, Tzahi Fadida wrote: Advocating is a strong word, i was suggesting. How exactly would you address 128gb,256gb? Unless of course your system board and CPU supports such sizes... The

Re: Blooming Filters (was: Re: [Job] MySQL consultation)

2007-03-22 Thread Tzahi Fadida
On Thursday 22 March 2007 17:45, Peter wrote: On Thu, 22 Mar 2007, Tzahi Fadida wrote: On Thursday 22 March 2007 16:18, Peter wrote: On Thu, 22 Mar 2007, Tzahi Fadida wrote: Advocating is a strong word, i was suggesting. How exactly would you address 128gb,256gb? Unless of course your

Re: Blooming Filters (was: Re: [Job] MySQL consultation)

2007-03-22 Thread Peter
On Thu, 22 Mar 2007, Tzahi Fadida wrote: Listen, i did not suggest to map 1024 bits, i was using your example. What you are talking about is PCI and other buses. On the same 32bit address bus you can address many data buses using bridges, which is exactly what i said from the beginning and

[Job] MySQL consultation

2007-03-21 Thread Elazar Leibovich
I need a MySQL consultant for a small consultation (20 min.) to help me designing a DB for ~30mil rows, whose size is about 10Gb. The consultant must have proven experience with building using and optimizing DBs of this scale. I only need to consult him by phone for 10-20 minutes, consultation by

Re: [Job] MySQL consultation

2007-03-21 Thread Elazar Leibovich
It seems that my question boils down to this: I'm having a 3Gb table with three TEXT columns and an ID. Counting it takes between 1 to 2 minutes, is it normal for such a size of a table? On 3/21/07, Elazar Leibovich [EMAIL PROTECTED] wrote: I need a MySQL consultant for a small consultation (20

Re: [Job] MySQL consultation

2007-03-21 Thread Jonathan Ben Avraham
Hi Elazar, What is it exactly that you are counting? - yba On Wed, 21 Mar 2007, Elazar Leibovich wrote: Date: Wed, 21 Mar 2007 15:42:21 +0200 From: Elazar Leibovich [EMAIL PROTECTED] To: linux-il@linux.org.il Subject: Re: [Job] MySQL consultation It seems that my question boils down

Re: [Job] MySQL consultation

2007-03-21 Thread Elazar Leibovich
: Elazar Leibovich [EMAIL PROTECTED] To: linux-il@linux.org.il Subject: Re: [Job] MySQL consultation It seems that my question boils down to this: I'm having a 3Gb table with three TEXT columns and an ID. Counting it takes between 1 to 2 minutes, is it normal for such a size of a table? On 3

Re: [Job] MySQL consultation

2007-03-21 Thread Jonathan Ben Avraham
On Wed, 21 Mar 2007, Elazar Leibovich wrote: Date: Wed, 21 Mar 2007 16:02:22 +0200 From: Elazar Leibovich [EMAIL PROTECTED] To: Jonathan Ben Avraham [EMAIL PROTECTED] Cc: linux-il@linux.org.il Subject: Re: [Job] MySQL consultation The whole table. SELECT COUNT(*) FROM tbl WHERE id1; Thanks

Re: [Job] MySQL consultation

2007-03-21 Thread Elazar Leibovich
+0200 From: Elazar Leibovich [EMAIL PROTECTED] To: Jonathan Ben Avraham [EMAIL PROTECTED] Cc: linux-il@linux.org.il Subject: Re: [Job] MySQL consultation The whole table. SELECT COUNT(*) FROM tbl WHERE id1; Thanks for the speedy reply. Hi Elazar, One to three minutes seems entirely reasonable

Re: [Job] MySQL consultation

2007-03-21 Thread Geoffrey S. Mendelson
On Wed, Mar 21, 2007 at 03:42:21PM +0200, Elazar Leibovich wrote: It seems that my question boils down to this: I'm having a 3Gb table with three TEXT columns and an ID. Counting it takes between 1 to 2 minutes, is it normal for such a size of a table? I think it's pretty obvious that you

Re: [Job] MySQL consultation

2007-03-21 Thread Jonathan Ben Avraham
Ben Avraham [EMAIL PROTECTED] Cc: ILUG linux-il@linux.org.il Subject: Re: [Job] MySQL consultation This is not the problem. The problem is that running Queries like: SELECT SQL_CALC_FOUND_ROWS c.cat_ID FROM posts AS l,post2cat AS a2c, categories AS c WHERE (date2006-01-01 AND date2006-01-02

Re: [Job] MySQL consultation

2007-03-21 Thread Geoffrey S. Mendelson
On Wed, Mar 21, 2007 at 04:42:55PM +0200, Jonathan Ben Avraham wrote: Hi Geoffrey, I beg to disagree. A simple redesign should do the trick. The problem IMHO is that the query in question has to go through the whole database each time. Busting the table up into smaller tables ordered by date

Re: [Job] MySQL consultation

2007-03-21 Thread Tzahi Fadida
On Wednesday 21 March 2007 16:26, Elazar Leibovich wrote: This is not the problem. The problem is that running Queries like: SELECT SQL_CALC_FOUND_ROWS c.cat_ID FROM posts AS l,post2cat AS a2c, categories AS c WHERE (date2006-01-01 AND date2006-01-02) AND(c.cat_ID=a2c.category_id AND

Re: [Job] MySQL consultation

2007-03-21 Thread Shlomi Fish
: Geoffrey S. Mendelson [EMAIL PROTECTED] To: Elazar Leibovich [EMAIL PROTECTED] Cc: linux-il@linux.org.il Subject: Re: [Job] MySQL consultation On Wed, Mar 21, 2007 at 03:42:21PM +0200, Elazar Leibovich wrote: It seems that my question boils down to this: I'm having a 3Gb table with three TEXT

Re: [Job] MySQL consultation

2007-03-21 Thread Geoffrey S. Mendelson
On Wed, Mar 21, 2007 at 11:09:21PM +0200, Shlomi Fish wrote: However, I disagree with Geoffrey on the fact that you need Oracle or DB2 or a different operating system than Linux (or BSD). By all means, Linux runs very well on this hardware, and PostgreSQL and MySQL are as fast as Oracle or

Re: [Job] MySQL consultation

2007-03-21 Thread Shlomi Fish
Hi Geoffrey! On Wednesday 21 March 2007, Geoffrey S. Mendelson wrote: On Wed, Mar 21, 2007 at 11:09:21PM +0200, Shlomi Fish wrote: However, I disagree with Geoffrey on the fact that you need Oracle or DB2 or a different operating system than Linux (or BSD). By all means, Linux runs very

Re: [Job] MySQL consultation

2007-03-21 Thread Peter
On Wed, 21 Mar 2007, Geoffrey S. Mendelson wrote: With 30 million records in a database, I would seriously consider a different hardware platform. Some of the perform quite well with Linux some do not. 30 million records is not a lot. But it is too much if one expects to write things like

Re: [Job] MySQL consultation

2007-03-21 Thread Peter
By the way, for example on how NOT to do SQL web programming see the white pages service at fwd.pulver.net or the search in Skype. Both of these have well under a million records usually and it takes minutes to hours to get answers. At least Skype starts displaying as soon as there is data.

Blooming Filters (was: Re: [Job] MySQL consultation)

2007-03-21 Thread Omer Zak
On Thu, 2007-03-22 at 00:20 +0200, Peter wrote: You also probably want to read up on more advanced indexing methods (like Bloom and Blooming filters and such) than what's available with ordinary off the shelf databases. I searched for information about Blooming filters. Google gave me