On Mon, Jun 29, 2009 at 10:22 PM, Robert Haasrobertmh...@gmail.com wrote:
I'm finding myself unable to follow all the terminology on this thead.
What's dimension reduction? What's PCA?
[snip]
Imagine you have a dataset with two variables, say height in inches
and age in years. For tue purpose
On Fri, Jan 2, 2009 at 5:48 PM, Martijn van Oosterhout
klep...@svana.org wrote:
So you compromise. You split the data into say 1MB blobs and compress
each individually. Then if someone does a substring at offset 3MB you
can find it quickly. This barely costs you anything in the compression
On Nov 14, 2007 10:12 PM, Joshua D. Drake [EMAIL PROTECTED] wrote:
http://www.intel.com/performance/server/xeon/intspd.htm
http://www.intel.com/performance/server/xeon/fpspeed.htm
That says precisely nothing about the matter at hand. Someone should
simply change it and benchmark it in pgsql. I
There seems to be some behavior change in current CVS with respect to
gist and gin indexes on varchar[]. Some side effect of the tsearch2
merge?
\d search_pages
Table public.search_pages
Column |Type | Modifiers
---+-+---
On 6/15/07, Gregory Stark [EMAIL PROTECTED] wrote:
While in theory spreading out the writes could have a detrimental effect I
think we should wait until we see actual numbers. I have a pretty strong
suspicion that the effect would be pretty minimal. We're still doing the same
amount of i/o
On 6/14/07, Simon Riggs [EMAIL PROTECTED] wrote:
On Thu, 2007-06-14 at 16:39 +0900, ITAGAKI Takahiro wrote:
Greg Smith [EMAIL PROTECTED] wrote:
On Mon, 11 Jun 2007, ITAGAKI Takahiro wrote:
If the kernel can treat sequential writes better than random writes, is
it worth sorting dirty
On 11/1/06, Teodor Sigaev [EMAIL PROTECTED] wrote:
[snip]
Brain storm method:
Develop a dictionary which returns all substring for lexeme, for example for
word foobar it will be 'foobar fooba foob foo fo oobar ooba oob oo obar oba ob
bar ba ar'. And make GIN functional index over your column
On 10/24/06, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:
I wasn't aware that a system could protect against this. :-)
I write 8 Kbytes - how can I guarantee that the underlying disk writes
all 8 Kbytes before it loses power? And why isn't the CRC a valid means
of dealing with this? :-)
[snip]
On 10/21/06, Tom Lane [EMAIL PROTECTED] wrote:
[snip]
It hasn't even been tested. One thing I'd want to know about is the
performance effect on non-Intel machines.
On Opteron 265 his test code shows SB8 (the intel alg) is 2.48x faster
for checksum and 1.95x faster for verify for the 800 *
On 8/21/06, Alvaro Herrera [EMAIL PROTECTED] wrote:
But the confirmation that needs to come is that the WAL changes have
been applied (fsync'ed), so the performance will be terrible. So bad,
that I don't think anyone will want to use such a replication system ...
Okay. I give up... Why is
On 7/19/06, Jim C. Nasby [EMAIL PROTECTED] wrote:
[snip]
\d does list bdata__ident_filed_departure before bdata_ident; I'm
wondering if the planner is finding the first index with ident_id in it
and stopping there?
From my own experience it was grabbing the first that has the
requested field
Oh come on, Sorry to troll but this is too easy.
On 5/15/06, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:
You guys have to kill your Windows hate - in jest or otherwise. It's
zealous, and blinding.
[snip]
Why would it
be assumed, that a file system designed for use from a desktop, would be
On 4/9/06, Tom Lane [EMAIL PROTECTED] wrote:
Gregory Maxwell [EMAIL PROTECTED] writes:
For example, one case made in this thread involved bursty performance
with seqscans presumably because the I/O was stalling while processing
was being performed.
Actually, the question that that raised
On 4/9/06, Tom Lane [EMAIL PROTECTED] wrote:
Certainly. If the OS has readahead logic at all, it ought to think that
a seqscan of a large table qualifies. Your arguments seem to question
whether readahead is useful at all --- but they would apply *just as
well* to an app doing its own
On 4/9/06, Luke Lonergan [EMAIL PROTECTED] wrote:
Gregory,
On 4/9/06 1:36 PM, Gregory Maxwell [EMAIL PROTECTED] wrote:
It might also be interesting for someone with the right testing rig on
linux to try the adaptive
readahead patch to see if that improves PG's ability to keep the disk
On 4/8/06, Tom Lane [EMAIL PROTECTED] wrote:
This is exactly the bit of optimism I was questioning. We've already
been sweating blood trying to reduce multiprocessor contention on data
structures in which collisions ought to be avoidable (ie, buffer arrays
where you hope not everyone is
On 3/21/06, Jim C. Nasby [EMAIL PROTECTED] wrote:
ISTM that having a currency type is pretty common for most databases; I
don't really see any reason not to just include it. Likewise for a type
that actually stores timezone info with a timestamp.
This really should be generalized to work with
On 2/17/06, Ragnar [EMAIL PROTECTED] wrote:
Say again ?
Let us say you have 1 billion rows, where the
column in question contains strings like
baaaaaa
baaaaab
baaaaac
...
not necessarily in this order on disc of course
The minimum value
On 2/13/06, Joshua D. Drake [EMAIL PROTECTED] wrote:
Well as one of the people that deploys and managees many, many
postgresql installations I can say I have never run into the need to
have dns names and the thought of dns names honestly seems silly. It
will increase overhead and dependencies
On 12/26/05, Pavel Stehule [EMAIL PROTECTED] wrote:
(1,1) * (1,2) = true
(1,2) * (2,1) is NULL
(2,3) * (1,2) = false
it's usefull for multicriterial optimalisation
This is indeed a sane and useful function which should be adopted by
the SQL standard.. in postgresql this would easily enough
On 12/8/05, Bruce Momjian pgman@candle.pha.pa.us wrote:
A script which identifies non-utf-8 characters and provides some
context, line numbers, etc, will greatly speed up the process of
remedying the situation.
I think the best we can do is the iconv -c with the diff
On 12/6/05, Jan Wieck [EMAIL PROTECTED] wrote:
IMO this is not true. You can get affordable 10GBit network adapters, so
you can have plenty of bandwith in a db server pool (if they are located in
the same area). Even 1GBit Ethernet greatly helps here, and would make it
possible to
On 12/5/05, Tom Lane [EMAIL PROTECTED] wrote:
Not only does 4000! not work, but 400! doesn't even work. I just lost
demo wow factor points!
It looks like the limit would be about factorial(256).
The question remains, though, is this computational range good for
anything except demos?
On 12/4/05, Tom Lane [EMAIL PROTECTED] wrote:
Paul Lindner [EMAIL PROTECTED] writes:
On Sun, Dec 04, 2005 at 11:34:16AM -0500, Tom Lane wrote:
Paul Lindner [EMAIL PROTECTED] writes:
iconv -c -f UTF8 -t UTF8 -o fixed.sql dump.sql
Is that really a one-size-fits-all solution? Especially
On 02 Dec 2005 15:25:58 -0500, Greg Stark [EMAIL PROTECTED] wrote:
I suspect this comes out of a very different storage model from Postgres's.
Postgres would have no trouble building an index of the existing data using
only shared locks. The problem is that any newly inserted (or updated)
On 02 Dec 2005 15:49:02 -0500, Greg Stark [EMAIL PROTECTED] wrote:
Rod Taylor [EMAIL PROTECTED] writes:
The missing capability in this case is to be able to provide or generate
(self learning?) statistics for a function that describe a typical result
and the cost of getting that result.
On 12/1/05, Pollard, Mike [EMAIL PROTECTED] wrote:
Optimizer hints were added because some databases just don't have a very
smart optimizer. But you are much better served tracking down cases in
which the optimizer makes a bad choice, and teaching the optimizer how
to make a better one. That
On 11/21/05, Jim C. Nasby [EMAIL PROTECTED] wrote:
What about Greg Stark's idea of combining Simon's idea of storing
per-heap-block xmin/xmax with using that information in an index scan?
ISTM that's the best of everything that's been presented: it allows for
faster index scans without adding
On 11/18/05, Merlin Moncure [EMAIL PROTECTED] wrote:
In Sybase ASE (and I'm pretty sure the same is true in Microsoft SQL
Server) the leaf level of the narrowest index on the table is scanned,
following a linked list of leaf pages. Leaf pages can be pretty dense
under Sybase, because they
On 11/15/05, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:
I don't understand why an user can't WILLINGLY (by EXPLICITLY setting an
OPTION) allow a privileged administrator to run PostGre.
It is a MAJOR problem for me, that will force me to use another database
because my database will be on a
On 11/13/05, Robert Treat [EMAIL PROTECTED] wrote:
On Saturday 12 November 2005 04:06, Matteo Beccati wrote:
| 1 |1 | NULL |
Wow, that seems ugly maybe there's a reason for it, but I'm not sure we
could deviate from my$ql's behavior on this even if we wanted... they are the
standard
On 11/8/05, Tom Lane [EMAIL PROTECTED] wrote:
Teodor Sigaev [EMAIL PROTECTED] writes:
Layout of GIST_SPLITVEC struct has been changed from 8.0, I'm afraid that
old
.so is used. spl_(right|left)valid fields was added to GIST_SPLITVEC.
Does look a bit suspicious ... Robert, are you *sure*
On 07 Nov 2005 14:22:37 -0500, Greg Stark [EMAIL PROTECTED] wrote:
IIRC, floating point registers are actually longer than a double so if the
entire calculation is done in registers and then the result rounded off to
store in memory it may get the right answer. Whereas if it loses the extra
On 11/4/05, Martijn van Oosterhout kleptog@svana.org wrote:
Yeah, and while one way of removing that dependance is to use ICU, that
library wants everything in UTF-16. So we replace copying to add NULL
to string with converting UTF-8 to UTF-16 on each call. Ugh! The
argument for UTF-16 is that
On 11/4/05, Tom Lane [EMAIL PROTECTED] wrote:
Martijn van Oosterhout kleptog@svana.org writes:
Yeah, and while one way of removing that dependance is to use ICU, that
library wants everything in UTF-16.
Really? Can't it do UCS4 (UTF-32)? There's a nontrivial population
of our users that
On 11/4/05, Martijn van Oosterhout kleptog@svana.org wrote:
[snip]
: ICU does not use UCS-2. UCS-2 is a subset of UTF-16. UCS-2 does not
: support surrogates, and UTF-16 does support surrogates. This means
: that UCS-2 only supports UTF-16's Base Multilingual Plane (BMP). The
: notion of UCS-2
On 11/3/05, Martijn van Oosterhout kleptog@svana.org wrote:
That's called UTF-16 and is currently not supported by PostgreSQL at
all. That may change, since the locale library ICU requires UTF-16 for
everything.
UTF-16 doesn't get us out of the variable length character game, for
that we need
On 10/31/05, Jim C. Nasby [EMAIL PROTECTED] wrote:
On Mon, Oct 31, 2005 at 01:34:17PM -0500, Bruce Momjian wrote:
There is no way if the system has some incorrect value whether that
would later corrupt the data or not. Anything the system does that it
shouldn't do is a potential corruption
On 10/26/05, Christopher Kings-Lynne [EMAIL PROTECTED] wrote:
iconv -c -f UTF8 -t UTF8
recode UTF-8..UTF-8 dump_in.sql dump_out.sql
I've got a file with characters that pg won't accept that recode does
not fix but iconv does. Iconv is fine for my application, so I'm just
posting to the
On 10/27/05, Andrew Dunstan [EMAIL PROTECTED] wrote:
Yes, MySQL is broken in some regards, as usual. However, the API isn't
bad (except for the fact that it doesn't care what invalid crap you
throw at it), and more importantly there are thousands of apps and
developers who think around that
On 10/27/05, Jim Nasby [EMAIL PROTECTED] wrote:
Adding -hackers back to the list...
You could as equally say that it's ordering it by the order of the
enum declaration, which seems quite reasonable to me.
I don't really see why that's considered reasonable, especially as a default.
I
On 10/27/05, Andrew Dunstan [EMAIL PROTECTED] wrote:
That seems counter-intuitive. It's also exposing an implimentation
detail (that the enum is stored internally as a number).
No it is not. Not in the slightest. It is honoring the enumeration order
defined for the type. That is the ONLY
I don't recall this being mentioned in the prior threads:
http://www.cs.duke.edu/TPIE/
GPLed, but perhaps it has some good ideas.
---(end of broadcast)---
TIP 4: Have you searched our list archives?
http://archives.postgresql.org
On 10/3/05, Ron Peacetree [EMAIL PROTECTED] wrote:
[snip]
Just how bad is this CPU bound condition? How powerful a CPU is
needed to attain a DB IO rate of 25MBps?
If we replace said CPU with one 2x, 10x, etc faster than that, do we
see any performance increase?
If a modest CPU can drive a
On 9/30/05, Ron Peacetree [EMAIL PROTECTED] wrote:
4= I'm sure we are paying all sorts of nasty overhead for essentially
emulating the pg filesystem inside another filesystem. That means
~2x as much overhead to access a particular piece of data.
The simplest solution is for us to implement a
On 9/28/05, Ron Peacetree [EMAIL PROTECTED] wrote:
2= We use my method to sort two different tables. We now have these
very efficient representations of a specific ordering on these tables. A
join operation can now be done using these Btrees rather than the
original data tables that involves
On 9/15/05, Tom Lane [EMAIL PROTECTED] wrote:
Yesterday's CVS tip:
1 32s 2 46s 4 88s 8 168s
plus no-cmpb and spindelay2:
1 32s 2 48s 4 100s 8 177s
plus just-committed code to pad LWLock to 32:
1 33s 2 50s 4 98s 8 179s
alter to pad to 64:
1
On 8/16/05, Joshua D. Drake [EMAIL PROTECTED] wrote:
Sure... it hasn't been found. We can play the it might have or might
not have game all day long but it won't get us anywhere. Today, and
yesterday pl/Ruby can be run trust/untrusted, pl/python can not.
Both of these things could be said
On 8/16/05, David Fetter [EMAIL PROTECTED] wrote:
It's not. In PL/parlance, trusted means prevented from ever
opening a filehandle or a socket, and PL/PythonU is called
PL/Python*U* (U for *un*trusted) because it cannot be so prevented.
If somebody has figured out a way to make a PL/Python
On 6/23/05, Gavin Sherry [EMAIL PROTECTED] wrote:
inertia) but seeking to a lot of new tracks to write randomly-positioned
dirty sectors would require significant energy that just ain't there
once the power drops. I seem to recall reading that the seek actuators
eat the largest share of
On 6/18/05, Tom Lane [EMAIL PROTECTED] wrote:
What is important is that it is possible, and useful, to build Postgres
in a completely non-GPL environment. If that were not so then I think
we'd have some license issues. But the fact that building PG in a
GPL-ized environment creates a
- Who has permissions to set the user's quota per tablespace, the
superuser and the tablespace owner?
It would be nice if this were nestable, that is, if the sysadmin could
carve out a tablespace for a user then the user could carve that into
seperately quotated sub tables..
The idea being, a
Has any thought been given to adding bloom filter indexes to PostgreSQL?
A bloom index would be created on a column, and could then be used to
accelerate exact matches where it is common that the user may query
for a value that doesn't exist. For example, with the query select
userid from
Has any thought been given to adding bloom filter indexes to PostgreSQL?
A bloom index would be created on a column, and could then be used to
accelerate exact matches where it is common that the user may query
for a value that doesn't exist. For example, with the query select
userid from
54 matches
Mail list logo