From: MauMau
> I intentionally have put little conclusion on our specification and
> design. I'd like you to look at recent distributed databases, and
> then think about and discuss what we want to aim for together. I feel
> it's better to separate a thread per topic or group of topics.
Finally
Hello all,
# I'm resending because some error occurred
I've overhauled the scaleout design wiki I presented at PGCon 2018
developer unconference and assembled the research of other DBMSs'
scale-out features.
Scaleout Design
https://wiki.postgresql.org/wiki/Scaleout_Design
I intentionally have p
Hello all,
I've overhauled the scaleout design wiki I presented at PGCon 2018
developer unconference and assembled the research of other DBMSs'
scale-out features.
Scaleout Design
https://wiki.postgresql.org/wiki/Scaleout_Design
I intentionally have put little conclusion on our specification an
;
>
>
> https://pdfs.semanticscholar.org/8b60/163593931cebc58e9f637cfb501500230adc.pdf
>
>
>
>
>
> Regards
>
> Takayuki Tsunakawa
>
>
>
>
>
> --- below is Sumanta's original mail ---
>
> *From:* Sumanta Mukherjee
> *Sent:* Wednesday, June 17
7, 2020 5:34 PM
To: Tsunakawa, Takayuki/綱川 貴之
Cc: Bruce Momjian ; Merlin Moncure ;
Robert Haas ; maumau...@gmail.com
Subject: Re: I'd like to discuss scaleout at PGCon
Hello,
I saw the presentation and it is great except that it seems to be unclear of
both SD and SN if the storage and the comput
On Sat, Jun 23, 2018 at 12:41:00PM +1000, Haribabu Kommi wrote:
>
> On Sat, Jun 23, 2018 at 12:17 PM Bruce Momjian wrote:
>
> On Fri, Jun 22, 2018 at 01:28:58PM -0500, Merlin Moncure wrote:
> > On Fri, Jun 22, 2018 at 12:34 PM Bruce Momjian wrote:
> > >
> > > What we don't want
On Sat, Jun 23, 2018 at 12:17 PM Bruce Momjian wrote:
> On Fri, Jun 22, 2018 at 01:28:58PM -0500, Merlin Moncure wrote:
> > On Fri, Jun 22, 2018 at 12:34 PM Bruce Momjian wrote:
> > >
> > > What we don't want to do is to add a bunch of sharding-specific code
> > > without knowing which workloads
On Fri, Jun 22, 2018 at 01:28:58PM -0500, Merlin Moncure wrote:
> On Fri, Jun 22, 2018 at 12:34 PM Bruce Momjian wrote:
> >
> > What we don't want to do is to add a bunch of sharding-specific code
> > without knowing which workloads it benefits, and how many of our users
> > will actually use shar
On 06/22/2018 11:28 AM, Merlin Moncure wrote:
Key features from my perspective:
*) fdw in parallel. how do i do it today? ghetto implemented parallel
queries with asynchronous dblink
*) column store
Although not in core, we do have this as an extension through Citus
don't we?
JD
--
Comma
On Fri, Jun 22, 2018 at 12:34 PM Bruce Momjian wrote:
>
> On Fri, Jun 1, 2018 at 11:29:43AM -0500, Merlin Moncure wrote:
> > FWIW, Distributed analytical queries is the right market to be in.
> > This is the field in which I work, and this is where the action is at.
> > I am very, very, sure abou
On Fri, Jun 1, 2018 at 11:29:43AM -0500, Merlin Moncure wrote:
> FWIW, Distributed analytical queries is the right market to be in.
> This is the field in which I work, and this is where the action is at.
> I am very, very, sure about this. My view is that many of the
> existing solutions to this
> From: Konstantin Knizhnik [mailto:k.knizh...@postgrespro.ru]
> I can not completely agree with it. I have done a lot of benchmarking of
> PostgreSQL, CitusDB, SparkSQL and native C/Scala code generated for
> TPC-H queries.
Wow, you have an amazingly abundant experience.
> I do not want to say
From: Michael Paquier [mailto:mich...@paquier.xyz]
> > On Wed, Jun 6, 2018 at 8:16 PM, MauMau wrote:
> >> Looking at the XL source code, the following sequence of functions
> >> are called when the coordinator handles the Row Description message
> >> ('T') from the data node. I guess the parsing
On Thu, Jun 7, 2018 at 10:53 AM, Michael Paquier wrote:
> On Thu, Jun 07, 2018 at 10:28:15AM +0530, Ashutosh Bapat wrote:
>> On Wed, Jun 6, 2018 at 8:16 PM, MauMau wrote:
>>> Looking at the XL source code, the following sequence of functions are
>>> called when the coordinator handles the Row Des
On Thu, Jun 07, 2018 at 10:28:15AM +0530, Ashutosh Bapat wrote:
> On Wed, Jun 6, 2018 at 8:16 PM, MauMau wrote:
>> Looking at the XL source code, the following sequence of functions are
>> called when the coordinator handles the Row Description message ('T')
>> from the data node. I guess the par
On Wed, Jun 6, 2018 at 11:46 PM, Alvaro Herrera
wrote:
> On 2018-Jun-06, Ashutosh Bapat wrote:
>
>> On Tue, Jun 5, 2018 at 10:04 PM, MauMau wrote:
>> > From: Ashutosh Bapat
>> >> In order to normalize parse trees, we need to at least replace
>> >> various OIDs in parse-tree with something that th
On Wed, Jun 6, 2018 at 8:16 PM, MauMau wrote:
> From: Ashutosh Bapat
>> Keeping OIDs same across the nodes would require extra communication
>> between nodes to keep track of next OID, dropped OIDs etc. We need
> to
>> weigh the time spent in that communication and the time saved during
>> parsing
From: Alvaro Herrera
> Maybe an easy (hah) thing to do is use 2PC for DDL, agree on a OID
> that's free on every node, then create the object in all servers at
the
> same time. We currently use the system-wide OID generator to assign
the
> OID, but seems an easy thing to change (much harder is to
From: Alvaro Herrera
> Maybe an easy (hah) thing to do is use 2PC for DDL, agree on a OID
> that's free on every node, then create the object in all servers at
the
> same time. We currently use the system-wide OID generator to assign
the
> OID, but seems an easy thing to change (much harder is to
On 2018-Jun-06, Ashutosh Bapat wrote:
> On Tue, Jun 5, 2018 at 10:04 PM, MauMau wrote:
> > From: Ashutosh Bapat
> >> In order to normalize parse trees, we need to at least replace
> >> various OIDs in parse-tree with something that the foreign server
> >> will understand correctly like table name
From: Simon Riggs
On 5 June 2018 at 17:14, MauMau wrote:
>> Furthermore, an extra hop and double parsing/planning could matter
for
>> analytic queries, too. For example, SAP HANA boasts of scanning 1
>> billion rows in one second. In HANA's scaleout architecture, an
>> application can connect t
From: Michael Paquier
> Greenplum's orca planner (and Citus?) have such facilities if I
recall
> correctly, just mentioning that pushing down directly to remote
nodes
> compiled plans ready for execution exists here and there (that's not
the
> case of XC/XL). For queries whose planning time is way
From: Ashutosh Bapat
> Keeping OIDs same across the nodes would require extra communication
> between nodes to keep track of next OID, dropped OIDs etc. We need
to
> weigh the time spent in that communication and the time saved during
> parsing.
If we manage the system catalog for cluster-wide obj
2018-06-06 10:58 GMT+02:00 Konstantin Knizhnik :
>
>
> On 05.06.2018 20:17, MauMau wrote:
>
>> From: Merlin Moncure
>>
>>> FWIW, Distributed analytical queries is the right market to be in.
>>> This is the field in which I work, and this is where the action is
>>>
>> at.
>>
>>> I am very, very, su
On 05.06.2018 20:17, MauMau wrote:
From: Merlin Moncure
FWIW, Distributed analytical queries is the right market to be in.
This is the field in which I work, and this is where the action is
at.
I am very, very, sure about this. My view is that many of the
existing solutions to this problem
On 5 June 2018 at 17:14, MauMau wrote:
> Furthermore, an extra hop and double parsing/planning could matter for
> analytic queries, too. For example, SAP HANA boasts of scanning 1
> billion rows in one second. In HANA's scaleout architecture, an
> application can connect to any worker node and
On Tue, Jun 5, 2018 at 10:04 PM, MauMau wrote:
> From: Ashutosh Bapat
>> In order to normalize parse trees, we need to at
>> least replace various OIDs in parse-tree with something that the
>> foreign server will understand correctly like table name on the
>> foreign table pointed to by local fore
On Wed, Jun 06, 2018 at 01:14:04AM +0900, MauMau wrote:
> I don't think an immediate server like the coordinators in XL is
> necessary. That extra hop can be eliminated by putting both the
> coordinator and the data node roles in the same server process. That
> is, the node to which an applicatio
From: Simon Riggs
On 1 June 2018 at 16:56, Ashutosh Bapat
wrote:
>> I think partitioning + FDW provide basic infrastructure for
>> distributing data, planning queries working with such data. We need
>> more glue to support node management, cluster configuration. So, I
>> agree with your statement.
From: Merlin Moncure
> FWIW, Distributed analytical queries is the right market to be in.
> This is the field in which I work, and this is where the action is
at.
> I am very, very, sure about this. My view is that many of the
> existing solutions to this problem (in particular hadoop class
> solt
From: Ashutosh Bapat
> Each node need to be confiugred and maintained. That requires
efforts.
> So we need to keep the number of nodes to a minimum. With a
> coordinator and worker node segregation, we require at least two
nodes
> in a cluster and just that configuration doesn't provide much
> scal
But managing the catalog at one place and using
the same OID values seems to concise to me as a concept.
Regards
MauMau
-Original Message-
From: Ashutosh Bapat
Sent: Saturday, June 2, 2018 1:00 AM
To: Tom Lane
Cc: MauMau ; Robert Haas ; PostgreSQL Hackers
Subject: Re: I'd like to
From: Ashutosh Bapat
> In order to normalize parse trees, we need to at
> least replace various OIDs in parse-tree with something that the
> foreign server will understand correctly like table name on the
> foreign table pointed to by local foreign table OR (schema
qualified)
> function names and
From: Robert Haas
On Thu, May 31, 2018 at 8:12 AM, MauMau wrote:
>> Oh, I didn't know you support FDW approach mainly for analytics. I
>> guessed the first target was OLTP read-write scalability.
>
> That seems like a harder target to me, because you will have an
extra
> hop involved -- SQL from
From: Simon Riggs
> Passing detailed info between servers is exactly what XL does.
>
> It requires us to define a cluster, exactly as XL does.
>
> And yes, its a good idea to replicate some tables to all nodes, as
XL does.
>
> So it seems we have at last some agreement that some of the things
XL
>
On Sun, Jun 3, 2018 at 2:00 AM, Simon Riggs wrote:
>
> In XL, GTM is a singe component managing transaction ids. That has a
> standby, so is not a SPOF.
>
> But that is not what I mean. I don't believe that a GTM-style
> component is necessary in a future in-core scalablility solution.
>
I agree.
On 2 June 2018 at 22:46, Ashutosh Bapat wrote:
And that is why both XL and "FDW approach" rely on a central coordinator.
>>>
>>> I don't think we ever specified that "FDW approach" "relies" on a
>>> central coordinator. One could configure and setup a cluster with
>>> multiple coordinators u
On Sat, Jun 2, 2018 at 4:05 AM, Simon Riggs wrote:
> On 1 June 2018 at 16:56, Ashutosh Bapat
> wrote:
>> On Fri, Jun 1, 2018 at 11:10 AM, Simon Riggs wrote:
>>>
>>> Using a central coordinator also allows multi-node transaction
>>> control, global deadlock detection etc..
>>
>> But that becomes
On 1 June 2018 at 16:56, Ashutosh Bapat wrote:
> On Fri, Jun 1, 2018 at 11:10 AM, Simon Riggs wrote:
>>
>> Using a central coordinator also allows multi-node transaction
>> control, global deadlock detection etc..
>
> But that becomes an SPOF and then we have to configure a standby for
> that. I
On Wed, May 30, 2018 at 9:26 PM Robert Haas wrote:
> The FDW approach, of which I have been a supporter for some years now,
> is really aiming at a different target, which is to allow efficient
> analytics queries across a multi-node cluster. I think we're getting
> pretty close to being able to
On Fri, Jun 1, 2018 at 11:27 AM, Tom Lane wrote:
> Ashutosh Bapat writes:
>> In order to avoid double parsing, we might want to find a way to pass
>> a "normalized" parse tree down to the foreign server. We need to
>> normalize the OIDs in the parse tree since those may be different
>> across the
On Fri, Jun 1, 2018 at 11:10 AM, Simon Riggs wrote:
>
> Using a central coordinator also allows multi-node transaction
> control, global deadlock detection etc..
But that becomes an SPOF and then we have to configure a standby for
that. I am not saying that that's a bad design but it's not very g
Ashutosh Bapat writes:
> In order to avoid double parsing, we might want to find a way to pass
> a "normalized" parse tree down to the foreign server. We need to
> normalize the OIDs in the parse tree since those may be different
> across the nodes.
I don't think this is a good idea at all. It b
On 1 June 2018 at 04:00, MauMau wrote:
> The SQL processor should be one layer, not two layers.
For OLTP, that would be best. But it would be restricted to
single-node requests, leaving you the problem of how you know ahead of
time whether an SQL statement was single node or not.
Using a centra
On 1 June 2018 at 15:44, Ashutosh Bapat wrote:
> On Thu, May 31, 2018 at 11:00 PM, MauMau wrote:
>> 2018-05-31 22:44 GMT+09:00, Robert Haas :
>>> On Thu, May 31, 2018 at 8:12 AM, MauMau wrote:
Oh, I didn't know you support FDW approach mainly for analytics. I
guessed the first target
On Thu, May 31, 2018 at 11:00 PM, MauMau wrote:
> 2018-05-31 22:44 GMT+09:00, Robert Haas :
>> On Thu, May 31, 2018 at 8:12 AM, MauMau wrote:
>>> Oh, I didn't know you support FDW approach mainly for analytics. I
>>> guessed the first target was OLTP read-write scalability.
>>
>> That seems like
2018-05-31 22:44 GMT+09:00, Robert Haas :
> On Thu, May 31, 2018 at 8:12 AM, MauMau wrote:
>> Oh, I didn't know you support FDW approach mainly for analytics. I
>> guessed the first target was OLTP read-write scalability.
>
> That seems like a harder target to me, because you will have an extra
>
On Thu, May 31, 2018 at 8:12 AM, MauMau wrote:
> I anticipated a decision process at the unconference like this:
> "Do we want to build on shared everything architecture?"
> "No, because it limits scalability, requires expensive shared storage,
> and it won't run on many clouds."
> "Then do we wan
2018-05-31 11:26 GMT+09:00, Robert Haas :
> It was nice to meet you in person.
Me too. And it was very kind of you to help me to display the wiki
page well and guide the session. When I first heard your voice at the
Developer Meeting, I thought Bruce Momjian was speaking, because your
voice soun
On Sun, May 27, 2018 at 1:20 AM, MauMau wrote:
> I'm going to attend PGCon in Ottawa for the first time. I am happy if
> I can meet you.
It was nice to meet you in person.
> I'd like to have a session on scaleout design at the unconference.
> I've created a wiki page for that (this is still jus
Hello,
I'm going to attend PGCon in Ottawa for the first time. I am happy if
I can meet you.
Because I'm visually impaired, I only have vision to sense light. If
you see a Japanese man with a height of 171 cm with a white cane, it's
probably me. I'd be happy if you talk to me. But as I'm stil
51 matches
Mail list logo