Re: [HACKERS] Extended statistics is not working on Vars hidden under a RelabelType
On 15 October 2017 at 06:49, Robert Haas wrote: > On Fri, Oct 13, 2017 at 4:49 PM, David Rowley > wrote: >> tps = 8282.481310 (including connections establishing) >> tps = 8282.750821 (excluding connections establishing) > > vs. > >> tps = 8520.822410 (including connections establishing) >> tps = 8521.132784 (excluding connections establishing) >> >> With the patch we are making use of the extended statistics, which we >> do expect to be more work for the planner. Although, we didn't add >> extended statistics to speed up the planner. > > Sure, I understand. That's actually a pretty substantial regression - > I guess that means that it's pretty important to avoid creating > extended statistics that are not needed, at least for short-running > queries. To be honest, I ran that on a VM on my laptop. I was getting quite a bit of noise. I just posted that to show that the 12x slowdown didn't exist. I don't know what the actual slowdown is. I just know extended stats are not free and that nobody expected that they ever would be. The good news is that they're off by default and if the bad ever outweighs the good then the fix for that starts with "DROP STATISTICS" I personally think it's great we're starting to see a useful feature materialise that can help with poor row estimates from the planner. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Partition-wise aggregation/grouping
On Fri, Oct 13, 2017 at 1:13 PM, David Rowley wrote: > > I looked over the patch and saw this: > > @@ -1800,6 +1827,9 @@ cost_merge_append(Path *path, PlannerInfo *root, > */ > run_cost += cpu_operator_cost * tuples; > > + /* Add MergeAppend node overhead like we do it for the Append node */ > + run_cost += cpu_tuple_cost * DEFAULT_APPEND_COST_FACTOR * tuples; > + > path->startup_cost = startup_cost + input_startup_cost; > path->total_cost = startup_cost + run_cost + input_total_cost; > } > > You're doing that right after a comment that says we don't do that. It > also does look like the "run_cost += cpu_operator_cost * tuples" is > trying to do the same thing, so perhaps it's worth just replacing > that, which by default will double that additional cost, although > doing so would have the planner slightly prefer a MergeAppend to an > Append than previously. > I think we can remove that code block entirely. I have added relevant comments around DEFAULT_APPEND_COST_FACTOR already. However, I am not sure of doing this as you correctly said it may prefer MergeAppend to an Append. Will it be fine we remove that code block? > +#define DEFAULT_APPEND_COST_FACTOR 0.5 > > I don't really think the DEFAULT_APPEND_COST_FACTOR adds much. it > means very little by itself. It also seems that most of the other cost > functions just use the magic number. > Agree, but those magic numbers used only once at that place. But here there are two places. So if someone wants to update it, (s)he needs to make sure to update that at two places. To minimize that risk, having a #define seems better. > > -- > David Rowley http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Training & Services > -- Jeevan Chalke Technical Architect, Product Development EnterpriseDB Corporation The Enterprise PostgreSQL Company
Re: [HACKERS] relkind check in DefineIndex
On 2017/10/14 4:32, Robert Haas wrote: > On Fri, Oct 13, 2017 at 12:38 PM, Alvaro Herrera > wrote: >> The relkind check in DefineIndex has grown into an ugly rats nest of >> 'if' statements. I propose to change it into a switch, as per the >> attached. > > wfm +1 Thanks, Amit -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] How does postgres store the join predicate for a relation in a given query
On Sat, Oct 14, 2017 at 3:15 AM, Gourav Kumar wrote: > Why does have_relevant_joinclause() and have_relevant_eclass_joinclause() > return true for all possible joins for the query given below. > Even when they have no join predicate between them. > e.g. join between ss1 & ws3, ss2 & ws3 etc. > The prologues of those functions and comments within those explain that. /* * have_relevant_joinclause * Detect whether there is a joinclause that involves * the two given relations. * * Note: the joinclause does not have to be evaluable with only these two * relations. This is intentional. For example consider * SELECT * FROM a, b, c WHERE a.x = (b.y + c.z) * If a is much larger than the other tables, it may be worthwhile to * cross-join b and c and then use an inner indexscan on a.x. Therefore * we should consider this joinclause as reason to join b to c, even though * it can't be applied at that join step. */ /* * have_relevant_eclass_joinclause * Detect whether there is an EquivalenceClass that could produce * a joinclause involving the two given relations. * * This is essentially a very cut-down version of * generate_join_implied_equalities(). Note it's OK to occasionally say "yes" * incorrectly. Hence we don't bother with details like whether the lack of a * cross-type operator might prevent the clause from actually being generated. */ May be you want to see whether those comments are applicable in your case and also see how the callers handle the return values. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] UPDATE of partition key
Hi Amit. On 2017/10/04 22:51, Amit Khandekar wrote: > Main patch : > update-partition-key_v20.patch Guess you're already working on it but the patch needs a rebase. A couple of hunks in the patch to execMain.c and nodeModifyTable.c fail. Meanwhile a few comments: +void +pull_child_partition_columns(Bitmapset **bitmapset, +Relation rel, +Relation parent) Nitpick: don't we normally list the output argument(s) at the end? Also, "bitmapset" could be renamed to something that conveys what it contains? + if (partattno != 0) + child_keycols = + bms_add_member(child_keycols, + partattno - FirstLowInvalidHeapAttributeNumber); + } + foreach(lc, partexprs) + { Elsewhere (in quite a few places), we don't iterate over partexprs separately like this, although I'm not saying it is bad, just different from other places. + * the transition tuplestores can be built. Furthermore, if the transition + * capture is happening for UPDATEd rows being moved to another partition due + * partition-key change, then this function is called once when the row is + * deleted (to capture OLD row), and once when the row is inserted to another + * partition (to capture NEW row). This is done separately because DELETE and + * INSERT happen on different tables. Extra space at the beginning from the 2nd line onwards. + (event == TRIGGER_EVENT_UPDATE && ((oldtup == NULL) ^ (newtup == NULL Is there some reason why a bitwise operator is used here? + * 'update_rri' has the UPDATE per-subplan result rels. Could you explain why they are being received as input here? + * 'perleaf_parentchild_maps' receives an array of TupleConversionMap objects + * with on entry for every leaf partition (required to convert input tuple + * based on the root table's rowtype to a leaf partition's rowtype after + * tuple routing is done) Could this be named leaf_tupconv_maps, maybe? It perhaps makes clear that they are maps needed for "tuple conversion". And the other field holding the reverse map as leaf_rev_tupconv_maps. Either that or use underscores to separate words, but then it gets too long I guess. + tuple = ConvertPartitionTupleSlot(mtstate, + mtstate->mt_perleaf_parentchild_maps[leaf_part_index], + The 2nd line here seems to have gone over 80 characters. ISTM, ConvertPartitionTupleSlot has a very specific job and a bit complex interface. I guess it could simply have the following interface: static HeapTuple ConvertPartitionTuple(ModifyTabelState *mtstate, HeapTuple tuple, bool is_update); And figure out, based on the value of is_update, which map to use and which slot to set *p_new_slot to (what is now "new_slot" argument). You're getting mtstate here anyway, which contains all the information you need here. It seems better to make that (selecting which map and which slot) part of the function's implementation if we're having this function at all, imho. Maybe I'm missing some details there, but my point still remains that we should try to put more logic in that function instead of it just do the mechanical tuple conversion. + * We have already checked partition constraints above, so skip them + * below. How about: ", so skip checking here."? ISTM, the newly introduced logic in ExecSetupTransitionCaptureState() to try to reuse the per-subplan child-to-parent map as per-leaf child-to-parent map could be simplified a bit. I mean the following code: +/* + * But for Updates, we can share the per-subplan maps with the per-leaf + * maps. + */ +update_rri_index = 0; +update_rri = mtstate->resultRelInfo; +if (mtstate->mt_nplans > 0) +cur_reloid = RelationGetRelid(update_rri[0].ri_RelationDesc); -/* Choose the right set of partitions */ -if (mtstate->mt_partition_dispatch_info != NULL) +for (i = 0; i < numResultRelInfos; ++i) +{ How about (pseudo-code): j = 0; for (i = 0; i < n_leaf_parts; i++) { if (j < n_subplans && leaf_rri[i]->oid == subplan_rri[j]->oid) { leaf_childparent_map[i] = subplan_childparent_map[j]; j++; } else { leaf_childparent_map[i] = new map } } I think the above would also be useful in ExecSetupPartitionTupleRouting() where you've added similar code to try to reuse per-subplan ResultRelInfos. In ExecInitModifyTable(), can we try to minimize the number of places where update_tuple_routing_needed is being set. Currently, it's being set in 3 places: +boolupdate_tuple_routing_needed = node->part_cols_updated; & +/* + * If this is an UPDATE and a BEFORE UPDATE trigger is present, we may + * need to do update tuple routing. + */ +if (resultRelInfo->ri_TrigDesc && +resultRelInfo->ri_TrigDesc->trig_update_before_row && +
Re: [HACKERS] Determine state of cluster (HA)
On 13 October 2017 at 08:50, Joshua D. Drake wrote: > -Hackers, > > I had a long call with a firm developing front end proxy/cache/HA for > Postgres today. Essentially the software is a replacement for PGPool in > entirety but also supports analytics etc... When I was asking them about > pain points they talked about the below and I was wondering if this is a > problem we would like to solve. IMO: no one node knows the full state of the system, or can know it. I'd love PostgreSQL to help users more with scaling, HA, etc. But I think it's a big job. We'd need: - a node topology of some kind, communicated between nodes - heartbeat and monitoring - failover coordination - pooling/proxying - STONITH/fencing - etc. That said, I do think it'd be very desirable for us to introduce a greater link from a standby to master: - Get info about master. We should finish merging recovery.conf into postgresql.conf. - > b. Attempt to connect to the host directly, if not... > c. use the slave and use the hostname via dblink to connect to the master, > as the hostname , i.e. select * from dblink('" + connInfo + " > dbname=postgres', 'select inet_server_addr()') AS t(inet_server_addr inet). > This is necessary in the event the hostname used in the recovery.conf file > is not resolvable from the outside. OK, so "connect directly" here means from some 3rd party, the one doing the querying of the replica. > 1. The dblink call doesn't have a way to specify a timeout, so we have to > use Java futures to control how long this may take to a reasonable amount of > time; statement_timeout doesn't work? If not, that sounds like a sensible, separate feature to add. Patches welcome! > 2. NAT mapping may result in us detecting IP ranges that are not accessible > to the application nodes. PostgreSQL can't do anything about this one. > 3. there is no easy way to monitor for state changes as they happen, > allowing faster failovers, everything has to be polled based on events; It'd be pretty simple to write a function that sleeps in the backend until it's promoted. I don't know off the top of my head if we set all proc latches when we promote, but if we don't it's probably harmless and somewhat useful to do so. Either way, you'd do long-polling. Call the function and let the connection block until something interesting happens. Use TCP keepalives to make sure you notice if it dies. Have the function return when the state changes. > 4. It doesn't support cascading replication very well, although we could > augment the logic to allow us to map the relationship between nodes. > 5. There is no way to connect to a db node with something akin to > SQL-Server's "application intent" flags, to allow a connection to be > rejected if we wish it to be a read/write connection. This helps detect the > state of the node directly without having to ask any further questions of > the node, and makes it easier to "stall" during connection until a proper > connection can be made. That sounds desirable, and a good step toward eventually being able to transparently re-route read/write queries from replica to master. Which is where I'd like to land up eventually. Again, that'd be a sensible patch to submit, quite separately to the other topics. > 6. The master, on shutdown, will not actually close and disable connections > as it shuts down, instead, it will issue an error that it is shutting down > as it does so. Er, yes? I don't understand what you are getting at here. Can you describe expected vs actual behaviour in more detail? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Lockable views
>> >> test=# CREATE VIEW v3 AS SELECT count(*) FROM v1; >> >> CREATE VIEW >> >> test=# BEGIN; >> >> BEGIN >> >> test=# LOCK TABLE v3; >> >> ERROR: cannot lock view "v3" >> >> DETAIL: Views that return aggregate functions are not automatically >> >> updatable. >> > >> > It would be nice if the message would be something like: >> > >> > DETAIL: Views that return aggregate functions are not lockable > > This uses messages from view_query_is_auto_updatable() of the rewrite system > directly. > Although we can modify the messages, I think it is not necessary for now > since we can lock only automatically updatable views. You could add a flag to view_query_is_auto_updatable() to switch the message between DETAIL: Views that return aggregate functions are not automatically updatable. and DETAIL: Views that return aggregate functions are not lockable >> > I wonder if we should lock tables in a subquery as well. For example, >> > >> > create view v1 as select * from t1 where i in (select i from t2); >> > >> > In this case should we lock t2 as well? >> >> Current the patch ignores t2 in the case above. >> >> So we have options below: >> >> - Leave as it is (ignore tables appearing in a subquery) >> >> - Lock all tables including in a subquery >> >> - Check subquery in the view definition. If there are some tables >> involved, emit an error and abort. >> >> The first one might be different from what users expect. There may be >> a risk that the second one could cause deadlock. So it seems the third >> one seems to be the safest IMO. > > Make sense. Even if the view is locked, when tables in a subquery is > modified, the contents of view can change. To avoid it, we have to > lock tables, or give up to lock such views. > > We can say the same thing for functions in a subquery. If the definition > of the functions are changed, the result of the view can change. > We cannot lock functions, but should we abtain row-level lock on pg_proc > in such cases? (of cause, or give up to lock such views) I think we don't need to care about function definition changes used in where clauses in views. None view queries using functions do not care about the definition changes of functions while executing the query. So why updatable views need to care them? > BTW, though you mentioned the risk of deadlocks, even when there > are no subquery, deadlock can occur in the current patch. > > For example, lock a table T in Session1, and then lock a view V > whose base relation is T in Session2. Session2 will wait for > Session1 to release the lock on T. After this, when Session1 try to > lock view V, the deadlock occurs and the query is canceled. You are right. Dealocks could occur in any case. Best regards, -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese:http://www.sraoss.co.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] v10 bottom-listed
On 2017/10/13 22:58, Magnus Hagander wrote: > On Fri, Oct 6, 2017 at 3:59 AM, Amit Langote > wrote: > >> On 2017/10/05 22:28, Erik Rijkers wrote: >>> In the 'ftp' listing, v10 appears at the bottom: >>> https://www.postgresql.org/ftp/source/ >>> >>> With all the other v10* directories at the top, we could get a lot of >>> people installing wrong binaries... >>> >>> Maybe it can be fixed so that it appears at the top. >> >> Still see it at the bottom. Maybe ping pgsql-www? >> > > Turns out there was already quite an ugly hack to deal with this, so I just > made it a bit more ugly. Once the caches expire I believe sorting should be > correct. Saw that it's now listed all the way up, thanks! :) Regards, Amit -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SIGSEGV in BRIN autosummarize
On Sat, Oct 14, 2017 at 08:56:56PM -0500, Justin Pryzby wrote: > On Fri, Oct 13, 2017 at 10:57:32PM -0500, Justin Pryzby wrote: > > > Also notice the vacuum process was interrupted, same as yesterday (think > > > goodness for full logs). Our INSERT script is using python > > > multiprocessing.pool() with "maxtasksperchild=1", which I think means we > > > load > > > one file and then exit the subprocess, and pool() creates a new subproc, > > > which > > > starts a new PG session and transaction. Which explains why autovacuum > > > starts > > > processing the table only to be immediately interrupted. > > On Sun, Oct 15, 2017 at 01:57:14AM +0200, Tomas Vondra wrote: > > I don't follow. Why does it explain that autovacuum gets canceled? I > > mean, merely opening a new connection/session should not cancel > > autovacuum. That requires a command that requires table-level lock > > conflicting with autovacuum (so e.g. explicit LOCK command, DDL, ...). > > I was thinking that INSERT would do it, but I gather you're right about > autovacuum. Let me get back to you about this.. I confirmed that we're taking an explicit lock before creating new child tables (as I recall, to avoid errors in the logs shortly after midnight when multiple subprocesses see data for the new date for the first time): 2017-10-15 12:52:50.499-04 | 59e3925e.6951 | statement: LOCK TABLE cdrs_huawei_sgsnPDPRecord IN SHARE UPDATE EXCLUSIVE MODE Probably we can improve that with LOCK TABLE ONLY. Justin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] possible encoding issues with libxml2 functions
2017-08-21 6:25 GMT+02:00 Pavel Stehule : > > >> xpath-bugfix.patch affected only xml values containing an xml declaration >> with >> "encoding" attribute. In UTF8 databases, this latest proposal >> (xpath-parsing-error-fix.patch) is equivalent to xpath-bugfix.patch. In >> non-UTF8 databases, xpath-parsing-error-fix.patch affects all xml values >> containing non-ASCII data. In a LATIN1 database, the following works >> today >> but breaks under your latest proposal: >> >> SELECT xpath('text()', ('' || convert_from('\xc2b0', 'LATIN1') || >> '')::xml); >> >> It's acceptable to break that, since the documentation explicitly >> disclaims >> support for it. xpath-bugfix.patch breaks different use cases, which are >> likewise acceptable to break. See my 2017-08-08 review for details. >> > > The fact so this code is working shows so a universe is pretty dangerous > place :) > > ping? will we continue in this topic? >
Re: [HACKERS] [PATCH] pageinspect function to decode infomasks
On 10/14/2017 11:47 PM, Peter Geoghegan wrote: > On Sat, Oct 14, 2017 at 10:58 AM, Robert Haas wrote: >> I think it's perfectly sensible to view those 2 bits as making up a >> 2-bit field with 4 states rather than displaying each bit >> individually, but you obviously disagree. Fair enough.> > I guess it is that simple. FWIW, my opinion falls in line with Robert's. Also, whichever way it goes, this is a patch I've been wanting for a long time. -- Vik Fearing +33 6 46 75 15 36 http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SIGSEGV in BRIN autosummarize
Hi, On 10/15/2017 03:56 AM, Justin Pryzby wrote: > On Fri, Oct 13, 2017 at 10:57:32PM -0500, Justin Pryzby wrote: ... >> It's a bit difficult to guess what went wrong from this backtrace. For >> me gdb typically prints a bunch of lines immediately before the frames, >> explaining what went wrong - not sure why it's missing here. > > Do you mean this ? > > ... > Loaded symbols for /lib64/libnss_files-2.12.so > Core was generated by `postgres: autovacuum worker process gtt > '. > Program terminated with signal 11, Segmentation fault. > #0 pfree (pointer=0x298c740) at mcxt.c:954 > 954 (*context->methods->free_p) (context, pointer); > Yes. So either 'context' is bogus. Or 'context->methods' is bogus. Or 'context->methods->free_p' is bogus. >> Perhaps some of those pointers are bogus, the memory was already pfree-d >> or something like that. You'll have to poke around and try dereferencing >> the pointers to find what works and what does not. >> >> For example what do these gdb commands do in the #0 frame? >> >> (gdb) p *(MemoryContext)context > > (gdb) p *(MemoryContext)context > Cannot access memory at address 0x7474617261763a20 > OK, this means the memory context pointer (tracked in the header of a chunk) is bogus. There are multiple common ways how that could happen: * Something corrupts memory (typically out-of-bounds write). * The pointer got allocated in an incorrect memory context (which then was released, and the memory was reused for new stuff). * It's a use-after-free. * ... various other possibilities ... > > I uploaded the corefile: > http://telsasoft.com/tmp/coredump-postgres-autovacuum-brin-summarize.gz > Thanks, but I'm not sure that'll help, at this point. We already know what happened (corrupted memory), we don't know "how". And core files are mostly just "snapshots" so are not very useful in answering that :-( regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] oversight in EphemeralNamedRelation support
On Fri, Oct 13, 2017 at 10:01 AM, Tom Lane wrote: > But I see very > little case for allowing CTEs to capture such references, because surely > we are never going to allow that to do anything useful, and we have > several years of precedent now that they don't capture. For what it's worth, SQL Server allows DML in CTEs like us but went the other way on this. Not only are its CTEs in scope as DML targets, it actually lets you update them in cases where a view would be updatable, rewriting as base table updates. I'm not suggesting that we should do that too (unless of course it shows up in a future standard), just pointing it out as a curiosity. -- Thomas Munro http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] On markers of changed data
Hello! > 9 окт. 2017 г., в 10:23, Andrey Borodin написал(а): > > Thanks, Stephen, this actually pointed what to look for > VM is WAL-logged [0] > FSM is not [1] > > [0] > https://github.com/postgres/postgres/blob/113b0045e20d40f726a0a30e33214455e4f1385e/src/backend/access/heap/visibilitymap.c#L315 > [1] > https://github.com/postgres/postgres/blob/1d25779284fe1ba08ecd57e647292a9deb241376/src/backend/storage/freespace/freespace.c#L593 After tests of binary equivalence before and after backup I've come to conclusion, that Visibility Map cannot be backuped incrementally: it's bits are unset without page LSN bump. This can lead to wrong results of Index Only Scans executed on freshly restored backups. In my implementation of incremental backup in WAL-G I will disable any FSM, VM and XACT\CLOG incrementation. Posting this for the record, so that if someone goes this way info will be available. Thank you for your attention. Best regards, Andrey Borodin. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SAP Application deployment on PostgreSQL
There is still some activities regarding this question like https://blogs.sap.com/2017/05/08/replicate-abap-database-table-definition-to-postgresql/ Maybe PG or EDB (with the Oracle compatibility layer) would have been better strategic choice for SAP to conter ORACLE ;o) -- Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal - Default namespaces for XPath expressions (PostgreSQL 11)
2017-10-02 12:22 GMT+02:00 Kyotaro HORIGUCHI < horiguchi.kyot...@lab.ntt.co.jp>: > Hi, thanks for the new patch. > > # The patch is missing xpath_parser.h. That of the first patch was usable. > > At Thu, 28 Sep 2017 07:59:41 +0200, Pavel Stehule > wrote in mail.gmail.com> > > Hi > > > > now xpath and xpath_exists supports default namespace too > > At Wed, 27 Sep 2017 22:41:52 +0200, Pavel Stehule > wrote in gmail.com> > > > 1. Uniformity among simliar features > > > > > > As mentioned in the proposal, but it is lack of uniformity that > > > the xpath transformer is applied only to xmltable and not for > > > other xpath related functions. > > > > > > > I have to fix the XPath function. The SQL/XML function Xmlexists doesn't > > support namespaces/ > > Sorry, I forgot to care about that. (And the definition of > namespace array is of course fabricated by me). I'd like to leave > this to committers. Anyway it is working but the syntax (or > whether it is acceptable) is still arguable. > > SELECT xpath('/a/text()', 'http://example.com";> > test', > ARRAY[ARRAY['', 'http://example.com']]); > | xpath > | > | {test} > | (1 row) > > > The internal name is properly rejected, but the current internal > name (pgdefnamespace.pgsqlxml.internal) seems a bit too long. We > are preserving some short names and reject them as > user-defined. Doesn't just 'pgsqlxml' work? > > > Default namespace correctly become to be applied on bare > attribute names. > > > updated doc, > > fixed all variants of expected result test file > > Sorry for one by one comment but I found another misbehavior. > > create table t1 (id int, doc xml); > insert into t1 >values >(5, 'http://x.y";>50 rows>'); > select x.* from t1, xmltable(XMLNAMESPACES('http://x.y' AS x), > '/x:rows/x:row' passing t1.doc columns data int PATH > 'child::x:a[1][attribute::hoge="haha"]') as x; > | data > | -- > |50 > > but the following fails. > > select x.* from t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y'), > '/rows/row' passing t1.doc columns data int PATH > 'child::a[1][attribute::hoge="haha"]') as x; > | data > | -- > | > | (1 row) > > Perhaps child::a is not prefixed by the transformation. > the problem was in unwanted attribute modification. The parser didn't detect "attribute::hoge" as attribute. Updated parser does it. I reduce duplicated code there more. > XPath might be complex enough so that it's worth switching to > yacc/lex based transformer that is formally verifiable and won't > need a bunch of cryptic tests that finally cannot prove the > completeness. synchronous_standy_names is far simpler than XPath > but using yacc/lex parser. > > > Anyway the following is nitpicking of the current xpath_parser.c. > > - NODENAME_FIRSTCHAR allows '-' as the first char but it is > excluded from NameStartChar (https://www.w3.org/TR/REC- > xml/#NT-NameStartChar) > I think characters with high-bit set is okay. > Also IS_NODENAME_CHAR should be changed. > fixed > - NODENAME_FIRSTCHAR and IS_NODENAME_CHAR is in the same category > but have different naming schemes. Can these are named in the same way? > fixed > - The current transoformer seems to using up to one token stack > depth. Maybe the stack is needless. (pushed token is always > popped just after) > fixed Regards Pavel > > regards, > > -- > Kyotaro Horiguchi > NTT Open Source Software Center > > diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index b52407822d..af72a07326 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -10468,7 +10468,8 @@ SELECT xml_is_well_formed_document('http://postgresql.org/stuf second the namespace URI. It is not required that aliases provided in this array be the same as those being used in the XML document itself (in other words, both in the XML document and in the xpath - function context, aliases are local). + function context, aliases are local). Default namespace has + empty name (empty string) and should be only one. @@ -10487,8 +10488,8 @@ SELECT xpath('/my:a/text()', 'http://example.com";>test', To deal with default (anonymous) namespaces, do something like this: